Xeneize: The Average Tuning System: Scala Archive Statistics

Introduction: The Average Tuning System (ATS)

The Average Tuning System (ATS) represents a set of pitches derived from a descriptive statistical analysis of the extensive Scala Archive, a renowned database encompassing a vast collection of global tunings. The primary objective of this investigation was to identify common structural elements and tendencies across diverse historical and contemporary tuning practices, seeking a statistically informed representation of shared characteristics.

A core aspect of the analysis involved interpreting the data under the assumption that most tuning systems within the archive function as periodic pitch sets. To comprehensively assess the interval content, an interval matrix expansion was performed for each tuning file. The rationale for this step is crucial: cyclic permutation and base changes inherent in periodic sets mean that the initially presented sequence of intervals (the "key") may not fully reveal the system's most prominent interval relationships. Matrix expansion calculates all possible intervals generated within the set, providing a complete picture independent of the starting note. This process revealed that while some systems appear distinct initially, expansion shows they are permutations of the same underlying structure (torsors), often highlighting strong internal interval preferences (like the perfect fifth) not explicit in the original file listing.

Despite the potential for the matrix expansion to alter perceived interval prominence, key statistical findings remained remarkably consistent whether analyzing the initial keys directly or the fully expanded interval matrices. The average (mean), median, and mode for the number of notes per system, as well as the ranking of the most frequent intervals (top 10), showed strong convergence in both scenarios, indicating powerful underlying tendencies within the archive.

The analysis navigated inherent challenges related to data representation, including precision issues arising from converting between fractional ratios and cent values, the limitations of decimal representations for logarithmic pitch data, and the effects of necessary truncation and clustering. While acknowledging these potential sources of error (and noting that analysis on a logarithmic scale would be optimal), the fundamental trends in interval popularity and system size proved robust even when accounting for these factors.

Therefore, while not an exhaustive machine learning approach, this descriptive statistical analysis confidently identifies key features:
Dominant Equave: The octave (2/1) serves as the interval of equivalence in over 95% of the analyzed systems.
Common System Size: The statistical average size was 17 notes, with 12 being the median and clear mode. For the ATS, this was refined to 14 notes, considering practical application constraints (e.g., guitar fretting).
Most Prominent Intervals: The analysis yielded a set of the 14 most frequent intervals, forming the basis of the ATS:

{16/15, 10/9, 7/6, 6/5, 5/4, 4/3, √2, 3/2, 8/5, 5/3, 12/7, 9/5, 15/8, 2/1}

The ATS, constructed from these statistically prevalent components, offers a unique perspective—a tuning system reflecting the central tendencies found within the diverse tapestry of the Scala Archive.

Further details of the analysis, including specific statistical distributions, graphical representations, data processing considerations, and comparisons, are presented in the following draft analysis section. Further research avenues, potentially involving more sophisticated correlation analyses, remain possible.

DRAFT:

This tuning system is a simple descriptive statistical representation of the scala archive, a renowned curated database of global tunings, seeking common ground and practical use among diverse world tunings.

Interval Traditional Western Name
16/15       minor diatonic semitone
10/9        minor whole tone
7/6        septimal minor third
6/5        minor third
5/4        major third
4/3        perfect fourth
√2
3/2        perfect fifth
8/5        minor sixth
5/3        major sixth
12/7        septimal major sixth
9/5        just minor seventh
15/8        classic major seventh
2/1        octave

Statistics and tuning construction:

Out of the 5,176 files, the range of system sizes extends from 2 to 579. The average system size is 17, with a median of 12. The mode is also 12, appearing 1,546 times, followed by 7-note size tunings with 715 occurrences. This signifies a diverse collection, albeit with a notable concentration of systems hovering around the 12-note mark.

Top 5 Sizes

Size Occurrences
12    1546
7    715
5    231
19    218
8    206

While some files span multiple octaves or include non-reduced intervals below the unison, these instances are relatively rare. Most are periodic tunings in alignment with the octave, the archive's most common interval. (Note: rather than relatively rare, some are intentionally wrong, since scala file definition specifies the omission of the 1, and conclude with the equave, implementations may totally ignore those values)

In a direct analysis of the files, the first key from each tuning, totaling 87,558 notes, reveals the octave as the most common, appearing with its exact representation in 4,481 total files and with close variations in practically all tunings.

The perfect fifth emerges as the second most popular interval, succeeded by the perfect fourth and major third.

Distribution of intervals. The two graphics depict identical data. The first graphic displays both vertical and horizontal axes on a linear scale, while the second utilizes a logarithmic scale for the vertical axis. This logarithmic scale highlights intervals that occur only once, significantly beyond the octave, as well as those appearing below a value of 1.

Top 5 Intervals

Interval Name            Occurrences
2/1   octave            4481
3/2   perfect fifth    2001
4/3   perfect fourth    1743
5/4   major third       1290
9/8   major whole tone 1095

Assuming all tunings are periodic, cyclical pitch sets, the octave is identified as the interval of equivalence in 4,379 tuning files. The next most common equave is the twelfth, with only 93 files.

When calculating all added tones, the complete interval matrix only for the octave-ending tunings yields a total of 2,641,310 intervals, and the list of the most frequent remains largely unchanged.

The two graphics present distinct datasets. The first graphic represents the scan of the initial key in each file, while the second illustrates the scan subsequent to computing all matrices. Both graphics showcase the top 17 intervals, which exhibit remarkable similarity. Each graph encompasses a single octave, with both vertical and horizontal axes set to a logarithmic scale.

(Why is it important to calculate the interval matrix and added tones to determine the most common intervals?

Take this periodic tuning, for example: 16/15 6/5 8/5 9/5 2/1.

If you're not very familiar with intervals, simply seeing the initial key doesn't tell you anything. However, upon computing the matrix for this 5-note periodic tuning, it reveals 14 unique intervals. Among these, the most common intervals are the fifth (3/2), the fourth (4/3), the major whole tone (9/8), and the Pythagorean minor seventh (16/9) – all of which aren't explicitly mentioned in the "first" key.)

There are precision issues affecting interval categorization, resulting from the conversion of fractions and cents, the dual languages of scala files, to a common decimal representation. This inherits machine number problems. When calculating the complete matrix of equal division systems, where a size of any given number should imply the same diversity, the precision nuances in floating-point arithmetic may lead to some being counted as different.

Another problem arises in categorizing cent tunings. Some files may refer to the same note, but due to differences in the amount of digits in their definitions, no program will consider them equal. (701.955 != 701.95)

You can attempt to correct this by equally limiting the number of digits, which would effectively reduce the number of individual distinct intervals. However, since truncation occurs in their decimal format, an uneven definition loss of musical notes is observed due to their original distribution, which is nonlinear (without repetitions).

The graph represents the tuning space horizontally and accumulates identical exact repetitions vertically.

Both graphics portray identical data, but the second one illustrates the data after truncation (with a maximum error of approximately 0.2 cents). Both visuals display the top 17 intervals, which remained consistent even after truncation. This reduction resulted in 242,538 unique intervals being compressed to just 9,997. The logarithmic view in the graphic also highlights the uneven definition loss of musical notes post-truncation, which was executed on the decimal data.

Progressively truncating the notes in this way, doesn't significantly alter popularity, even a 2-cent error proved insufficient to dislodge any peak prominence.

Additionally, the graph experiences intrinsic truncation due to its fixed resolution, significantly lower than the data range. Consequently, different notes are depicted on the same pixel, this is used to add a third dimension to the graph, highlighting note concentrations, which are always very close to some of the already favored intervals. For example, the perfect fifth has a concentration of notes next to it, hinting at systems like 12-tone equal temperament, where the fifth is 700 cents. However, without altering the graphical scale, these clusters won't even be apparent.

Both graphics represent the analysis of the initial keys, displaying the same dataset. However, the first graphic features a vertical logarithmic scale, while the second employs a linear scale. Presented as a heat map, red areas denote note concentrations (which are not visible in the linear view), while blue indicates fewer notes.

The generated systems employing the 17 most frequent intervals, are symmetric in both cases, reflecting a mirror image via the square root of 2. They comprise half superparticular intervals and half their reduced inversions, the perfect fourth and fifth, major third and minor sixth, minor third and major sixth, etc.

Nonetheless, some of these intervals are very small in practice, which poses minimal concerns for keyboard or synthesizer configurations but imposes constraints upon the guitar's limited space, among other factors that make it less suitable for very precise tunings; and 17 was just the average system size.

The final generated system consists of 13 notes, or 14 when including the square root of 2. This selection exhibits near-complete coverage of the tuning space. Graphically, their common-tone aggregate resembles the added tones for the entire collection, which is interesting. The intervals that were left out from the average 17 due to their proximity haven't disappeared entirely; they remain popular, even surpassing those included, although the major whole-tone was removed from the main key, it still exists in some of the others.

The first image corresponds to the analysis of the full archive's interval matrix, showcasing the 17 most popular intervals. The second image depicts the same graphic process, computing the interval matrix and accumulating the repetitions vertically, but on the newly generated tuning system. The general contour of both is similar, this type of tuning analysis typically provides the fingerprint for a tuning. This means the 14-note system generates a similar fingerprint to the entire database of 2.5 million notes.

The system does not match any of the existing files.

Analysis using subsets of the archive—half or a third selected randomly—still yielded the same most frequent intervals. However, for a more accurate representation of an average world tuning system, it's essential to curate the data better. This would involve handpicking the most well-known tunings that are or were actually in use, rather than relying on the full Scala archive, which contains numerous modern tunings seldom used.

Composition with the average system

Improvisation with the average system

TODO: Additional statistics:

The first ~500 most frequent intervals comprise just, rational, and integer ratio intervals before cent-defined intervals like the octave at 1200 cents appear.

How to:

The program developed for this analysis is open-source and available at [LINK]. It's designed for straightforward usage—simply load any .scl file or files, and it will promptly conduct and showcase statistics on them. The analysis comes in two modes: 'direct' examines files as they are, focusing on the first key, while 'full' generates interval matrices for all files. Notably, the 'full' analysis uses a fixed equave of 2:1, a setting implemented after discovering that 95% of the database concludes with a 2:1 equave. This equave parameter can be adjusted within the code for further exploration and customization.

Xeneize

Thursday, August 1, 2024

The Average Tuning System: Scala Archive Statistics

No comments:

Post a Comment

Diophantine Limits in Quantum Search: A Three Gap Theorem Perspective