Tonal Constancy and the Perceptual Forging of Pitch
(draft)
This study examines the perception of stable tonal functions in musical contexts where those functions are not reliably present in the acoustic signal. Although musical intervals can be described as physical frequency ratios, their experienced roles depend heavily on melodic context, cultural learning, and cognitive pattern-matching. The same pitch height can be interpreted differently depending on the musical environment, revealing that many aspects of tonal structure are perceptually constructed rather than acoustically fixed.
Tonal Constancy is proposed as a unifying principle to explain this behavior. The theory holds that the brain actively maps incoming sound onto a familiar 12-tone framework, projecting learned tonal relationships, such as tonic-dominant hierarchy or the sense of resolution, onto ambiguous pitch material. Listeners often perceive functional categories that are not acoustically encoded but are inferred from melodic shape, stress patterns, and prior exposure.
This perspective has direct implications for microtonality. It distinguishes between systems that the brain can easily assimilate by fitting them into existing categories, and systems that resist assimilation because their structures are incompatible with the internalized 12-tone model. Counterintuitively, some coarse or evenly spaced tuning systems (macrotonal frameworks) can feel more perceptually “alien” than finely divided microtonal scales precisely because they provide fewer recognizable tonal anchors. (Microtonal Refinement/Optimization vs Microtonal Structuralism)
The chapters that follow define a precise vocabulary for these phenomena, present case studies demonstrating tonal reconstruction in action, and outline a classification system for how different pitch structures interact with the listener’s internalized tonal grid. The work also situates these perceptual mechanisms within broader discussions of acoustics, cognition, and cultural history.
Note on Perceptual Validity: The ultimate evidence for Tonal Constancy is the listener's ear. Therefore, the audio examples provided are not merely aesthetic demonstrations; they are the primary data. Formal statistical validation is outside the scope of this initial exploration.
Part I: The Phenomenon - A New Framework for Perception
1: An Introduction to Tonal Constancy.
2: From 'Xenharmonic' to 'Induodecimable': The Need for a Precise Lexicon.
3: A Case Study in Perceptual Forcing: Tonal Reconstruction in 7-EDO.
Part II: The Mechanism - A Classification of Translatability
4: The Four Layers of Duodecimability.
Perceptual Forcing (e.g., 7-EDO)
Relational Reference (e.g., Maqamat, 19-EDO)
Structural Alienation (e.g., Bohlen-Pierce)
Timbral Dissolution (e.g., Gamelan)
5: The Spectrum of Familiarity: Microtonal 'Flavor' vs. New Functional Categories.
A deeper dive into the distinction between Layers 1/2 and Layer 3.
6: Anchor Density: A Model for Perceptual Alienation.
"Anchor Density" spectrum explains the experiential difference between various induodecimable systems.
Contrasts "High-Density" systems (11/13-EDO: slippery, locally familiar) with "Low-Density" systems (Bohlen-Pierce: truly alien, no footholds).
Part III: The Foundation - The Substrate of Hearing
7: The Indispensable Cycle: Range, Resolution, and Perceptual Distance.
"How real is the octave?"
8: The Blueprint for Pitch: A Collision of Two Logics.
Presents the theory that our pitch system arises from the conflict between two forces:
The Logic of Symmetrical Partition (2^n binary division of the cycle).
The Logic of Acoustic Resonance (The asymmetrical, "timbre-locked" force of the 3:2 fifth).
Part IV: The Context - History, Ontology, and Evidence
9: The Riverbeds of Culture: Re-examining Pythagoras and the Great Convergence.
What did Pythagoras really do?
"Practice Before Theory" (lutes, frets) and the convergence of Chinese and Greek systems.
10: Is Anything Fundamental? On Ontology
11: Exhibit A: The Anti-Randomness Engine.
The brain is a relentless ordering engine, and Tonal Constancy is its primary tool for forging meaning from the chaos of sound.
Part I: The Phenomenon
1: An Introduction to Tonal Constancy
Tonal Constancy refers to the brain’s tendency to impose familiar tonal structures onto acoustically unfamiliar or ambiguous pitch data. It explains why listeners often perceive semitones, tonics, or functional resolutions even when the input contains no reliable cues for those categories. Rather than passively receiving frequencies, the auditory system actively organizes them according to learned schemas, a form of cognitive pattern-imposition similar to pareidolia, but operating on pitch relationships.
The term parallels color constancy in vision, where colors remain identifiable across changes in lighting. In audition, small deviations in interval size or tuning system do not disrupt the perceived identity of a pitch pattern; we still recognize the underlying relational “shape.” Even in more extreme tuning systems, such as those far from 12-tone equal temperament, listeners frequently reconstruct familiar functions through melodic motion and contextual inference. In this sense, pitch categories behave less like fixed points and more like directional cues, vectors rather than precise coordinates.
For such reconstruction to occur, the brain relies on a stable internal framework. This suggests a deeper process: the perceptual categorization of cycles. Just as spatial patterns are grouped into shapes and temporal events into meters, pitch information is grouped into circular structures, with the octave being the dominant organizing cycle. This is supported by the harmonic series and phenomena such as the missing fundamental, which provide early perceptual justification for octave grouping.
Within this framework, three related concepts become important:
-Range: the span of frequencies humans can hear.
-Resolution: the smallest perceivable pitch difference (JND).
-Cycle: the inferred circular structure the brain uses to map pitch into repeating units.
Range and resolution describe physical limits; cycle reflects a cognitive strategy for generating order within continuous acoustic space.
This perspective explains several recurring phenomena:
-Why scales such as 7-EDO can still evoke familiar functions like “semitone” or “dominant” when heard in motion.
-Why randomly generated or uniformly spaced tones often suggest tonal relationships.
-Why even experienced musicians may not immediately detect that they have left the 12-tone system when exposed to systems like 13-EDO, provided timbre and pitch motion remain within certain perceptual expectations.
In short, most pitch systems are predisposed to be interpreted tonally unless they are structured to avoid alignment with learned priors.
Tonal Constancy therefore reflects more than musical habit. It may represent a fundamental feature of auditory cognition. This study aims to formalize the concept, propose a classification system for degrees of translatability into tonal space, and explore its implications for theory, composition, and the psychology of hearing.
2: From 'Xenharmonic' to 'Induodecimable': The Need for a Precise Lexicon.
The term xenharmonic was originally coined to describe musical materials that lie outside the framework of Western 12-tone equal temperament (12-EDO). Implicit in its early usage was a sense of perceptual alienness, sounds that could not be reconciled with conventional tonal expectations, or even approximated within the familiar scalar structures of Western music.
Over time, however, the term’s scope has broadened. Today, xenharmonic may be applied to any music using non-standard tunings, alternate instruments, or unfamiliar timbres. As its use has expanded, its precision has diminished. A piece might now be labeled xenharmonic even if it maps closely onto 12-EDO, or if it retains gestures that remain tonally functional within familiar paradigms. In this diluted form, the term no longer guarantees that the music is truly untranslatable into the 12-tone system.
To address this ambiguity, we use a more semantically precise term: induodecimable—from Latin roots meaning not reducible to twelve. It describes musical structures, scales, or timbres that cannot be effectively translated into 12-EDO without a perceptual or functional loss. Unlike xenharmonic, this term emphasizes irreducibility, not just unfamiliarity. Moreover, its morphology is cross-linguistically stable (e.g., induodecimable reads identically in English and Spanish), and it admits extensions for greater specificity, such as indiatonizable, referring to pitch content incompatible with diatonic function.
It is important to note that this property is not binary. Whether a given musical structure is duodecimable (that is, whether it is approximable by 12-EDO) is ultimately a perceptual judgment. Some cases are obvious: inharmonic spectra perceived as noise, or very specific microtonal systems with step sizes that fall far outside typical pitch categories. But others lie in a gray zone where perceptual context, cultural exposure, and learned listening habits strongly shape what we "hear."
This gray zone is precisely where Tonal Constancy becomes critical. Even when a melodic or harmonic structure defies analytical reduction to diatonic scales, listeners often project familiar tonal frameworks onto it, constructing implied functions, modes, or centers through context and inference. The ability to “make tonal sense” of unfamiliar material is not evidence of universality in the structure itself, but of the perceptual elasticity of the listener.
As this study will show, the diatonic scale functions as a tonal attractor—a kind of perceptual sink into which ambiguous or approximate materials are pulled. The 12-tone system serves as its most stable host, offering a resolution of the scale’s unequal steps into evenly spaced units that align (imperfectly, but reliably) with physical redundancies in the harmonic series.
(
The question then arises: why twelve? It's not the timbre, the Pythagorean algorithm or the more compelling harmonic semi-group, which, while seemingly more ontologically robust, doesn't necessarily relate to or reflect our perception. Why does this specific internal subdivision act as the dominant attractor, rather than systems based on ten, fifteen, or nineteen tones? Why does duodecimability seem to represent a perceptual threshold?
The answer is not solely historical. Nor is it purely acoustic. Instead, we find ourselves in the deeper terrain of categorical emergence: how perceptual systems construct stable reference frames from continuous data. Just as we learn to divide the color spectrum into culturally specific “basic colors,” so too do we divide the pitch continuum into categories that are both learned and constrained, by cognition, biology, and acoustics.
This chapter lays the foundation for a more precise taxonomy of perceptual translatability in music. The aim is not only to explore how Tonal Constancy works, but to examine the deeper question: where do musical categories come from at all? At what point do categories cease to form, or become unstable? And when do they dissolve into pure context-dependence; when Tonal Constancy, in effect, runs out?
3. Tonal Reconstruction in 7-EDO and the Elasticity of Pitch Meaning
The 7-tone equal division of the octave (7-EDO) offers a clear demonstration of tonal constancy. Each step spans ~171 cents, and unlike diatonic 12-EDO, the system contains no internally differentiated intervals, no whole/semitone hierarchy, no embedded modal markers, and no natural centers of gravity. Acoustically, it is a uniformly spaced cycle.
Yet when 7-EDO is used melodically, listeners routinely report hearing tonal centers, modal references, and functional cadences. The system behaves musically intelligibly despite lacking structural cues.
This raises a central question:
How does a scale composed entirely of uniform steps give rise to perceived modes, cadences, and tonal direction?
The answer lies in the organization of the sequence rather than the tuning itself. Trajectory, rhythmic emphasis, contour, and learned tonal archetypes guide the perceptual system toward interpretations that are not encoded in the signal. This is tonal constancy: the imposition of familiar pitch relationships onto acoustically ambiguous material.
The 7-EDO “Chameleon Effect”: Functional Multivalence
One of the clearest signs of tonal constancy in 7-EDO is the instability of pitch identity. A single 7-EDO degree can be re-interpreted as multiple 12-EDO categories depending on context. For example, the 342-cent step—the “neutral third”—frequently shifts between major-like and minor-like functions.
This is Functional Multivalence (a one-to-many mapping):
In 12-EDO, a pitch class is relatively stable (“C is C”).
In 7-EDO, the same frequency can behave as a major third, a minor third, or something in between, depending on melodic direction, local emphasis, or implied harmony.
The perceptual system prefers to preserve musical grammar—directionality, cadence, and contour—over preserving literal interval size. In effect, the brain “chooses” the pitch identity that best fits the surrounding syntax, even when that identity is not present in the acoustic input.
The following examples (7-EDO followed by 12-EDO reinterpretation) illustrate how a single interval can support multiple functional readings.
The tune has a simple A-A-B-B structure. In both phrases, the final step of the bass line is identical in 7-EDO: a single 171-cent ascent. Yet in the 12-EDO reinterpretation, this same interval is mapped differently in each phrase:
In one case, the step resolves as a 100-cent semitone, supplying a cadential “leading tone.”
In the other, it expands to a 200-cent whole tone, producing a “major seventh”-like resolution.
Thus the same 7-EDO motion supports two distinct tonal functions, determined not by its size but by its role in the phrase.
Audio Examples 03/04: Neutral Intervals in Context
Additional examples (sine-wave only) show that the effect persists even without harmonic cues. The 342-cent degree is perceived as “major” or “minor” depending on the melodic frame:
Ascending, it often acquires major-third implications.
Descending, or in a minor-leaning contour, it takes on a minor-third quality.
When rendered in 12-EDO, performers naturally “resolve” these ambiguous steps toward the expected functional pitches to satisfy the implied cadence. This is tonal constancy operating directly on interval interpretation.
Trajectory, Momentum, and Torsor Structure
These examples highlight the role of pitch momentum, the way successive intervals form a directed trajectory through pitch space. Even in a perfectly symmetric tuning, melodic movement generates expectation and prepares closure.
This behavior aligns more closely with a torsor than a vector space: there is no absolute reference point, only relational structure. A pitch derives meaning from: its placement in the trajectory, its rhythmic emphasis, and its relation to culturally learned tonal prototypes.
The ear treats 7-EDO not as a static grid but as a flexible relational field.
When Does Meaning Break Down?
This leads to a central set of questions:
How far can an interval deviate before its expected function collapses?
When does a “minor third” cease to be heard as minor?
At what point does tonal constancy fail to rescue the structure?
These boundaries are not fixed. They shift with experience, familiarity, cultural priors, and attentional state. Much of what we “hear” as categorical pitch identity is constructed, not given.
Later chapters will return to these issues when discussing tonal attractors, learned priors, and the emergence of pitch categories.
Summary
Tonal behavior in 7-EDO shows that pitch meaning is elastic. Even in a scale with no intrinsic hierarchy, listeners reconstruct functional roles through trajectory, rhythm, and expectation. The auditory system is not passively reporting interval sizes; it actively infers tonal structure.
Where the tuning system provides symmetry and ambiguity, perception generates hierarchy and direction. This is tonal constancy, not a property of the tuning, but of the listener.
Relationship with Shepard tones:
Shepard tones expose a perceptual symmetry in pitch space which relies on overlapping spectral components, octave-wrapped circular pitch causes and ambiguous vertical positioning on the pitch helix. This creates bi-stability, the same stimulus can be interpreted as “ascending” or “descending”
depending on which branch of the helix the brain commits to. But this requires a very artificial timbre. It’s not “natural” in musical terms.
The 7-EDO functional example is a parallel phenomenon but "natural", this is the difference:
The ambiguity is not caused by timbre or chroma wrapping.
It is caused by cadential expectation and functional reinterpretation.
The 171 cent jump is small to be a clear “whole step”, too large to be a clear “semitone”, ambiguous in scalar context, and can act as either a 100-cent role or a 200-cent role after functional remapping. This means the listener's tonal model decides the interval class, not the acoustics. This is a cognitive analog to Shepard’s perceptual ambiguity but purely musical.
This is exactly what is predicted by categorical perception, key-dependent interval class assignment, top-down functional bias, and tonal constancy mechanisms.
The example is a miniature interval multistability illusion, here the bistability is: major-step function vs minor-step function mapped onto the same absolute interval.
The 171-cent example shows interval identity is not acoustically fixed, functional context can warp interval categorization, listeners can be tricked naturally, not through artificial timbres and pitch-space can behave as a bistable perceptual manifold.
Deustch Illusion's
Deutsch describes a self-reinforcing perceptual loop, the “bootstrapping operation”:
-bottom-up cues: local intervals, sequential grouping, contour, roughness, spectral features
-top-down cues: stored tonal hierarchies, category expectations, Western pitch-class memories
The system settles into a coherent key + a coherent sequence despite ambiguity in the input.
But she is only talking about ambiguities inside 12-EDO. Not about the deformation of the system itself.
The hidden assumption of their entire debate: The underlying pitch lattice is stable, fixed, and accurate.
Here we extend this:
What if the entire pitch framework is warped?
How far can you stretch the lattice before the bootstrapping collapses?
How does the brain “repair” a scale that violates its statistical priors?
But the bootstrapping mechanism should still operate even under distortion.
Their theory quietly implies pitch flexibility, this lurks inside the implications.
Their model says: Local intervals provide sequential cues. Stored hierarchies provide tonal mapping, the system iterates until a stable interpretation emerges.
This is mathematically the same pattern as a stable fixed point under perturbation. If you slightly detune the fifth: the local interval cue moves, the hierarchical mapping adapts, the loop tries to settle into a new stable point
Neither Deutsch nor Krumhansl explored perturbing the system to see when this perceptual homeostasis breaks, but their mechanism predicts that there must be:
a region of stability (duodecimability)
a region of instability (collapse)
a boundary (“breaking point”)
The unasked question, everything from Krumhansl’s probe-tone curves to Deutsch’s illusions was done on the assumption that: the octave = 1200 cents, the fifth ≈ 700 cents, the diatonic steps ≈ 100/200 cents, pitch classes repeat with perfect diatonic symmetry.
What if we perturb the system?
How flat can a fifth be before tonal hierarchy collapses?
How stretched can an octave be and still be recognized?
How much deviation can a major third tolerate before category flipping?
How stable is the bootstrapping loop under systematic scale deformation?
How does the perceptual system “repair” wrong tunings?
If tonal perception is a dynamic attractor landscape with deformation tolerance; this becomes, cycle elasticity, equivoques, duodecimability layers, structural vs mnemonic constancy, melodic contour vs pitch topology, multiple tunings mapping to the same cognitive attractor.
The same bootstrapping mechanism might be responsible for:
octave constancy (why a stretched octave still “feels like” an octave)
scale constancy (why warped scales still produce diatonic functions)
melody recognition under pitch drift (the “Happy Birthday” experiment)
equivoques (same intervals → different structures)
inverse equivoques (different interval sets → same functional pattern)
All of this falls out naturally from their bootstrapping framework.
How the mind finds a key when the tuning system itself is moving?
Ξ Example A - 7edo
Ξ Example A - 12edo
Ξ Example B - 7edo
Ξ Example B - 12edo
(Image.1) This geometric visualization compares 7-EDO with the diatonic scale in 12-tone equal temperament on a logarithmic scale. Transposition of the 7-EDO structure yields identical intervallic relationships, whereas transposition of the diatonic scale reveals the seven familiar modes of 12-tone music.
Part II: The Mechanism
4: The Four Layers of Duodecimability
Why Pre-Select Pitches at All?
Any tuning system begins with an act of selection: we carve a finite subset out of a continuous pitch continuum. Whether we choose 12-EDO, a just-intonation lattice, or a non-octave structure, this selection presupposes a grid. And the moment a grid is imposed, pitch becomes symbolic, something to be named, navigated, and reasoned with, rather than merely heard.
This raises the foundational question behind duodecimability:
What does pitch selection reveal about the perceptual forces that shape our sense of musical structure?
The Organology of Resistance: Why Frets and Notes Matter
A persistent assumption in microtonal discourse is that “fretless equals freedom”, that removing the grid grants access to an infinite field of pitch possibilities. Under tonal constancy, the opposite is often true.
The Gravitational Pull of the Fretless
On fretless instruments, intonation becomes a closed loop between the ear and the fingers. Because the auditory system continuously seeks harmonic-series alignment, familiar step sizes, and diatonic attractors (the “anti-randomness engine”), players unconsciously micro-correct toward culturally internalized targets. The result:
Fretless improvisation drifts toward just intonation or 12-EDO approximations.
“Microtonal freedom” often collapses back into familiar centers.
Without structural resistance, the instrument’s acoustics and the player's perceptual habits steer the music toward what the ear already knows.
Frets as Cognitive Prosthetics
Frets, keys, and fixed pitches are not restraints, they are tools of resistance. They freeze the geometry of an alternative system long enough for it to be inhabited on its own terms. By shifting navigation from psychoacoustic alignment to spatial/logical constraints (shapes, cycles, finger patterns), frets temporarily disable the brain’s corrective instinct.
They allow for structural alienation: the ability to function within an unfamiliar tuning without immediately reabsorbing it into 12-tone expectations.
Why This Matters for Duodecimability
Without such scaffolding, many “alien” systems are eroded by tonal constancy before they can be meaningfully explored. Fixed geometry protects them from the perceptual gravity of the listener and the performer.
These observations motivate the central question of this chapter: The Need for a Framework
How strongly does a tuning system gravitate back toward 12-EDO when filtered through human perception, performance practice, and musical habit?
Rather than treating this gravitational pull as an aesthetic defect or a perceptual failure, we can use it as an analytic tool. The concept of duodecimability provides a structured vocabulary for describing how alternative tunings interact with the 12-tone system, not as universal truth, but as our current cultural baseline.
What Duodecimability Measures
Duodecimability is not an evaluation of musical value. It is a practical measure of translatability: how easily a given system can be mapped, approximated, or “rescued” by 12-EDO expectation.
It identifies four layers, ranging from systems that can be subtly aligned with 12-tone tonality to those that resist assimilation even at their acoustic substrate. This framework allows us to distinguish:
systems that behave like dialects or variations of 12-tone practice,
systems that partially align but diverge in key functions,
systems that require structural scaffolding to maintain their identity, and
systems that collapse entirely when filtered through the perceptual pull of tonal constancy.
This is not an argument about what tuning “should be” or which system is superior. Instead, it provides
a practical tool for microtonal composition, an explanatory model for instrument design, and a conceptual bridge between psychoacoustics and musical structure.
It clarifies why some tunings feel intuitively compatible with tonal expectations while others feel like entirely new musical species.
The Four Layers of Duodecimability
A tuning system’s “duodecimability” refers to the degree to which its pitches, functions, or perceptual structures can be interpreted (or misinterpreted) through the lens of the 12-tone system.
Each layer below marks a progressively deeper departure from 12-EDO as both perceptual default and theoretical grammar:
Layer 1: Perceptual Forcing ("Duodecimability by Proximity and Momentum")
Definition: A system that is mathematically unrelated to 12-EDO, but is perceptually coerced into a 12-tone framework by the listener’s tonal expectations.
Mechanism: Tonal constancy combined with melodic trajectory. The listener’s brain fills in “missing” functions based on contour, rhythm, and cultural conditioning.
Example: 7-EDO. Despite its equal-step structure (~171 cents per step), melodies played in 7-EDO can imply tonic-dominant relationships, cadential closure, and even modal coloration. The actual intervals don’t match, but the function does. The translation happens inside the brain, not the score.
Layer 2: Relational Reference ("Duodecimability by Analogy and/or Refinement")
Definition: A system that contains more than 12 pitches, but still describes itself in terms derived from the 12-tone world, thus retaining it as a conceptual anchor, even if not a physical one.
Mechanism: Category refinement. These systems acknowledge and orbit around 12-EDO concepts like major/minor, fifths, thirds, etc., often subdividing them or redefining their boundaries.
Examples: Arabic Maqamat: The maqam system, while utilizing microtonal intervals, maintains its identity through scalar steps and modal gravity. These intervals are often compared to their 12-EDO counterparts, such as neutral seconds and sub-minor thirds.
19-EDO, 22-EDO: These temperaments expand the lattice without abandoning diatonic principles. Concepts like “tonics,” “leading tones,” and “dominants” persist, even with altered sizes. These systems enhance rather than alienate. (It's worth noting that these highly divided systems enrich the 12-EDO framework and also accommodate other unfamiliar structures. However, they are designed, both theoretically and practically, as better approximations of just intervals and known categories. They can also feature indiatonizable structures but are not typically used in that way, see next chapter on macrotonality.)
Layer 3: Structural Alienation ("True Induodecimability with Harmonic Timbre")
Definition A tuning system whose internal logic (how it generates scales, harmonies, and motion) has no connection to diatonic or 12-tone principles.
Mechanism: Constructed from different mathematical or harmonic seeds. Even when using harmonic instruments (e.g., with overtone series), the structural rules cannot be mapped onto 12-EDO categories.
Examples: The Bohlen-Pierce Scale: Built on the tritave (3:1) instead of the octave (2:1), and divided into 13 steps. Its chords are derived from odd harmonics (3, 5, 7...), and its “fifths” and “thirds” have no analogues in 12-EDO. It creates functional progressions, just not those functions. This scale is locally similar to 8edo, but mismatch on the long run.
5/10-EDO: This system, while octave-repeating, generates arpeggios with highly ambiguous internal logic. A sequence might include chords that, individually, can fit into 12-EDO, but collectively shift too much, requiring multiple conflicting translations. The system resists stable mapping. (see Note).
Certain Just Intonation Lattices: Especially those incorporating the 11th or 13th harmonics. These systems often sound “smooth” or “consonant” when played with harmonic timbres, but their intervallic logic has no fixed counterpart in 12-EDO. Trying to approximate them with standard pitches is like translating poetry using only rhymes, it misses the meaning entirely.
This is the tipping point: these systems can sound traditionally good, but their logic is alien. You can enjoy them, but you can't name them with your old words.
Layer 4: Timbral Dissolution ("Induodecimability by Substrate")
Definition: A musical system whose inharmonic timbres prevent any pitch structure from aligning with 12-EDO attractors, rendering not only tuning, but pitch itself, unstable or secondary.
Mechanism: The overtone series is no longer harmonic. With no integer multiples, the usual anchors, octave, fifth, third, do not emerge naturally in perception. The substrate itself erodes tonal identity.
Example: Gamelan music. The metallophones used in Balinese and Javanese ensembles produce inharmonic spectra. Their tuning systems (e.g., slendro, pelog) are not “approximations” of 12-EDO; they are entirely separate epistemologies. A tone's pitch is defined by its timbral fingerprint, not by harmonic ratios. Even the concept of an interval can dissolve into a cloud of color and resonance.
Attempting to analyze this using Western theory is like applying Latin grammar to birdcalls: the medium does not support the metaphor. These musics are not microtonal, they are extratonal.
Closing Reflection
These four layers don’t represent value judgments. They describe degrees of translatability, not superiority or purity. Each layer tells us more about how listeners (trained and untrained) perceive, categorize, and force-fit sound into symbolic boxes.
They also suggest that duodecimability is not a binary. It is a gradient, and perhaps a contested one: where you place a system may depend not only on its design, but on your listening history, your training, and your linguistic tools.
In the next chapter, we turn from classification to emergence: how tonal categories form in the first place, and what kinds of mental scaffolding make tonal constancy (and its resistance) possible.
The Equivoque Principle: Local Identity, Global Divergence
Induodecimability can be misunderstood as a purely microtonal phenomenon, a matter of "notes between the keys." However, the most profound breaks from the 12-tone system occur not when intervals are unrecognizable, but when familiar intervals build impossible structures. We call these Equivoque Scales.
The Equivoque: A sequence of intervals that appears locally identical to a 12-EDO structure (triggering Tonal Constancy) but which, upon accumulation, arrives at a destination that contradicts 12-tone logic.
Case Study: The 5-EDO Paradox: Consider the 5-tone equal division of the octave (5-EDO). Its single step is 480 cents. To a listener conditioned by 12-EDO, this falls comfortably within the category of a "Perfect Fourth" (500 cents). The 20-cent deviation is perceived merely as a "flat" or "mellow" character, a timbral flavor rather than a categorical change.
However, the grammar of the system depends on what happens when we stack them.
-In 5-EDO: Stacking five steps (5 * 480) yields exactly 2400 cents; a perfect double octave. The stack resolves into stability. It is a closed cycle.
-In 12-EDO: Stacking five Perfect Fourths (5 * 500) yields 2500 cents; a double octave plus a semitone. The stack creates tension and displacement.
The Failure of Translation
If a musician attempts to "translate" a 5-EDO piece based on local intervals, they will play a stack of fourths. But where the 5-EDO piece resolves to a stable octave, the 12-EDO translation lands on a dissonant minor second. The local translation (Note A -> Note B) was "correct," but the macro-translation (Structure A -> Structure B) collapsed.
This reveals that Tonal Constancy operates on a "horizon of prediction." For short segments, the brain assimilates the 480-cent interval as a fourth. But as the segment lengthens, the accumulated error forces the brain to confront a new geometry. The "Equivoque" is the point where the map (12-EDO) no longer matches the territory: Local Similarity \(\neq\) Global Congruence.
The Geometry of the Fretboard
The difference between "tuning deviation" and "structural alienation" is best visualized on the guitar. In 12-EDO, stacking Perfect Fourths (500c) overshoots the double octave (2400c) by a semitone (2500c). To correct this, standard tuning introduces an asymmetry: the interval between the G and B strings is shortened to a Major Third (400c). The symmetry of the instrument is broken to satisfy the cycle of the octave.
In 10-EDO (or 5-EDO), the structural "fourth" is 480 cents. Stacking five of these intervals yields exactly 2400 cents (480 * 5). On a guitar refretted for 10-EDO, the tuning becomes perfectly symmetrical (4,4,4,4,4 steps) while still locking into the double octave.
This creates an "Equi-Pentatonic" chord on the open strings, a sound that is locally recognizable (stack of near-fourths) but globally "impossible" in 12-EDO logic. It is a system where the geometry of performance becomes fundamentally different.
Induodecimability is not just about "weird notes", it is about different geometries of connection.
The Two Families of "Equivoques" (structural vs mnemonic)
There are two routes by which intervals → structure can happen and they are not the same phenomenon.
1. Structural Equivoques
Equivoque Duality Principle:
For any perceptual mapping that preserves local interval categories while altering global structure, there exists a complementary mapping that preserves global structure while altering local intervals.
(purely perceptual or topological, independent of musical memory)
Equivoques: same small intervals → different global cycle (the 5edo subfourths example).
Inverse Equivoques: : different small intervals → same global cycle (the stretched diatonic examples, or local perturbations of the 5edo subfourths)
This is the domain of the stretched-diatonic examples: the octave from 1200 → 1150 (or 1250).
Every interval is slightly warped. But the tonal grammar (scale degrees, melodic motion, cadential weight) stays intact. The listener perceives “the same melody” even if they’ve never heard it before.
This type relies on total scale interval flexibility.
The brain stabilizes identity based on internal relational geometry, not raw acoustics, assumes “there is a cycle here” and finds the closest consistent one.
2. Mnemonic
(contour-based identification via stored templates)
This is not the same mechanism. The “Happy Birthday dropping in pitch” experiment belongs to a different category.
The massively distroted tune, chromas land in the “wrong” positions, the interval sizes are inconsistent, but its still perfectly recognizable because now the brain is using a top-down stored pattern matching the melody to a memory template; contour dominance over interval precision.
Up/down motion + rhythm is enough to trigger recognition, identity-from-template, not identity-from-geometry
This kind of recognition does not imply a robust internal structure in the new tuning system.
It implies melodic memory, not tuning tolerance. This does not mirror equivoques, it’s formally separate.
The Optimization Trap:
It is a common misconception that "more notes" equals "more alien." Systems with high step counts such as 19, 31, 53, or 72-EDO are often grouped with radical microtonality. However, under the lens of Tonal Constancy, these systems often function not as departures from the 12-tone framework, but as Hyper-Diatonic Optimizations.
The Availability of "Better" Notes
In systems like 31-EDO or 72-EDO, the density of pitches is so high that the system acts as a "super-set." Within this vast array, one can easily select a subset that approximates 12-EDO intervals with greater precision than 12-EDO itself (e.g., finding a "pure" 5:4 Major Third).
The Effect: Instead of forcing the listener to confront new categories (as 8-EDO does), the composer, consciously or not, is tempted to overfit the pitch selection to known templates.
The JND Threshold
This reaches a critical limit in 72-EDO, where the step size (~16.6 cents) approaches the average Just Noticeable Difference (JND) for pitch in melodic contexts.
At this level of granularity, the step is no longer a structural "brick" it is a nuance.
Tonal Constancy engages effortlessly here. Because the grid is finer than the brain’s categorical error margin, any pitch can be slid perceptually into a standard 12-tone bin.
Rational Metaphysics
Consequently, much of modern microtonal theory has been directed not toward escaping the diatonic gravity well, but toward deepening it. The focus often shifts to a quest to justify 12-tone musical habits using the "purity" of Just Intonation ratios. This approach seeks to "fix" the commas and beating of Western music, perfecting the very structure it claims to expand.
Conclusion on Density
Therefore, high-density systems are not inherently induodecimable. Unless the composer rigorously avoids the "diatonic attractors" hidden within the swarm of notes, these systems tend to collapse back into Layer 2 (Relational Reference). They sound like "better" versions of the familiar, whereas lower-density, structurally incompatible systems (like 10-EDO) sound fundamentally different because they offer no place to hide.
5: The Spectrum of Familiarity — Microtonal Flavor vs. Functional Break
Music doesn’t become "otherworldly" just because it uses strange intervals. In many cases, it is ornamental, expressive, a kind of seasoning, a flavor layered atop an underlying structure that is still resolutely tonal.
The difference between microtonal flavor and functional departure is a spectrum, not a binary. But it’s crucial because it defines whether a piece of music is interpretable, translatable, or cognitively disorienting. And that distinction hinges on tonal constancy: whether the listener can still rely on familiar perceptual anchors, tonic, cadence, resolution, even as the tuning system mutates around them.
Historic Flavors: Chopin and Meantone Coloration
A famous example: Chopin referred to D minor as the “saddest” key.
At first glance, this seems metaphysical or poetic. But in fact, there was a physical reason. During his time, many pianos were tuned in meantone temperament, a system optimized for certain intervals using simple integer ratios. While 12-tone equal temperament (12-EDO) was theoretically known and even in use it was still rare for instruments to be tuned to it precisely. Ear-based tuning methods favored rational approximations. Algorithmically, meantone was simply more practical before electronic tuners.
The result: each key had a unique color, a subtle deviation in interval sizes that made D minor sound distinctly different from, say, B minor. These were microtonal inflections, not fundamental departures. The harmonic framework remained diatonic. What changed was the flavor profile of each key.
Modern Examples of Flavor: Bends, Blues, and Maqamat
Today, the idea persists in many styles:
Blues music bends between notes of the pentatonic and chromatic scales, sliding into pitches that don't "exist" in 12-EDO notation. These expressive bends act as stylistic inflections, not harmonic challenges. The tonic remains the tonic.
Arabic Maqamat and Persian Dastgah systems incorporate quarter-tones and nuanced scalar steps, often creating pitches "between the keys." Yet these systems still rely on cadential logic and tonal gravitation. The microtones serve as ornaments, bridges, flavors. They rarely seek to dissolve the entire structure, they aim to enrich it.
In both cases, duodecimability remains possible, even if imperfect. A skilled listener can still find the center of gravity. These are flavored tonalities, not alternative logics.
Functional Break: When Tonal Constancy Fails
What happens, though, when the system no longer submits to interpretation?
Below is an example from 8-EDO, a tuning system that divides the octave into eight equal steps (150 cents each). It contains two maximally symmetric diminished scales, and enough pitch density to form chords and melodies. However, the logic of this system is non-diatonic by design.
Try to map its harmonic progressions to 12-EDO, and tonal constancy breaks. No amount of perceptual coercion or melodic expectation can fully translate its motion. The listener doesn’t "mishear" it as tonal ,they simply hear it as strange.
Why? Because 8-EDO sits out of phase with 12-EDO. There are no simple ratios shared between their step sizes. Their intervals don’t approximate one another; they contradict each other.(Except for the diminished scale, or 4EDO) This is the threshold at which duodecimability fails entirely. Translation is not fuzzy, it is impossible.
Octave Retention vs. Structural Alienation
Interestingly, 8-EDO still uses the octave as a repeating unit. This gives it a slight advantage in group performance and instrument design: parts can be transposed, ranges can be shared.
Compare that to the very similar scale, Bohlen-Pierce (13-ED3), a system that replaces the octave (2:1) with the tritave (3:1). While rich in harmonic possibilities (especially with odd harmonics), it loses the universal reference point that the octave provides. The result: true structural alienation, especially in chordal writing. Melodies still function, but harmonies drift into perceptual limbo. An approximate 1.96 ratio, close to the octave, exists, but it is harmonically incoherent with traditional instruments.
This is why 8-EDO, though less famous, can feel more playable. Its symmetrical design makes it excellent for exploring alien harmonic functions while maintaining just enough structure for ensemble use.
The Takeaway: The Diatonic Ghost is Hard to Kill
Even in highly divided systems like 19, 22, or 31-EDO, often used for their greater consonance or intonation precision, diatonic templates resurface. Musicians use them to better approximate known categories, not to invent new ones. In fact, the higher the division, the more tempting it becomes to overfit microtonal pitch space to traditional harmonic roles.
By contrast, systems like 8-EDO or 10-EDO, low-subdivision tunings that avoid rational alignment with 12-EDO, offer fewer handholds. Their symmetry, spacing, and internal logic prevent easy mapping. They don't flavor tonal music, they replace it.
These systems are functionally distinct, and their progressions defy tonal constancy. This is the boundary line: where the mind stops hearing “altered chords” and starts hearing new grammar.
Closing Note
The difference between flavor and functional break is not merely theoretical. It defines whether music can still operate within a shared perceptual vocabulary, or whether it demands the invention of a new one.
In the chapters ahead, we’ll explore this boundary more formally: how tonal categories form, and what kinds of cognitive attractors allow or prevent the perception of coherence when pitch structures drift too far.
Or put more provocatively: when does a microtone become a mutiny?
Example of Duodecimability Using All 31-EDO Notes
The audio/video example below is a reinterpretation of Bach’s Goldberg Variation No. 1.
This variation famously uses all 12 pitch classes while remaining firmly diatonic; an early peak of Bach’s polyphonic chromaticism.
Here, however, the piece is performed on a 31-EDO sampled clavichord, and the adaptation makes use of nearly all 31 available chromas across the octaves.
The obvious question is: why doesn’t the music collapse?
Subtle Modulations, Stable Structure
In 31-EDO the step size is 38.7 cents, so many intervals sit near multiple possible 12-EDO interpretations. For example:
-the semitone can be realized as 77 or 116 cents
-the tritone can appear as 580 or 620 cents
Because of this, several 12-EDO mappings are always available not only via mathematical proximity, but also via melodic trajectory, voice-leading weight, and contextual expectation. Interval function is more flexible than the grid suggests, as shown earlier in the 7-EDO “functional multivalence” examples.
Thus even though the pitch set is far denser and every chroma is eventually touched, the music remains:
-tonal
-diatonic in function
-fully duodecimable
This is not “microtonal structuralism.” It’s microtonal refinement + auditory illusions, very similar to pitch-drift and Shepard-tone-style ambiguity.
Is 31-EDO Structurally Alien?
Yes and no.
31-EDO has its own harmonic logic and can certainly support non-12-tone structures. But because its pitch density is so high, you can choose to: selectively improve consonances, or subtly colorize a keys.
…while still remaining within familiar 12-tone perceptual categories.
You don’t leave the “12-EDO flavor”, you simply refine, bend, and tint it.
This is duodecimability in action: many microtonal deviations still funnel back into 12-tone percepts when context supports them.
Video Description
The video shows two circular pitch-class displays:
A 12-EDO clock, marking the original chroma classes of the Bach score.
When each pitch class is used, it remains highlighted with a different color.
(By halfway through the piece, Bach has used all 12.)
A 31-EDO clock, showing all pitches actually played in the 31-EDO adaptation.
As each chroma is used, it remains marked showing how the entire 12-tone structure “moves” within a larger 31-tone space.
The result is a visualization of how the whole wheel of 31 notes gets painted over time, while the music itself stays remarkably stable.
6: Anchor Density — Diatonic Memory and the Illusion of Familiarity
Tonal constancy does not act on systems uniformly. Some tunings invite diatonic reinterpretation easily. Others resist it entirely. The key difference is not simply the number of steps per octave or the presence of harmonic intervals, but what we call Anchor Density, the frequency and distribution of elements that are close enough to recognizable tonal functions that the brain tries to interpret them as such. The concept aligns with the Perceptual Magnet Effect described by Patricia Kuhl in speech perception, and later applied to music by Carol Krumhansl.
This is not a binary switch. Induodecimability, as introduced earlier, isn’t “on” or “off.” It’s a gradient of perceptual traction, the ease with which the listener's cognitive machinery can hallucinate tonal relevance from non-diatonic material.
Anchors: The Seeds of Tonal Illusion
We define an anchor as a moment, a pitch, a dyad, a short progression that approximates a recognizable function within the 12-EDO system. These are not structural absolutes; they are perceptual affordances. A chord that sounds like a major triad even if it's technically off is an anchor. A cadence that feels like resolution is an anchor, regardless of its tuning origin.
The brain uses these moments to project a familiar tonal grid over unfamiliar territory. It’s a form of perceptual compression, and it’s why even radically mistuned systems can feel “not quite right” instead of “completely alien.”
Case Study: The Diatonic Categorization Experiment
A example of this phenomenon appears in "Diatonic Categorization in the Perception of Melodies.(Jason Yust’s)" In the study, ~30 participants, primarily musicians or audio professionals, were asked to categorize melodies played in an unusual subscale of 13-EDO: a seven-note scale with step intervals [0, 2, 4, 8, 10, 11,12] (notes selected pattern notation).
Despite the scale’s deep structural departure from 12-EDO, participants consistently used diatonic terms to describe what they heard, “major third,” “perfect fourth,” “leading tone,” etc. Not a single subject identified the system as non-12-EDO. the internal tonal schema overrode the signal.
This was not 13-EDO as a novel tuning. This was 12-EDO imposed on a foreign substrate, is Categorical Assimilation. When a stimulus is within a certain distance of a category prototype (an anchor), the brain shrinks the perceptual distance, pulling the sound into the category.
The Anchor Density Spectrum
Let’s break down two critical points on the anchor density gradient.
1. High-Density Anchors: Partial Duodecimability
Systems: 11-EDO, 13-EDO, high-fidelity Just Intonation subsets (e.g., 13edo roughly approximates \((11^x \times 2^{y_x}) \in (1,2] \, x \in [0,12]\)), a chain of thirteen 11th harmonics folded by octaves 1:2. For example: \(2^{6/13} \approx 1.377 \approx 11/8 = 1.375\) Perceptual Experience: Slippery, ambiguous, “almost tonal” Mechanism: Local phrases strongly resemble 12-EDO intervals, triggering familiar categories
These systems produce local illusions of tonality. You might hear a chord that “feels” like a major triad, your brain engages tonal constancy, and you momentarily experience a key center. But when the next chord arrives, the illusion fails. The logic collapses. The system can't sustain a consistent diatonic mapping over time.
We call this the Shifting Grid Problem: tonal constancy can win a battle, but it loses the structural war. The mind would have to rebuild the entire tonal scaffold for each new event—a computation it can't maintain over time. (Or can be used to exploit the less familiar categories)
This explains why many listeners describe such music as “drifting,” “haunting,” or “unstable.” It’s not unfamiliar in total, it’s familiar in fragments, and that inconsistency is deeply disorienting.
2. Low-Density Anchors: Global Induodecimability
Systems: Bohlen-Pierce, 8-EDO, 10-EDO, some dissonant Just Intonation networks Perceptual Experience: Profound alienation, unfamiliarity, or unclassifiability Mechanism: Few (or zero) interval categories resemble 12-EDO constructs
Here, tonal constancy doesn't just fail intermittently, it fails completely. There are no islands of recognition. The harmonic series is differently parsed, the scale is divided in unfamiliar ratios, and resolution itself might not even be meaningful.
These are truly induodecimable systems. Even approximate mappings to 12-EDO don’t make sense. The brain has to make a choice: either accept this new musical logic on its own terms, or reject it as noise.
Implications: When Hallucination Meets Constraint
Anchor density reveals a deep cognitive tradeoff. Tonal constancy can only operate within a limited domain of error. Systems like 19-EDO or 22-EDO can stretch that domain without breaking it; systems like 8-EDO snap it in half.
Flavor becomes function when too many anchors accumulate. Microtonality starts as expression—but once enough structural anchors reappear, the listener’s mind imposes full tonality onto the scale.
The difference between Microtonal Refinement (Optimization) and Microtonal Structuralism.
7. The Indispensable Cycle: How Pitch Becomes Place
The Octave as Construct
The octave is traditionally treated as foundational: it arises physically from halving a string, frames the diatonic scale, and corresponds to the 2:1 frequency ratio. Yet its perceptual status is not simply a direct reflection of acoustics. The octave is constructed by the auditory system. A tone and its 2:1 counterpart are not perceived as categorically distinct but as variants of the same pitch identity. This equivalence, intuitive to children and indispensable to musicians, feels obvious because it is statistically reinforced, not because it is ontologically necessary.
The harmonic series provides the regularity: doubling a frequency aligns overtones and adds no new spectral information. The brain learns this redundancy and treats 2:1 transformations as perceptual “returns.” Thus, the octave functions less as a physical law than as a highly efficient cognitive shortcut.
Linear Pitch vs Cyclical Pitch
A purely linear pitch space offers range and resolution but no meaningful structure. To illustrate, imagine two listeners with equal pitch discriminability:
• Listener A perceives a standard 20–20,000 Hz range (~10 octaves).
• Listener B perceives only 10–11 Hz while resolving the same number of discriminable steps.
Despite identical resolution, the topologies they inhabit differ fundamentally. Listener A experiences hierarchical structure, anchored positions, intervals, and return points. Listener B experiences undifferentiated detail. “Octave” is not simply a summation of JNDs but a cognitive cycle that turns a line into a map.
Perceptual identity depends on such cycles. Without a return point some transformation that “comes back” to itself, pitch remains an infinite continuum without landmarks. Cycles enable categories, memory, direction, and tonal meaning.
Statistical Roots of the Cycle
Octave equivalence persists even with sine tones, indicating that the cycle arises from internalized statistical models, not solely from spectral cues. When the 2:1 ratio is moderately distorted (e.g., stretched to 1150 or 1250 cents), melodies retain their identity. Tonal constancy stabilizes the cycle so long as internal melodic relationships remain coherent. The perceptual loop bends but does not immediately break.
This demonstrates that the brain prefers a closed-loop topology and will warp incoming data to maintain it. These “stretched diatonics” invert the phenomenon of equivoque intervals: instead of similar intervals collapsing onto different structures, distorted intervals preserve a familiar structure.
Toward Perceptual Cyclicity
Perceptual Cyclicity can be defined as:
A property of continuous pitch space in which unidirectional motion leads to repeated encounters with qualitatively similar percepts, producing a sense of recurrence despite physical progression.
This introduces internal geometry: antipodes, midpoints, and stable recursive partitions. Even with degraded range or resolution, the relations persist. This is the basis of tonal constancy.
Breaking the Loop: Timbre
Could alternate cycles be trained say, based on a 2.5:1 ratio? Theoretically, perhaps; practically, statistical coherence collapses. Missing fundamentals, combination tones, and overtone alignment continually reinforce 2:1 relationships. One can stretch the cycle, but replacing it is difficult without abandoning stable pitch identity altogether. Inharmonic timbres (e.g., gamelan metallophones) illustrate this: when spectra diverge from harmonicity, cycles weaken or dissolve.
Why the Brain Needs Cycles
Cycles enable the brain to compress infinite pitch space into a finite topology. They support prediction, reduce memory load, and permit combinatorial structures: scales, chords, modes, symmetry. Without cycles, musical cognition reduces to raw signal detection.
Elasticity and Limits
Empirical evidence suggests that octave perception is highly elastic. When melodies retain coherent internal logic, even large deviations from 1200 cents remain “in tune.” What matters is not the numeric ratio but the density of internally consistent anchors. When enough local regularities accumulate, the brain infers a global cycle, even if the physical data do not demand one.
The brain prefers a stable loop over an accurate line. It will infer structure rather than accept randomness.
Pitch as a Vector on a Loop
The octave is not a metaphysical constant but a perceptual habit grounded in efficiency and statistical regularity. It turns the continuous dimension of frequency into a cyclical space in which pitch acquires identity and direction. Once the loop is established, the brain treats pitch not as a scalar point but as a location within a closed topology.
Functional tonality therefore is fundamentally probabilistic and topological.
(Video.01 - Color-coded octave equivalence)
Video.01: Octave equivalence is demonstrated through a common chord progression exhibiting a known tension-resolution characteristic: \(\text{V}_7 \to \text{I}\). Within a 12-tone equal temperament (12-EDO) framework, with middle C standardized at 261 Hz, the progression \(\text{G}_7 \to \text{C}\) is employed. An initial sequence, represented in MIDI format, comprises approximately one pitch class per chord. Subsequent sequences introduce randomized octave doublings of chord members, illustrating the preservation of harmonic function and tonal meaning. Introduction of other random intervals in further sequences results in the loss of this harmonic function. While the octave's significance may appear self-evident within certain modern consonance models and given the observed perceptual flexibilities, such examples serve to reaffirm its fundamental role. The synthesized sounds in these examples utilize sine waves, thus eliminating timbral complexity and ensuring that the observed pitch grouping is independent of partials.
The perceptual flexibility of the octave and its role as a framework for monophonic melodic structure are demonstrated through a series of audio examples. Each example features a 12-EDO diatonic major scale subjected to proportional stretching. The notes of the scale are presented sequentially, followed by a short melody, to illustrate the preservation of tonal meaning and relative intervallic distances despite the stretching. This process results in a relative error distribution of less than 10 cents between adjacent notes. Specifically, audio example 1 features a stretching of the octave from 1200 cents to 1150 cents, while audio example 2 features a stretching from 1200 cents to 1250 cents.
This is Categorical Perception again. The brain prefers a "closed loop" topology (a circle) over a line, so it will bend the data to close the circle. (These stretched diatonic scales are the inverse of the equivoques, where similar intervals stack onto a different macrostructure, here different intervals stack and stand as the same macrostructure)
(Audio.01) 12-EDO diatonic stretched to 1150cents.
(Audio.02) 12-EDO diatonic stretched to 1250cents.
(Audio.03) Auditory stimulus used in pitch distance estimation tests. A sequence of randomly generated pitches with constant, randomized step sizes is presented. Participants estimate the overall interval between the first and last pitches. Step sizes and number of notes are withheld from participants to prevent calculation-based responses.
8: Self-Organizing Criticality in the auditory cortex. A Collision of Two Logics
Opening Question: Why So Few Categories?
We live in a universe of almost infinite auditory resolution. The human ear can distinguish pitch changes as small as 5 cents under ideal conditions—meaning that, technically, we could divide the octave into hundreds of perceptible steps. And yet, almost universally, humans gravitate toward small sets of pitch categories: five (pentatonic), seven (diatonic), and twelve (chromatic). Why?
Why not 17? Or 53? Why don’t we hear in hundreds of pitch regions the same way we can recognize thousands of faces? What explains this great compression?
To answer that, we must look at the perceptual and physical forces at play, not just what the ear can hear, but what the brain wants to organize. We’ll see that the categories we use are not merely products of hearing, but of balancing two incompatible but equally foundational logics: one mathematical, the other acoustic.
Symmetry vs. Resonance
Imagine pitch categorization as a negotiation between two deep instincts, two different ways of dividing the world into "sensible parts." These two logics are not optional. They are both embedded in our perceptual architecture and the acoustic environment itself.
1. The Logic of Symmetrical Partition (The 2ⁿ Brain)
The brain has an affinity for symmetry, especially binary symmetry. Oour nervous system is organized into mirror-like hemispheres, our motor systems operate in bilateral pairs, and our cognition thrives on hierarchical nesting. When it encounters a continuous space, the simplest organizing move is to subdivide it by halves.
This gives us a tidy and scalable system:
1 division = 1 point (trivial)
2 → 2 regions (perceptual opposites)
4 → quarters
8 → eighths
16 → sixteenths, and so on.
This division strategy is computationally elegant and perceptually robust, up to a point.
This is where semantic distinctiveness decay enters. As the cycle gets subdivided into more and more parts, the distance between each mark on the ruler shrinks. At some point, the perceptual difference between one step and the next becomes so small that the brain stops treating them as different enough to matter. It's not that we can’t hear them, it’s that they don't participate in meaningful contrast. We might call this categorical fatigue.(JNM, Just Noticeable Meaning)
2. The Logic of Acoustic Resonance (The Timbre Lock)
But there's a second force, and it doesn't care about symmetry or perceptual consonance. It cares about resonance.
The physical world, particularly the world of vibrating strings, tubes, and vocal cords, delivers us with one inescapable musical truth: the 3:2 ratio—the perfect fifth.
The perfect fifth is a rebel. It doesn’t neatly fit into binary division. Try cutting the octave in half, then half again, and you won’t land on the fifth. It’s irrational in log₂-space. But perceptually, it's overwhelmingly powerful. In spectral terms, it's the first prominent harmonic interval after the octave. When you play a note and then its fifth, your auditory system doesn’t hear disjoint points—it hears coherence. Overtones lock. Energy feels shared.
This is what gives us tonal hierarchy: the sense that some notes "belong together" and others don't. It introduces direction, gravity, and asymmetry. In Western music, it’s what makes dominant-tonic cadences feel like return journeys.
This is not a culturally invented behavior. It’s a biological reaction to the physics of sound. Every culture that employs pitched instruments stumbles into this asymmetry—whether it encodes it in twelve tones or not.
The Great Compromise: Why 12?
Twelve is not just a convenient number. It is a solution—a local maximum in the space of possible tonal systems.
Twelve is where these two logics, symmetry and resonance, strike a practical truce:
It accommodates the 3:2 fifth closely enough that stacking them brings you nearly back to the octave (seven fifths ≈ 12 semitones).
It allows binary subdivisions (12 divides cleanly into 2, 3, 4, and 6).
It supports multiple diatonic subsets (pentatonic, heptatonic, etc.).
It permits hierarchical relationships—tonic/dominant, major/minor—without overloading the perceptual system.
In other words: 12 is not perfect, but it’s good enough in enough directions. It's the first stable “cultural attractor” where perceptual symmetry and physical asymmetry can be mapped onto a finite structure that the brain can learn and use fluently.
This is the Dual-Patch Theory. The auditory cortex likely has two competing maps: one for Periodicity (pitch height/symmetry) and one for Harmonicity (spectral matching). 12-EDO is one of the few systems where these two maps overlap cleanly.
Importantly, this isn’t about consonance per se. Consonance is contextual and plastic. This is about structure and how categories emerge, how the brain carves up a cycle into meaningful segments.
Resolution Isn’t Enough: From JND to Musical Meaning
Even with modern tuning systems (19, 22, 31, 72) where we can finely sculpt intervals, not every perceptible difference becomes a category.
This is the core distinction:
JND (Just Noticeable Difference): “Can I tell this is different?”
JNM (Just Noticeable Meaning):Categorical Distinctiveness: “Does this difference mean something?”
This distinguishes Psychophysics (what the ear can do) from Semantics (what the mind acts upon). This is the missing link in microtonal theory.
Tonal constancy plays a key role here. It resists giving category status to subdivisions that don’t contribute to the known structure. It insists on interpreting ambiguous or fine distinctions as versions or “flavors” of known categories, not entirely new ones. This is why microtonal intervals so often get perceived as “bent” versions of 12-tone intervals, unless the tuning is radically unfamiliar or the anchor density is too low to allow reinterpretation.
So we get plateaus: 5, 7, 12. These are systems where the perceptual return on complexity is high—where each added step adds not just a difference, but a function.
Conclusion: The Brain’s Great Synthesis
Our musical categories arise not from arbitrary cultural evolution nor fixed laws of physics. They are emergent solutions—stable balances between two incompatible demands:
The brain’s love of symmetry, compression, and predictability.
The world’s asymmetrical offerings, in the form of acoustic resonances and harmonic structure.
The cycle gives us the container. These two forces tell us how to divide it. And Tonal Constancy is the mechanism that enforces these learned partitions—interpreting incoming sound as near or far, familiar or strange, anchored or drifting.
The categories we use are not infinite, not because we can’t tell the difference—but because only some differences rise to the level of meaning. That is the blueprint of pitch. A perceptual system always balancing what it could sense with what it must make sense of.
Neural Mechanisms and Predictive Models of Tonal Constancy
Neuroimaging and electrophysiological studies reveal that specific regions of the auditory cortex selectively respond to structured sound patterns such as speech, melody, and harmonic sequences, but remain relatively inactive during exposure to unstructured noise. Notably, areas such as the planum temporale, located posterior to the primary auditory cortex, appear to engage dynamically when pitch structures exhibit internal regularities, even when those regularities are statistically subtle or culturally learned.
These regions are not merely passively decoding incoming sound—they participate in an active predictive process. The brain constructs internal models of melodic or harmonic progression, and generates expectations for future events. When a pitch contour unfolds predictably, it minimizes error between the expected and actual input, triggering dopaminergic reward responses in associated circuits such as the nucleus accumbens. These reward-linked responses, observed even in anticipation of musical climaxes, suggest that successful pattern prediction is inherently satisfying, reinforcing the learned tonal templates over time.
Such findings align well with a Bayesian perspective: the brain updates internal priors based on the statistical structure of the sound environment, forming what we might call tonal basins—perceptual attractors that stabilize around culturally salient pitch configurations. These basins guide pitch interpretation even when the physical signal is ambiguous, distorted, or derived from non-standard tunings (e.g., 7-EDO or inharmonic timbres). The result is tonal constancy, rooted in statistical expectation, contextual prediction, and hierarchical sensory processing.
Given these parallels, we may consider whether concepts from statistical mechanics or nonlinear dynamical systems could be adapted to describe perceptual behavior in pitch space. Several potential analogies emerge:
Tonal template / basin || Potential well in an energy landscape Contextual pitch trajectory || Particle path with momentum Dopaminergic reinforcement || Entropy minimization with energy input
Such analogies are speculative, but not without foundation. Perception is inherently path-dependent, sensitive to both immediate context and long-term learning. The momentum of pitch trajectories, as previously discussed, may correspond to cumulative Bayesian updating or even to forms of inertial processing, where prior motion in tonal space biases future interpretations.
Tonal constancy may not yet be a widely codified term in music cognition, but the phenomena it describes lie at the heart of musical experience. From the brain's predictive machinery to the statistical learning of pitch spaces, it offers a compelling bridge between acoustic structure, cultural form, and neural function. As we continue to unravel the relationship between sound, meaning, and expectation, tonal constancy may prove to be not just a perceptual curiosity, but a fundamental principle in how humans find coherence in the musical world.
draft
Exhibit A: The Anti-Randomness Engine
These musical examples belong to a larger work in which I explored the role of randomness in pitch selection—beginning with the question of what randomness even means in a musical context. For the purposes of that study, I treated randomness as non-intentional design.
The research (see [link]) investigates in depth why apparently random pitch sets remain musically functional, often without producing the sense of “oddity” one might expect. While this section does not include all of those examples—some pitch systems are far less straightforward to analyze than the 7-, 8-, or 10-EDO cases presented here—the broader study incorporates sets derived from planetary data, mathematical functions, noise distributions, and other sources.
What unites these varied systems is that, despite escaping the 12-tone grid in every permutation and defying conventional tuning logic, they still exhibit traditional musical utility. In practice, many of them do not even sound “microtonal.” Instead, musical intent, cognitive expectation, and perceptual organization stabilize them, allowing listeners to hear them as familiar or “normal.”
The key finding is simple: Uniform subdivisions of the perceptual cycle—even allowing for clustering or irregular spacing—still guarantee tonal functions.
This “anti-randomness engine” illustrates how music perception is less about strict mathematical grids and more about how the mind organizes trajectories into meaningful tonal categories.
Conclusion: The Architecture of Alienation
This study began by asking if randomness destroys musical coherence. The evidence suggests the opposite: randomness often ensures it.
We find that Randomness and High Density function as a "perceptual mirror." The sheer statistical spread of random systems provides enough "anchors" (familiar intervals approximating 3:2 or 2:1) that the brain's error-correction mechanism (Tonal Constancy) can effectively project a 12-tone grid onto the chaos. The listener does not hear the randomness; they hear the subset of it that resembles what they already know.
The creation of genuine "new notes", does not arise from chaos, but from Specific, Alien Structure. It emerges in systems like 5-EDO, 8-EDO, or non-octave scales (Bohlen-Pierce), where the internal geometry is so rigid and "induodecimable" that the brain is forced to abandon its diatonic priors.
In the end, to escape the gravity of the 12-tone system (learned or not), one cannot simply roll the dice. One must build a new geometry that is robust enough to stand on its own.
Start with 12edo music: how far can you stretch the interval sizes without distorting their meaning? Where do they switch categories?
It seems that if each pitch in a 12edo set is randomly varied within ±20 cents, the structure remains perceptually stable we still hear it as “12edo.” These are duodecimable tunings: systems that, despite numerical deviations, are still interpreted as 12-tone.
This resilience might be due to biological factors (like pitch discrimination thresholds and auditory memory) and/or cultural familiarity. But even that’s not the whole story.
When you introduce pitch trajectory, motion, gesture, melodic phrasing, the categories become even more flexible. A note that might have sounded “too far” statically can function perfectly well within a musical phrase, even shifting its perceived identity based on context.
So, duodecimability is not just a mathematical condition. It’s contextual and perceptual. The brain doesn’t simply match frequencies, it interprets relationships, motion, expectation, and structure.
This has real consequences for microtonal music and composition. Flavor is one thing, the unique color of a tuning, but structural category is another. Tonal constancy means that pitches don’t just sound similar; they mean similarly, and meaning is shaped by use.
In this light, new tonal spaces aren't just a matter of dividing the octave differently. They demand both careful pitch selection and deliberate usage, exploiting or bypassing tonal constancy depending on the desired perceptual effect. Composition becomes an active negotiation between new materials and old cognitive habits.
Meaning-to-Noise Ratio (MNR)
How much of the perceived structure carries interpretable, repeatable form vs how much is background variation or "interpretive entropy".
Lower MNR → ambience, texture
Higher MNR → form, hierarchy, prediction:
±5 cents noise → likely still high MNR
±25 cents → MNR drops, unless trajectory/symmetry/surface redeems it
Non-periodic microtonal sets → potentially high MNR if duodecimable, low if not
References / Further Reading:
Albert Bregman - Auditory Scene Analysis 1990 Diana Deutsch - research on auditory illusions 1999 David Temperley - The Cognition of Basic Musical Structures Maurice Merleau-Ponty - Phenomenology of Perception Don Ihde - Listening and Voice: Phenomenologies of Sound Easley Blackwood research on EDOs Sethares - Tuning, Timbre, Spectrum, Scale Burns & Ward (1978) - Categorical Perception of Musical Intervals Plack et al. (2005) - Pitch: Neural Coding and Perception Zatorre & Halpern (2005) - Brain regions for pitch category memory Tillmann et al. (2000) - Implicit Learning of Tonal Structure Lerdahl's Tonal Pitch Space (2001)
No comments:
Post a Comment