Sunday, July 14, 2024

Tonal Constancy

Tonal Constancy and the Perceptual Forging of Pitch 
(draft)


Human pitch perception is not a passive registration of acoustic frequencies, but an active process of categorical organization. When we listen to melodies or harmonies, the auditory system continuously groups pitches into learned, stable classes, intervals, scale degrees, tonal functions. This process, which I call tonal constancy, is the brain’s mechanism for maintaining coherent pitch categories even when the acoustic input is imprecise, unstable, or atypical.

Tonal constancy combines several interacting principles:

-learned templates (diatonic hierarchy, common-practice cadences, chord prototypes)
-psychoacoustic cues (spectral alignments, sensory dissonance minima)
-Bayesian priors (expectation of tonal centers, step/skip distinctions)
-language-like magnet effects (Kuhl, Krumhansl)

Together these allow the brain to stabilize musical pitch spaces keeping notes “in tune” conceptually, even when they are not acoustically aligned. It's categorical perception with a predictive correction mechanism.

This stabilization has an interesting consequence: many pitch systems far beyond 12-tone equal temperament are still heard as if they belonged to it. I refer to this phenomenon as duodecimability: the degree to which arbitrary pitch material is assimilated into 12-tone categories. In the simplest formalization, duodecimability can be imagined as a categorical tolerance window analogous to saying that ±15/30 cents deviations still fall into a 12-EDO class. If pitch events fall within the categorical boundaries of 12-EDO, we say they are duodecimable. But in practice, the tolerance is not fixed. It expands and contracts dynamically in response to musical motion, tonal context, and the listener’s expectations. Every musical event:

-strengthens a hypothesized tonal grammar
-increases tolerance for deviations
-increases willingness to “repair” the contour
-reinterprets incoming intervals based on context
-updates prediction biases.

This means that duodecimability is behavioral, not merely acoustic. Once a tonal interpretation gathers inertia, the listener becomes increasingly tolerant of deviations. This explains why 12-EDO can absorb large perturbations (±40c, sometimes more), why the same interval can be categorized differently depending on how it is approached, how it is resolved, or which tonal hierarchy is currently active. This functional multivalence explains why certain intervals in scales such as 7-EDO can be reinterpreted in multiple ways.

Evidence for duodecimability appears in controlled experiments with random tunings:
even when pitches are drawn from a uniform distribution across the audible range, diatonic structure, tonal centers, modality, major-minor implications that are not present in the data as typical fixed values still reappear. Similarly, empirical cases like pitch drift and modulating experiments demonstrate the brain’s readiness to preserve the internal contour of melodies by “rotating the helix” of 12 tones, rather than abandoning familiar pitch categories. Multivalence is not “in the interval”; it’s in the interaction between interval and prediction.

This reveals a deeper asymmetry: it is generally easier for listeners to reinterpret pitch height than to abandon pitch category. The system first tries to preserve the tonal scaffold, only relinquishing it when the input becomes unequivocally incompatible.

Only a few pitch systems escape this gravitational pull. Tunings such as 5-EDO, the Bohlen-Pierce scale, or certain just-intonation domains trigger an early collapse of duodecimability. The brain cannot map them to the 12-tone prior without contradiction, and so tonal constancy reboots, constructing a new internal grammar shaped by the system’s own logic.

What follows from this is a general cognitive picture:
Tonality is not a fixed property of a tuning system but a strategy the brain applies to maintain coherence.
Duodecimability is the measure of how hard the brain tries to impose that strategy before giving up.

In this work, I aim to separate this perceptual mechanism from traditional music-theoretical concepts (modality, tonal hierarchy, functional harmony). By doing so, we can understand why some microtonal systems feel familiar, and others generate their own internal stability independent of the 12-tone world.

---

This study examines the perception of stable tonal functions in musical contexts where those functions are not reliably present in the acoustic signal. Although musical intervals can be described as physical frequency ratios, their experienced roles depend heavily on melodic context, cultural learning, and cognitive pattern-matching. The same pitch height can be interpreted differently depending on the musical environment, revealing that many aspects of tonal structure are perceptually constructed rather than acoustically fixed.

Tonal Constancy is proposed as a unifying principle to explain this behavior. The theory holds that the brain actively maps incoming sound onto a familiar, likely, 12-tone framework, projecting learned tonal relationships, such as tonic-dominant hierarchy or the sense of resolution, onto ambiguous pitch material. Music reveals functional categories that are not acoustically encoded but are inferred from melodic shape, stress patterns, and prior exposure.

This perspective has direct implications for microtonality. It distinguishes between systems that the brain can easily assimilate by fitting them into existing categories, and systems that resist assimilation because their structures are incompatible with the internalized 12-tone model. Counterintuitively, some coarse or evenly spaced tuning systems (macrotonal frameworks) can feel more perceptually “alien” than finely divided microtonal scales precisely because they provide fewer recognizable tonal anchors. (Microtonal Refinement/Optimization vs Microtonal Structuralism)

The chapters that follow define a precise vocabulary for these phenomena, present case studies demonstrating tonal reconstruction in action, and outline a classification system for how different pitch structures interact with the listener’s internalized tonal grid. The work also situates these perceptual mechanisms within broader discussions of acoustics, cognition, and cultural history.

Note on Perceptual Validity: The ultimate evidence for Tonal Constancy is the listener's ear. Therefore, the audio examples provided are not merely aesthetic demonstrations; they are the primary data. Formal statistical validation is outside the scope of this initial exploration.


Part I: The Phenomenon 
    1: An Introduction to Tonal Constancy.
    2: From 'Xenharmonic' to 'Induodecimable': The Need for a Precise Lexicon.
    3: A Case Study in Perceptual Forcing: Tonal Reconstruction in 7-EDO.

Part II: The Mechanism
    4: Layers of Duodecimability.
            Perceptual Forcing (e.g., 7-EDO)
            Relational Reference (e.g., Maqamat, 19-EDO)
            Structural Alienation (e.g., Bohlen-Pierce)
            Timbral Dissolution (e.g., Gamelan)
    5: The Spectrum of Familiarity: Microtonal 'Flavor' vs. New Functional Categories.
    6: Anchor Density: A Model for Perceptual Alienation.
            "Anchor Density" spectrum explains the experiential difference between various induodecimable systems.
            Contrasts "High-Density" systems with "Low-Density" systems.

Part III: The Foundation
    7: The Indispensable Cycle: Range, Resolution, and Perceptual Distance.
            "How real is the octave?"
    8: The Blueprint for Pitch: A Collision of Two Logics.
            Presents the theory that our pitch system arises from the conflict between two forces:
                The Logic of Symmetrical Partition (2^n binary division of the cycle).
                The Logic of Acoustic Resonance (The asymmetrical, "timbre-locked" force of the 3:2 fifth).

Part IV: The Context
    9: The Riverbeds of Culture: Re-examining Pythagoras and the Great Convergence.
            What did Pythagoras really do?
            "Practice Before Theory" (lutes, frets) and the convergence of Chinese and Greek systems.
    10: Is Anything Fundamental? On Ontology
    11: Exhibit A: The Anti-Randomness Engine.
            The brain is a relentless ordering engine, and Tonal Constancy is its primary tool for forging                    meaning from the chaos of sound.




Part I: The Phenomenon


1: An Introduction to Tonal Constancy


Tonal Constancy refers to the brain’s tendency to impose familiar tonal structures onto acoustically unfamiliar or ambiguous pitch data. It explains why listeners often perceive semitones, tonics, or functional resolutions even when the input contains no reliable cues for those categories. Rather than passively receiving frequencies, the auditory system actively organizes them according to learned schemas, a form of cognitive pattern-imposition similar to pareidolia, but operating on pitch relationships.

The term parallels color constancy in vision, where colors remain identifiable across changes in lighting. In audition, small deviations in interval size or tuning system do not disrupt the perceived identity of a pitch pattern; we still recognize the underlying relational “shape.” Even in more extreme tuning systems, such as those far from 12-tone equal temperament, listeners frequently reconstruct familiar functions through melodic motion and contextual inference. In this sense, pitch categories behave less like fixed points and more like directional cues, vectors rather than precise coordinates.

For such reconstruction to occur, the brain relies on a stable internal framework. This suggests a deeper process: the perceptual categorization of cycles. Just as spatial patterns are grouped into shapes and temporal events into meters, pitch information is grouped into circular structures, with the octave being the dominant organizing cycle. This is supported by the harmonic series and phenomena such as the missing fundamental, which provide early perceptual justification for octave grouping.

Within this framework, three related concepts become important:

-Range: the span of frequencies humans can hear.
-Resolution: the smallest perceivable pitch difference (JND).
-Cycle: the inferred circular structure the brain uses to map pitch into repeating units.

Range and resolution describe physical limits; cycle reflects a cognitive strategy for generating order within continuous acoustic space.

This perspective explains several recurring phenomena:

-Why scales such as 7-EDO can still evoke familiar functions like “semitone” or “dominant” when heard in motion.
-Why randomly generated or uniformly spaced tones often suggest tonal relationships.
-Why even experienced musicians may not immediately detect that they have left the 12-tone system when exposed to systems like 13-EDO, provided timbre and pitch motion remain within certain perceptual expectations.

In short, most pitch systems are predisposed to be interpreted tonally unless they are structured to avoid alignment with learned priors.

Tonal Constancy therefore reflects more than musical habit. It may represent a fundamental feature of auditory cognition. This study aims to formalize the concept, propose a classification system for degrees of translatability into tonal space, and explore its implications for theory, composition, and the psychology of hearing.


2: From 'Xenharmonic' to 'Induodecimable': The Need for a Precise Lexicon.


The term xenharmonic was originally coined to describe musical materials that lie outside the framework of Western 12-tone equal temperament (12-EDO). Implicit in its early usage was a sense of perceptual alienness, sounds that could not be reconciled with conventional tonal expectations, or even approximated within the familiar scalar structures of Western music.

Over time, however, the term’s scope has broadened. Today, xenharmonic may be applied to any music using non-standard tunings, alternate instruments, or unfamiliar timbres. As its use has expanded, its precision has diminished. A piece might now be labeled xenharmonic even if it maps closely onto 12-EDO, or if it retains gestures that remain tonally functional within familiar paradigms. In this diluted form, the term no longer guarantees that the music is truly untranslatable into the 12-tone system.

To address this ambiguity, we use a more semantically precise term: induodecimable, from Latin roots meaning not reducible to twelve. It describes musical structures, scales, or timbres that cannot be effectively translated into 12-EDO without a perceptual or functional loss. Unlike xenharmonic, this term emphasizes irreducibility, not just unfamiliarity. Moreover, its morphology is cross-linguistically stable (e.g., induodecimable reads identically in English and Spanish), and it admits extensions for greater specificity, such as indiatonizable, referring to pitch content incompatible with diatonic function.

It is important to note that this property is not binary. Whether a given musical structure is duodecimable (that is, whether it is approximable by 12-EDO) is ultimately a perceptual judgment. Some cases are obvious: inharmonic spectra perceived as noise, or very specific microtonal systems with step sizes that fall far outside typical pitch categories. But others lie in a gray zone where perceptual context, cultural exposure, and learned listening habits strongly shape what we "hear."

This gray zone is precisely where Tonal Constancy becomes critical. Even when a melodic or harmonic structure defies analytical reduction to diatonic scales, listeners often project familiar tonal frameworks onto it, constructing implied functions, modes, or centers through context and inference. The ability to “make tonal sense” of unfamiliar material is not evidence of universality in the structure itself, but of the perceptual elasticity of the listener.

As this study will show, the diatonic scale functions as a tonal attractor, a kind of perceptual sink into which ambiguous or approximate materials are pulled. The 12-tone system serves as its most stable host, offering a resolution of the scale’s unequal steps into evenly spaced units that align (imperfectly, but reliably) with physical redundancies in the harmonic series.


  • Duodecimability:A musical object (interval, scale, passage, piece, or tuning behavior) is duodecimable if there exists a translation into 12-tone equal temperament that preserves the same functional structure for a human listener.
    Note-by-note mapping may or may not be close in cents. What must survive is: direction, relative attraction, cadence identity, interval categories, and modulatory logic.
    When this fails, when a mapping is possible but the resulting structure no longer behaves recognizably, then the original is induodecimable.
    Duodecimability is a structural property of music, not merely a property of the tuning system. A tuning may be capable of producing duodecimable and induodecimable music depending on context.

 

The question then arises: why twelve? It's not the timbre, the Pythagorean algorithm or the more compelling harmonic semi-group, which, while seemingly more ontologically robust, doesn't necessarily relate to or reflect our perception. Why does this specific internal subdivision act as the dominant attractor, rather than systems based on ten, fifteen, or nineteen tones? Why does duodecimability seem to represent a perceptual threshold?

The answer is not solely historical. Nor is it purely acoustic. Instead, we find ourselves in the deeper terrain of categorical emergence: how perceptual systems construct stable reference frames from continuous data. Just as we learn to divide the color spectrum into culturally specific “basic colors,” so too do we divide the pitch continuum into categories that are both learned and constrained, by cognition, biology, and acoustics.

This chapter lays the foundation for a more precise taxonomy of perceptual translatability in music. The aim is not only to explore how Tonal Constancy works, but to examine the deeper question: where do musical categories come from at all? At what point do categories cease to form, or become unstable? And when do they dissolve into pure context-dependence; when Tonal Constancy, in effect, runs out?


3. Tonal Reconstruction in 7-EDO and the Elasticity of Pitch Meaning


The 7-tone equal division of the octave (7-EDO) offers a clear demonstration of tonal constancy. Each step spans ~171 cents, and unlike diatonic 12-EDO, the system contains no internally differentiated intervals, no whole/semitone hierarchy, no embedded modal markers, and no natural centers of gravity. Acoustically, it is a uniformly spaced cycle.

Yet when 7-EDO is used melodically, listeners routinely report hearing tonal centers, modal references, and functional cadences. The system behaves musically intelligibly despite lacking structural cues.

This raises a central question:

How does a scale composed entirely of uniform steps give rise to perceived modes, cadences, and tonal direction?

The answer lies in the organization of the sequence rather than the tuning itself. Trajectory, rhythmic emphasis, contour, and learned tonal archetypes guide the perceptual system toward interpretations that are not encoded in the signal. This is tonal constancy: the imposition of familiar pitch relationships onto acoustically ambiguous material.

The 7-EDO “Chameleon Effect”: Functional Multivalence


One of the clearest signs of tonal constancy in 7-EDO is the instability of pitch identity. A single 7-EDO degree can be re-interpreted as multiple 12-EDO categories depending on context. For example, the 342-cent step—the “neutral third”—frequently shifts between major-like and minor-like functions.

This is Functional Multivalence (a one-to-many mapping):
  • In 12-EDO, a pitch class is relatively stable (“C is C”).
  • In 7-EDO, the same frequency can behave as a major third, a minor third, or something in between, depending on melodic direction, local emphasis, or implied harmony.
The perceptual system prefers to preserve musical grammar—directionality, cadence, and contour—over preserving literal interval size. In effect, the brain “chooses” the pitch identity that best fits the surrounding syntax, even when that identity is not present in the acoustic input.

For reference, the 7-EDO scale in cents is:

[0, 171, 342, 513, 684, 855, 1026]

Audio Examples 01/02: Diverging Cadential Mappings


The following examples (7-EDO followed by 12-EDO reinterpretation) illustrate how a single interval can support multiple functional readings.

The tune has a simple A-A-B-B structure. In both phrases, the final step of the bass line is identical in 7-EDO: a single 171-cent ascent. Yet in the 12-EDO reinterpretation, this same interval is mapped differently in each phrase:

In one case, the step resolves as a 100-cent semitone, supplying a cadential “leading tone.”
In the other, it expands to a 200-cent whole tone, producing a “major seventh”-like resolution.

Thus the same 7-EDO motion supports two distinct tonal functions, determined not by its size but by its role in the phrase.

Audio Examples 03/04: Neutral Intervals in Context


Additional examples (sine-wave only) show that the effect persists even without harmonic cues. The 342-cent degree is perceived as “major” or “minor” depending on the melodic frame:

Ascending, it often acquires major-third implications.
Descending, or in a minor-leaning contour, it takes on a minor-third quality.

When rendered in 12-EDO, performers naturally “resolve” these ambiguous steps toward the expected functional pitches to satisfy the implied cadence. This is tonal constancy operating directly on interval interpretation.

Trajectory, Momentum, and Torsor Structure


These examples highlight the role of pitch momentum, the way successive intervals form a directed trajectory through pitch space. Even in a perfectly symmetric tuning, melodic movement generates expectation and prepares closure.

This behavior aligns more closely with a torsor than a vector space: there is no absolute reference point, only relational structure. A pitch derives meaning from: its placement in the trajectory, its rhythmic emphasis, and its relation to culturally learned tonal prototypes.

The ear treats 7-EDO not as a static grid but as a flexible relational field.

This leads to a central set of questions:

When Does Meaning Break Down?
How far can an interval deviate before its expected function collapses?
When does a “minor third” cease to be heard as minor?
At what point does tonal constancy fail to rescue the structure?

These boundaries are not fixed. They shift with experience, familiarity, cultural priors, and attentional state. Much of what we “hear” as categorical pitch identity is constructed, not given.

Later chapters will return to these issues when discussing tonal attractors, learned priors, and the emergence of pitch categories.

Summary

Tonal behavior in 7-EDO shows that pitch meaning is elastic. Even in a scale with no intrinsic hierarchy, listeners reconstruct functional roles through trajectory, rhythm, and expectation. The auditory system is not passively reporting interval sizes; it actively infers tonal structure.

Where the tuning system provides symmetry and ambiguity, perception generates hierarchy and direction. This is tonal constancy, not a property of the tuning, but of the listener.


Relationship with Shepard tones:

Shepard tones expose a perceptual symmetry in pitch space which relies on overlapping spectral components, octave-wrapped circular pitch causes and ambiguous vertical positioning on the pitch helix. This creates bi-stability, the same stimulus can be interpreted as “ascending” or “descending”
depending on which branch of the helix the brain commits to. But this requires a very artificial timbre. It’s not “natural” in musical terms.

The 7-EDO functional example is a parallel phenomenon but "natural", this is the difference:

The ambiguity is not caused by timbre or chroma wrapping.
It is caused by cadential expectation and functional reinterpretation.

The 171 cent jump is small to be a clear “whole step”, too large to be a clear “semitone”, ambiguous in scalar context, and can act as either a 100-cent role or a 200-cent role after functional remapping. This means the listener's tonal model decides the interval class, not the acoustics. This is a cognitive analog to Shepard’s perceptual ambiguity but purely musical.

This is exactly what is predicted by categorical perception, key-dependent interval class assignment, top-down functional bias, and tonal constancy mechanisms.

The example is a miniature interval multistability illusion, here the bistability is: major-step function vs minor-step function mapped onto the same absolute interval.

The 171-cent example shows interval identity is not acoustically fixed, functional context can warp interval categorization, listeners can be tricked naturally, not through artificial timbres and pitch-space can behave as a bistable perceptual manifold.


Deustch Illusion's

Deutsch describes a self-reinforcing perceptual loop, the “bootstrapping operation”:

-bottom-up cues: local intervals, sequential grouping, contour, roughness, spectral features
-top-down cues: stored tonal hierarchies, category expectations, Western pitch-class memories

The system settles into a coherent key + a coherent sequence despite ambiguity in the input.
But she is only talking about ambiguities inside 12-EDO. Not about the deformation of the system itself.

The hidden assumption of their entire debate: The underlying pitch lattice is stable, fixed, and accurate.

Here we extend this:

What if the entire pitch framework is warped?
How far can you stretch the lattice before the bootstrapping collapses?
How does the brain “repair” a scale that violates its statistical priors?

But the bootstrapping mechanism should still operate even under distortion.

Their theory quietly implies pitch flexibility, this lurks inside the implications.

Their model says: Local intervals provide sequential cues. Stored hierarchies provide tonal mapping, the system iterates until a stable interpretation emerges.

This is mathematically the same pattern as a stable fixed point under perturbation. If you slightly detune the fifth: the local interval cue moves, the hierarchical mapping adapts, the loop tries to settle into a new stable point

Neither Deutsch nor Krumhansl explored perturbing the system to see when this perceptual homeostasis breaks, but their mechanism predicts that there must be:

a region of stability (duodecimability)
a region of instability (collapse)
a boundary (“breaking point”)

The unasked question, everything from Krumhansl’s probe-tone curves to Deutsch’s illusions was done on the assumption that: the octave = 1200 cents, the fifth ≈ 700 cents, the diatonic steps ≈ 100/200 cents, pitch classes repeat with perfect diatonic symmetry.

What if we perturb the system?
How flat can a fifth be before tonal hierarchy collapses?
How stretched can an octave be and still be recognized?
How much deviation can a major third tolerate before category flipping?
How stable is the bootstrapping loop under systematic scale deformation?
How does the perceptual system “repair” wrong tunings?

If tonal perception is a dynamic attractor landscape with deformation tolerance; this becomes, cycle elasticity, equivoques, duodecimability layers, structural vs mnemonic constancy, melodic contour vs pitch topology, multiple tunings mapping to the same cognitive attractor.

The same bootstrapping mechanism might be responsible for:

octave constancy (why a stretched octave still “feels like” an octave)
scale constancy (why warped scales still produce diatonic functions)
melody recognition under pitch drift (the “Happy Birthday” experiment)
equivoques (same intervals → different structures)
inverse equivoques (different interval sets → same functional pattern)
perceptual inertia (where memory overrides tuning reality)

All of this falls out naturally from their bootstrapping framework.

How the mind finds a key when the tuning system itself is moving?



Ξ Example A - 7edo
Ξ Example A - 12edo

Ξ Example B - 7edo
Ξ Example B - 12edo


(Image.1) This geometric visualization compares 7-EDO with the diatonic scale in 12-tone equal temperament on a logarithmic scale. Transposition of the 7-EDO structure yields identical intervallic relationships, whereas transposition of the diatonic scale reveals the seven familiar modes of 12-tone music.


[Context-dependent functional reinterpretation of non-12-EDO pitch material

The study tests whether functional identity emerges from musical context rather than from fixed interval size, such that identical pitch distances may be categorized differently depending on their role within a phrase.

Primary Hypothesis (H1)

Listeners trained in 12-EDO will produce convergent 12-EDO transcriptions of short musical phrases constructed from pitch selections that are mathematically unrelated to 12-EDO, indicating context-dependent functional reinterpretation rather than nearest-neighbor pitch matching.

Secondary Hypothesis (H2)

The same pitch interval may be transcribed as different 12-EDO functions depending on musical context, demonstrating functional multivalence driven by melodic trajectory and tonal expectation.

Null Hypothesis (H0)

Transcriptions will primarily reflect nearest mathematical proximity between pitch selections and 12-EDO pitch classes, resulting in divergent or unstable interpretations across participants. 

Design Justification . Why 7-EDO

7-EDO was selected as a test case because its step sizes (~k171 cents) are sufficiently distant from 12-EDO semitones to prevent trivial nearest-neighbor mapping, yet dense enough to support stable melodic motion and phrase-level structure. This makes it well suited for testing whether pitch interpretation is governed by mathematical proximity or by contextual functional inference.

Importantly, the choice of 7-EDO is methodological rather than aesthetic; any pitch system that decouples interval size from familiar categorical boundaries could serve the same role.


This analyzis helps by telling us what doesn’t need to be controlled, fixed, or standardized because the brain already does it for free.

Most music theory, tuning theory, and even cognition implicitly ask:

What tuning is right?
What intervals are consonant?
What system is natural?

This work reframes this to:

Which structures are perceptually stable under distortion, and why?

It helps music cognition by isolating a missing variable, a lot of research treats: Pitch categories, Scales, Tonality, as static objects.

This  highlights something under-theorized:
The dynamics of category reallocation over time.

Categorization is path-dependent, not just grid-dependent.
The symbolic system is downstream of perceptual stability, not upstream.

It helps explain cultural convergence without mysticism, Some structures, like 12-EDO, are attractors because they tolerate distortion better under predictive cognition.]



Part II: The Mechanism


4: Layers of Duodecimability


Why start with duodecimability?


Because the 12-fold diatonic categorization whether its origins are cultural, biological, or both; appears to be the brain’s default prior for extracting tonal meaning. It is the predictive template that absorbs the largest amount of distortion, and it is remarkably difficult to “turn off.”

This is why randomly generated pitches can still form something that passes as 12-tone music, even when the underlying intervals are mathematically far from 12-EDO or just intonation. The familiar examples: pitch drift, shifting, detuning, entire performances slowly rotating away from their starting point; yet remaining perfectly recognizable, all point to the same mechanism.
The Bach-in-31-tones reinterpretation demonstrates this vividly: the listener treats the 31 pitches not as new categories but as 12 categories with imperceptible internal modulation. The brain prefers to reinterpret the entire signal as a warped version of 12-tone space rather than adopt a finer contour with more pitch classes.

The layers I describe for induodecimability therefore generalize. They apply to any sufficiently robust pitch structure, whenever tonal constancy locks onto it, whether it is 5-EDO, the Bohlen Pierce scale, or others. More broadly, these layers are layers of translatability between musical systems, understood not as fixed mathematical grids, but as auditory predictive templates with their own distortion-absorption capacities, precision thresholds, and tolerance for categorical warping. Among these, the 12-tone template appears to be the most flexible. (Later, in the section on cycles and the relationship between JND and JNM(just noticeable meaning)I discuss why a 12-fold partition might offer unusually high predictive utility.)

Likewise, the final layer, 4: substrate dependence; is not specific to duodecimal systems, but applies to any pitch model. This layer is included because it becomes relevant for musical practice and analysis. For instance, some music catalogued as “microtonal” is in fact post-pitch music: the “scales” are secondary, and pitch height is no longer the main axis of organization. In such repertoire, the semantics arise from the behavior of inharmonic spectra, noise processes, or timbral trajectories. These works are not merely “induodecimable”, they are unpitchable in the categorical sense. Pitch is not the substrate on which their meaning is built. 

Why Pre-Select Pitches at All?


Any tuning system begins with an act of selection: we carve a finite subset out of a continuous pitch continuum. Whether we choose 12-EDO, a just-intonation lattice, or a non-octave structure, this selection presupposes a grid. And the moment a grid is imposed, pitch becomes symbolic, something to be named, navigated, and reasoned with, rather than merely heard.

This raises the foundational question behind duodecimability:
What does pitch selection reveal about the perceptual forces that shape our sense of musical structure?

The Organology of Resistance: Why Frets and Notes Matter


A persistent assumption in microtonal discourse is that “fretless equals freedom”, that removing the grid grants access to an infinite field of pitch possibilities. Under tonal constancy, the opposite is often true.

The Gravitational Pull of the Fretless


On fretless instruments, intonation becomes a closed loop between the ear and the fingers. Because the auditory system continuously seeks harmonic-series alignment, familiar step sizes, and diatonic attractors (the “anti-randomness engine”), players unconsciously micro-correct toward culturally internalized targets. The result:

Fretless improvisation drifts toward just intonation or 12-EDO approximations.

“Microtonal freedom” often collapses back into familiar centers.
Without structural resistance, the instrument’s acoustics and the player's perceptual habits steer the music toward what the ear already knows.

Frets as Cognitive Prosthetics


Frets, keys, and fixed pitches are not restraints, they are tools of resistance. They freeze the geometry of an alternative system long enough for it to be inhabited on its own terms. By shifting navigation from psychoacoustic alignment to spatial/logical constraints (shapes, cycles, finger patterns), frets temporarily disable the brain’s corrective instinct.

They allow for structural alienation: the ability to function within an unfamiliar tuning without immediately reabsorbing it into 12-tone expectations.

Why This Matters for Duodecimability


Without such scaffolding, many “alien” systems are eroded by tonal constancy before they can be meaningfully explored. Fixed geometry protects them from the perceptual gravity of the listener and the performer.

These observations motivate the central question of this chapter: The Need for a Framework

How strongly does a tuning system gravitate back toward 12-EDO when filtered through human perception, performance practice, and musical habit?

Rather than treating this gravitational pull as an aesthetic defect or a perceptual failure, we can use it as an analytic tool. The concept of duodecimability provides a structured vocabulary for describing how alternative tunings interact with the 12-tone system, not as universal truth, but as our current cultural baseline.

What Duodecimability Measures


Duodecimability is not an evaluation of musical value. It is a practical measure of translatability: how easily a given system can be mapped, approximated, or “rescued” by 12-EDO expectation.

Identifying  layers, ranging from systems that can be subtly aligned with 12-tone tonality to those that resist assimilation even at their acoustic substrate. This framework allows us to distinguish:

systems that behave like dialects or variations of 12-tone practice, 
systems that partially align but diverge in key functions, 
systems that require structural scaffolding to maintain their identity, and 
systems that collapse entirely when filtered through the perceptual pull of tonal constancy.

This is not an argument about what tuning “should be” or which system is superior. Instead, it provides
a practical tool for microtonal composition, an explanatory model for instrument design, and a conceptual bridge between psychoacoustics and musical structure.

It clarifies why some tunings feel intuitively compatible with tonal expectations while others feel like entirely new musical species.
 
Layers of Duodecimability

A tuning system’s “duodecimability”(translatability) refers to the degree to which its pitches, functions, or perceptual structures can be interpreted (or misinterpreted) through the lens of the 12-tone system.

Each layer below marks a progressively deeper departure from 12-EDO as both perceptual default and theoretical grammar:
 
  • Layer 1: Mathematical Proximity + Melodic-Trajectory Tolerance

Layer 1 covers music and systems that remain close enough to 12-EDO for tonal constancy to maintain a stable 12-tone interpretation, even when the literal pitches deviate. Here lie the just-intonation 12-tone structures, meantone temperaments, and the hyper-diatonic subsets of large EDOs. These systems are essentially colorations or timbral optimizations of the 12-tone framework.

Music that employs microtonal inflection (blues bending, R&B melisma, Flamenco ornaments) still relies on the 12-tone categorical skeleton. The deviations provide flavor, expression, and emotional contour, but the functional grid remains unmistakably diatonic/duodecimal.

Similarly, many quarter-tone traditions (Arabic, Persian, Ottoman, etc.) use extra pitches ornamentally and melodically, enriching the expressive space of 12-tone logic without dissolving it. The added pitches are not themselves duodecimable, but the framework is, these systems exploit the flexibility of categorical perception rather than replacing it.

(An upcoming chapter expands this into the distinction between expression-level deviation and category-level function.)

  • Layer 2: Structural Divergence + Independent Predictive Templates

Layer 2 contains systems whose internal structure cannot be “absorbed” into 12-tone categorization. They require and reliably trigger their own predictive grammar once tonal constancy locks onto them.

5-EDO and the Bohlen-Pierce scale are strongly induodecimable: they can be forced into 12 categories only locally, superficially or momentarily, but their true logic emerges quickly once musical motion reveals their characteristic intervals and cadential behavior. (Conversely, 12-EDO is highly inpentable and in-BP-able.) This leads to the Equivoques.

7-EDO occupies a mixed zone (the system with the latent functional multivalence): some melodic movements are categorically incompatible with 12-tone logic, while others are close enough to feel modally diatonic and therefore fully duodecimable.


This boundary is not binary. It is a continuum of pitch relationships and motion patterns, where different systems reveal their character at different thresholds of contour, function, and expectation. Layer 2 is where divergent musical worlds become perceptually robust.


  • Layer 3: Substrate-Induodecimability (Timbral Dissolution)

A musical system reaches substrate-induodecimability(any translatability) when its sonic material no longer supports stable pitch perception. Here, the breakdown is not in tuning or tonal function but in the underlying spectral substrate. Inharmonic partials, unstable resonances, or nonlinear tone generators prevent the auditory system from assigning a reliable fundamental.
As a result, the 12-tone system, nor any other, cannot be inferred, not even approximately.

Gamelan metallophones, hyperstring instruments, ring modulation, and other inharmonic systems exemplify this condition. Their perceptual identity is defined not by pitch geometry but by timbral fingerprints and resonant patterns. While one can attempt to “translate” these sounds onto 12-EDO instruments, any such translation is interpretive rather than structural: it preserves personal associations, not categorical equivalence.

In this layer, pitch-based translatability collapses, not because the tuning diverges, but because pitch ceases to be the primary organizing coordinate of the music.

 ----


These layers don’t represent value judgments. They describe degrees of translatability, not superiority or purity. Each layer tells us more about how listeners (trained and untrained) perceive, categorize, and force-fit sound into symbolic boxes.

They also suggest that duodecimability is not a binary. It is a gradient, and perhaps a contested one: where you place a system may depend not only on its design, but on your listening history, your training, and your linguistic tools.

In the next chapter, we turn from classification to emergence: how tonal categories form in the first place, and what kinds of mental scaffolding make tonal constancy (and its resistance) possible.

The Equivoque Principle: Local Identity, Global Divergence

Induodecimability can be misunderstood as a purely microtonal phenomenon, a matter of "notes between the keys." However, the most profound breaks from the 12-tone system occur not when intervals are unrecognizable, but when familiar intervals build impossible structures. We call these Equivoque Scales.

The Equivoque: A sequence of intervals that appears locally identical to a 12-EDO structure (triggering Tonal Constancy) but which, upon accumulation, arrives at a destination that contradicts 12-tone logic.

Case Study: The 5-EDO Paradox: Consider the 5-tone equal division of the octave (5-EDO). Its second step is 480 cents. To a listener conditioned by 12-EDO, this falls comfortably within the category of a "Perfect Fourth" (500 cents). The 20-cent deviation is perceived merely as a "flat" or "mellow" character, a timbral flavor rather than a categorical change.

However, the grammar of the system depends on what happens when we stack them.

-In 5-EDO: Stacking five steps (5 * 480) yields exactly 2400 cents; a perfect double octave. The stack resolves into stability. It is a closed cycle. 
 -In 12-EDO: Stacking five Perfect Fourths (5 * 500) yields 2500 cents; a double octave plus a semitone. The stack creates tension and displacement.

The Failure of Translation

If a musician attempts to "translate" a 5-EDO piece based on local intervals, they will play a stack of fourths. But where the 5-EDO piece resolves to a stable octave, the 12-EDO translation lands on a dissonant minor second. The local translation (Note A -> Note B) was "correct," but the macro-translation (Structure A -> Structure B) collapsed. 

This reveals that Tonal Constancy operates on a "horizon of prediction." For short segments, the brain assimilates the 480-cent interval as a fourth. But as the segment lengthens, the accumulated error forces the brain to confront a new geometry. The "Equivoque" is the point where the map (12-EDO) no longer matches the territory: Local Similarity \(\neq\) Global Congruence.


The Geometry of the Fretboard

The difference between "tuning deviation" and "structural alienation" is best visualized on the guitar. In 12-EDO, stacking Perfect Fourths (500c) overshoots the double octave (2400c) by a semitone (2500c). To correct this, standard inter-string tuning introduces an asymmetry: the interval between the G and B strings is shortened to a Major Third (400c). The symmetry of the instrument is broken to satisfy the cycle of the octave.(And play chords "ergonomically")

In 10-EDO (or 5-EDO), the structural "fourth" is 480 cents. Stacking five of these intervals yields exactly 2400 cents (480 * 5). On a guitar refretted for 10-EDO, the inter-string tuning becomes perfectly symmetrical (4,4,4,4,4 steps) while still locking into the double octave.

This creates an "Equi-Pentatonic" chord on the open strings, a sound that is locally recognizable (stack of near-fourths) but globally "impossible" in 12-EDO logic. It is a system where the geometry of performance becomes fundamentally different.

Induodecimability is not just about "weird notes", it is about different geometries of connection.


The Two Families of "Equivoques" (structural vs mnemonic)

There are two routes by which intervals → structure can happen and they are not the same phenomenon.

1. Structural Equivoques

Equivoque Duality Principle:
For any perceptual mapping that preserves local interval categories while altering global structure, there exists a complementary mapping that preserves global structure while altering local intervals. (This corresponds the each tonal structure flexibility and reinterpretation tolerance)

(purely perceptual or topological, independent of musical memory)

Equivoques: same small intervals → different global cycle (the 5edo subfourths example).

Inverse Equivoques: : different small intervals → same global cycle (the stretched diatonic examples, or local perturbations of the 5edo subfourths)

This is the domain of the stretched-diatonic examples: the octave from 1200 → 1150 (or 1250).

Every interval is slightly warped. But the tonal grammar (scale degrees, melodic motion, cadential weight) stays intact. The listener perceives “the same melody” even if they’ve never heard it before.
This type relies on total scale interval flexibility.

The brain stabilizes identity based on internal relational geometry, not raw acoustics, assumes “there is a cycle here” and finds the closest consistent one.

2. Mnemonic

(contour-based identification via stored templates)

This is not the same mechanism. The “Happy Birthday drifting in pitch” experiment belongs to a different category.

The massively distroted tune, chromas land in the “wrong” positions, the interval sizes are inconsistent, but its still perfectly recognizable because now the brain is using a top-down stored pattern matching the melody to a memory template; contour dominance over interval precision.

Up/down motion + rhythm is enough to trigger recognition, identity-from-template, not identity-from-geometry

This kind of recognition does not imply a robust internal structure in the new tuning system.
It implies melodic memory, not tuning tolerance. This does not mirror equivoques, it’s formally separate.

Here we are just recognizing the song, not making a "literal" transcription of the incoming pitches.

---


The Optimization Trap:

It is a common misconception that "more notes" equals "more alien." Systems with high step counts such as 19, 31, 53, or 72-EDO are often grouped with radical microtonality. However, under the lens of Tonal Constancy, these systems often function not as departures from the 12-tone framework, but as Hyper-Diatonic Optimizations.

The Availability of "Better" Notes

In systems like 31-EDO or 72-EDO, the density of pitches is so high that the system acts as a "super-set." Within this vast array, one can easily select a subset that approximates 12-EDO intervals with greater precision than 12-EDO itself (e.g., finding a "pure" 5:4 Major Third).

The Effect: Instead of forcing the listener to confront new categories (as 8-EDO does), the composer, consciously or not, is tempted to overfit the pitch selection to known templates.

The JND Threshold

This reaches a critical limit in 72-EDO, where the step size (~16.6 cents) approaches the average Just Noticeable Difference (JND) for pitch in melodic contexts.

At this level of granularity, the step is no longer a structural "brick" it is a nuance.

Tonal Constancy engages effortlessly here. Because the grid is finer than the brain’s categorical error margin, any pitch can be slid perceptually into a standard 12-tone bin.

Rational Metaphysics

Consequently, much of modern microtonal theory has been directed not toward escaping the diatonic gravity well, but toward deepening it. The focus often shifts to a quest to justify 12-tone musical habits using the "purity" of Just Intonation ratios. This approach seeks to "fix" the commas and beating of Western music, perfecting the very structure it claims to expand).

Conclusion on Density

Therefore, high-density systems are not inherently induodecimable. Unless the composer rigorously avoids the "diatonic attractors" hidden within the swarm of notes, these systems tend to collapse back into Layer 1. They sound like "better" versions of the familiar, whereas lower-density, structurally incompatible systems (like 10-EDO) sound fundamentally different because they offer no place to hide.


Example of Duodecimability Using All 31-EDO Notes

A transformation of sensory input that preserves perceptual invariants.

The audio/video example below is a reinterpretation of Bach’s Goldberg Variation No. 1.
This variation famously uses all 12 pitch classes while remaining firmly diatonic; an early peak of Bach’s polyphonic chromaticism.

Here, however, the piece is performed on a 31-EDO sampled clavichord, and the adaptation makes use of nearly all 31 available chromas across the octaves.

The obvious question is: why doesn’t the music collapse?

Subtle Modulations, Stable Structure

In 31-EDO the step size is 38.7 cents, so many intervals sit near multiple possible 12-EDO interpretations. For example:

-the semitone can be realized as 77 or 116 cents
-the tritone can appear as 580 or 620 cents

Because of this, several 12-EDO mappings are always available not only via mathematical proximity, but also via melodic trajectory, voice-leading weight, and contextual expectation. Interval function is more flexible than the grid suggests, as shown earlier in the 7-EDO “functional multivalence” examples.

Thus even though the pitch set is far denser and every chroma is eventually touched, the music remains:

-tonal
-diatonic in function
-fully duodecimable

This is not “microtonal structuralism.” It’s microtonal refinement + auditory illusions, very similar to pitch-drift and Shepard-tone-style ambiguity.

Is 31-EDO Structurally Alien?

Yes and no.

31-EDO has its own harmonic logic and can certainly support non-12-tone structures. But because its pitch density is so high, you can choose to: selectively improve consonances, or subtly colorize a keys.
…while still remaining within familiar 12-tone perceptual categories.
You don’t leave the “12-EDO flavor”, you simply refine, bend, and tint it.

This is duodecimability in action: many microtonal deviations still funnel back into 12-tone percepts when context supports them.



Video Description

The video shows two circular pitch-class displays:

A 12-EDO clock, marking the original chroma classes of the Bach score.
When each pitch class is used, it remains highlighted with a different color.
(By halfway through the piece, Bach has used all 12.)

A 31-EDO clock, showing all pitches actually played in the 31-EDO adaptation.
As each chroma is used, it remains marked showing how the entire 12-tone structure “moves” within a larger 31-tone space.

The result is a visualization of how the whole wheel of 31 notes gets painted over time, while the music itself stays remarkably stable.

5: The Spectrum of Familiarity: Microtonal Flavor vs. Functional Break


Music doesn’t become "otherworldly" just because it uses strange intervals. In many cases, it is ornamental, expressive, a kind of seasoning, a flavor layered atop an underlying structure that is still resolutely tonal.

The difference between microtonal flavor and functional departure is a spectrum, not a binary. But it’s crucial because it defines whether a piece of music is interpretable, translatable, or cognitively disorienting. And that distinction hinges on tonal constancy: whether the listener can still rely on familiar perceptual anchors, tonic, cadence, resolution, even as the tuning system mutates around them.
 
Historic Flavors: Chopin and Meantone Coloration

A famous example: Chopin referred to D minor as the “saddest” key.

At first glance, this seems metaphysical or poetic. But in fact, there was a physical reason. During his time, many pianos were tuned in meantone temperament, a system optimized for certain intervals using simple integer ratios. While 12-tone equal temperament (12-EDO) was theoretically known and even in use it was still rare for instruments to be tuned to it precisely. Ear-based tuning methods favored rational approximations. Algorithmically, meantone was simply more practical before electronic tuners.

The result: each key had a unique color, a subtle deviation in interval sizes that made D minor sound distinctly different from, say, B minor. These were microtonal inflections, not fundamental departures. The harmonic framework remained diatonic. What changed was the flavor profile of each key.
 
Modern Examples of Flavor: Bends, Blues, and Maqamat

Today, the idea persists in many styles:

Blues music bends between notes of the pentatonic and chromatic scales, sliding into pitches that don't "exist" in 12-EDO notation. These expressive bends act as stylistic inflections, not harmonic challenges. The tonic remains the tonic. 
 
Arabic Maqamat and Persian Dastgah systems incorporate quarter-tones and nuanced scalar steps, often creating pitches "between the keys." Yet these systems still rely on cadential logic and tonal gravitation. The microtones serve as ornaments, bridges, flavors. They rarely seek to dissolve the entire structure, they aim to enrich it.

In both cases, duodecimability remains possible, even if imperfect. A skilled listener can still find the center of gravity. These are flavored tonalities, not alternative logics.
 
Functional Break: When Tonal Constancy Fails

What happens, though, when the system no longer submits to interpretation?

Below is an example from 8-EDO, a tuning system that divides the octave into eight equal steps (150 cents each). It contains two maximally symmetric diminished scales, and enough pitch density to form chords and melodies. However, the logic of this system is non-diatonic by design.

Try to map its harmonic progressions to 12-EDO, and tonal constancy breaks. No amount of perceptual coercion or melodic expectation can fully translate its motion. The listener doesn’t "mishear" it as tonal ,they simply hear it as strange.

Why? Because 8-EDO sits out of phase with 12-EDO. There are no simple ratios shared between their step sizes. Their intervals don’t approximate one another; they contradict each other.(Except for the diminished scale, or 4EDO) This is the threshold at which duodecimability fails entirely. Translation is not fuzzy, it is impossible.
 
Octave Retention vs. Structural Alienation

Interestingly, 8-EDO still uses the octave as a repeating unit. This gives it a slight advantage in group performance and instrument design: parts can be transposed, ranges can be shared.

Compare that to the very similar scale, Bohlen-Pierce (13-ED3), a system that replaces the octave (2:1) with the tritave (3:1). While rich in harmonic possibilities (especially with odd harmonics), it loses the universal reference point that the octave provides. The result: true structural alienation, especially in chordal writing. Melodies still function, but harmonies drift into perceptual limbo. An approximate 1.96 ratio, close to the octave, exists, but it is harmonically incoherent with traditional instruments.

This is why 8-EDO, though less famous, can feel more playable. Its symmetrical design makes it excellent for exploring alien harmonic functions while maintaining just enough structure for ensemble use.

(So 8edo, Bohlen-Pierce share many equivoques, similar local intervals but contradicting global structures, some of their scales are in phase briefly)
 
The Takeaway: The Diatonic Ghost is Hard to Kill

Even in highly divided systems like 19, 22, or 31-EDO, often used for their greater consonance or intonation precision, diatonic templates resurface. Musicians use them to better approximate known categories, not to invent new ones. In fact, the higher the division, the more tempting it becomes to overfit microtonal pitch space to traditional harmonic roles.

By contrast, systems like 8-EDO or 10-EDO, low-subdivision tunings that avoid rational alignment with 12-EDO, offer fewer handholds. Their symmetry, spacing, and internal logic prevent easy mapping. They don't flavor tonal music, they replace it.

These systems are functionally distinct, and their progressions defy tonal constancy. This is the boundary line: where the mind stops hearing “altered chords” and starts hearing new grammar.
 
Closing Note

The difference between flavor and functional break is not merely theoretical. It defines whether music can still operate within a shared perceptual vocabulary, or whether it demands the invention of a new one.

In the chapters ahead, we’ll explore this boundary more formally: how tonal categories form, and what kinds of cognitive attractors allow or prevent the perception of coherence when pitch structures drift too far.

Or put more provocatively: when does a microtone become a mutiny?


6: Anchor Density — Diatonic Memory and the Illusion of Familiarity


Tonal constancy does not act on systems uniformly. Some tunings invite diatonic reinterpretation easily. Others resist it entirely. The key difference is not simply the number of steps per octave or the presence of harmonic intervals, but what we call Anchor Density, the frequency and distribution of elements that are close enough to recognizable tonal functions that the brain tries to interpret them as such. The concept aligns with the Perceptual Magnet Effect described by Patricia Kuhl in speech perception, and later applied to music by Carol Krumhansl.

This is not a binary switch. Induodecimability, as introduced earlier, isn’t “on” or “off.” It’s a gradient of perceptual traction, the ease with which the listener's cognitive machinery can hallucinate tonal relevance from non-diatonic material.
 
Anchors: The Seeds of Tonal Illusion

We define an anchor as a moment, a pitch, a dyad, a short progression that approximates a recognizable function within the 12-EDO system. These are not structural absolutes; they are perceptual affordances. A chord that sounds like a major triad even if it's technically off is an anchor. A cadence that feels like resolution is an anchor, regardless of its tuning origin.

The brain uses these moments to project a familiar tonal grid over unfamiliar territory. It’s a form of perceptual compression, and it’s why even radically mistuned systems can feel “not quite right” instead of “completely alien.”
 
Case Study: The Diatonic Categorization Experiment

A example of this phenomenon appears in "Diatonic Categorization in the Perception of Melodies.(Jason Yust’s)" In the study, ~30 participants, primarily musicians or audio professionals, were asked to categorize melodies played in an unusual subscale of 13-EDO: a seven-note scale with step intervals [0, 2, 4, 8, 10, 11,12] (notes selected pattern notation).

Despite the scale’s deep structural departure from 12-EDO, participants consistently used diatonic terms to describe what they heard, “major third,” “perfect fourth,” “leading tone,” etc. Not a single subject identified the system as non-12-EDO. the internal tonal schema overrode the signal.

This was not 13-EDO as a novel tuning. This was 12-EDO imposed on a foreign substrate, is Categorical Assimilation. When a stimulus is within a certain distance of a category prototype (an anchor), the brain shrinks the perceptual distance, pulling the sound into the category.
 

The Anchor Density Spectrum (DRAFT) 


Let’s break down two critical points on the anchor density gradient.
 
1. High-Density Anchors: Partial Duodecimability

Systems: 11-EDO, 13-EDO, high-fidelity Just Intonation subsets (e.g., 13edo roughly approximates \((11^x \times 2^{y_x}) \in (1,2] \, x \in [0,12]\)), a chain of thirteen 11th harmonics folded by octaves 1:2. For example: \(2^{6/13} \approx 1.377 \approx 11/8 = 1.375\)
Perceptual Experience: Slippery, ambiguous, “almost tonal”
Mechanism: Local phrases strongly resemble 12-EDO intervals, triggering familiar categories

These systems produce local illusions of tonality. You might hear a chord that “feels” like a major triad, your brain engages tonal constancy, and you momentarily experience a key center. But when the next chord arrives, the illusion fails. The logic collapses. The system can't sustain a consistent diatonic mapping over time.

We call this the Shifting Grid Problem: tonal constancy can win a battle, but it loses the structural war. The mind would have to rebuild the entire tonal scaffold for each new event—a computation it can't maintain over time. (Or can be used to exploit the less familiar categories)

This explains why many listeners describe such music as “drifting,” “haunting,” or “unstable.” It’s not unfamiliar in total, it’s familiar in fragments, and that inconsistency is deeply disorienting.
 
2. Low-Density Anchors: Global Induodecimability

Systems: Bohlen-Pierce, 8-EDO, 10-EDO, some dissonant Just Intonation networks
Perceptual Experience: Profound alienation, unfamiliarity, or unclassifiability
Mechanism: Few (or zero) interval categories resemble 12-EDO constructs

Here, tonal constancy doesn't just fail intermittently, it fails completely. There are no islands of recognition. The harmonic series is differently parsed, the scale is divided in unfamiliar ratios, and resolution itself might not even be meaningful.

These are truly induodecimable systems. Even approximate mappings to 12-EDO don’t make sense. The brain has to make a choice: either accept this new musical logic on its own terms, or reject it as noise.
 
Implications: When Hallucination Meets Constraint

Anchor density reveals a deep cognitive tradeoff. Tonal constancy can only operate within a limited domain of error. Systems like 19-EDO or 22-EDO can stretch that domain without breaking it; systems like 8-EDO snap it in half.

Flavor becomes function when too many anchors accumulate. Microtonality starts as expression—but once enough structural anchors reappear, the listener’s mind imposes full tonality onto the scale.

The difference between Microtonal Refinement (Optimization) and Microtonal Structuralism.




7. The Indispensable Cycle: How Pitch Becomes Place


The Octave as Construct

The octave is traditionally treated as foundational: it arises physically from halving a string, frames the diatonic scale, and corresponds to the 2:1 frequency ratio. Yet its perceptual status is not simply a direct reflection of acoustics. The octave is constructed by the auditory system. A tone and its 2:1 counterpart are not perceived as categorically distinct but as variants of the same pitch identity. This equivalence, intuitive to children and indispensable to musicians, feels obvious because it is statistically reinforced, not because it is ontologically necessary.

The harmonic series provides the regularity: doubling a frequency aligns overtones and adds no new spectral information. The brain learns this redundancy and treats 2:1 transformations as perceptual “returns.” Thus, the octave functions less as a physical law than as a highly efficient cognitive shortcut.

Linear Pitch vs Cyclical Pitch

A purely linear pitch space offers range and resolution but no meaningful structure. To illustrate, imagine two listeners with equal pitch discriminability:
• Listener A perceives a standard 20–20,000 Hz range (~10 octaves).
• Listener B perceives only 10–11 Hz while resolving the same number of discriminable steps.

Despite identical resolution, the topologies they inhabit differ fundamentally. Listener A experiences hierarchical structure, anchored positions, intervals, and return points. Listener B experiences undifferentiated detail. “Octave” is not simply a summation of JNDs but a cognitive cycle that turns a line into a map.

Perceptual identity depends on such cycles. Without a return point some transformation that “comes back” to itself, pitch remains an infinite continuum without landmarks. Cycles enable categories, memory, direction, and tonal meaning.

Statistical Roots of the Cycle

Octave equivalence persists even with sine tones, indicating that the cycle arises from internalized statistical models, not solely from spectral cues. When the 2:1 ratio is moderately distorted (e.g., stretched to 1150 or 1250 cents), melodies retain their identity. Tonal constancy stabilizes the cycle so long as internal melodic relationships remain coherent. The perceptual loop bends but does not immediately break.

This demonstrates that the brain prefers a closed-loop topology and will warp incoming data to maintain it. These “stretched diatonics” invert the phenomenon of equivoque intervals: instead of similar intervals collapsing onto different structures, distorted intervals preserve a familiar structure.

Toward Perceptual Cyclicity

Perceptual Cyclicity can be defined as:
A property of continuous pitch space in which unidirectional motion leads to repeated encounters with qualitatively similar percepts, producing a sense of recurrence despite physical progression.

This introduces internal geometry: antipodes, midpoints, and stable recursive partitions. Even with degraded range or resolution, the relations persist. This is the basis of tonal constancy.

Breaking the Loop: Timbre

Could alternate cycles be trained say, based on a 2.5:1 ratio? Theoretically, perhaps; practically, statistical coherence collapses. Missing fundamentals, combination tones, and overtone alignment continually reinforce 2:1 relationships. One can stretch the cycle, but replacing it is difficult without abandoning stable pitch identity altogether. Inharmonic timbres (e.g., gamelan metallophones) illustrate this: when spectra diverge from harmonicity, cycles weaken or dissolve.

Why the Brain Needs Cycles

Cycles enable the brain to compress infinite pitch space into a finite topology. They support prediction, reduce memory load, and permit combinatorial structures: scales, chords, modes, symmetry. Without cycles, musical cognition reduces to raw signal detection.

Elasticity and Limits

Empirical evidence suggests that octave perception is highly elastic. When melodies retain coherent internal logic, even large deviations from 1200 cents remain “in tune.” What matters is not the numeric ratio but the density of internally consistent anchors. When enough local regularities accumulate, the brain infers a global cycle, even if the physical data do not demand one.

The brain prefers a stable loop over an accurate line. It will infer structure rather than accept randomness.

Pitch as a Vector on a Loop

The octave is not a metaphysical constant but a perceptual habit grounded in efficiency and statistical regularity. It turns the continuous dimension of frequency into a cyclical space in which pitch acquires identity and direction. Once the loop is established, the brain treats pitch not as a scalar point but as a location within a closed topology.

Functional tonality therefore is fundamentally probabilistic and topological.




(Video.01 - Color-coded octave equivalence)

Video.01: Octave equivalence is demonstrated through a common chord progression exhibiting a known tension-resolution characteristic: \(\text{V}_7 \to \text{I}\). Within a 12-tone equal temperament (12-EDO) framework, with middle C standardized at 261 Hz, the progression \(\text{G}_7 \to \text{C}\) is employed. An initial sequence, represented in MIDI format, comprises approximately one pitch class per chord. Subsequent sequences introduce randomized octave doublings of chord members, illustrating the preservation of harmonic function and tonal meaning. Introduction of other random intervals in further sequences results in the loss of this harmonic function. While the octave's significance may appear self-evident within certain modern consonance models and given the observed perceptual flexibilities, such examples serve to reaffirm its fundamental role. The synthesized sounds in these examples utilize sine waves, thus eliminating timbral complexity and ensuring that the observed pitch grouping is independent of partials.

The perceptual flexibility of the octave and its role as a framework for monophonic melodic structure are demonstrated through a series of audio examples. Each example features a 12-EDO diatonic major scale subjected to proportional stretching. The notes of the scale are presented sequentially, followed by a short melody, to illustrate the preservation of tonal meaning and relative intervallic distances despite the stretching. This process results in a relative error distribution of less than 10 cents between adjacent notes. Specifically, audio example 1 features a stretching of the octave from 1200 cents to 1150 cents, while audio example 2 features a stretching from 1200 cents to 1250 cents.

This is Categorical Perception again. The brain prefers a "closed loop" topology (a circle) over a line, so it will bend the data to close the circle. (These stretched diatonic scales are the inverse of the equivoques, where similar intervals stack onto a different macrostructure, here different intervals stack and stand as the same macrostructure)


(Audio.01) 12-EDO diatonic stretched to 1150cents.


(Audio.02) 12-EDO diatonic stretched to 1250cents.

(Audio.03) Auditory stimulus used in pitch distance estimation tests. A sequence of randomly generated pitches with constant, randomized step sizes is presented. Participants estimate the overall interval between the first and last pitches. Step sizes and number of notes are withheld from participants to prevent calculation-based responses. 


8: Self-Organizing Criticality in the auditory cortex. A Collision of Two Logics

Opening Question: Why So Few Categories?

We live in a universe of almost infinite auditory resolution. The human ear can distinguish pitch changes as small as 5 cents under ideal conditions—meaning that, technically, we could divide the octave into hundreds of perceptible steps. And yet, almost universally, humans gravitate toward small sets of pitch categories: five (pentatonic), seven (diatonic), and twelve (chromatic). Why?

Why not 17? Or 53? Why don’t we hear in hundreds of pitch regions the same way we can recognize thousands of faces? What explains this great compression?

To answer that, we must look at the perceptual and physical forces at play, not just what the ear can hear, but what the brain wants to organize. We’ll see that the categories we use are not merely products of hearing, but of balancing two incompatible but equally foundational logics: one mathematical, the other acoustic.

Symmetry vs. Resonance

Imagine pitch categorization as a negotiation between two deep instincts, two different ways of dividing the world into "sensible parts." These two logics are not optional. They are both embedded in our perceptual architecture and the acoustic environment itself.

1. The Logic of Symmetrical Partition (The 2ⁿ Brain)

The brain has an affinity for symmetry, especially binary symmetry. Oour nervous system is organized into mirror-like hemispheres, our motor systems operate in bilateral pairs, and our cognition thrives on hierarchical nesting. When it encounters a continuous space, the simplest organizing move is to subdivide it by halves.

This gives us a tidy and scalable system:

1 division = 1 point (trivial)
2 → 2 regions (perceptual opposites)
4 → quarters
8 → eighths
16 → sixteenths, and so on.

This division strategy is computationally elegant and perceptually robust, up to a point. 
This is where semantic distinctiveness decay enters. As the cycle gets subdivided into more and more parts, the distance between each mark on the ruler shrinks. At some point, the perceptual difference between one step and the next becomes so small that the brain stops treating them as different enough to matter. It's not that we can’t hear them, it’s that they don't participate in meaningful contrast. We might call this categorical fatigue.(JNM, Just Noticeable Meaning)

2. The Logic of Acoustic Resonance (The Timbre Lock)

But there's a second force, and it doesn't care about symmetry or perceptual consonance. It cares about resonance.

The physical world, particularly the world of vibrating strings, tubes, and vocal cords, delivers us with one inescapable musical truth: the 3:2 ratio—the perfect fifth.

The perfect fifth is a rebel. It doesn’t neatly fit into binary division. Try cutting the octave in half, then half again, and you won’t land on the fifth. It’s irrational in log₂-space. But perceptually, it's overwhelmingly powerful. In spectral terms, it's the first prominent harmonic interval after the octave. When you play a note and then its fifth, your auditory system doesn’t hear disjoint points—it hears coherence. Overtones lock. Energy feels shared.

This is what gives us tonal hierarchy: the sense that some notes "belong together" and others don't. It introduces direction, gravity, and asymmetry. In Western music, it’s what makes dominant-tonic cadences feel like return journeys.

This is not a culturally invented behavior. It’s a biological reaction to the physics of sound. Every culture that employs pitched instruments stumbles into this asymmetry—whether it encodes it in twelve tones or not.

The Great Compromise: Why 12?
Twelve is not just a convenient number. It is a solution—a local maximum in the space of possible tonal systems.

Twelve is where these two logics, symmetry and resonance, strike a practical truce:

It accommodates the 3:2 fifth closely enough that stacking them brings you nearly back to the octave (seven fifths ≈ 12 semitones).

It allows binary subdivisions (12 divides cleanly into 2, 3, 4, and 6).

It supports multiple diatonic subsets (pentatonic, heptatonic, etc.).

It permits hierarchical relationships—tonic/dominant, major/minor—without overloading the perceptual system.

In other words: 12 is not perfect, but it’s good enough in enough directions. It's the first stable “cultural attractor” where perceptual symmetry and physical asymmetry can be mapped onto a finite structure that the brain can learn and use fluently.

This is the Dual-Patch Theory. The auditory cortex likely has two competing maps: one for Periodicity (pitch height/symmetry) and one for Harmonicity (spectral matching). 12-EDO is one of the few systems where these two maps overlap cleanly.

Importantly, this isn’t about consonance per se. Consonance is contextual and plastic. This is about structure and how categories emerge, how the brain carves up a cycle into meaningful segments.

Resolution Isn’t Enough: From JND to Musical Meaning


Even with modern tuning systems (19, 22, 31, 72) where we can finely sculpt intervals, not every perceptible difference becomes a category.

This is the core distinction:

  • JND (Just Noticeable Difference): “Can I tell this is different?”
  • JNM (Just Noticeable Meaning):Categorical Distinctiveness: “Does this difference mean something?”
This distinguishes Psychophysics (what the ear can do) from Semantics (what the mind acts upon). This is the missing link in microtonal theory.

Tonal constancy plays a key role here. It resists giving category status to subdivisions that don’t contribute to the known structure. It insists on interpreting ambiguous or fine distinctions as versions or “flavors” of known categories, not entirely new ones. This is why microtonal intervals so often get perceived as “bent” versions of 12-tone intervals, unless the tuning is radically unfamiliar or the anchor density is too low to allow reinterpretation.

So we get plateaus: 5, 7, 12. These are systems where the perceptual return on complexity is high—where each added step adds not just a difference, but a function.

Conclusion: The Brain’s Great Synthesis

Our musical categories arise not from arbitrary cultural evolution nor fixed laws of physics. They are emergent solutions, stable balances between two incompatible demands:

The brain’s love of symmetry, compression, and predictability.

The world’s asymmetrical offerings, in the form of acoustic resonances and harmonic structure.

The cycle gives us the container. These two forces tell us how to divide it. And Tonal Constancy is the mechanism that enforces these learned partitions—interpreting incoming sound as near or far, familiar or strange, anchored or drifting.

The categories we use are not infinite, not because we can’t tell the difference—but because only some differences rise to the level of meaning. That is the blueprint of pitch. A perceptual system always balancing what it could sense with what it must make sense of.






Neural Mechanisms and Predictive Models of Tonal Constancy

Neuroimaging and electrophysiological studies reveal that specific regions of the auditory cortex selectively respond to structured sound patterns such as speech, melody, and harmonic sequences, but remain relatively inactive during exposure to unstructured noise. Notably, areas such as the planum temporale, located posterior to the primary auditory cortex, appear to engage dynamically when pitch structures exhibit internal regularities, even when those regularities are statistically subtle or culturally learned.

These regions are not merely passively decoding incoming sound—they participate in an active predictive process. The brain constructs internal models of melodic or harmonic progression, and generates expectations for future events. When a pitch contour unfolds predictably, it minimizes error between the expected and actual input, triggering dopaminergic reward responses in associated circuits such as the nucleus accumbens. These reward-linked responses, observed even in anticipation of musical climaxes, suggest that successful pattern prediction is inherently satisfying, reinforcing the learned tonal templates over time.

Such findings align well with a Bayesian perspective: the brain updates internal priors based on the statistical structure of the sound environment, forming what we might call tonal basins—perceptual attractors that stabilize around culturally salient pitch configurations. These basins guide pitch interpretation even when the physical signal is ambiguous, distorted, or derived from non-standard tunings (e.g., 7-EDO or inharmonic timbres). The result is tonal constancy, rooted in statistical expectation, contextual prediction, and hierarchical sensory processing.

Given these parallels, we may consider whether concepts from statistical mechanics or nonlinear dynamical systems could be adapted to describe perceptual behavior in pitch space. Several potential analogies emerge:

Neural/Perceptual Concept   || Physical/Mathematical Analog:

Tonal template / basin      || Potential well in an energy landscape
Contextual pitch trajectory || Particle path with momentum
Dopaminergic reinforcement  || Entropy minimization with energy input


Such analogies are speculative, but not without foundation. Perception is inherently path-dependent, sensitive to both immediate context and long-term learning. The momentum of pitch trajectories, as previously discussed, may correspond to cumulative Bayesian updating or even to forms of inertial processing, where prior motion in tonal space biases future interpretations.

Tonal constancy may not yet be a widely codified term in music cognition, but the phenomena it describes lie at the heart of musical experience. From the brain's predictive machinery to the statistical learning of pitch spaces, it offers a compelling bridge between acoustic structure, cultural form, and neural function. As we continue to unravel the relationship between sound, meaning, and expectation, tonal constancy may prove to be not just a perceptual curiosity, but a fundamental principle in how humans find coherence in the musical world.


draft



Exhibit A: The Anti-Randomness Engine

These musical examples belong to a larger work in which I explored the role of randomness in pitch selection—beginning with the question of what randomness even means in a musical context. For the purposes of that study, I treated randomness as non-intentional design.

The research (see [link]) investigates in depth why apparently random pitch sets remain musically functional, often without producing the sense of “oddity” one might expect. While this section does not include all of those examples—some pitch systems are far less straightforward to analyze than the 7-, 8-, or 10-EDO cases presented here—the broader study incorporates sets derived from planetary data, mathematical functions, noise distributions, and other sources.

What unites these varied systems is that, despite escaping the 12-tone grid in every permutation and defying conventional tuning logic, they still exhibit traditional musical utility. In practice, many of them do not even sound “microtonal.” Instead, musical intent, cognitive expectation, and perceptual organization stabilize them, allowing listeners to hear them as familiar or “normal.”

The key finding is simple:
Uniform subdivisions of the perceptual cycle—even allowing for clustering or irregular spacing—still guarantee tonal functions.

This “anti-randomness engine” illustrates how music perception is less about strict mathematical grids and more about how the mind organizes trajectories into meaningful tonal categories.

Conclusion: The Architecture of Alienation

This study began by asking if randomness destroys musical coherence. The evidence suggests the opposite: randomness often ensures it.

We find that Randomness and High Density function as a "perceptual mirror." The sheer statistical spread of random systems provides enough "anchors" (familiar intervals approximating 3:2 or 2:1) that the brain's error-correction mechanism (Tonal Constancy) can effectively project a 12-tone grid onto the chaos. The listener does not hear the randomness; they hear the subset of it that resembles what they already know.

The creation of genuine "new notes", does not arise from chaos, but from Specific, Alien Structure. It emerges in systems like 5-EDO, 8-EDO, or non-octave scales (Bohlen-Pierce), where the internal geometry is so rigid and "induodecimable" that the brain is forced to abandon its diatonic priors.

In the end, to escape the gravity of the 12-tone system (learned or not), one cannot simply roll the dice. One must build a new geometry that is robust enough to stand on its own.






Start with 12edo music: how far can you stretch the interval sizes without distorting their meaning? Where do they switch categories?

It seems that if each pitch in a 12edo set is randomly varied within ±20 cents, the structure remains perceptually stable we still hear it as “12edo.” These are duodecimable tunings: systems that, despite numerical deviations, are still interpreted as 12-tone.

This resilience might be due to biological factors (like pitch discrimination thresholds and auditory memory) and/or cultural familiarity. But even that’s not the whole story.

When you introduce pitch trajectory, motion, gesture, melodic phrasing, the categories become even more flexible. A note that might have sounded “too far” statically can function perfectly well within a musical phrase, even shifting its perceived identity based on context.

So, duodecimability is not just a mathematical condition. It’s contextual and perceptual. The brain doesn’t simply match frequencies, it interprets relationships, motion, expectation, and structure.

This has real consequences for microtonal music and composition. Flavor is one thing, the unique color of a tuning, but structural category is another. Tonal constancy means that pitches don’t just sound similar; they mean similarly, and meaning is shaped by use.

In this light, new tonal spaces aren't just a matter of dividing the octave differently. They demand both careful pitch selection and deliberate usage, exploiting or bypassing tonal constancy depending on the desired perceptual effect. Composition becomes an active negotiation between new materials and old cognitive habits.

Meaning-to-Noise Ratio (MNR)

How much of the perceived structure carries interpretable, repeatable form vs how much is background variation or "interpretive entropy".

Lower MNR → ambience, texture
Higher MNR → form, hierarchy, prediction:

±5 cents noise → likely still high MNR
±25 cents → MNR drops, unless trajectory/symmetry/surface redeems it
Non-periodic microtonal sets → potentially high MNR if duodecimable, low if not



References / Further Reading:
Albert Bregman -  Auditory Scene Analysis 1990
Diana Deutsch -  research on auditory illusions 1999
David Temperley -  The Cognition of Basic Musical Structures
Maurice Merleau-Ponty -  Phenomenology of Perception
Don Ihde - Listening and Voice: Phenomenologies of Sound
Easley Blackwood research on EDOs
Sethares - Tuning, Timbre, Spectrum, Scale
Burns & Ward (1978) - Categorical Perception of Musical Intervals
Plack et al. (2005) - Pitch: Neural Coding and Perception
Zatorre & Halpern (2005) - Brain regions for pitch category memory
Tillmann et al. (2000) - Implicit Learning of Tonal Structure
Lerdahl's Tonal Pitch Space (2001)



META:

There’s a surprisingly clean analogy between tonal cognition and quantum amplitude amplification, especially Grover-style search dynamics.

The Quantum Analogy for Tonal Cognition

We can think of the listener’s tonal state as a probability distribution over possible tonal interpretations. At any moment while listening, the brain maintains something like:

“maybe we’re in major”
“maybe the tonic is here”
“maybe this interval is a leading tone”
“maybe this is non-tonal”

Initially, this is a superposition of tonal hypotheses. Nothing is collapsed yet.

Tonal Constancy as The Quantum Prior:

Before any music is played, the distribution is not uniform.
The 12-tone system has been heavily learned, so the listener begins with amplitude concentrated in the familiar states.

This is exactly like starting a quantum search with a biased initial state:
some outcomes begin with higher amplitude because the system “expects” them.

Musical Events as Oracle Calls:

Grover’s algorithm works by using an oracle to: mark a desired state, flip its phase, then reflect the whole system to amplify its probability.

In perception, each musical event acts like a partial oracle:

A dominant-seventh chord “marks” certain tonal hypotheses as more likely.
A cadential motion boosts the amplitude of one key over others.
A surprising chord reduces amplitude on the previously-favored hypothesis.

Each event “flips the phase” of certain cognitive candidates not metaphorically, but in the sense that it changes the sign or slope of prediction errors in the auditory predictive model.

This prevents small deviations from accumulating into noise.
The tonal interpretation is kept on track by repeated amplitude steering.

Musical Motion as Amplitude Amplification

The longer a piece stays within a coherent grammar, the more the brain amplifies the probability of its tonal hypothesis.

A cadence is basically: amplitude amplification + hard collapse.

Before the cadence: a superposition of likely tonal centers
During the cadence: rapid amplification of the intended tonic’s probability
After the cadence: the system collapses into “we’re in this key”

This is not mystical, it matches Bayesian perceptual update models.

But the quantum analogy is strong because: the system updates globally, not note-by-note; predictions interfere with new input; the result is a winner-take-all collapse into a single tonal state.

Duodecimability as Quantum Error Correction

Grover’s search is robust even if the amplitude shifts slightly or the oracle is noisy, the algorithm tends to recover the right solution.

Similarly:

Even if the pitches are wrong,if the notes drift, if the tuning is warped... the brain maintains the correct tonal category, until error exceeds a threshold.

12-EDO acts like a cognitive error-correcting code, the system tries to repair drift back into categorical bins.

That’s why Happy Birthday drifts in pitch continuously but stays in the same contour/category space: the “oracle” (melodic expectation) keeps re-correcting.

Induodecimable Systems as Decoherence

When you enter 5-EDO, Bohlen-Pierce, or 8-EDO: predictions fail,categorization becomes impossible, the “oracle” cannot mark the expected tonal states, the amplitude distribution becomes uniform or chaotic.

This is the same as quantum decoherence; the superposition no longer evolves according to its internal logic because the environment (the music) is incompatible with the system’s internal basis.

At that point the brain resets and builds a new basis, a new grammar.

This gives a high-level parallel:

High duodecimability: coherent evolution in a stable basis.
Low duodecimability: decoherence and basis reconstruction.




----tonal congnition (draft)


No comments:

Post a Comment

Chapter X Dürer’s The Lute Designer: The Epistemology of Iconographic Accuracy

1. Introduction: When an Artwork Teaches You How to Read Art   © GrandPalaisRmn (Musée du Louvre) / Tony Querrec Iconographic analysis of m...