Tuesday, November 5, 2024

Interval Reduction

This page is dedicated to the interval reduction operation, a foundational concept in music theory that I’ve explained briefly in other articles where it often plays a key role. Interval reduction is a universal yet frequently misunderstood process, with "octave reduction" as one specific example. Here, I offer a formal definition and consistent mathematical notation for interval reduction, aligning it with the notation developed for the interval matrix. This approach provides a basis for broader generalization in music analysis and tuning theory.

Note on Ratio Notation: In music theory literature, ratios and fractions are often used interchangeably, particularly when discussing relationships like string length, frequencies, harmonics, or subharmonics. This overlap can lead to ambiguity, especially when referring to intervals without specifying direction. For instance, while octave equivalence remains clear across notations (1:2, 2:1, 1/2, 2/1), we may lose the specific octave reference—whether above or below—if this is not explicitly noted. Even worse with 2:3, 3:2, 2/3, 3/2, without specifying we can't be sure if it's a fifth or a fourth.

To address this, we follow the convention that treats ratios like 4:5:6 as indicative of pitch relationships, where each number represents a multiple of a fundamental (1). For example, in this format, 4:5:6 represents a major chord as the intervals 1/1, 5/4 and 6/4=3/2.

Examples:
- Octave (second harmonic): Ratio 1:2, Fraction 2/1, Decimal 2
- Second Subharmonic: Ratio 2:1, Fraction 1/2, Decimal 0.5
- Fifth: Ratio 2:3, Fraction 3/2, Decimal 1.5
- Fourth (below unison): Ratio 3:2, Fraction 2/3, Decimal 0.666...

This notation standard helps clarify both direction and pitch relationships, reducing ambiguity when discussing intervals and chords across different contexts.

Interval Definition:
To avoid ambiguity, interval here refers specifically to a musical interval, where values represent multiples of a fundamental frequency, typically normalized to a unison. When referring to a mathematical range, such as \((1,2]\), we will use the term space.
This distinction helps clarify references to musical intervals, such as the octave (a frequency multiplier with a value of 2), as distinct from the octave space, defined as any interval \((a,2a]\) for \(a \in \mathbb{R^+}\).

Interval Reduction

Definition:
Interval Reduction is a scaling transformation that maps a positive real number \(x\) within the bounds of a specified space, denoted \((1,b]\), by repeatedly multiplying or dividing \(x\) by \(b\), until \(x\) falls within \((1,b]\). Here, \(b,x \in \mathbb{R}^+ \) with \(x \notin (1,b]\).

Notation:
The interval reduction operation can be represented as a mapping into a modular "interval space." We use the following notation to generalize this process concisely:

Using the mod operator:
\[x \bmod 1:b = [x] \] where \([x]\) represents the equivalence class or representative of \(x\) within the \((1,b]\) space.

This notation mirrors modular arithmetic while specifying that it applies to interval reduction:
\[ x \equiv [x] \pmod {1:b} \] To capture all equivalence classes in a similar form:
\[ xb^n \equiv [x] \pmod {1:b},\, n \in \mathbb{Z} \] This expression shows that the fifth 2:3 is the class representative in octave space of the third harmonic (1:3), the sixth (1:6), twelfth (1:12), (1:24), (1:48), and so on, with respect to unison.

Example Application:

For \( x=48, b=2\):
\[48 \bmod 1:2 = 1.5 = 3/2 \text{ (Fifth)}\] Normalization for Other Ranges:

To perform interval reduction in ranges other than \((1,b]\), normalize the space as follows:

\(x \bmod a:b ⇒ x \bmod 1:b/a \)

For example, with \(x=7, a=4, b=5\):
\[7 \bmod 4:5 = 7 \bmod 1:5/4 = 1.174\ldots\]This approach is particularly clear for rational values in the space. We avoid directly writing \(7 \bmod 5/4\) to prevent confusion with traditional modular arithmetic; using ratio notation, \(7 \bmod 4:5\) explicitly denotes interval modularity, simplifying it as \(7 \bmod 1:5/4\). For spaces involving irrational values, we write the ratio starting from 1, such as \(5 \bmod 1:\sqrt{2}\), to maintain clarity.

The process can also be expressed explicitly using logarithms and the classic mod operator:
\[ x \bmod a:b = (b/a)^{log_{b/a}(x) \bmod 1} \] Example:
\[5 \bmod 1:2 = 2^{log_2(5) \bmod 1} = 5/4 = 1.25 \text{ (Major third)}\] In this case, we map the fractional part of the logarithm (base 2) of 5 into the octave cycle.

While interval reduction is not a strict mathematical function—relying on iterative scaling rather than a single closed-form—it can be expressed as a function, especially within octave space. This space is particularly relevant to music theory as it relates to chroma equivalence and pitch grouping in human perception. Here, the "octave reduction" is the chroma function:
\[\Xi(x) = x \bmod 1:2 = 2^{log_2(x) \bmod 1}\] This function is a special case of our interval reduction process, where b = 2. This "octave reduction" maps any positive number to its equivalent within the octave cycle, representing its chroma.
We can extend this idea to define a general chroma function \(\Xi_{1:b}(x)\), where 'b' defines the specific interval space: \[\Xi_{1:b}(x) = x \bmod 1:b\] Example, for \( x = 7, b = 5/4\): \[\Xi_{4:5}(7) = 1.17440512\] Therefore, \(\Xi_{1:b}(x)\) effectively represents interval reduction within a given space. The chroma function \(\Xi(x)\), previously used in my work, is a specific case of this more generalized interval reduction function that without parameters, defaults to octave space.

The choice of notation can depend on context. For example, if one simply needs to find the chroma of a pitch within a tuning system, the compact chroma function provides a straightforward approach. However, the modular notation for interval reduction clarifies the process when building tuning systems and can also be applied in other mathematical contexts. For instance, in \( x \bmod 1:b = [x]\), the modularity is evident by viewing the operation as \(x \bmod 1:b = x/b^n\) , where \(n\) is the integer that scales \(x\) into the space \((1,b]\). While our primary interest may be in the result, here [x], the value of \(n\) and the sequences it generates with various inputs form the basis of the logarithm algorithm I introduced in [link].

Coding:
Most programming languages support logarithmic functions, so the chroma function can be implemented concisely. For example, in Python:

import math

def chroma(x):
    return 2 ** (math.log2(x) % 1)

This implementation uses logarithmic reduction to map x to its equivalent within the octave space (1,2]. However, for a more generalized approach that applies to any interval space (1,b], here’s the Python code to perform interval reduction for x mod 1:b :

def interval_reduction(x, b):
    # Ensure inputs are positive real numbers and x is outside (1, b]
    if x <= 0 or b <= 1 or (1 < x <= b):
        raise ValueError("Ensure x > 0, b > 1, and x is not within (1, b].")

    # Apply interval reduction by scaling x within the (1, b] range
    while x > b:
        x /= b
    while x <= 1:
        x *= b
       
    return x

Tuesday, August 20, 2024

The Interval Matrix


DRAFT
This article introduces the concept of the interval matrix from a traditional music theory perspective, alongside a software tool designed to create and visualize these matrices. In this context, intervals refer to proportions or ratios between numbers.

The interval matrix is built from all possible representations of a set's values under an equivalence relation, using each element as a base, resulting in a numerical or geometrical table—a matrix—that represents this expansion.

These matrices are not initially intended for conventional matrix operations; the focus lies in the geometric structure that emerges from different sets and their elements' relationships.

Interval Matrix software. Prime numbers up to 19(set to periodic), with equivalence 1:2 (octave-space)
\(\mathbf{Ä}_{1:2}(P_{19})\)


For an infinite set, the matrix cannot be fully generated. However, if the set has a repeating pattern (period), a minimal generating set can be identified. The matrix is then built and completed using this minimal set, (n-by-n) as seen in a common musical tuning system (a set of pitches or rhythms used to create or perform music).

This period typically becomes the primary equivalence relation (equave) parameter in the set's function for constructing the matrix and analyzing the intervals within.

The matrix can be constructed for a finite set that isn't meant to repeat. For example, in music, this approach can be used to analyze notes on an instrument where there's no indication to continue calculating additional pitches. This method applies to any finite set. In a finite matrix, each row contains one element less than the previous row.

Set and matrix construction:

For analyzing a set \(S\) that is already normalized and within the desired range—such as in any pre-calculated musical tuning system—the set remains unaltered, and the matrix is built directly \(\mathbf{A}(S)\). The only required parameter is its periodicity: Is the given set a minimal generating set of an infinite set, or does it represent a fixed, finite number of elements?

Most examples here will use periodic matrices. To denote matrix periodicity or non-periodicity, we might use different notation, such as \(\mathbf{Ä}\) for periodic matrices and \(\mathbf{A}\) for non-periodic ones.

The generalization of the interval matrix construction allows us to relate different sets and reductions, enabling us to find congruences between systems. The reduction function (which corresponds to the chroma function when the space is the octave, 1:2) for a real matrix, where the set consists of any real numbers, operates as follows:

The absolute value of each element is taken, and the function then returns this value, reduced or remapped (if necessary) by an equivalence relation:

For a value \(s_x\) larger or smaller than the chosen equivalence relation \(r\), it is reduced to a new element \(\tilde{s}_x\) by applying the operation:

\(\tilde{s}_x = |s_x| \bmod 1:r\)

(This uses the mod symbol because it effectively returns the intervallic remainder. This process involves repeatedly multiplying or dividing \(s_x\) by \(r\) until \(s_x\) falls into \((1, r]\) space. This page has details about interval reduction.)

Since the matrix is defined by reinterpreting the set values with each element as the base, all rows inherently start with 1. Consequently, the reduction, or normalization, is consistently performed as \(\bmod 1:r\)

Optional: A constant \(\delta\) may be applied to each element of the set before performing the base change.(this in relevant for other uses explained in other article)

The reduction can be notated and performed for sets \(S_{1:r}\) without considering any matrix. It can also be used in constructing the matrix, \(\mathbf{A}_{1:r}(S)\), which implies both reduction and base shifting.

Example: If \(S\) = {1, 2, 3, 5}, then \(S_{1:2}\) ​= {1, 3/2, 5/4, 2}, and \(\mathbf{Ä}_{1:2}(S)\) would yield [{...},{...},{...},{...}]. (reduced and periodic)

Interval Matrix Definitions:

  • Full Interval Matrix: \(\mathbf{A} = \mathbf{A}_{s_n}^{\delta}\)
    This matrix uses the last or largest element of its generating set as the equivalence relation.
  • Local Interval Matrix: \(\mathbf{A}_{s_i}^{\delta}\)
    This matrix uses any element within the generating set as the equivalence relation, except for the largest one.
  • External Interval Matrix: \(\mathbf{A}_{x}^{\delta}\)
    This matrix uses a value outside the generating set as the equivalence relation. 


A full interval matrix built from a periodic set is inherently a symmetric matrix.

A full or local interval matrix is not "useful" for isotropic sets (where the chosen period or relation is a member of the set). This leads to identical and overlapping shifts of the elements.

Musical Interpretation:
For example, the 12-tone equal temperament \(\text{12ed2}\) guitar is an interval matrix (incomplete) representing the infinite set generated by the constant \(2^{1/12}\). Each row is shifted by five elements from the previous row (except between \(\text{G}\) and \(\text{B}\), where the shift is four). The matrix is trivial for this set's intervallic analysis, as columns (frets) are always aligned regardless of the shift or element taken as base.

Interval matrices are tipically shifted by one element until they are complete.

Consider this group: \(\langle 2, 3 \mid 3^2 = 1 \rangle \). This represents a set of infinite fifths and octaves. One of its minimal generating sets is \(S\) = (1, 3/2, 2]. The resulting matrix \(\mathbf{Ä}_{1:2}(S)\) has only two rows:

(1,  3/2,  2]
(1,  4/3,  2]

Interval Matrix Accumulation: \(\text{Acc}(\mathbf{A}(S))\)

This is a new set with all the representations of the elements under the set equivalence relation, which unfiltered, might repeat values, helping to find prevalent proportions. Isotropic sets always have an accumulation identical to any of their matrix rows. (The accumulation is a vectorization or flattening of the matrix)

In this case, the infinite set generated by \(\langle 2, 3 \mid 3^2 = 1\rangle\) = { ..., 1/2, 2/3, 1, 3/2, 2, ...} has an interval accumulation (under the equivalence 1:2):  (1, 4/3, 3/2, 2].

The distinction between full, local and external interval accumulations reflects the matrix type.

For example, consider a local matrix \(\mathbf{A}_{1:2}(S)\) constructed from the set {1, 2, 3, 4} in octave space (with an equivalence relation of 1:2). The local accumulation would be:

\(\text{Acc}(\mathbf{A}_{1:2}(S))\) = {1, 4/3, 3/2, 2} (filtered, with non-repeated values)

To obtain the global or full accumulation, the space is set to the largest element in the set. Thus, the matrix built from the set {1, 2, 3, 4} under the equivalence relation 1:4 would yield:

\(\text{Acc}(\mathbf{A}(S))\) = {1, 4/3, 3/2, 2, 3, 8/3, 4} (filtered)

For larger and more complex sets, the accumulation also provides a method for finding a possible natural mode of the set, if any.

Let’s take the pentatonic \(\langle 2, 3 \mid 3^5 = 1 \rangle\)
a minimal generating set is { (1, 9/8, 81/64, 3/2, 27/16, 2/1] }, its full matrix (omitting 1):

{9/8,  81/64, 3/2,    27/16, 2/1}
{9/8,    4/3, 3/2,    27/16, 2/1}
{9/8,    4/3, 3/2,    16/9,  2/1} Natural Mode
{32/27,  4/3, 3/2,    16/9,  2/1}
{32/27,  4/3, 128/81, 16/9,  2/1}

The natural mode of any set is the particular representation that includes the most frequent values appearing after shifts; it is the most faithful or weighted representation of the set.


How the Interval Matrix App Works

It accepts a list of numbers, treating them always as a minimal generating set(for now).

If the list/set is an already a reduced tuning system, the matrix is full and the equave(period, interval of equivalence) parameter should initially be set to match that of the set, typically the last and largest value. It does not adjust it automatically.

The matrix displays for each element in each row: the original value inserted, the reduced value(if it was reduced), a delta value(if it was displaced), and a rational approximation of the value.

The delta value comes from the delta parameter, usually 0. This value is added to every element in the original set before the rest of the calculations. This is useful for understanding how a minimal set, while maintaining its original absolute difference between members, shapes through this change.

For example, you can start with period/equave 1:2, and this set {1,2,3} reduces to {1, 3/2, 2}, but with delta = 3, it becomes {4, 5, 6}, and reduced, {1, 5/4, 3/2, (2)}.

Prime numbers up to 19. Delta = -1, octave-space.
\(\mathbf{Ä}_{1:2}^{-1}(P_{19})\)

The rational approximation has an adjustable tolerance value.

On top of the interval matrix, there is a configurable equal division ruler that helps with intervallic/ratio measures.

The chroma matrix has a fixed equivalence relation of an octave and, by default, starts at red. You can select whether the chromas displayed are absolute or relative to each row. When selecting relative, the full spectrum located in the bottom UI expands to display all the possible chroma shifts. (The full spectrum isn’t really "full"—you set a maximum space to occupy, with a logical maximum of the human hearing range.)

This last part is the most important when dealing with musical tuning systems; practical tuning systems have a simpler chroma matrix.

Unlike Scala files, the 1 must be inserted (remove it to understand what happens). You can, if you want, omit the equave in this list; it will be added (invisibly) from the equave parameter. However, it’s useful to keep it too, for example, when analyzing a non-octave tuning using an equave 3 (tritave). You can omit it, but if you want to inspect these intervals reduced to an octave, you might want to keep it and track it. So if when the set has an element equal to the equave, you will find two identical rows in the matrix.

Future Development

If you paid close attention to the code of this app and the SFINX app, you may have noticed that they use the same engine. That’s because, as I have pointed out, a guitar is essentially an interval matrix by string length.

My goal is to finally reunite both apps—SFINX was developed to aid in the graphic and diagram generation of scales for microtonal guitars, while the Interval Matrix was developed ideally for geometric analysis of sets and chromas.

(DRAFT)

Link to the apps:

jbcristian.github.io/xeneize/




Thursday, August 8, 2024

Pythagorean Scale ≅ Z/12Z ⊕ Z

A Group-Theoretic Framework for Musical Tuning Systems

This article explores the underlying mathematical structures of a specific type of musical tuning systems through the lens of group theory, offering a unified framework for understanding their generation and properties. While a multitude of tuning systems exist, many share a common foundation, historically referred to as "chaining/stacking and reducing/folding" or its linguistic equivalents (e.g., "encadenamiento y cancelación" in Spanish). 

This method, exemplified in the Pythagorean tuning system, involves repeatedly adding intervals, specifically fifths, and reducing the results by octaves (a 1:2 ratio). This principle finds its parallel in the ancient Mesopotamian and Chinese musical systems demonstrating a universal approach to generating scales and temperaments.

For example, the 3-limit 12-tone Pythagorean scale, (also the shí’èr lǜ (十二律) or “twelve-pitch” system), takes twelve consecutive powers of 3 and reduces them by octave 1:2. The scales are then presented as the size-ordered values in the \((1, s_n]\) range and can be seen as its minimal generating set, \(S\). These values are used to compute the rest of the ratios for a given instrument based on a fixed reference frequency \(f\), the complete set of pitch ratios \(P\) generated by the system is given by:

\[ P = \bigcup_{k \in \mathbb{Z}} \{ s_n^k \times s \times f \mid s \in S, f \in \mathbb{R/Z} \} \]

This makes the system a group-like structure, it is not only a minimal generating set but a subgroup—a quotient by an equivalence class \(Q/{\sim}\), usually the octave, the free generator.

To illustrate this further (to keep examples shorter), consider the 3-limit pentatonic scale. This scale is generated using the first five powers of 3, reduced by octaves, resulting in the following frequency ratios:

9/8, 81/64, 3/2, 27/16, 2/1

(These ratios are usually presented relative to a fixed base, typically 9/8, giving: 9/8, 4/3, 3/2, 16/9, 2/1)

Each pitch within the pentatonic scale can be expressed as a product of powers of the generators. Representing this with a group presentation:

\( \text{Pentatonic} = \langle 2, 3 \,|\, 3^5 = 1 \rangle \) 

Where each pitch \(p\) can be defined as:

\( p = (2^n \times 3^{m \bmod 5}) \) with \( n, m \in \mathbb{Z} \).

This notation, analogous to finitely generated abelian groups like  \( G = \langle a, b\, |\, b^k = 1 \rangle \),  captures the fundamental elements of the system.  Here, the pentatonic group is isomorphic to the direct sum of a cyclic group of order 5 (representing the modulo-constrained generator) and an infinite cyclic group:

\( \langle \text{Pentatonic}\rangle = \mathbb{3_{5}\oplus 2}\cong \mathbb{Z/5Z \oplus Z} \). 

The minimal generating set, often referred to as the "basic region" or "fundamental domain", in this case, corresponds to:

{ (1, 9/8, 81/64, 3/2, 27/16, 2/1] }

(The 1 is usually omitted in presentations like synthesizer tuning files.)

This set embodies the distinct elements of the pentatonic group, excluding octave duplicates.

1     = (2^0  * 3^(0%5)),
9/8   = (2^-3 * 3^(2%5)),
81/64 = (2^-6 * 3^(4%5)),
3/2   = (2^-1 * 3^(1%5)),
27/16 = (2^-4 * 3^(3%5)),
2/1   = (2^1  * 3^(5%5)),


Note that this is only one subgroup, as many subsets could generate different scales. 

The underlying group structure can be exploited to understand the mathematical properties of these systems. For example, an equivalence relation, where octaves are treated as equal, leads to the definition of cosets.  Let \(a\) represent the octave interval (2/1) and \(x, y\) denote any two pitches in the pentatonic scale. The equivalence relation is defined as: 

\( x \sim y \Leftrightarrow x = y \times a^k\) with \(k \in \mathbb{Z}\).

The set of equivalence classes, \( P/{\sim} \), forms a subgroup isomorphic to \(\mathbb{Z/5Z}\):

\( P/{\sim} = \{\,\{\, x_{i\bmod 5} \times 2^k\,\vert\,k,i \in \mathbb{Z}\,\}\,|\, x_i \in P\,\}\cong \mathbb{Z/5Z} \subset P \) 

This implies that the pentatonic scale, with the octave equivalence, reduces to a five-element cyclic group.  Crucially, in this group structure, each pitch is associated with a unique element.

To visualize the relationship between pitches, we can construct an "interval matrix". Each row in the matrix corresponds to a different starting note, and each column represents an interval relative to that note. This matrix highlights the specific relationship between pitches within the system:

\(\langle x_0 \rangle =\) {   9/8, 81/64,    3/2, 27/16, 2/1 }
\(\langle x_1 \rangle =\) {   9/8,   4/3,    3/2, 27/16, 2/1 }
\(\langle x_2 \rangle =\) {   9/8,   4/3,    3/2,  16/9, 2/1 } Natural Mode
\(\langle x_3 \rangle =\) { 32/27,   4/3,    3/2,  16/9, 2/1 }
\(\langle x_4 \rangle =\) { 32/27,   4/3, 128/81,  16/9, 2/1 }

Each row corresponds to the starting note \(x_i\), and the set represents all its related notes generated from the \(P/{\sim}\) system. 

The analysis above illustrates a framework for examining any tuning system, regardless of its specific construction. A a common variation of the diatonic scale, for instance, which uses the generator \(5\) in addition to \(2\) and \(3\), calculates 3 fifths and a major third for each except the last one, can also be represented as: 

\( \text{Diatonic} = \langle 2, 3, 5\,|\, 3^4 = 1, 5^2 = 1\rangle \)

remove, traditionally, the major third of the Re; its relative \([\text{45}]\) harmonic class :

\( D/{\langle 2^n\times 3^3\times 5^1 \,\vert\, n \in \mathbb{Z} \rangle^D}\)

Fa ← Do → Sol → Re             [1/3] ← [1] → [3] → [9] 
↓     ↓     ↓      
              ↓        ↓      ↓      
La    Mi    Si     Fa#            [1/5]    [5]    [15]   [45]

These demonstrate that, with the use of additional generators and relations, various traditional and modern scales can be represented as groups.

While it may seem practical to describe these tunings as groups using Equave/Octave Reduction, \( a \cdot b = \text{Octave Reduction}(a \times b)\) this is not strictly necessary. Standard multiplication can also be employed as the group operation, provided the correct relations are specified.

This group-theoretic perspective offers a powerful framework for analyzing and understanding the intricate structure of musical tuning systems. It also clarifies why common musical manipulations such as transposition, retrograde, and inversion can be performed in an abstract manner without needing to work directly with frequencies.  However, systems with a single generator (isotropic* or equally divided), like the 12-tone equal temperament \( 12{\text{ed}}2 = \langle 2^{n/12} \rangle\) with \( n \in \mathbb{Z},\, \langle12{\text{ed}}2\rangle = \{\ldots, 2^{−1/12}, 1, 2^{1/12}, \ldots\} \), are simpler to describe from a construction standpoint (see chromas). In these systems, equivalence relations lead to shifts (cosets), which are invariant and congruent. Unlike other systems, there is no fundamental region, only the generator itself, which results in a trivial analysis as shown by its simple interval matrix. However, subscales derived from any number of equal divisions can still be considered potential group-like structures.

For example, the pentatonics scales derived from 12EDO are also isomorphic to \(\mathbb{Z/5Z}\).

Each element or step \( g_k = (\, k +(\,(\,p \times (n \bmod 5)) \bmod 12)\) with \(n,k \in \mathbb{Z},\,p \in \{5,7\} \)

\(5\) and \(7\) are the only non-trivial generators of the additive group \(\mathbb{Z/12Z}\), constrained to a 5-cycle:
(the first 5 classes in the "circle of fifths/fourths")

...
k =-1 
{ 11, 1, 3, 6,  8 }, \(\flat\)
k = 0 {  0, 2, 4, 7,  9 },
k = 1 {  1, 3, 5, 8, 10 }, \(\sharp\)
...

Applying group theory offers a unified mathematical language to express the essential properties and relations of these seemingly disparate approaches, making this method interesting for research and exploration in the field of music theory. 

\( \text{Golden Harmonics} = \langle \sqrt{\phi}, \sqrt{5}\,|\, (\sqrt\phi)^{10} = 1\rangle \)



*While most systems are typically presented as either Just Intonation or Equal Divisions, this dichotomy becomes problematic when encountering variations. Just Intonation traditionally involves rational intervals, and Equal Divisions involve irrational ones, but this distinction isn't always clear-cut; there are equal divisions of rational intervals and vice versa. A more descriptive categorization is (a spectrum of) non-isotropic and isotropic, abstracting away the nature of generators. Determining the exact categories and properties might be subjective, but these terms broaden the definition, reduce ambiguity, and preserve the original meanings of established terms.

Monday, August 5, 2024

Chroma: A Unifying Principle of Auditory and Visual Perception

Chroma, in music theory, refers to the perceptual quality of a pitch, an attribute pivotal to the understanding of tuning systems. This concept, however, extends beyond the auditory realm. This work examines parallels between auditory and visual perception, not as superficial analogies, but as reflections of deeper structural and logical principles. The mathematical and perceptual foundations of chroma are investigated to uncover connections between the organization of pitch and the continuum of color. By analyzing the color continuum through a methodology derived from music theory—specifically, logarithmic perception and wavelength relationships expressed as ratios, including the construction of musical scales based on spectral locations of color attractors—a color model emerges that addresses true complementary colors. This model offers predictions of complementary hue wavelengths through a consistent mathematical relationship accounting for individual variations, addressing the blue-yellow problem and offering insights into color constancy, afterimages, and stereo vision color mixing. A logical framework for understanding color as a phenomenon is proposed, emphasizing the constraints and structure governing the differentiation of hues, resonating with established modern opponent process theory. This analysis invites a broader reflection on the shared principles underlying sensory experience and the ways in which sound and light may reveal a common order.

Chromas and Mathematical Representation: (DRAFT)

Chroma, as a perceptual attribute of sound, is rooted in the categorization of pitches. This concept, often subject to misinterpretations due to conflation with synesthesia, represents a pitch class under octave equivalence. Chroma is distinct from the broader term "pitch class," which denotes a class within a system of note names. Chroma concerns the identification of perceptually similar pitches, irrespective of absolute frequency differences. This recognition of similarity is attributed to various aspects of auditory perception, including the logarithmic nature of hearing, a preference for harmonic timbres, and the physical interaction of sound waves, encompassing resonance and interference.

  • Octave (Definition): In music, an octave defines the interval between a reference pitch P and another pitch with twice the frequency. This corresponds to the second harmonic. Octaves, in general, from a reference pitch P are defined by all frequencies \(P \times 2^n\), where n is an integer. Specific designations are employed for related intervals: the double octave (corresponding to the fourth harmonic) and the sub-octave (corresponding to the second subharmonic). Octaves are perceptually identified as similar, as exemplified by all C notes on a piano. This perceptual similarity is rooted in the simple mathematical relationship of multiplying or dividing frequency (or wavelength) by 2. This relationship can be considered a perceptually grounded equivalence class of pitches.
  • Chroma (Definition): Chroma denotes the relative perceptual quality of a musical pitch, representing its position within a reference octave space. Chromas are conventionally designated with names such as "fifth," "major third," and "augmented unison." Chroma is not an inherent property of an isolated pitch. Rather, chroma emerges from the relationship between a pitch and a reference point. Specifically, an isolated pitch P does not possess a chroma. When P is considered in relation to a reference pitch A, P—and all of its octave duplicates (\(P \times 2^n\), where n is an integer)—are considered to share the same chroma, C. This chroma C represents the perceptual quality of that specific pitch relationship with the reference root. It constitutes the "color" of the interval formed between the reference pitch and the pitch in question, irrespective of octave displacement, this implies that chromas are cyclic. While Western music theory may differentiate between a 2nd and a 9th (intervals separated by an octave), these intervals share the same chroma with respect to the root. In tuning systems with finite periodic chromas, this octave equivalence enables the transposition of chords and voicings while retaining their function. It also facilitates musical transcription between ranges, such as from piano (with approximately 80 notes) to guitar (requiring fewer than 30), and it allows for the perceptual substitution of a note with its octave duplicates without loss of tonal meaning and function.
  • Pitch Class (Definition): In modern music theory, "pitch class" (or simply "class") is a more abstract and generalized concept, analogous to an equivalence class in algebra. It denotes a set of pitches related by a specific, defined equivalence relation. This relation is not inherently perceptual and can be arbitrarily defined according to the requirements of a particular musical system or analytical context. While in octave-based systems such as 12-tone equal temperament (12-TET) the most common equivalence relation is octave equivalence (resulting in the frequent coincidence of pitch class and chroma), this constitutes only one possibility. A "pitch class" represents an assigned label or identifier given to a specific pitch. This assignment is absolute for that particular pitch within a defined system but does not inherently convey information regarding the pitch's relationship to other pitches. The class identifies the pitch but does not inherently reveal its intervallic relationship with other pitches or its function within a tonal structure.

The Nature of Chroma and Octave's Flexibility:

While the octave is explained through its mathematical and harmonic basis as a framework for pitch organization, certain perceptual aspects allow for flexibility within the strict mathematical definition of multiplying or dividing by 2. Human perception identifies pitches with a 1:2 frequency ratio as equivalent, sharing the same "color." This perceptual phenomenon aligns with the logarithmic nature of hearing. For example, the perceived interval between pitches with fundamental frequencies at 200 Hz and 400 Hz is equivalent to the perceived interval between pitches at 1500 Hz and 3000 Hz. This principle is exemplified by the practice of tonal music, in which the tension and resolution of chord progressions, such as \(\text{V}_7 \to \text{I}\), remain unaffected by octave displacements of chord members, though the addition of unrelated pitches can disrupt cadence and functional meaning.

One source of octave flexibility is timbre. Timbre influences consonance perception. Techniques of partial manipulation can render otherwise dissonant intervals more consonant, including octaves with frequency ratios deviating slightly from 2 (e.g., 2.1). However, human pitch grouping, in terms of simultaneously presented notes, persists even when listening to pure sine waves devoid of timbral complexity.

(Video.01 - Color-coded octave equivalence)

Video.01: Octave equivalence is demonstrated through a common chord progression exhibiting a known tension-resolution characteristic: \(\text{V}_7 \to \text{I}\). Within a 12-tone equal temperament (12-EDO) framework, with middle C standardized at 261 Hz, the progression \(\text{G}_7 \to \text{C}\) is employed. An initial sequence, represented in MIDI format, comprises approximately one pitch class per chord. Subsequent sequences introduce randomized octave doublings of chord members, illustrating the preservation of harmonic function and tonal meaning. Introduction of other random intervals in further sequences results in the loss of this harmonic function. While the octave's significance may appear self-evident within certain modern consonance models and given the observed perceptual flexibilities, such examples serve to reaffirm its fundamental role. The synthesized sounds in these examples utilize sine waves, thus eliminating timbral complexity and ensuring that the observed pitch grouping is independent of partials.

Another aspect, independent of consonance, is the sequential and melodic use of notes. Monophonically, the octave can be stretched even within a ±100 cent range without loss of tonal meaning within pentatonic or diatonic scales. 

The perceptual flexibility of the octave and its role as a framework for monophonic melodic structure are demonstrated through a series of audio examples. Each example features a 12-EDO diatonic major scale subjected to proportional stretching. The notes of the scale are presented sequentially, followed by a short melody, to illustrate the preservation of tonal meaning and relative intervallic distances despite the stretching. This process results in a relative error distribution of less than 10 cents between adjacent notes. Specifically, audio example 1 features a stretching of the octave from 1200 cents to 1150 cents, while audio example 2 features a stretching from 1200 cents to 1250 cents.

[audio examples]

The missing fundamental effect and combination tones exemplify our preference for harmonic timbres. Both phenomena involve the perception of phantom pitches, often explained by mathematical and physical principles involving integer multiples between frequencies.

  • Missing Fundamental Effect: A perceptual phenomenon where the fundamental frequency of a harmonic series is perceived even when physically absent, due to neural processing of the harmonic overtones and the brain's sensitivity to octave relationships.
  • Combination Tones: Additional tones perceived when two or more tones are sounded simultaneously, arising from nonlinearities in the auditory system and resulting in the generation of new frequencies.

Thus, despite its inherent perceptual flexibility, the octave constitutes a natural reference for human perception and physical phenomena such as wave behavior. This relationship is well-established musically and mathematically.

Mathematical Representation of Chroma

For readers familiar with tuning theory, the following mathematical treatment of chroma provides a rigorous foundation. For those less mathematically inclined, the core concept is that chroma represents the fractional part of a frequency ratio within an octave. This section introduces the equations and concepts necessary for a precise analysis of chroma and its relationship to musical intervals.

Octave equivalence is mathematically captured by defining chroma as the fractional part of the base-2 logarithm of a pitch frequency ratio, expressed in terms of the octave cycle (1:1 represented as a power of 2):

\( \text{chroma}(x)=2^{\log_2(x)\mod 1} \)

Alternatively, expressed in terms of a normalized ratio modoulo operation:

\( \Xi(x) = x \mod 1:2 \)

This signifies that the chroma of a pitch is invariant under octave multiplication or division (scaling by \(2^n\), where \(n \in \mathbb{Z}\)). For instance, relative to 1:1, the chroma of 3, 6, 12, 24, etc. (representing a fifth) is 1.5, corresponding to the frequency ratio 2:3. This approach identifies their equivalent "color" regardless of absolute frequency.

Using \(\log\) and \( \bmod 1\) notation makes the process explicit for coding. For example, the chroma function can be implemented as 2**(math.log2(x) % 1). This method bypasses the need for manual interval reduction (such as repeated division by 2 for values greater than 2 or multiplication for values less than 1).

The following mathematical expressions, formally defining an equivalence class and an isomorphism of topological groups, are familiar in principle to musicians. These equations, which define structure preservation, enable the construction of pitch class diagrams, such as the well-known "circle of fifths." These same principles are subsequently employed in the development of a color model and a hue wheel (mapping the visual spectrum to the unit circle in the complex plane)..

Chroma can be formalized in terms of ratio equivalence relations. For \( x, y \in (0, \infty) \)

\( x \sim y \Leftrightarrow x = 2^n \times y \, \) for some \( n \in \mathbb{Z} \)

The following mapping is established:

\( \frac{(0, \infty)}{\sim} \xrightarrow{\log_2(\bullet)} \mathbb{\frac{R}{Z}} \xrightarrow{\exp(2\pi i \bullet)} \mathbb{S^1} \subseteq \mathbb{C} \)

In general, the mapping can be expressed as:

\( [x] \mapsto \log_2(x) + \mathbb{Z} \mapsto e^{2\pi i \log_2(x)} \)

The mathematical nature of chromas reveals that melodies and chords necessitate more than octaves alone; other "colors" or fractional parts of the log₂ scale are essential. The implications of this mathematical understanding of chroma for different tuning systems, particularly those deviating from the familiar octave-based structure, are now considered.

The Impact of Non-Octave Tunings on Music:

In the analysis of any tuning system, an understanding of its chroma content is of paramount importance. The finiteness or infinitude of its chroma set, along with the precision of its octave approximation, constitutes a primary factor in assessing the system's inherent complexity, practical applicability, and potential for integration within established musical frameworks. Chroma content provides fundamental insight into the structural characteristics of the tuning system and the musical operations that are readily supported or significantly challenged. However, without specialized tools, such analysis can be computationally demanding and often impractical for general artistic endeavors. Nonetheless, certain fundamental principles can be grasped without extensive calculation.

A fundamental principle states that for finite generating sets (such as those represented in typical tuning files), a non-octave period implies an infinite chroma set. While such systems offer theoretical interest, their practical application in conventional music creation and workflows is generally limited.

Generating Sets (in the context of tuning): Within the context of musical instruments and software, a generating set constitutes a finite collection of pitches (or frequency ratios) employed to define a tuning system. This set provides the fundamental building blocks from which other pitches can be derived. While some tuning systems are designed to be periodic (repeating at octaves or other intervals), generating sets are utilized even for non-periodic systems. Software or instruments then map these pitches across the audible range. This practical usage should be distinguished from the more abstract mathematical definition of a generating set within group theory.

If a tuning is defined with an octave period (or powers of the octave, 2ᵏ/1, where k is an integer), the chroma set is finite. Conversely, systems in which the period—or any internal step—does not sum to an integer multiple of the octave yield infinite chroma sets.

Consider, for example, the Bohlen-Pierce tuning system, specifically its equal-tempered form known as 13-ED3 (13 equal divisions of the tritave). While this system is often described as having "13 classes," the presence of infinite chromas introduces complexities. On a standard six-string guitar tuned in 13-ED3 (with each string tuned to the 4th fret of the preceding string), 28 unique notes are generated across the fretboard (13 + 3 + 3 + 3 + 3 + 3). Each of these notes represents a distinct chroma relative to the open lowest string. Consequently, the guitarist encounters 28 unique chromas, significantly exceeding the 12 chromas of the conventional system and contradicting the initial assertion of 13 pitch classes.

In octave-based systems, collaboration among musicians is facilitated. Participants can perform within any register, matching pitch classes that share the same chroma (e.g., performing a C major chord across different octaves). In contrast, non-octave tunings present a significantly more challenging collaborative environment.

When participants perform within different "periods" of a non-octave tuning system, the established functional roles of harmony are disrupted. Unless all musicians possess mastery of the tuning's note content and intervallic relationships across all potential periods, coordinated performance becomes exceedingly difficult.

This highlights a significant limitation of non-octave tunings in collaborative musical contexts. While these tunings offer potential for unique sonic exploration, their applicability within shared, traditional musical practices is inherently constrained. The infinite, continuous, and cyclical nature of musical chroma reveals similarities with the visual spectrum, particularly when color-coding is introduced into musical notation for the visualization of pitch relationships, both in complex microtonal scenarios and in conventional 12-tone contexts.

Several perceptual questions then arise:

  • Does the visual spectrum provide sufficient distinct color categories to represent the nuanced distinctions between musical pitches?
  • Where are the perceptual boundaries between adjacent note/color pairings located?
‘the just confines of the colours are hard to be assigned,
because they pass into one another by insensible gradation’ (Newton).

When does one color end and another begin? This is strikingly similar to a fundamental question in music theory: when does one pitch function shift to the next? Functional music theory resolves this by prioritizing contextual relationships over the precise, often ambiguous boundaries of intervals.

Color-Coded Octaves, Interval Matrices, and Chroma Analysis

Various music notation systems are currently employed in conventional musical practice, including traditional Western staff notation, MIDI roll notation, and alternative systems incorporating color coding. Many of these systems utilize color to represent octave equivalence and pitch classes, as exemplified by tools such as the "colored piano" and color-coded staff notations.

Modern software facilitates the implementation of such color-coded systems, even within standard 12-tone workflows. These visual aids not only assist in working with microtones and unconventional scales but also enhance learning and comprehension of standard musical concepts.

For instance, the demonstration of octave equivalence presented earlier (Video.01) employs a color-coded MIDI roll. Within this 12-tone context, 12 uniformly distributed colors from the sRGB hue wheel are assigned, with an arbitrary origin (in this case, red is assigned to the pitch class C). This visualization enables rapid identification of the constituent notes within a chord. In one example, a nine-note chord is presented; however, a brief visual inspection reveals only three distinct colors, indicating three pitch classes corresponding to a major chord. Without color coding, individual note class analysis or intervallic calculation would be required—a significantly more time-consuming process. This system provides an efficient visual representation, irrespective of any theoretical connections between musical notes and the color spectrum (a topic addressed subsequently).

The Spiral Harp: A Case for Color Coding

Color coding becomes particularly advantageous in more complex musical contexts, such as the Spiral Harp. This virtual musical instrument generates pitches by interpreting the lengths of spiral polygonal chain segments as string lengths. The instrument supports a wide range of configurations, enabling the creation of complex microtonal setups and interactive performance within intricate, web-like structures.

The Spiral Harp is designed to facilitate free exploration rather than impose a rigid theoretical or methodological framework. However, understanding relationships between notes significantly enhances learning and navigation. Traditional labeling of each string proves impractical due to the instrument's capacity to generate over 1,000 distinct pitches within the audible range. Furthermore, given the infinite number of possible configurations, enumeration or calculation of all string ratio relationships becomes both infeasible and of limited utility.


Color coding offers a practical solution. By assigning an arbitrary origin and denoting octave equivalence with consistent color assignments, performers can readily identify strings belonging to the same chroma or octave class. Strings of varying lengths sharing the same color will produce consonant sonorities, as they belong to the same octave class.

Within this software implementation, the sRGB hue wheel—a perceptually uniform color space—is utilized for color coding. This facilitates the recognition of octave equivalence and also reveals additional intervallic relationships. For instance, complementary colors (those that combine to produce an achromatic percept, such as red-cyan, green-magenta, and blue-yellow) correspond to tritone relationships. This correspondence echoes parallels observed in art and music: both tritones and complementary colors are frequently associated with tension or dissonance.

In music theory, the tritone is defined as the geometric mean of the octave, represented by the square root of 2. Unlike intervals such as perfect fifths and fourths, which exhibit inverse mirroring, the tritone possesses symmetry—remaining invariant under inversion—reinforcing its ambiguous, achromatic quality.

Interval and Chroma Matrices

In music theory, interval matrices serve as analytical tools for tuning systems and instruments. While some systems exhibit octave-based periodicity, others employ alternative periods or lack periodicity altogether, potentially generating an infinite number of chromas. Comprehensive understanding of such systems necessitates the calculation of pitches beyond the minimal generating set, examining the resulting scale extensions and emergent musical possibilities.

For the visualization and analysis of these intervallic relationships, a chroma matrix can be constructed. This matrix constitutes an extended interval matrix with the octave as a fixed period.

Color coding can enhance both interval and chroma matrices, with color assignments based on octave equivalence and an arbitrary reference point.

In a tuning system whose interval or chroma matrix displays only one color, the system is comprised solely of octave duplications. Conversely, a non-octave tuning system yields a chroma matrix with a growing number of colors as pitches are added, demonstrating its infinite chroma nature.

  • Interval Matrix (Definition): An interval matrix is a tabular representation of the intervals between all pairs of pitches within a given tuning system or scale. It proves particularly useful for analyzing non-equal temperaments or scales characterized by non-uniform intervallic distances between scale degrees. In equal temperaments, the interval matrix exhibits redundant patterns, rendering it less informative. A simple example is the diatonic scale, whose interval matrix reveals the characteristic intervallic structures of its various modes (e.g., Ionian, Dorian, Phrygian).
(Image.02) sRGB color-coded interval matrix of the 3-limit diatonic scale (group presentation: <2, 3 | 3⁷≡1>). Each row represents a cyclic permutation of the scale. The matrix is displayed logarithmically, with a 12-tone equal temperament (12-EDO) ruler for reference.

Visualization Example

[Video.02]

This animation demonstrates the construction of a chroma matrix using specialized software. The demonstration comprises two examples:

  1. Calculation of the 12-tone system, illustrating how the addition of pitches beyond its period does not introduce new chromas, resulting in a repeating pattern.
  2. Analysis of a non-octave tuning, demonstrating how the addition of pitches reveals new chromas.

For further information on interval matrices and access to software tools for their creation, refer to [link].

Analogous to the limited utility of interval matrices in analyzing equal divisions of the octave (where all permutations yield the same relative set of pitches), chroma matrices also offer limited analytical value for octave-based tunings, even those with unequal divisions. In such systems, the addition of pitches beyond the octave period does not generate new chromas.

Chroma matrices find their primary application in the analysis of non-octave tunings. For example, an interval matrix of the 13-ED3 system (equally dividing the tritave) exhibits identical rows. Given that its period of repetition (the tritave) and its equivalence class are arbitrarily defined, and considering the system's equal division, any local interval matrix provides limited information. Specifically, employing any pitch as the equivalence class results in the same local intervallic relationships, which do not capture the global structure of the system. In this context, chroma analysis provides the most informative approach.

Chromas and Color Perception:

Despite the disparate physical nature of sound and light, and the distinct sensory organs responsible for their processing, both phenomena share fundamental characteristics, including their wave-like and cyclical nature, as well as several related perceptual phenomena.

This analysis posits that the connection between chroma in music and color perception transcends mere associative analogy. A deeper, non-arbitrary relationship, rooted in shared perceptual principles and a common mathematical structure, is proposed, extending beyond simple visualizations.

The analysis reveals a consistent mathematical structure explaining spectral locations for "color attractors" (formerly "unique hues") based on empirical data from individual observers. This structure predicts the positions and relationships between complementary colors through a simple wavelength ratio, coinciding with observations of color constancy, stereo vision color mixing, and afterimage hues.

Demonstration: The Blue-Yellow "Non-Complementarity"

The emergent color mapping, as detailed subsequently, reveals complementary pairs consistent with subtractive color models such as RYB.

(Image.03) Stereo-vision color mixing demonstration.

The presented image juxtaposes two modified copies of a landscape photograph. The left image is predominantly rendered in yellow (255, 255, 0), with some regions incorporating orange (255, 127, 0). The right image is predominantly rendered in blue (0, 0, 255), with some regions incorporating violet (127, 0, 255). Stereoscopic merging of these images results in a green percept where yellow and blue overlap, contradicting the complementary relationship posited by additive color models such as RGB. Instead, the orange regions are neutralized by the blue, and the violet regions by the yellow, resulting in a landscape dominated by green grasslands against achromatic rocks, mountains, and clouds.

Vision Instructions: While optimal viewing is achieved with stereoscopic equipment such as a VR headset (or even a simple cardboard viewer and mobile device), viewing via the cross-eye technique on a computer monitor is also possible. The image should be displayed at a comfortable size and viewing distance, with the viewer's head held straight and horizontal. By slowly converging the eyes, a focal point where the images merge can be found. Initial attempts may require some time due to potential binocular rivalry. Once the images are fused, the eyes will relax, and the resulting "true" colors will be perceived.


The analysis initially focuses on the proportional arrangement of color attractors within the visible spectrum, utilizing median data from studies on trichromats' and tetrachromats' color wavelength matching. The analytical techniques employed for musical chroma and interval ratio analysis are applied to this data, resulting in the construction of musical scales that exhibit consistent results and extend the observed correspondences. This approach ultimately reveals a common color structure across individuals.

Subsequently, several parallels between vision and hearing are summarized. Some parallels highlight direct analogies between chroma and complementary colors, while others suggest similar perceptual effects arising from distinct mechanisms. Note: Certain parallels, such as those related to sensory conflict (e.g., the beat effect and binocular rivalry), are examined due to their descriptive similarities rather than a hypothesized underlying predictive model based on chroma or complementary color prediction.

Finally, a logical model of color, derived from observations of complementary colors, is introduced, aligning with established modern opponent process theory.

Color Attractors (Definition)

The term "unique hue" lacks a singular, universally accepted definition, similar to the ambiguity surrounding "primary colors." While "unique hue" is often defined as a color without admixture of another hue, this definition is subject to further scrutiny. To differentiate from subsequent definitions, "unique hues" are redefined as "color attractors." This terminology reflects the fact that these particular color sensations, possessing discrete names, are those intuitively considered "primary," as exemplified by green in RGB systems. However, green is not "unique" or "primary" in the sense of being irreducible, as it can also be obtained through color mixing. The possession of a unique name, such as "red," distinguishes color attractors from colors with compound names, such as "yellowish-orange." However, every point on the color continuum represents a unique mix, regardless of its attractor qualities.

The Spectral Octave

A striking initial parallel between sound and light lies in the scale of human perception. The visible spectrum spans a frequency range approximating one octave, defined as the interval (a, 2a], where a ∈ ℝ⁺. Observed frequencies range from approximately 400 THz to 800 THz, corresponding to wavelengths ranging from approximately 750 nm to 375 nm (a 1:2 ratio). Given the linear relationship between energy and frequency (Ef), photon energy also doubles across the visible spectrum. This electromagnetic range is designated the "spectral octave," mirroring the musical octave and suggesting a shared organizational principle based on frequency ratios. Both the continuum of chromas and hues exist within a frequency range corresponding to a doubling in frequency.

Crucially, both sound and light perception exhibit logarithmic characteristics. This logarithmic nature is reflected in several aspects of light perception. The relationship between physical light intensity and perceived brightness is logarithmic, a well-established finding in psychophysics often described by power laws closely related to logarithms. The visual system's adaptation to changes in ambient light color also involves logarithmic processes. At the neural level, logarithmic transformations are common in sensory processing, with neurons often exhibiting a logarithmic or compressive response to stimuli, enabling the encoding of a wide range of input intensities.

Just-Noticeable-Difference of Chroma and Hue:

The just-noticeable difference (JND) for hue, defined as the smallest perceptible change in color, exhibits non-uniformity across the visible spectrum. Empirical studies have demonstrated that the JND is smaller in the blue region (approximately 2 nm at ~400 nm) and larger in the red region (approximately 6 nm at ~700 nm), exhibiting an increase with increasing wavelength. This non-linear distribution suggests a logarithmic relationship between wavelength and perceived hue. Similarly, the distribution of color attractors also presents a non-linear pattern. For instance, the spectral range occupied by violet, blue, and cyan is roughly equivalent to the entire red band, further indicating a logarithmic compression at higher frequencies (shorter wavelengths).

Considering the entire visible spectrum, the average JND approximates 1–2%, a threshold comparable to that of pitch, which is approximately 10–20 cents (equivalent to 1–2% of the musical octave). While these values are subject to variation depending on factors such as timbre, loudness, brightness, and saturation, this parallel suggests the possibility of shared perceptual processes underlying the perception of small changes in both sound and light.


(Image.04) Six pairs of colors are presented, each pair representing a distinct segment of the visible spectrum: red, yellow, green, cyan, blue, and magenta. The hue difference within each pair is set at 6 units on a 360-unit scale. Under the hypothetical assumption of displays emitting monochromatic light for each wavelength, a 6-unit step on a 360-unit scale corresponds to approximately 1–2% of the spectral octave (assuming a simplified visible range of 375 nm to 750 nm).

Ambiguities in Color Perception Research

Subsequent analysis is based on findings from research investigating "unique hues" and color bands. However, these studies present significant conceptual ambiguities, extending beyond methodological considerations to the very terminology employed. Terms such as "unique hues," "color bands," "finer discrimination," and "richer color experience" lack precise definitions, leading to potential misinterpretations.

Key Terms (Definitions):

  • Color Attractors: These correspond to the previously termed "unique hues" (e.g., red, green, blue, yellow). They represent color sensations associated with discrete names and often considered perceptually "primary."
  • Color Bands: These denote discrete, distinguishable regions of hues within the spectrum, analogous to perceptual "steps."
  • Just-Noticeable Difference (JND): This refers to the smallest detectable change in a sensory stimulus, such as a subtle shift in hue or pitch.
The term "richer color experience" itself is polysemous, encompassing several distinct aspects of vision:
Increased Precision/Resolution: Enhanced ability to distinguish fine details and resolve spatial information.
Increased Number of Hues: Perception of a wider range of distinct hues, representing qualitatively new color sensations.
Increased Saturation/Chroma: Perception of more intense and pure colors.

The conflation of "finer discrimination" (a quantitative measure of JNDs) with "richer color experience" (often implicitly referring to an increase in the number of hues) constitutes a central methodological and terminological problem. Studies often infer the existence of new hues based solely on the observation of more distinguishable "color bands" in a continuous spectrum. 

Furthermore, the term "color bands" itself is ambiguous, sometimes used interchangeably with "unique hues" or "spectral appearances," further compounding the confusion. The lack of control over factors like visual noise and continuous contrast in some studies further complicates the interpretation of "color band" data, as these factors can influence the perceived distinctness of transitions between hues.

It is conceivable that, in some studies, the conclusions regarding a "richer" color experience could be rephrased in terms of a "poorer" experience (e.g., decreased JNDs implying more finely divided bands) without altering the consistency of the reported observations, highlighting the ambiguity of the terminology.

Continuous vs. Discrete Spectra:

The method of spectral presentation significantly influences perceptual discrimination:

  • Discrete Presentation: When the spectrum is presented as distinct, separate bands, observers can typically distinguish over 60 regions due to the clear boundaries between them (corresponding to an average JND of 1–2% of the visible range).
  • Continuous Presentation: In a smooth, continuous gradient, the number of distinguishable regions decreases significantly, averaging around 11, as the transitions between hues become less distinct..

Tetrachromat Perception:

Research on tetrachromats provides a compelling example of the challenges inherent in interpreting color perception data. While tetrachromats report perceiving a greater number of distinct color bands than trichromats, their reported color attractors generally align with those of trichromats. This observation raises questions regarding the interpretation of "color bands."

A key methodological consideration lies in the assessment of "finer discrimination." Simply enumerating distinguishable color bands, especially in a continuous spectrum, does not constitute sufficient evidence of a fundamentally richer color experience. A more rigorous approach involves a two-stage testing protocol:

  1. Discrete Discrimination Testing: Tetrachromats should be assessed for finer discrimination using monochromatic and metameric stimuli presented discretely. This could involve:

    • Discrete JND Testing: Determining whether tetrachromats can distinguish a greater number of steps than trichromats in a series of closely spaced monochromatic lights. A statistically significant increase (e.g., from an average of 60 steps for trichromats to 100 for tetrachromats) would provide evidence of finer discrimination.
    • Spectral Ordering Tasks: Presenting a set of color chips and requiring subjects to arrange them in spectral order. Accurate ordering of a significantly larger number of chips by tetrachromats (compared to the average of approximately 24 for trichromats) would further support the hypothesis of finer discrimination.
  2. Continuous Spectrum Testing: Only after establishing finer discrimination through discrete tests should researchers proceed with continuous spectrum testing and inquiries regarding color bands. Given the near-identical visible range for both trichromats and tetrachromats (with only minor variations), any hypothetical novel unique hue perceived by tetrachromats would necessarily fall between existing trichromat unique hues. Therefore, if tetrachromats genuinely perceived fundamentally new hues, a substantial increase in the number of perceived bands in a continuous spectrum would be expected, exceeding observed values. The absence of such a dramatic increase suggests that tetrachromats perceive more refined gradations within the established trichromat color space rather than entirely new hues. Their perception of more "bands" in a continuous spectrum may reflect a tendency to perceive discrete transitions where trichromats perceive a continuous gradient, rather than a fundamentally richer chromatic experience.



    Finally, the establishment of JNDs for both discrete and continuous presentations should precede inquiries regarding the identification of "color attractors" and their "best exemplars." This methodology allows for the separation of perceptual data from learned color associations. At this stage, variations in color choices reflect individual preferences and learned associations with color terms rather than fundamental differences in spectral perception. For example, when asked to select "blue," individual choices may range from "sky blue" to "marine blue," reflecting variations in preference rather than differences in the perceived hue's spectral location.

    While the data employed for color attractor locations is derived from these studies, the results exhibit general consistency and fall within expected statistical variation. Therefore, despite the potential for minor inaccuracies, this data serves as a reasonable basis for the present analysis.

    The Emergence of Complementary Color Relationships

    The relationship between the proposed complementary color pairs (approximated by color attractors: red-cyan, orange-blue, yellow-violet, and green-magenta) became evident through musical analysis applied to individual sets of color attractor wavelengths. This analysis involved the construction of musical scales from these wavelengths and the subsequent examination of their musical chromas.

    These complementary pairs, identified through a consistent ratio, predict the relationship between inducer hues and afterimages, explain the color mixing observed in stereoscopic vision, and account for the hues produced by color constancy. These predictions deviate from the conventional notion of RGB complementarity.

    Several factors have obscured this relationship in previous research. Individual wavelength-hue data, when visualized at all, is typically presented in tabular form within an arbitrary, linear, horizontal range (Image.05). While such representations may reveal inter-subject alignment, they fail to highlight the internal relationships within individual data sets, including underlying symmetries.

    (Image.05) Color Attractor ("Unique-Hues") 380-780nm, linear scale.

    Circular representations of the visible spectrum, based on wavelength, are found in color science literature, often employing a standard range of 400–700 nm and assigning complementary wavelengths to the line of purples and magenta based on various models or practical needs. These circular arrangements are sometimes non-linear, employing irregular step sizes due to considerations of perceptual uniformity, further obscuring underlying relationships.

    The process of scale construction led to a visual arrangement of individual color attractor data on a logarithmic scale within a spectral octave range (a, 2a). This arrangement revealed a clear symmetrical pattern, approaching near-perfect symmetry in some individuals. This symmetry point, given the logarithmic and octave-based nature of the arrangement, corresponds to the square root of 2.

    While variations in color attractor locations are observed across individuals, the internal ratio within each individual remains consistent, allowing for a significant degree of predictive power. For example, given the wavelength position x of "orange" for a specific individual, the corresponding "blue" is predicted to be located at approximately x × 1/√2. This relationship is observed across all complementary pairs (red-cyan, yellow-violet). This consistent ratio is also what determines the location of magenta, placing it precisely opposite green, the central hue of the spectrum. While other color relationships may exhibit some degree of constant ratio and predictability, the recurrence of the square root of 2 ratio three times within each individual's data strengthens the robustness of this relationship.

    While some physiometric tests may extend the visible range beyond a doubling of wavelength (e.g., 380–820 nm), such instances are rare, and these extensions do not involve the perception of novel hues beyond red (the longest perceived wavelength, or the lowest frequency). This extra range, and similarly, shorter ranges of vision, while still encompassing the full spectrum, exhibit the same flexibility observed in the musical octave, as discussed previously. These variations in range do not substantially affect the proportions of internal components, analogous to the stretched diatonic scales.

    The average range selected for the spectral octave is 375–750 nm. These specific values serve both as commonly used ranges and for facilitating graphic representation and ruler markings with predictable subdivision increments (+5 nm).

    (Image.06) Color attractor locations (red, orange, yellow, green, cyan, blue, violet; magenta is artificially mirrored across green) for trichromats (left) and tetrachromats (right), plotted on a logarithmic scale within the spectral octave of 375–750 nm.

    To provide a more intuitive understanding of the color mapping than the preceding mathematical description, a step-by-step graphic demonstration is presented in the appendix. The CIE 1931 XYZ color space is utilized for gradient rendering in this demonstration.

    Prior to analyzing the identified complementary pairs and their predictive capacity for afterimage hues and other phenomena, an examination of median values for each group is presented, revealing a richer musical analog than simple tritones (corresponding to the square root of 2) between color ratios.

    Color Attractor Spectral Location and Wavelength-Derived Musical Scales

    Historically, attempts have been made to establish connections between the musical and visual domains. Isaac Newton famously associated the colors of the rainbow with musical notes. Despite the prevalence of equal temperaments, such as the 12-tone system, during his era, Newton's pitch calculations were rooted in Pythagorean metaphysics and rational harmony. However, the challenge of consistently aligning scales, intervals, and light wavelengths with musical octaves prevented the development of a definitive model.

    This analysis adopts a reverse approach, constructing musical scales based on the spectral locations of color attractors rather than imposing existing musical structures onto the light spectrum. These hues, identified as "best exemplars" in color science literature, exhibit notable consistency across studies. While individual variations are observed, reported values typically fall within one standard deviation. The derivation of scales from these data points reveals remarkably stable musical structures, distinct from the rational intervals sought by Newton, yet no less compelling.

    This section presents short musical examples based on tuning systems derived from the wavelengths of color attractors reported in color science literature. It is crucial to note that wavelengths, measured in nanometers, are part of a human-defined measurement system. The scales presented here are not constructed by directly mapping nanometers to frequencies (Hz). Instead, they are based on the proportional relationships between color attractors, abstracting away from specific unit systems.

    For the creation of these musical scales, wavelengths are considered proportionally relative to a base color and adapted for practical implementation on specific instruments. For example, a synthesizer may map a central tone to 261 Hz (middle C), with subsequent scale values expressed as frequency multiples to establish a periodic system. Within this framework, the perceptual spectrum functions as a torsor, where relative relationships are of primary importance.

    Torsor (in the context of color): A torsor describes a set lacking a distinguished origin or zero point, yet possessing a well-defined notion of relative position or displacement. In the context of color, the set of all possible hues constitutes a torsor. The difference between two hues can be defined (e.g., "this hue is 30 degrees clockwise from that hue"), but there is no absolute "zero hue." In this context, the hues form a torsor relative to the scales (nm, Hz, cents, mocts, etc.), meaning that the relationships between hues are preserved regardless of the measurement units employed.

    Mathematical Process Summary:

    The concept of a torsor within the context of hues and the spectral octave can be illustrated through an example.

    While color science typically employs wavelength measurements (nm) within the electromagnetic spectrum, music utilizes audio frequencies (Hz). These quantities are inversely related. Analogous to musical frequency ratio calculation from string lengths (or wavelengths), where the specific frequency value is less important than the ratio itself (assuming constant string tension), the precise terahertz values or photon energy are not directly employed here. Wavelength units (nm) are sufficient for determining proportional frequencies, calculated as inverses of the wavelengths. For example, the frequency ratio from red (700 nm) to cyan (495 nm) is calculated as follows:

    Red (base): 700/700 = 1

    Cyan frequency ratio: 1 × (700/495) ≈ 1.414

    In the generated scales (available for download), ratios are calculated relative to red. However, given the cyclical nature of the system, the choice of base color is arbitrary; the proportional intervals remain invariant regardless of which color is chosen as the root or unison. This invariance exemplifies the torsor nature of hues.

    The position, wavelength, and corresponding musical note assigned to magenta are derived from the observed complementary relationships. Specifically, the frequency ratio assigned to magenta is the frequency ratio of green multiplied by √2. This methodology accounts for individual variations in the spectral octave range (e.g., 370–740 nm, 405–810 nm), which are dependent on the location of the green attractor. While the graphics presented here utilize a constant 375–750 nm range for illustrative purposes, this choice reflects the torsor nature of hues.

    Three Examples of Unique Hue-Based Scales:

    • Newton’s Unique Hue Wavelengths: Newton's "principal" hues are represented by modern approximations of his qualitative descriptions.
    • Modern Trichromat Research: This scale utilizes median unique hue data from contemporary color vision studies on normal trichromats.
    • Tetrachromat Data: This scale is derived from studies on individuals with genetic predispositions to a fourth photopigment.

    Auditory Examples:

    The following auditory examples demonstrate the translation of unique hues into musical scales, revealing perceptual and structural parallels between light and sound.

    [Audios] ×3

    Musical Properties of Hue-Derived Scales and the Role of Uniform Distribution

    If strikingly unusual or exotic microtonal sonorities are anticipated from these hue-derived scales, their relative conventionality may be surprising. While subtle microtonal inflections may be perceptible to trained listeners, the overall impression is often surprisingly consonant with established musical practice. As previously mentioned, not only the tritone is frequently approximated by frequency ratios derived from hue data, but also other stable musical intervals, such as the major third and perfect fifth, emerge from various color combinations. The resulting scales exhibit major and minor chords, and each scale features varying degrees of consonance with other traditional intervallic relationships, corresponding to intervals such as sixths and sevenths. However, a single diatonic scale is not derived from a single root; multiple intervals are present, but their non-uniform distribution prevents direct transposition of chords derived from one color to another. The fact that these scales exhibit musical usability with common timbres, as demonstrated by the piano example in Audio:Trichromats01, is notable.

    This observation raises the question of whether this musical usability is merely coincidental. To address this, the implications of randomness in tuning systems are considered. Prior research ([link]) explored the musical properties of randomly generated tunings, examining various interpretations of randomness, order, and predictability. A key finding was that uniform distribution of pitches within the octave space—even with some allowance for clustering—facilitates conventional musical usage, including tonicization and consonance on standard instruments. This arises from the inherent tendency of random subdivisions of the octave to approximate low-integer rational values, regardless of timbre (within certain tolerances). Constructing a scale with ten unusable pitches proves more challenging than constructing a usable one.

    "Octave space" is defined here as any pitch range of the form (a, 2a], where a ∈ ℝ⁺. Uniformity of pitch distribution is considered within this space, ensuring that any octave-equivalent range within the audible spectrum contains a reasonable density of pitches (approximately 5 to 20). This definition excludes trivial cases such as uniformly distributed pitches concentrated within a narrow frequency range or sparsely distributed across the audible spectrum without regard for octave equivalence.

    This conclusion is further supported by analysis of the Scala archive, a database of over 5,000 world tunings. Interval matrix analysis revealed that approximately 80% of the database exhibits congruence, indicating that many scales share the same intervallic content but with different starting points (modal transpositions/cyclic permutations, thus exhibiting the torsor property). Furthermore, randomly generated numbers, even from pseudo-random number generators (PRNGs), often approximate existing scales within a tolerance of approximately 5 cents. This suggests that tunings resembling established, structurally organized systems can emerge from seemingly random values. This observation led to the development of an "Average Tuning System," a 14-note system capable of approximating at least five notes from any of the 5,000 tunings in the archive within a 10-cent tolerance.

    As demonstrated in the aforementioned study, music created with numbers derived from diverse sources, including planetary sizes, temperatures, mountain heights, and subatomic particle energies, consistently exhibits musical usability due to the emergence of stable, familiar intervallic relationships. This reinforces the principle that uniform distribution within the octave is a primary factor in creating musically usable scales.

    Therefore, the relative conventionality of the hue-derived scales is not entirely unexpected. The color attractors themselves are well-distributed across the "color octave," naturally facilitating traditional tonal and modal usage.

    However, this statistical predictability does not diminish the significance of these findings. While the musical usability of these scales may be statistically probable, their origin in physical reality and human perception imbues them with additional meaning. These are not merely arbitrary numerical values; they are rooted in the fundamental properties of light and its perception.

    If the visible spectrum spanned a significantly different range—either much smaller (e.g., 400–430 nm) or spanning multiple "spectral octaves" (e.g., 400–3500 nm)—the relationship between color and chroma would become less compelling. The fact that colors exist within a single spectral octave strengthens the perceptual analogy.

    This limited range also addresses the question of whether sufficient color distinctions exist to represent functional harmonies. The answer is affirmative. The fine distinctions made in color perception are analogous to the subtle distinctions made in musical intervals. Just as musicians may debate whether an interval is a "super major second" or a "sub minor third," distinctions are made between colors such as "yellowish orange" and "reddish yellow." This shared phenomenon highlights the fine granularity of both auditory and visual perception.

    Individual Variation and Data Analysis

    Image.07 presents the original tabular visualization of color attractor data for each observer. Adjacent to this table, data from selected subjects are represented within a logarithmic octave wheel visualization. This visualization reveals that the near-perfect symmetrical patterns observed in some individuals are not readily apparent in the original tabular format.

    (Image.07) Color Attractors Symmetries

    While consistency is observed in certain cases, individual variation is also present. Some subjects exhibit deviations from others, even lacking a defined color attractor for cyan, for example. These variations reflect not only potential differences in the semantic interpretation of color names arising from personal preferences or cultural influences but also patterns in the distribution of these "deviations." When all data are plotted on a single logarithmic octave wheel, a logarithmic organization of color perception is suggested.

    The observed ambiguity in the violet/blue region, where overlap in color attractor positions occurs between individuals (up to a clearly defined red region), is characteristic of linearly sampling a logarithmic phenomenon. This is analogous to the distortions that arise in musical interval tempering when the logarithmic nature of pitch is not considered. This effect is also evident in the standard deviations reported by researchers, which demonstrate a decrease in deviation with increasing wavelength.

    For the color values presented in this analysis, frequency ratios have been employed. However, for future research and standardization, the use of millioctaves (mocts) for color measurement is recommended over cents. Cents are biased toward 12-tone equal temperament, whereas millioctaves maintain a consistent decimal scaling. Importantly, millioctaves map more intuitively to the fractional part of the base-2 logarithm, as discussed in the mathematical representation section. The millioctave, as its name implies, divides the octave into 1000 equal logarithmic units. For instance, complementary colors are separated by 500 mocts. This unit also simplifies calculations by obviating the need for wavelength/frequency inversions and direct use of √2.

    Color Wheel Construction: Addressing Color Space Transformations and Limitations

    The construction of the color wheels presented in this analysis requires careful consideration of color space transformations and the inherent limitations of representing the visible spectrum within the RGB color space. Converting a specific wavelength to RGB values involves several factors that can influence the final color representation:

    • CIE XYZ Model Version: Different versions of the CIE XYZ color space (e.g., 1931, 1964, 2012) have slightly different color matching functions, leading to variations in the resulting XYZ coordinates for a given wavelength.
    • Illuminant: The choice of standard illuminant (e.g., D65, A, C) affects the white point of the color space and, consequently, the mapping of wavelengths to XYZ coordinates.
    • Gamma Correction: Gamma correction is a non-linear transformation applied to RGB values to account for the non-linear response of display devices. Different gamma values will result in different RGB representations for the same XYZ coordinates.

    Consequently, obtaining a specific RGB value like (0, 255, 255) for cyan from a wavelength requires careful selection of the CIE XYZ model, illuminant, and gamma. Furthermore, achieving fully saturated RGB values for all spectral hues is often impossible. If a median render of the spectrum with equal power distribution is used, for example, the perceived saturation of red tends to decrease at longer wavelengths, making it difficult to accurately represent individual "best red" values at wavelengths like 710 nm.

    It is crucial to emphasize that the color wheels presented here are primarily concerned with the hue/chroma dimension of color, not with precise representations of luminance or gamma. The goal is to accurately represent the relative positions of hues within the spectrum and their complementary relationships, rather than to create a photometrically accurate rendering of the spectrum.

    Therefore, the CIE XYZ model (specifically the 2012 version in this case) is used primarily to confirm the relative locations of hues and color bands, providing a standardized framework for comparison. However, the final color attractor representations in the wheel are ultimately based on standard RGB values, chosen to represent the perceived hue as accurately as possible within the limitations of the RGB color space. The choice of RGB values for the attractors is done with a focus on maximizing saturation and perceptual distinctiveness, with the understanding that this might not perfectly align with a strict radiometric conversion.

    This approach acknowledges the inherent limitations of representing the full spectrum in RGB while prioritizing the accurate representation of hue relationships, which are central to the analysis presented.

    It is important to note that this color wheel, constructed through a combination of standard color matching functions (CIE XYZ 2012) and individual perceptual data (color attractors and bands), possesses significant predictive power. Specifically, it accurately predicts the complementary relationships observed in stereoscopic color mixing. This predictive capacity is further validated by empirical data from studies on afterimage hues, providing converging evidence for the validity of this model.

    _______

    Prediction Mechanism and Perceptual Parallels

    Beyond the general correspondence implied by the "spectral octave," several other parallels exist between the auditory and visual sensory domains. These parallels manifest in two distinct ways: (1) through shared complementary structures, as revealed by the octave-based color model; and (2) through shared descriptive characteristics or related conceptual frameworks.

    For example, within the context of sensory conflict, binocular rivalry is often compared to the phenomenon of binaural beats. While this comparison highlights similarities in sensory conflict resolution, it does not typically propose a shared model based on chroma or color. However, binocular rivalry is revisited in the context of stereoscopic color vision, where the octave and chroma models are essential for predicting color mixing outcomes.

    The auditory analog of afterimages, the "aftersound" effect, is briefly mentioned as a further example of sensory adaptation. While not directly related through chroma or octave models, the afterimage phenomenon is analyzed to demonstrate its adherence to the same complementary pairs predicted by the octave chroma model.

    Color constancy is also addressed, drawing a parallel with the auditory phenomenon of tonal constancy. These phenomena are not directly linked by the octave model but share similar resolution mechanisms for interpreting "neutral" or ambiguous sensory information to maintain a coherent perceptual experience, particularly in the context of musical scales and color perception. Subsequently, color constancy is analyzed using the octave model, demonstrating that the same complementary color pairs emerge when the brain interprets a physically achromatic object under colored illumination. The perceived hue of the object under colored light corresponds to the complementary of the illuminating light's hue.

    Further parallels, such as chroma shift phenomena like the Abney effect in vision and the effects of stretched tunings on perceived chroma in audition, provide additional insights into shared limits of perceptual precision with respect to hue and chroma.

    (Image.07) - Stereo vision demo. Orange/Yellow snake

    Stereoscopic Color Mixing and the Integration of Binocular Information

    The integration of binocular information in stereoscopic vision raises important questions about the role of trichromacy and opponent processing at various stages of visual processing, from the initial encoding in the retina and early visual pathways (retinal and post-retinal opponency) to the formation of a unified perceptual experience. The fact that color information undergoes at least one further transformation during stereoscopic processing before reaching conscious awareness suggests that opponent mechanisms may operate at later stages of visual processing. This also highlights the binary nature of color opponency in achieving achromatization (the perception of gray or white).

    While opponent processing is well-established in the retina and early visual areas of the brain, the phenomenon of stereoscopic color mixing suggests the possibility of a further stage of opponent processing specifically dedicated to integrating color information from the two eyes. This proposed "final" opponent process could be responsible for the observed cancellation of complementary colors when presented to the two eyes in a stereoscopic configuration. This hypothesis aligns with known mechanisms involved in binocular rivalry and stereoscopic depth perception, both of which require the integration and resolution of potentially conflicting signals from the two eyes. Further research is necessary to fully elucidate the neural basis of this proposed "final" opponent process.

    The propagation of color information in the brain, originating from discrete photoreceptors and culminating in continuous image perception, necessitates interpolation of the discrete signals. This interpolation, evident in the filling-in of the blind spot and the perceived continuity of peripheral vision despite decreasing resolution, represents a point where the limits of qualia become apparent, merging with a lack of conscious experience. This interpolation may occur concurrently with or prior to stereoscopic color mixing, which exhibits complementary relationships predicted by the octave-color mapping, as well as color mixtures resembling subtractive models.

    This suggests that color mixing occurs prior to the formation of unified qualia but interacts with other phenomena, such as color constancy, in complex and not entirely predictable ways, as will be discussed.

    Binocular Rivalry and Resolution

    Binocular rivalry occurs when two different images are presented to each eye. The brain is unable to fuse these disparate images into a single coherent percept, resulting in an alternating perception of the two images, with each image intermittently dominating conscious awareness.

    This rivalry can be directly observed with colored stimuli. When viewing stereoscopically merged blue and yellow squares, for example, rivalry ensues. However, color alone does not fully account for this conflict. Introducing contextual cues, such as the outline of a landscape, facilitates fusion and resolves the rivalry. In the landscape example, where one image is tinted blue and the other yellow, stereoscopic viewing successfully merges the images, and the colors are no longer perceived as conflicting, but rather mix, exposing the non-complementary nature of blue and yellow, mixing into green. The blue-yellow "problem" will be addressed later.

    (It's important to note that monocular rivalry also exists, where the alternating perception occurs even when only one eye is stimulated with two different images presented in rapid succession. This further emphasizes the brain's role in resolving sensory conflict, which isn't inherent of stereo inputs)

    The following stereoscopic images are designed to demonstrate two key aspects: (1) binocular complementaries, defined as those opposed in the logarithmic wheel mapping; and (2) color mixtures exhibiting subtractive-like characteristics. For clarity, these demonstrations focus on pairs of color attractors.


    Visual Processing Hierarchy in Stereoscopic Vision

    To investigate the precedence, order, and interactions of various visual processes and effects, several stereoscopic images were designed to elucidate the conditions necessary for a unified perceptual experience and to isolate specific perceptual conflicts. The goal was to create stereoscopic image pairs that fuse naturally while introducing controlled conflicts in specific visual attributes. The analysis of these experiments suggests a hierarchical organization of visual processing, where depth information derived from binocular disparity exerts a dominant influence, often resolving conflicts arising from color and luminance information.These findings also enable the creation of images where perceptual conflicts can be induced in otherwise harmonious stereoscopic image pairs.

    Summary of Observations:

    1. Scene Influence on Color Mixing:




    • Image (a): Colored Background, Colored Ball: This image depicts a soccer ball positioned against a uniform background. The left eye's view has a blue (0, 0, 255) filter applied to the entire image, while the right eye's view has a yellow (255, 255, 0) filter. The soccer ball is rendered in orange (255, 127, 0) in the right eye's view. When these images are fused stereoscopically, the observer perceives a black and white (achromatic) soccer ball against a green background. The disparity information from the ball, combined with the luminance cues, facilitates stable binocular fusion. The disparate color information from the backgrounds is integrated through stereoscopic color mixing, resulting in the perception of green.

    • Image (b): Colored Ball, Monochromatic Background: This image uses the same yellow and blue filters applied only to the soccer ball in each eye's view (yellow for the left eye, blue for the right). The background is rendered as monochromatic gray in both views. When these images are fused stereoscopically, the blue-yellow conflict present in the ball is not resolved into green. Despite the ball being the primary focus of attention and providing depth cues, the consistent achromatic information from the background and the matching depth information facilitate binocular fusion. The color mixing "instruction" is likely interpreted as "deliver the color information to qualia as is," preventing the typical blue-yellow mixing seen in other contexts. Depth information continues to dominate the perceptual strategy. To further highlight the binocular color conflict, small blue and yellow patches are introduced in the respective images, positioned so as not to overlap with the ball (top-right). These patches are perceived as floating, distinct colored regions within the 3D scene, demonstrating clear binocular rivalry. In contrast to Image (a), where the same color information was integrated into a unified green percept, these patches remain distinct due to the lack of a global color mixing instruction.

    • Image (c): Color Inversions: This image pair explores the effects of inverting complementary color filters between the two eyes. The right eye's view features an orange background and a blue ball. The left eye's view inverts these filters, presenting a blue (0, 0, 255) background and an orange (255, 127, 0) ball. When these images are fused stereoscopically, the global color conflict created by the complementary backgrounds is resolved towards a near-achromatic (gray) percept, driven by the depth and luminance information. However, both the ball and the small colored patches (also using the same orange-blue color pair) exhibit pronounced binocular rivalry. This setup demonstrates that the color mixing strategy is determined globally. Despite using the same colors, the intended balance of the global mixing scheme influences the entire visual field. The right eye's view exerts a "push" towards blue, while the left eye's view exerts a "push" towards orange, resulting in the gray background. Crucially, this global influence extends to the local color information as well. The colors of the ball and patches are pushed along in the same direction as their respective backgrounds, amplifying their chromatic contrast and resulting in a more intense perceptual conflict. This amplification manifests as a "brighter" or more saturated rivalry. Close observation of the ball's edges reveals the conflicting colors "bleeding" into the nearby achromatic (gray) grass. This observation directly demonstrates that color interpolation is processed independently of the luminance channel, which retains sharp detail without interference from the color conflict.

    2. Out of Gamut Stereoscopic Colors

    These image pairs further demonstrate the implications of a global color mixing instruction.

    • Pair 1: Gray Landscape with Small Red/Yellow Details: This pair consists of identical gray landscape images, except for small, near-pixel-sized details. In one image, these details are yellow; in the other, they are red. When viewed stereoscopically, depth and luminance information facilitate rapid binocular fusion, resulting in a unified percept. The small red and yellow details are perceived as a desaturated orange. This desaturation occurs because the global mixing instruction, derived from the dominant gray background, averages the local red and yellow information towards a neutral gray.

    • Pair 2: Complementary Filtered Landscape with Small Red/Yellow Details: This pair uses the same gray landscape as the base image but applies complementary color filters: yellow to the entire left image and violet to the entire right image. The small details remain the same as in the first pair: yellow in the left image and red in the right image. When these images are fused stereoscopically, depth and luminance information again resolve the global color conflict, resulting in a near-achromatic (gray) percept for the overall scene, as expected with complementary colors. However, the small details are now perceived as a highly saturated, bright orange, often exceeding the standard RGB color gamut.

    Explanation of the Enhanced Saturation:

    This enhanced saturation occurs because the small details were not fully integrated into the global color mixing strategy. The global instruction, driven by the large areas of yellow and violet, is to create a neutral gray. However, the small red and yellow details, being spatially limited, do not contribute significantly to this global average. Instead, they retain more of their original chromatic information. Because the background is being driven towards gray by the global mixing instruction, the local contrast of the small red and yellow details against this gray background is increased. This increased contrast, combined with the retained chromatic information, leads to the perception of a highly saturated, non-spectral orange. The red detail information is effectively amplified by the opposing forces of the global color mixing, which is trying to achieve gray. This amplification, combined with the similar effect on the yellow side, results in a highly saturated orange percept that can fall outside the standard RGB gamut.

    Contextual Perception: Tonal and Color Constancy

    In music, tonal constancy describes a phenomenon analogous to color constancy in vision. While these phenomena are not necessarily mediated by identical adaptive mechanisms, familiarity may contribute to their enhancement. In music, tonal constancy, as analyzed subsequently, refers to the brain's ability to interpret musical scales with nominally equal step sizes and neutral intervals (from a diatonic perspective) as exhibiting non-equal steps and resolved intervals when required by the musical context. This can be illustrated with more extreme examples than the stretched diatonic scales discussed earlier. Given any melody in any tuning system, each note can be subjected to a degree of pitch variation without losing its tonal meaning. A specific interval that functions as a minor third in one chord or cadence may be perceived as a major third in a different context or melodic trajectory. Similarly, other intervals can exhibit a superposition of functional roles. A sharpened second, for instance, may function as a melodic minor third but, when transposed an octave higher and combined with a suitable fifth or seventh, can function as a ninth. Tonal constancy is further elucidated with auditory examples later in this analysis.

    Color constancy is related to chroma through the octave color wheel and complementary colors. This visual phenomenon, intimately linked to afterimages, refers to the brain's capacity to interpret the color of objects under varying illumination conditions. The visual system adapts to changes in illumination and takes into account both illumination and material properties to discriminate colors.

    As a consequence of color constancy, when an object is illuminated with light of its complementary color, it is perceived as achromatic (gray or white). Conversely, objectively achromatic objects are perceived as tinted with the complementary color of the illuminating light. This effect can be readily demonstrated on computer screens, further confirming the objective nature of gray and its susceptibility to perceptual adaptation.

    As previously mentioned, familiarity plays a role in shaping these effects. Research has shown that color constancy is more pronounced when the shape and actual color of the object are known; in the absence of such prior knowledge, the perceived hue is less salient.

    (Image.08) The "orange" guitar.

    Another factor indicating the active participation of the brain in this effect is the difficulty in simply simulating it. For color constancy to occur, sufficient contextual cues must be present for the brain to interpret a scene, rather than merely an image. This is analogous to binocular rivalry, where contextual cues resolve perceptual conflict. For example, a pure blue image with a small gray square at its center is typically perceived as a blue background with a gray square; the color constancy effect is not elicited by simple color-gray contrast alone. However, a more realistic scene (Image.08) generates a vivid effect, even with a less saturated "simulated" blue light. The image depicts an "orange" classical guitar, which is objectively gray, with the rest of the scene rendered using a pure RGB blue filter (0, 0, 255).

    Given the influence of familiarity on the strength of color constancy, the subsequent images employ Rubik's Cubes within a scene. Rubik's Cubes are commonly used in color constancy demonstrations because, while viewers may associate them with color, they do not typically associate them with a single, fixed color. This object provides sufficient cues to establish a "natural" scene and elicit the color constancy effect across various complementary color settings.


    (Image.09) Color Constancy and Complementary Colors. This image demonstrates eight configurations of illuminating light and the corresponding perceived complementary color on the gray cube. The effect is sufficiently pronounced that discerning the objectively gray regions of the cubes may require careful observation. Some viewers may even be inclined to download the images to verify that the target areas are indeed gray (RGB 85, 85, 85).


    (Image.10)



    Tonal Constancy

    This auditory phenomenon shares conceptual similarities with color constancy, although the underlying mechanisms and models differ. Analogously, if one considers a set of pitches (musical chromas), such as the diatonic scale, as analogous to a set of colors, these pitches can be substantially shifted and retuned without losing their tonal meaning, just as a set of colors can remain identifiable under varying illumination conditions. The stretching or alteration of notes can be considered analogous to different "illuminations" of the set of musical chromas, which nonetheless retain their identifiable relationships.

    As discussed previously, a key difference exists between musical and visual chromas. The color of light can be perceived based on a single frequency or wavelength. In music, however, chroma is relative; a single isolated note does not possess an inherent "color" but acquires a contextual chroma within a chord or melody. In this sense, the color spectrum functions as a torsor relative to sound. Once a note is incorporated into a harmonic or melodic context, its role and degree are defined by its relative chroma. Each note, therefore, possesses multiple chromas relative to the other notes within the musical context. Consequently, different intervallic configurations of the diatonic scale not only remain functionally viable but can also generate additional chromas through transposition. There is no single "yellow" in this analogy; there are multiple "roots," each with its own set of relative chromas.

    In general, tonal constancy refers to the brain's tendency to interpret musical intervals and progressions within a tonal context, even when the actual intervals deviate from standard tunings. A clear demonstration can be provided using 7-tone equal division of the octave (7-EDO), a tuning system in which no interval perfectly corresponds to those of 12-tone equal temperament.


    Audio Examples

    The following audio examples present melodies in both 7-EDO and their 12-tone equivalents. Despite the equal step sizes in 7-EDO, the melodies evoke a sense of familiar tonal functions. For example, the second step in 7-EDO, a neutral third at 342 cents (approximately halfway between a major and minor third), is often perceived as having either a "major" or "minor" quality depending on the surrounding musical context, such as the implied harmony or the melodic contour. This effect, which persists even with pure sine waves (thereby eliminating harmonic artifacts), demonstrates tonal constancy: the listener's brain interprets the neutral intervals within a tonal framework, resolving them into functionally familiar pitches. When the same progression is rendered in 12-tone equal temperament, the listener/performer naturally resolves each step into the "correct" functional pitch to satisfy the implied cadence.


    Ξ Example A - 7edo
    Ξ Example A - 12edo

    Ξ Example B - 7edo
    Ξ Example B - 12edo


    (Further examples exploring this phenomenon in other tuning systems, including more complex modulations, can be found on my YouTube channel.)

    (Image.11) This geometric visualization compares 7-EDO with the diatonic scale in 12-tone equal temperament on a logarithmic scale. Transposition of the 7-EDO structure yields identical intervallic relationships, whereas transposition of the diatonic scale reveals the seven familiar modes of 12-tone music.

    This phenomenon raises questions regarding the limits of tonal functions. How much can these intervals be shifted or stretched before they lose their tonal meaning? This is a complex question involving individual perceptual variations and the continuous nature of pitch space. Color, chroma, and hue play a central role in exploring this question within the visual domain.

    While the examples demonstrate how a single interval can serve different functions depending on context—particularly within melodic sequences or trajectories—two additional factors warrant consideration. First, familiarity, while difficult to define and quantify precisely, intuitively influences perception and reinforces tonal constancy. Second, the non-Euclidean nature of pitch space, where the cumulative perception of small intervals can lead to an overestimation of the total perceived distance, contributes to the effect. These factors, combined with the influence of surrounding notes on the perception of otherwise neutral intervals, provide a comprehensive explanation for tonal constancy.


    Sense Adaptation and Aftereffects:

    Both vision and hearing exhibit phenomena related to sense adaptation, often referred to as aftereffects. In vision, this is known as the afterimage, while in hearing, it's called the aftersound. While both are related to sensory adaptation, the mechanisms are quite different.

    Aftersounds (Auditory Aftereffects):

    Aftersounds are auditory phenomena where a residual pitch is perceived after exposure to broadband noise with a rejected frequency band. The perceived pitch corresponds to the logarithmic center of the rejected band.

    Afterimages (Visual Aftereffects):

    Afterimages are perceptual phenomena where a residual color appears after the removal of an initial stimulus (the inducer). While the basic mechanism is attributed to the time-integral adaptation of photoreceptor cells (cone fatigue), the phenomenon is more complex than a simple depletion of cone sensitivity.

    Several observations highlight this complexity:
    1. Temporal Integration: The perceived afterimage hue is determined by the total exposure time to the inducer, even if the inducer color changes rapidly. For example, a patch flashing between dark red and yellow will produce the same afterimage as a patch of a spatially mixed red-yellow color, provided the total exposure times are equal. This suggests that the adaptation process integrates the stimulus over time.

    2. Edge/Object Influence: A less-known phenomenon demonstrates that afterimages can be perceived even in areas where the inducing color was not directly present. If a sharp-edged shape (e.g., a star) has only its corners tinted red, the afterimage will fill the entire shape, including the gray central area. This suggests that later visual processing stages, such as edge and object detection, influence the appearance of the afterimage, even though the initial trigger is cone fatigue. This process occurs before stereo mixing, as demonstrated by the fact that each eye retains its individual afterimage when viewing merged images (as in the crossed-eye examples discussed previously).
    These observations suggest that while cone fatigue is the initial trigger for afterimages, later visual processing stages play a role in shaping their appearance.

    Types of Afterimages:

    The common notion of afterimages as simply complementary colors (red-cyan, magenta-green) is an oversimplification. Modern research has revealed more complex mappings between inducer and afterimage hues.

    It's also important to distinguish between two types of negative afterimages:
    • Instant Afterimages: These appear immediately after the removal of the inducer.
    • Delayed/Conflict Afterimages: These require a longer exposure time and a dark environment to emerge. They are often experienced as "negative images" that oscillate and gradually fade.
    Many early studies focused primarily on instant afterimages and attempted to explain them solely through cone fatigue and a simple subtraction of the inducer color from the background. These explanations often failed to account for delayed afterimages, which are a common experience (e.g., staring at a bright light and then closing your eyes). Delayed afterimages are better experienced with natural pigments and daylight, requiring total darkness (covering the eyes) to properly emerge. They also exhibit distinct characteristics, such as oscillation (similar in frequency to binocular rivalry) and sequential appearance when multiple inducer colors are used.

    The Octave Hue Wheel and Afterimage Prediction:

    While the data on inducer and afterimage hues exhibits some variation across studies (due to individual differences and methodological variations), a clear pattern emerges when comparing the empirical data to the logarithmic octave hue wheel. This wheel proves remarkably effective in predicting afterimage hues.

    The most compelling evidence comes from examining specific color pairs:
    • Red and Cyan: Red and cyan are considered complementary colors and also exhibit reciprocal afterimage relationships. While there are minor variations in the precise hues reported in studies, the afterimages consistently fall within the red and cyan bands. Importantly, these colors are located directly opposite each other on both the sRGB wheel and the logarithmic octave hue wheel.
    • Green and Magenta: Green and magenta are also complementary colors and demonstrate reciprocal afterimage relationships. Again, slight variations exist within the reported magenta and green hues. These colors are positioned opposite each other on both the sRGB wheel and the octave wheel (where magenta was intentionally placed opposite green during its construction).
    • Blue and Yellow: Blue and yellow are also complementary. However, their afterimage relationships are not strictly reciprocal in the same way as red/cyan and green/magenta. Blue induces an orange afterimage, while yellow induces a purple afterimage. Conversely, orange induces a blue afterimage, and purple induces a yellow afterimage. This deviation from simple reciprocity is crucial: while blue and yellow are opposite on the sRGB wheel, on the octave wheel, blue is opposite orange, and yellow is opposite purple.



    This observation is key: the logarithmic octave hue wheel accurately predicts these non-reciprocal afterimage relationships. The afterimage of a given inducer hue is consistently located on the opposite side of the wheel. This demonstrates the predictive power of this wheel, which is based on a logarithmic representation of the visible spectrum.




    Chroma Shift: Abney Effect and Stretched Tunings

    While some perceptual parallels between vision and hearing are more abstract, others, such as the chroma shift, can be modeled or at least compared more directly. The Abney effect in vision and the phenomenon that led to stretched tunings for pianos both involve a shift in perceived chroma.

    Abney Effect (Visual Chroma Shift):

    The Abney effect describes a visual phenomenon where adding white light to a monochromatic light (a "unique hue") causes a shift in its perceived hue. For example, adding white light to red light makes the red appear more purplish, even though the spectral composition of the original red light remains unchanged. Similarly, adding white light to green makes it appear blueish.

    [Image]

    Stretched Tunings (Auditory Chroma Shift):

    Stretched tunings for pianos address a similar perceptual phenomenon in the auditory domain. In theory, perfectly tuned octaves should have a precise 1:2 frequency ratio. However, in practice, pianos sound more "in tune" when the higher octaves are stretched slightly—meaning the intervals are made slightly wider than the theoretical 1:2 ratio. This compensates for a perceptual chroma shift: higher frequencies are perceived as slightly flatter than their theoretical equivalents, while lower frequencies are perceived as slightly sharper.

    Connecting the Effects:

    The key to understanding the connection between the Abney effect and stretched tunings lies in considering the spectral composition of white light and the harmonic series of musical tones. We can approximate standard white light as a combination of red, green, and blue light in a ratio similar to a major chord (approximately 4:5:6 in frequency ratios). This means that adding white light to a monochromatic light is analogous to adding multiple harmonics (or partials) to a fundamental tone in music.

    For instance, if red is selected as the origin, a {1:1, 4:5, 2:3} or 4:5:6 ratio could be mapped to 435:543:652 THz. Therefore a sound featuring only red, green, and blue would visually represent a major chord..

    Red   ~ 432 THz = f(1)
    Green ~ 543 THz = f(Red * 5/4)
    Blue  ~ 652 THz = f(Red * 3/2)


    In a piano timbre, the higher octaves contain more partials, and these partials are more evenly distributed across the frequency spectrum, making it sound more "complex" or "rich". This addition of higher partials is analogous to the addition of white light which changes the hue. Because of the logarithmic nature of pitch perception the higher partials are percieved as flatter than they are in a linear frequency space, so stretching the octaves is a way to compensate this perceptual effect.

    Just as the addition of white light shifts the perceived hue of a monochromatic light towards purple (in the case of red) or cyan (in the case of green), the addition of higher partials in a piano timbre shifts the perceived pitch of higher octaves downwards. This is why stretching the octaves makes them sound more "in tune" to our ears.

    It's important to note that the degree of stretching in piano tunings is relatively small and often imperceptible to untrained listeners, much like the subtle hue shifts in the Abney effect are not always consciously noticed.

    Beat Effect (Auditory Conflict):

    The beat effect, or binaural beats, is an auditory phenomenon that arises from the brain's processing of two slightly different frequencies presented to each ear. This effect is not due to physical interference in the air or within the ear itself, but rather a result of neural processing in the brainstem. The brain creates a perceived "beat" or pulsation at a frequency equal to the difference between the two presented frequencies.

    While both binocular rivalry and the beat effect involve the brain resolving conflicting sensory information, the mechanisms are different. Binocular rivalry is thought to involve competition between neural representations in the visual cortex, while the beat effect arises from interference patterns in neural activity in the brainstem.

    Analogous to the contextual resolution of binocular rivalry, the beat effect can be diminished or eliminated by introducing additional auditory information. Adding harmonic overtones or even broadband noise to the pure tones can mask the beat, even though the conflicting fundamental frequencies are still present. This suggests that richer auditory contexts can also influence how the brain resolves auditory conflicts.





    Complementary colors:

    By representing the visible spectrum circularly and logarithmically within the wavelength range of 375-750 nm (our octave hue wheel), we can effectively visualize the relationship between inducer hues and their corresponding afterimages. Plotting data from individuals regarding their perceived locations of "unique hues" (red, orange, yellow, green, cyan, blue, violet, better defined as color attractors) on this wheel reveals a clear pattern: each afterimage hue is consistently located approximately opposite its inducer hue.

    This relationship holds true for the canonical complementary color pairs: red-cyan, orange-blue, and yellow-purple. While individual variations exist in the precise locations of these hues, they consistently fall within their respective color bands. This symmetrical disposition of unique hues on the wheel allows for a degree of predictability. For example, if the positions of red, orange, and yellow are known for a given individual, the positions of cyan, blue, and purple can be predicted by locating the points directly opposite them on the wheel. This corresponds to a wavelength ratio of approximately 1:√2 (or √2:1 when mapping back into the visible range).

    This raises an interesting question: can any unique hue be predicted independently, without relying on the location of other hues? Green emerges as a potential candidate. Its position on the octave hue wheel is essentially defined by the logarithmic center of the visible spectrum. Because green is positioned opposite non-spectral magenta (which can be experienced through afterimages or red-violet mixing), its location is dictated by the extreme edges of the visible spectrum. In effect, the "invisible" portions of the electromagnetic spectrum beyond the visible range define the center point (green).

    This independent predictability of green is further supported by empirical observations. For most individuals, the perceived edges of the green band (where green transitions to yellowish or bluish hues) correspond closely to the perceived limits of their visible spectrum—the points where red and violet disappear.

    For example, using the approximate wavelength ranges for Newton's "principal" hues (which, as discussed, are modern interpretations of his qualitative descriptions), red starts around 700 nm and violet ends around 400 nm. The corresponding range for green is approximately 500-570 nm. Applying the 1/√2 ratio (or 0.7072) to the long wavelength edge of green (570 nm) yields approximately 403 nm, very close to the perceived limit of violet. Similarly, multiplying the short wavelength edge of green (500 nm) by √2 yields approximately 707 nm, close to the perceived limit of red. This demonstrates that the edges of the green band can effectively predict the perceived limits of the visible spectrum, reinforcing its central and independently defined position on our hue wheel.

    The Color Continuum

    The existence and arrangement of distinct color appearances can be logically deduced from a few fundamental axioms defining color as a phenomenon. This deduction relies solely on the concepts of continuity and achromatism, independent of the physical, biological, or perceptual nature of light or vision. It addresses the abstract essence of color differentiation.

    This model captures color as a foundational map or primal space, from which it is expressed as different subsets in various realms of reality—from abstract color spaces to physical, biological, neurological, subjective, and perceptual layers. These layers represent potentially discrete points of transformation of color information from the underlying continuous experience. However, like colors themselves, these points are not easily separable. Where does the physical layer end and the biological begin? Do electromagnetic waves define the physical, or is it the biological mechanism of photoreception? These boundaries inevitably blur with neurological processes and the conscious phenomena of perceptual and subjective experience. This abstract definition allows color to exist even without strictly defined discrete layers for its manifestation.

    The phenomenon of color naturally leads us to assign discrete names to regions of the continuum. At a certain point along this continuum, the change becomes significant enough to warrant a new categorical designation. This process is influenced by perceptual thresholds and evolutionary pressures to identify certain attractors within the continuum. These attractors—hues that serve as reference points for our perception and language—lead us to name specific color characteristics like red, green, and blue, while other appearances are named for their mixtures, such as yellowish-orange. This categorization leads us to intuitively define certain colors as "unique" or "primary." Yet, colors are never truly unique in the sense of being irreducible; they arise through transformations and mixtures within the continuous underlying space.

    Refining the Concept of Unique Hues

    To begin, we must refine the notion of "unique hues" or "primary colors," separating it from the concept of color attractors. The intuitive definition of a unique hue as a color "without any tint of another" is flawed. Instead, we define co-unique hues relationally:

    Definition of Co-Unique Hues: A pair of hues (A and A⁻¹) are co-unique if and only if neither hue contains any component of the other. They are mutually exclusive in their composition and appearance, representing inverse transformations with respect to achromatism. This definition is independent of qualia, physical properties, spectral location, or naming conventions.

    Axioms of Color

    From this definition, we derive the following axioms:

    1. Achromatism (Complementarity): Color arises from the decomposition of achromatic (white/gray) light. This implies that colors must exist in at least pairs of complementary hues (A and A⁻¹) that, when combined in equal proportions, reconstitute achromatic light. This process is termed achromatization.

    2. Continuity: Hues must transition smoothly into one another, forming a continuous and cyclic spectrum. There can be no discontinuities or gaps.

    Deduction of the Hue Continuum Structure

    Necessity of at Least Four Hues

    The decomposition of achromatic light yields at least one co-unique pair (A and A⁻¹). This creates a rudimentary two-hue spectrum (A – [achromatic point] – A⁻¹), which violates continuity because the transition between A and A⁻¹ necessarily passes through an achromatic point, creating a discontinuity. To resolve this, at least one additional co-unique pair (B and B⁻¹) is required, positioned such that transitions do not lead to achromatization. This establishes a minimum four-hue structure: A – B – A⁻¹ – B⁻¹ – A.

    Derivation of Transition Hues

    While the transitions between these four fundamental hues are continuous, distinct points represent balanced mixtures. These transitional hues are doubly justified: they arise from the mixture of two adjacent fundamental hues (e.g., AB) and as complements of the opposite mixture (e.g., A⁻¹B⁻¹). This gives rise to four additional hues, each located between the initial four.

    Maximum of Eight Most Distinct Hues

    Starting with any co-unique pair and applying the principles of continuity and achromatism, we necessarily arrive at a structure with four fundamental hues and four transition hues, yielding a maximum of eight distinct hue categories. Adding further distinct hues would either create discontinuities or violate the principle of complementarity. Further subdivision of the continuum results in finer variations of these four categories, not "new" distinct hues.

    Logical Conclusions

    This framework predicts the existence of four fundamentally different unique hues as a requirement for a cyclic continuum of appearances. This prediction is independent of any specific color space or physical implementation and provides a fundamental explanation for the discrete nature of our color categorization. Starting with any arbitrary co-unique pair will necessarily generate the same fundamental four-hue structure to satisfy the axioms of achromatism and continuity. These hues, defined relationally, form the basis for our conventional color categories.

    Why Not Three Components?

    A three-component system, while able to traverse the hue cycle, cannot derive the structure of that cycle from the fundamental axioms of achromatism and continuity. It relies on pre-existing perceptual information about the existence and arrangement of hues.

    1. The concept of co-unique hues, where neither hue contains any component of the other, is impossible to define in a three-component system.

    2. The four-component model, based on co-unique hues and the axioms of achromatism and continuity, provides a logical foundation for the existence and arrangement of distinct hue categories. A three-component system, while perceptually useful, lacks this logical basis and assumes the structure of the hue cycle rather than deriving it.

    Conclusion

    This model seeks to explain why we perceive the hues we do, not just describe how we represent them. The four-component model derives the structure of the hue cycle from first principles, while a three-component model assumes it. Therefore, the notion of "new hues" arising independently is unnecessary. The logical framework, grounded in the axioms of achromatism and continuity, demonstrates that no further fundamental hues are required to complete the continuum.


    Co-Uniques(Complementaries) as generating sets

    The Problem with "Unique Hues" and "New Colors"

    The idea of a "new color" is undoubtedly fascinating and worth exploring. However, much of the publicly available information—ranging from online encyclopedias to research papers—on tetrachromacy and animal color vision often lacks rigorous support and misinterprets key concepts.

    A fundamental problem in color science is the absence of a clear, universally accepted definition of "unique hue." This ambiguity leads to overinterpretation, especially in studies of animal vision and human tetrachromacy.

    The Problem with "Unique Hues"

    The traditional definition of "unique hues" as colors "without any tint of another" is inherently subjective and doesn't hold up to scientific scrutiny. Instead, a relational definition—where co-unique hues are defined as hues that mutually cancel each other—is a more robust and useful concept. This approach aligns with color's symmetry and abstract models, providing a clearer framework for understanding the phenomenon.

    Misconceptions About "New Hues"

    The concept of "new hues" or "new colors" is problematic for several reasons. It suggests the existence of entirely discrete, qualitatively different color sensations that lie outside human experience. Yet, without a precise definition of "hue" tied to physical, neural, and abstract color models, such claims are difficult to validate.

    Shrimp Color Vision

    The claim that mantis shrimp, with their 15 photoreceptors, experience colors "beyond human imagination" often stems from a misunderstanding of how photoreceptors translate to color perception. Having more photoreceptors does not inherently mean perceiving more colors or finer spectral distinctions.

    Shrimp vision operates on the same principles as human trichromatic vision, with overlapping spectral sensitivities forming a continuous mapping between wavelength and receptor stimulation. The "colors" perceived by shrimp are determined by the relative responses of their photoreceptors, not by discrete, otherworldly hues.

    Claiming that shrimp can see more colors because they have more photoreceptors is akin to saying spiders move in more spatial dimensions because they have more legs. The analogy highlights the flawed reasoning behind such assumptions.

    Human Tetrachromacy

    Similar misconceptions surround human tetrachromacy. Even if a person has four cone types, it does not guarantee the perception of a completely new and unimaginable color. The essential question is: What is the complementary hue of this supposed new hue?

    Complementary colors are not just hues that cancel each other; they are co-unique attractors in a symmetrically balanced color space. For every hue we perceive, there exists a complementary hue that is equally distinct. This balance suggests that adding a "new hue" would require an additional complementary hue to maintain symmetry. Without these, the concept of a "new hue" disrupts the coherence of the continuum.

    The Four-Component Model

    The four-component model provides a logical framework to address these issues. By assuming continuity and compactness of the color continuum, it concludes that the number of distinct hue categories must be finite and even. Any further subdivision creates variations of existing categories, not fundamentally new hues.

    Conclusion

    The claims about shrimp vision and human tetrachromacy often ignore fundamental constraints of color theory. They overemphasize the role of photoreceptors without addressing how they map to a coherent and continuous color space. A "new color" does not fit into this continuum without disrupting its structure, symmetry, and balance.

    In essence, the color space as we understand it—defined by relational, continuous, and symmetrical properties—leaves no room for discrete, standalone "new hues." Instead, every color we perceive fits into a logically constrained, interdependent system.


    Applying the Axioms to the Visual Experience

    This abstract model, based on co-unique pairs and the principles of achromatism and continuity, provides a powerful framework for understanding the existence and arrangement of hues. We can now apply this framework to the specific context of human visual experience, demonstrating how it predicts the distribution of the primary unique hues within the visible spectrum.

    As previously established, green occupies a unique and independently predictable position. It resides at the logarithmic center of the visible spectrum, defined by its approximate boundaries of 375 nm and 750 nm. This central placement, determined by the physical limits of human vision, makes green a foundational anchor point.

    Given the axiom of continuity—that hues must transition smoothly into one another—there must be hues on either side of green. At the extreme edges of the visible spectrum, we observe two distinct hues: red and violet. These two hues, in order to maintain continuity, must blend or fuse, creating the non-spectral hue magenta. This magenta, by the axiom of complementarity, is the complementary hue of green.

    Now, applying the axiom of complementarity to red and violet, we deduce the existence of their respective complementary hues: cyan and yellow. These hues naturally emerge adjacent to green on the hue wheel, positioned such that they are complementary to red and violet.

    It's important to note that we did not begin with a co-unique pair as in the purely abstract deduction. Instead, we started with the central hue (green) and the extreme edges of the visible spectrum (red and violet), which define the complementary non-spectral hue magenta. From these three anchor points (green, red/magenta, and violet/magenta), the requirement of complementarity alone yields cyan and yellow as the transitions between green and red/violet. This process generates six distinct hues: red, yellow, green, cyan, violet, and magenta.

    The remaining hues, orange and blue, emerge naturally as a consequence of the continuity axiom. Orange fills the continuous transition between red and yellow, while blue fills the continuous transition between cyan and violet. These transitions do not lead to achromatization, as each hue has sufficient distinctness to avoid blending into gray with its neighbor.

    This process demonstrates how, starting with the independently defined position of green and the boundaries of the visible spectrum, the axioms of complementarity and continuity necessitate the existence and approximate positions of all the primary unique hues. No further unique hues can be added without violating these fundamental principles.


    This abstract color model, based on the logical necessity of co-unique pairs and a continuous spectrum, aligns remarkably well with physical reality, demonstrating a consistent mathematical structure rooted in logarithmic perception and octave-like cycles. This suggests that color, at a fundamental level, may possess an objective basis. However, to fully bridge the gap between this abstract model and our subjective experience, we must consider the biological mechanisms of human vision.

    The model predicts four relative primaries (two co-unique pairs) from which all other hues can be derived. This prediction is reflected in the limitations of three-color additive (RGB) and subtractive (CMY) color spaces. These spaces struggle to accurately reproduce the full gamut of colors, particularly highly saturated oranges and purples. This is because mixing two widely separated hues on the color cycle results in desaturated colors, effectively reducing the achievable gamut. This limitation arises directly from the fact that only three primaries are used, whereas this model demonstrates that four are necessary to fully encompass the color continuum.

    However, it's crucial to understand that any single point on the color continuum is composed of a mixture of at most three of these four relative primaries, not all four simultaneously. This is a direct consequence of how these primaries were logically derived: each transition involves only two adjacent hues.

    This observation leads to a hypothesis about the function of the three types of cone photoreceptors in the human retina. Rather than directly encoding opponent color channels (red-green, blue-yellow), lets consider that each cone type primarily records the intensity of light it absorbs across a broad range of wavelengths. Opponent processing, then, occurs at a later stage, where neural circuits calculate the proportions of stimulation between the different cone types. This proportional encoding allows the visual system to "retrieve" or infer the approximate wavelength or spectral characteristic of the incoming light, which is then processed as a specific visual experience.

    The model elegantly explains metamerism: different spectral distributions of light can produce the same relative cone stimulation ratios, leading to identical color perceptions despite different physical stimuli. It also explains metameric failure: when the spectral distributions are sufficiently different, the cone stimulation ratios might diverge, leading to different perceived colors.

    In dichromats, the absence of one cone type limits the system to two dimensions of color information. They experience only two unique hues and their mixtures, which blend into gray. In terms of our model, they are essentially trapped in a two-hue cycle, unable to access the transitions that give rise to the full color continuum. This is the paradox of breaking the continuity axiom: they have no way to distinguish the different hues and their transitions.

    The role of the third cone is thus not merely to add a third color dimension but to provide the necessary reference point for calculating proportions. Just as the proportion between two mountain heights changes drastically if the sea level changes, the relative stimulation of two cone types is meaningless without a third reference point. Each cone pair needs a third cone as a "sea level" to determine the relative proportions and, therefore, the perceived color.

    We can further illustrate this using the analogy of a wavelength-detecting machine. Instead of using sharp cut-off filters to create distinct color channels (as in some cameras), the receptors in our eyes act more like filters with overlapping response curves. This allows us to analyze the interaction and find the optimal setting for the peak responses and their spread, ensuring that no two different wavelengths produce the same overall response profile across the three cone types.
     

    (draft)
    Physically-Based, Octave-Modeled, Logarithmic Hue Wheel


    The CIE XYZ 1931 color model remains a foundational tool for accurate color representation. Recent advancements, such as continuous functions for cone responses, have replaced empirical tables with computationally efficient methods, maintaining accuracy while improving usability. Leveraging these tools, we can construct a logarithmic hue wheel physically grounded.

    The following sequence of graphics illustrates the construction of this hue wheel, providing a more intuitive visualization than the abstract group mappings presented in the mathematical section. It also clearly demonstrates how and which region is assigned to non-spectral magenta, which, interestingly, has ample "space" to close the wheel smoothly.

    While electromagnetic waves are often described in terms of wavelengths in color science, music theory typically focuses on frequency ratios. To bridge these perspectives, the graphics and accompanying explanations incorporate both metrics, referencing each value as appropriate for clarity and context.

    Key Graphics and Steps:

    1. Linear Spectrum (200–1600 THz): A linear representation of the electromagnetic spectrum, highlighting the visible range (~400–750 THz) and including black regions beyond visible light for reference.


    2. Octave Doubling: Frequencies are repeated at \(2^k\) generating three "rainbows" separated by black gaps. Non-spectral magenta, which does not exist in the physical spectrum, will be placed within these gaps.


    3. Logarithmic Scale: The three rainbows are equalized in size by compressing the scale logarithmically.


    4. Magenta Addition: A normal distribution curve fills the gaps, smoothly blending red and blue without altering their intensity. The magenta in this model is defined as the midpoint on the purple line in the CIE XYZ chromaticity diagram. This choice ensures a perceptually balanced transition between red and blue in the logarithmic hue wheel.


    5. Hue Wheel: A single rainbow from magenta to magenta forms a continuous logarithmic hue wheel.


    Technical Details:

    While the graphic displays wavelengths in the visible range (375–750 nm) with markings increasing in 25 nm increments, precise placements are based on spectral and perceptual considerations. Magenta is positioned opposite green (at 529 nm) on the wheel calculated as follows:
    • \(M_1 = 529 \times\frac{1}{\sqrt{2}}\)
    • \(M_2 = M_1 \times 2\)
      This places magenta approximately at 374 nm and 748 nm.
    The placement of magenta as the midpoint on the purple line of the XYZ chromaticity diagram provides a critical anchor for the non-spectral colors in the wheel. The selection of green as the reference point is also significant. It represents the median of the visible spectrum, aligns closely with the peak of the sun’s emission, and is consistently identified as a distinct hue in color science. Together, these decisions create a perceptually balanced and physically accurate hue wheel.

    Interval Reduction

    This page is dedicated to the interval reduction operation, a foundational concept in music theory that I’ve explained briefly in other arti...