What appears random to one observer may not be random to another. Randomness, therefore, is not an intrinsic property of a sequence or system, but a relational one: it depends on the observer’s predictive capacity.
Let's define randomness not a property of data alone, but of the interaction between data and a predictive structure: randomness exists when the predictor persistently receives maximally surprising outcomes, even as it adapts.
Consider a Bayesian-like predictive system. Such a system is designed to anticipate data(numbers, symbols, events) under fixed constraints (precision, range, discreteness, digital representation). It begins with priors: expectations about values, cycles, distributions, or patterns. With each new observation, the system updates its beliefs according to some learning or adaptation rule, possibly even rewriting its priors. The system may be arbitrarily simple or arbitrarily complex.
-Data: Let \(X = (x_1, x_2, \dots)\) be a (finite or infinite) sequence over an alphabet \((\mathcal{A})\).
-Predictor (Inference System): A predictor \((P)\) is a triple:\(P = (\Pi, U, L)\)
- \((\Pi_t)\): the internal state at time (t) (priors / beliefs)
- \((U)\): an update rule \( \Pi_{t+1} = U(\Pi_t, x_t)\)
- \((L)\): a prediction rule that assigns a probability distribution \( P_t(\cdot) = L(\Pi_t)\)
(This includes Bayesian predictors, ML models, Rule-based systems ,Human intuition (in principle))
Under this view, a sequence is truly random for that system when each successive value is consistently among the least expected outcomes according to the system’s current state. Randomness is thus defined relative to a model: it is the data that maximally violates the model’s expectations over time.
-Instantaneous Surprise: Define surprise at time (t) as:\(S_P(x_t) = -\log P_t(x_t)\) (Standard information-theoretic surprise.)
-Maximal Failure Condition: Let: \(S_{\max}(t) = \max_{a \in \mathcal{A}} \left( -\log P_t(a) \right)\)
-Relative randomness score:\(R_P(x_t) = \frac{S_P(x_t)}{S_{\max}(t)} \in [0,1]\)
- \((R_P(x_t) \approx 1)\): least-expected outcome
- \((R_P(x_t) \approx 0)\): highly predictable
Now the key definition.
-Relative Randomness (Sequence-Level)
A sequence \((X)\) is relatively random with respect to predictor \((P)\) if: \(\liminf_{T \to \infty} \frac{1}{T} \sum_{t=1}^T R_P(x_t) = 1\)
Informally: The predictor persistently receives maximally surprising outcomes, even as it adapts.
Relative Randomness and Infinite Generators
Under this definition, true relative randomness exists. One can generate infinitely many random sequences by defining infinitely many predictive systems with different priors and update rules. Each system creates its own form of chaos.
Define a generator \((G_P)\) that outputs: \(x_t = \arg\max_{a \in \mathcal{A}} S_P(a)\)
This is the “least expected value” generator, basically an adversary with full access to the learner’s state, except it cares about persistent maximal surprise.
As priors evolve, their trajectories may themselves appear random. From the perspective of the original system, the data stream becomes increasingly alien and unpredictable, it has no idea what is coming next.
Now \((G_P)\) creates relative randomness for \((P)\) But not for a meta-predictor \((P')\) that models \((P)\)
Meta-Systems and Paradox
This idea immediately encounters familiar paradoxes. If a system knows that the next value will be its worst prediction, it can simply adjust its guess. But this adjustment constitutes a new system, one that now incorporates knowledge of the previous system.
This leads to meta-systems: systems that predict other systems’ predictions. One could imagine multiple competing predictors, or even a super-system that outputs the least expected value across all of them. This hierarchy can continue indefinitely, producing a fractal regress of prediction and counter-prediction.
Let: \( P^{(k+1)} = \text{Predictor that models } (P^{(k)}, G_{P^{(k)}}) \)
Then, no finite level achieves absolute randomness, randomness is strictly relational, predictive power decays with meta-depth
The situation resembles game-theoretic adversaries, no-free-lunch theorems, epistemic paradoxes such as recursive knowledge collapse (“I know that you know that I know…”) where predictive utility decays to zero.
Algorithmic Complexity
From the standpoint of Kolmogorov complexity, such sequences may appear non-random. If the generating program can be arbitrarily large while the output grows without bound, the output sequence may have low algorithmic complexity in principle.
But this raises a practical question: if one is only given the data, how does one recover the generating program? Inference becomes increasingly difficult. Applying a Bayesian reconstruction may perform even worse, since the data was explicitly generated to defeat prediction. The original program may be effectively unrecoverable.
Let \(K(X)\) be Kolmogorov complexity:
\(K(X) \text{ small } \nRightarrow X \text{ predictable}\) Relative randomness depends on recoverability, not existence, of the generator.
We can define relative incompressibility: \( K_P(X) = \min { |M| : M \text{ allows } P \text{ to predict } X }\)
For the sequences: \((K(X))\) may be low, \((K_P(X))\) is high or undefined (the program exists but is unrecoverable).
Order, Disorder, and Expectation
This framing separates randomness from the usual notions of order and disorder, which are often conflated. A sequence may appear highly ordered yet still be extremely unexpected.
For example, a sequence of 100 zeros may look ordered to a human observer, yet in many contexts it is highly improbable (100 consecutive red outcomes in roulette). Order does not negate randomness if the sequence persistently defeats expectation.
Mismatch of Models
Take certified randomness in quantum computing. Even if outcomes are statistically independent, an observer who knows the typical distribution can still make approximations or occasional correct guesses. The predictions may “look like” the data, even if they fail in detail.
Now imagine a trick: instead of quantum data, the observer is given an image of a flower. Applying statistical tests designed for quantum randomness to such data is a category error, but the observer may never realize it. The failure arises not from the data itself, but from the mismatch between the data and the predictive model.
Randomness, then, is a function of expectation and adaptation. There is always, for any system, a least-predicted outcome. Data that consistently occupies that region is maximally random for that system.
-----
Beyond theory, regardless of the philosophical or mathematical implications, the constructive aspect is appealing. One can experiment with different predictive rules, prior update mechanisms, and resolutions, then visualize the resulting data. Different systems may generate sequences that are equally chaotic but qualitatively distinct in structure or evolution.
(implementation is complex because is not just a random number generator, it’s closer to a predictor + adversary co-evolution, with asymmetric information, and endogenous rule updates)
What “absolute randomness” would have to mean
Most confusion comes from hidden assumptions. For absolute randomness to exist, it would have to satisfy something like this:
A sequence is random in itself, independent of: who observes it, what model is used, what prior knowledge exists, what inference strategy is applied...
In other words: there exists a sequence \((X)\) such that no possible predictor can do better than chance on it, in any meaningful sense.
Why this already smells impossible? prediction is not a single thing. It ranges from simple frequency counting, pattern matching, compression, meta-reasoning about generators, adversarial modeling... So “absolutely random” would mean: No structure exists that any system could exploit. But here is the key move, structure is not intrinsic to data; it is defined by a model, absolute randomness becomes suspect.
Core impossibility intuition
Claim (informal): There is no sequence that is random with respect to all predictors.
For any fixed sequence, you can always construct a predictor that predicts it. This predictor may be: trivial, degenerate, non-general, useless on any other data, but it exists.
Example: “The predictor that outputs exactly this sequence.” That alone kills absolute randomness as an ontological property.
People sometimes object: “That predictor cheats, it memorizes the data.” But that objection already introduces constraints on predictors. And the moment you impose constraints, randomness becomes relative to a class of models. Which is exactly the point.
Restricting predictors doesn’t save absoluteness, suppose someone says: “Fine, but we only allow reasonable predictors.” Now you have to answer: reasonable by whose standards? computable? efficient? human-like? polynomial-time? Bayesian? finite memory? Each restriction changes what counts as random. So we don’t get absolute randomness. We get randomness relative to a model class.
The diagonal argument intuition
1. Fix a predictor \((P)\)
2. Define a generator \((G_P)\) that always outputs the value \((P)\) least expects
3. Then \((P)\) fails maximally on \((G_P)\)
Now generalize:
For predictor \((P_1)\), build \((G_{P_1})\)
For predictor \((P_2)\), build \((G_{P_2})\)
...
There is no single sequence that defeats all predictors simultaneously, because defeating one predictor gives structure another can exploit, any “universal” test can be diagonalized against.
Absolute randomness would require escaping diagonalization.
----
Music: predictable, yet improbable
Music is highly predictable: tonal hierarchies, repetition, phrasing, statistical regularities... But that does not make music probable in any absolute sense. Why? Because the predictor that makes music predictable is extremely specialized, historically contingent, biologically embodied, culturally trained...
From a blind universe with no intelligent life Bach is astronomically unlikely, even a simple melody is absurdly specific.. so you get this paradoxical but coherent statement: Music is predictable relative to minds like ours, and wildly improbable relative to the space of all possible sequences.
Randomness inherits its observer, structure does not float freely in the universe. It is selected, stabilized, and recognized by cognitive systems. So anything that is “predictable” in a rich way is already conditioned on a learning process, a history of constraints, a shared model space...
Which means, predictability is local, improbability is global, and randomness lives in the gap between them.
Not: randomness = lack of order
But: randomness = sustained violation of expectation
No comments:
Post a Comment