Chi-Square + Runs Test

When players hit a losing streak, they scream that the game is rigged. When they win, they think they cracked the code. This RNG Audit Calculator moves past emotion, running double statistical tests to verify if a casino’s seed stream is truly random (vital to prove if a provably fair game is actually safe).

RNG Audit — Chi-square + Runs Test

Paste outcomes (comma- or whitespace-separated). Tool tests whether they're uniformly distributed AND serially independent.

Outcomes

Chi-square buckets (K)

Sample size—

Chi-square statistic—

Chi-square p-value—

Observed runs—

Expected runs—

Runs Z-score—

Runs p-value (two-tailed)—

—

Why one statistical test is not enough

An online casino can cheat you without making the game look obviously biased. They can ensure that the overall game pays out the exact advertised RTP (e.g. 97%), passing basic regulatory audits, but manipulate the *sequence* of outcomes. They might cluster winning rounds during low wagers and cluster losing rounds the second you scale up your stakes.

To verify real randomness, a stream must pass two distinct hurdles: statistical uniformity (equal distribution of values) and serial independence (no predictable patterns or memory).

The P-Value Guide: In statistical auditing, a p-value between 0.05 and 0.95 indicates a healthy, random stream. If your p-value is extremely low (below 0.01), the outcomes are highly unlikely to be random (rigged/biased). If it is suspiciously close to 1.0 (above 0.99), the stream is “too perfect,” indicating artificial smoothing.

The two pillars of RNG auditing

Our auditing verifier processes your pasted sequence of floating-point numbers or dice rolls through two rigorous mathematical tests:

1. Chi-Square Goodness-of-Fit Test (Uniformity)

The Chi-Square test checks if the outcomes are spread evenly across all possible bins. For example, if you roll a 6-sided die 6,000 times, each face should appear approximately 1,000 times:

χ² = Σ ((Observed_i - Expected_i)^2 / Expected_i)

Where Observed is the actual count in bin i, and Expected is the theoretical count. The calculator converts this χ² statistic into a precise p-value using the Wilson-Hilferty approximation.

2. Wald-Wolfowitz Runs Test (Independence) (best paired with the Kolmogorov-Smirnov Test)

A stream can be perfectly uniform but completely predictable. For example, the alternating sequence [1, 6, 1, 6, 1, 6...] is perfectly uniform but is obviously not random. The Runs Test tracks the number of continuous sequences of values above and below the median (known as “runs”) to find serial patterns:

Expected_Runs (μ) = ((2 * N1 * N2) / N) + 1

Standard_Deviation (σ) = √((2*N1*N2 * (2*N1*N2 - N)) / (N^2 * (N - 1)))

Where N1 is the number of values above the median, N2 is the number of values below the median, and N is the total count. The calculator derives the standard normal Z-score:

Z = (Observed_Runs - μ) / σ

A high absolute Z-score indicates that the stream has too few runs (indicating clustering) or too many runs (indicating artificial alternation).

Data Sandwich: Auditing 500 crash rounds

Let’s walk through an actual audit. You extract a history of 500 consecutive crash multipliers from an online casino. You convert them to their raw floating-point seeds (between 0 and 1) and paste them into the auditor:

Chi-Square P-Value: 0.42 (Uniformity check: **PASSED**)
Runs Test Z-Score: -3.85 (P-value: 0.0001, Independence check: **FAILED**)

The Chi-Square test confirms that all decimals between 0 and 1 are evenly represented. However, the negative Z-score of -3.85 reveals a major problem: there are far fewer runs than expected. The winning and losing rounds are heavily clustered together. This is a clear mathematical indicator of a manipulated, non-random stream.

Frequently asked questions

What does a failed Runs Test mean?

A failed Runs Test means the outcomes are not independent of one another. The game has a memory. If you see a long streak of losses clustered together, or alternating patterns that repeat predictably, the RNG sequence is compromised.

How many inputs do I need for a reliable audit?

For statistical testing to be mathematically valid, you should paste a sample size of at least 100 consecutive outcomes. Paste 500 or more outcomes to get a highly reliable, high-confidence audit.

Can an RNG pass both tests but still be rigged?

It is extremely difficult. If an operator manipulates the outcomes in real time based on player bets, the sequence will almost certainly fail the Wald-Wolfowitz Runs Test. A clean pass across a large sample size is the highest proof of fair play.