Relation between P-value in a randomness test, number of samples, and entropy

Question

Consider tests of randomness of bit sequences of fixed size $n$ bits as in cryptography (e.g. NIST Special Publication 800-22 page 1-4; notice this uses captital-$P$ for p-values). Define such test as any deterministic function $T$ that accepts a vector $V$ of $n$ bits, and outputs a p-value $p$ in $]0dots1]$ (with null hypothesis that $V$ consists of random independent unbiased bits), that is with the defining property
$$forallalphain[0dots1],;;Prbig(T(V)lealphabig),le,alpha$$
where the probability is computed with $V$ a vector of random independent unbiased bits (or equivalently, is computed as the proportion of $V$ such that $T(V)lealpha$ among the $2^n$ vectors $V$).

Example tests matching this definition are

True, which always output $p=1$.
Non-zero, which outputs $p=1/2^n$ if all bits in $V$ are zero, and outputs $p=1$ otherwise.
Non-stuck, which outputs $p=1/2^{n-1}$ if all bits in $V$ are identical, and outputs $p=1$ otherwise.
Balanced, which computes the number $s$ of bits set in $V$ (that is the Hamming weight of $V$), and outputs as $p$ the odds that for random $V'$, $|2s'-n|$ is at least $|2s-n|$.

For $nle3$, Balanced is the same as Non-stuck.
For $n=4$, $p=begin{cases}
{1/8}&text{ if } sin{0,4}&text{e.g. }V=(0,0,0,0)\
{5/8}&text{ if } sin{1,3}&text{e.g. }V=(1,0,1,1)\
1&text{ otherwise}&text{e.g. }V=(1,0,0,1)end{cases}$
For $n=5$, $p=begin{cases}
{1/16}&text{ if } sin{0,5}&text{e.g. }V=(1,1,1,1,1)\
{3/8}&text{ if } sin{1,4}&text{e.g. }V=(0,0,1,0,0)\
1&text{ otherwise}&text{e.g. }V=(0,1,1,0,1)end{cases}$

There's a natural partial order relation among tests: $T$ implies $T'$ when $forall V, T(V)le T'(V)$. Any test implies True. Balanced implies Non-stuck, but does not imply Non-zero. Some tests, including Balanced and Non-zero, are optimal in the sense that no other test implies them.

Section 2 of the above reference describes 15 tests for large $n$ (thousands bits), that are intended to catch some defects relevant to actual random number generators, and be near-optimal (in the above sense). For example, section 2.1 is an approximation of Balanced for large $n$ using the complementary error function, designated The Frequency (Monobit) Test.

Q1: Assume that all the $n$ bits in a vector $V$ tested are random independent bits having same odds $q={1over2}+epsilon$ to be set, with $epsilon$ unknown (besides being smallish), corresponding to Shannon entropy per bit $$E=-qlog_2(q)-(1-q)log_2(1-q)=1-{2overlog2}epsilon^2+mathcal O(epsilon^4)$$

The Balanced test for some (large) number $n$ of such bits is applied once, and outputs a small p-value (say $ple0.001$). That allows us to reject the null hypothesis $H_0$ that $E=1$ (equivalently, $epsilon=0$) with high confidence (corresponding to the p-value $p$).

What is a tight function $E(p,n)$ such that we can reject the hypothesis $H_1$ that $Ege E(p,n)$ with some good confidence (corresponding to some known p-value higher than $p$, perhaps $2p$, or $sqrt p$, or something on that tune)? By "tight function" I mean that the lowest $E(p,n)$ we manage to prove for some confidence, the better.

Q2: Things are as in Q1, except that the test is unspecified beyond the defining property of p-values. Can we reject the hypothesis $H_1$ that $Ege E(p,n)$ with good confidence, for whatever $E(p,n)$ and at least the confidence level that was established in Q1? If that conjecture was false, what's a counterexample, or/and is that reparable?

Q3: Things are as in Q2 (or in Q1 if the property thought in Q2 does not apply), except that the bits in the input $V$ might be dependent, but still with Shannon entropy per bit $E$; that is, the distribution of the inputs $V$ is such that $$nE;=;-sum_{Vtext{ with }Pr(V)ne0}Pr(V)log_2(Pr(V))$$
Can we reject the hypothesis $H_1$ that $Ege E(p,n)$ with good confidence, for whatever $E(p,n)$ and at least the confidence level that was established in Q1? If that conjecture was false, what's a counterexample, or/and is that reparable?

noumenal · Answer

If I understand your problem correctly, the choice would be multinomial logistic regression, also known as the "conditional maximum entropy model". Concerning your second question I guess that two different concerns can be emphasized: the accuracy of the confidence interval or the statistical power of the inference. I would especially be concerned with correcting for multiple testing, for example using the Holm-Bonferroni correction of the alpha values or something more contemporary.

For the final question, a method that takes prior probabilities into account would be a Bayesian method. Although I don't have a concrete suggestion here, the problem reminds me of random forests and naive Bayes classifiers. However, these models not only involve dependence, but training, so I am not sure if those topics would be relevant to the problem at hand.

Relation between P-value in a randomness test, number of samples, and entropy

One Answer

Add your own answers!

Ask a Question