TransWikia.com

Permutation testing for significance of a predictor

Cross Validated Asked on November 6, 2021

I’ve been thinking lately about connecting permutation tests with testing for significance of a variable. Consider the following (very simple) example:

Suppose we want to consider the following linear model:
$$y_i=beta_0+beta_1x_{i}+epsilon_i$$
for some error term $epsilon_i$ which we don’t assume to follow a normal distribution and for some categorical variable $x$ which takes values $0$ or $1$ for simplicity. There is a lot of literature on what to do in this case but I want to somehow relate this to permutation testing. The following idea came to mind:

Under the null hypothesis of no (significant) effect of $x$ on $y$ we can "permute" the corresponding values of the $x$‘s corresponding to the $y$‘s and for each such allocation we can calculate the estimate of $beta_1$ given by $hat{beta}_1$. Under the null we shouldn’t expect the estimates $hat{beta}_1$ for each such allocation to vary too much so (conditional on the observed values) if we observe something that is too much on the left or right of the empirical distribution of $hat{beta}_1$ will cast doubt on the null hypothesis of no effect.

Does this sound like an OK idea? My problem is that we are purposefully including an insignificant variable which may or may not distort the results. However, I still think that the permutation test is applicable. Also, question that popped up while writing this: If $x$ is categorical with $2$ levels and we obtain large estimate of the coefficient $beta_1$, that hints at significance of the variable $x$ because it is categorical, right?

Finally, thanks to everyone who read that and who will help me resolve this issue!

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP