When performing linear regression, why does scipy use Wald Statistic followed by a t-test, as opposed to Wald Statistic followed by a Wald test?

The following code performs linear regression on two random 5-element vectors using scipy:

```
import scipy.stats
import random
# Draw some random X and Y
sampleSize = 5
X = [random.uniform(0, 1) for i in range(sampleSize)]
Y = [random.uniform(0, 1) for i in range(sampleSize)]
# named tuple for scipy's linear regression
resultScipy = scipy.stats.linregress(X, Y)
print(resultScipy)
```

Example output is:

```
LinregressResult(slope=-0.36654925096390012, intercept=0.67985896267369861, rvalue=-0.32357812225448918, pvalue=0.59531436421063166, stderr=0.61883685654551934)
```

I very much wanted to know how scipy obtains the pvalue, in this case `0.59531436421063166`

. Note that the scipy’s documentation isn’t particularly enlightening:

p-value:floattwo-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero.

After some python heavy-lifting I’ve written some code, which fully reproduces scipy’s behaviour:

```
import random
import math
import collections
import scipy.stats
import scipy.stats.distributions as distributions
import numpy as np
import sys
# Draw some random X and Y
sampleSize = 5
X = [random.uniform(0, 1) for i in range(sampleSize)]
Y = [random.uniform(0, 1) for i in range(sampleSize)]
# Compute their average
avgX = sum(X)/sampleSize
avgY = sum(Y)/sampleSize
# Partial steps to compute estimators of linear regression parameters.
XDiff = [X_i - avgX for X_i in X]
XDiffSquared = [i*i for i in XDiff]
YDiff = [Y_i - avgY for Y_i in Y]
# B1 is the estimator of slope.
# B0 is the estimator of intercept.
# r is the estimator of Y given X.
B1 = sum(x * y for x, y in zip(XDiff, YDiff)) / sum(XDiffSquared)
B0 = avgY - B1*avgX
r = lambda x: B0 + B1*x
# Partial steps to compute Wald Statistic.
errs = [y - r(x) for x, y in zip(X, Y)]
errStd = math.sqrt((1/(sampleSize-2))*(sum([err**2 for err in errs])))
XStd = math.sqrt((1/(sampleSize))*sum([diff**2 for diff in XDiff]))
stdB1 = errStd / (XStd * math.sqrt(sampleSize))
# Wald Statistic.
W = (B1 - 0)/stdB1
# pvalue of Wald Test of B1 = 0.
pvalueWald = 2*scipy.stats.norm.cdf(-abs(W))
# pvalue of T test of B1 = 0.
pvalueT = 2*distributions.t.sf(abs(W), sampleSize - 2)
# named tuple for our linear regression, with p-value coming from Wald Statistic
LinregressResult = collections.namedtuple("LinregressResult", ["slope", "intercept", "rvalue", "pvalue", "stderr"])
resultWald = LinregressResult(slope=B1, intercept=B0, rvalue=None, pvalue=pvalueWald, stderr=stdB1)
LinregressResult = collections.namedtuple("LinregressResult", ["slope", "intercept", "rvalue", "pvalue", "stderr"])
resultT = LinregressResult(slope=B1, intercept=B0, rvalue=None, pvalue=pvalueT, stderr=stdB1)
# named tuple for scipy's linear regression
resultScipy = scipy.stats.linregress(X, Y)
print(resultWald)
print(resultT)
print(resultScipy)
```

The code computes Wald Statistic on the estimator of the slope (gradient), followed by a `(sampleSize - 2)`

degree of freedom t-test.

Why is the p-value computed in such way? My go-to statistics textbook by Larry Wasserman, suggests that I compute the p-value of linear regression in a different way, by computing Wald Statistic, followed by a Wald test.

The two tests give very different results when the sample size is small and the slope is extreme, with Wald’s test p-values being consistently lower than t-test’s (even by orders of magnitude).

Which approach is better, more correct, and which one should I use?

Cross Validated Asked by Adam Kurkiewicz on December 29, 2020

1 AnswersThis issue was discussed at scipy's issue tracker:

https://github.com/scipy/scipy/issues/7074

It turns out that using Wald Statistic increases the type 1 error rate, as seen in the plots above.

Correct answer by Adam Kurkiewicz on December 29, 2020

1 Asked on January 14, 2021 by doxav

multiarmed bandit optimization queueing real time time series

1 Asked on January 14, 2021

1 Asked on January 14, 2021 by user261225

1 Asked on January 13, 2021 by katy

0 Asked on January 13, 2021

2 Asked on January 13, 2021 by crazydriver

1 Asked on January 13, 2021 by vin

bayesian bayesian network conditional probability inference posterior

0 Asked on January 12, 2021 by g-s-luimstra

convolution filter gradient descent neural networks optimization

1 Asked on January 12, 2021 by somethingsomething

1 Asked on January 12, 2021 by user3136

1 Asked on January 11, 2021 by ad-van-der-ven

bernoulli distribution distributions exponential distribution probability

0 Asked on January 11, 2021 by jonas-palaionis

0 Asked on January 11, 2021 by sventon

2 Asked on January 11, 2021 by benjamin-phua

1 Asked on January 10, 2021 by confusedmathstudent

conditional expectation conditional probability expected value frequency severity

0 Asked on January 10, 2021

2 Asked on January 10, 2021 by stats_nerd

feature engineering feature selection machine learning regression

0 Asked on January 10, 2021 by anto_zoolander

2 Asked on January 10, 2021 by snoopy

Get help from others!

Recent Answers

- Justin Markwell on Unity app crashes when using unmodified custom Android manifest (didn’t find class “UnityPlayerActivity”)
- eric_kernfeld on How to test consistency of responses?
- kjetil b halvorsen on How to test consistency of responses?
- Philipp on How do i draw a ray in unity
- DMGregory on MouseLook Script “Pops” back to the last value when the script is enabled after being disabled or destroyed

Recent Questions

- MouseLook Script “Pops” back to the last value when the script is enabled after being disabled or destroyed
- Unity app crashes when using unmodified custom Android manifest (didn’t find class “UnityPlayerActivity”)
- How do i draw a ray in unity
- How to test consistency of responses?
- How can I understand these variograms?

© 2022 AnswerBun.com. All rights reserved.