What is a P-value?

Cave & C. Supakorn



A way to decide whether to reject the null hypothesis (H0) is to determine the probability of obtaining a test statistic at least as extreme as the one observed under the assumption that H0 is true. This probability is referred to as the “p-value” and it plays an important role in statistics.





What is the true meaning of a p-value and how should it be used?


P-values are a continuum (between 0 and 1) that provide a relative measure of the strength of evidence against H0. The smaller the        p-value, the stronger the evidence for rejecting the H0. This leads to the guidelines of p < 0.001 indicating very strong evidence against H0, p < 0.01 strong evidence, p < 0.05 moderate evidence, p < 0.1 weak evidence or a trend, and p ≥ 0.1 indicating insufficient evidence[1]. Declaring p-values as being either significant or non-significant based on an arbitrary cut-off (e.g. 0.05 or 5%) should be avoided. As Ronald Fisher said:


“No scientific worker has a fixed level of significance at which, from year to year, and in all circumstances he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas”[2].


A very important aspect of the p-value is that does not provide any evidence in support of H0 – it only quantifies evidence against H0. That is, a large p-value does not mean we can accept H0. Take care not to fall into the trap of accepting H0!


For useful conclusions to be drawn from a statistical analysis, p-values should be considered alongside the size of the effect. Confidence intervals are commonly used to describe the size of the effect and the precision of its estimate. Crucially, statistical significance does not necessarily imply practical significance. Small p-values can come from a large sample and a small effect, or a small sample and a large effect.


It is also important to understand that the size of a p-value depends critically on the sample size. As Knaub (1987)[3] explained, with a very large sample size, H0 may be rejected at a very small significant level when the H0 is nearly (i.e. approximately) true. Conversely, with small sample size, it may be nearly impossible to reject H0.


[1] Ganesh H. and V. Cave. 2018. P-values, P-values everywhere! New Zealand Veterinary Journal. 66(2): 55-56.

[2] Fisher RA. 1956. Statistical Methods and Scientific Inferences. Oliver and Boyd, Edinburgh, UK.

[3] Knaub JR. 1987. Practical interpretation of hypothesis tests. The American Statistician. 41(3): 246-247.