P-Value Calculator

Enter a z-score and select one-tailed or two-tailed test to get the p-value instantly. Includes significance interpretation and critical z-values for 0.10, 0.05, and 0.01 levels.

Convert Z-score to p-value for hypothesis testing
Guides & Reference

How It Works

P-Value tab — z-score to p-valueHypothesis testing, statistical significance, research reporting.

Switch to the P-Value tab (opens by default here). Enter your z-score — positive for right-tailed, negative for left-tailed, absolute value for two-tailed. Select one-tailed or two-tailed. The p-value shows with significance interpretation: p < 0.01 (highly significant), p < 0.05 (significant), p < 0.10 (marginal), p ≥ 0.10 (not significant).

Two-tailed: p = 2×Φ(−|z|) | One-tailed: p = Φ(−z)z=1.96 → p=0.05 (2-tail) | z=2.576 → p=0.01 | z=1.0 → p=0.317
One-tailed vs two-tailed testsDirectional hypotheses in clinical trials, engineering.

Two-tailed (default): H₁: μ ≠ μ₀ — difference in either direction counts. P = 2×Φ(−|z|). One-tailed: H₁: μ > μ₀ (right) or μ < μ₀ (left). P = Φ(−z). One-tailed tests have more power (easier to reach significance) but require prior directional justification. Switch using the one-tailed/two-tailed toggle.

Two-tail: p = 2×Φ(−|z|) | One-tail: p = Φ(−z)z=1.645: two-tail p=0.10, one-tail p=0.05
Critical z-values and significance levelsSetting significance thresholds in research design.

Three common significance levels: α=0.10 → z_crit=1.645 (two-tailed). α=0.05 → z_crit=1.96. α=0.01 → z_crit=2.576. Reject null when |z_observed| > z_critical. The calculator shows the critical values alongside your p-value for easy comparison. Pre-specify α before data collection to avoid p-hacking.

α=0.05 → z_crit=1.96 (2-tail) | α=0.01 → z_crit=2.576z=2.1: reject at α=0.05 (2.1>1.96), reject at α=0.01? 2.1<2.576, no
P-value and confidence interval dualityReporting results in research papers.

P < 0.05 (two-tailed) ↔ 95% CI does not include null value. P < 0.01 ↔ 99% CI does not include null. This duality means: if you compute a 95% CI and it excludes zero (or another null value), your p-value is below 0.05 without computing it separately. Both contain the same statistical information.

p < 0.05 ↔ 95% CI excludes null value95% CI [2.1, 8.4] excludes 0 → p < 0.05 implied
Type I and Type II errorsResearch design, power analysis.

Type I error (α): rejecting a true null hypothesis (false positive). P(Type I) = α = significance level. Type II error (β): failing to reject a false null (false negative). Power = 1−β = probability of detecting a true effect. There is a trade-off: lower α (stricter significance) reduces Type I errors but increases Type II. Sample size calculations balance both.

Type I: false positive (p < α when H₀ true) | Type II: false negativeα=0.05: 5% chance of false positive if null is true

Quick Reference

Verify these in the calculator above.

P-value

z=1.96, two-tailed

p = 0.05

P-value

z=2.576, two-tailed

p = 0.01

P-value

z=1.645, two-tailed

p = 0.10

One-tail

z=1.645, one-tailed

p = 0.05

P-value

z=3.0, two-tailed

p = 0.003

Not sig.

z=1.0, two-tailed

p = 0.317

Decision

Significant at α=0.05?

|z| > 1.96

Duality

p < 0.05 ↔ 95% CI

excludes null

Tips & Shortcuts

Use two-tailed tests by default. Only use one-tailed when you have a directional hypothesis established before seeing any data — not after.

P-value depends on both effect size AND sample size. With n=10000, trivially small effects produce p < 0.001. Always report effect size alongside p.

P = 0.049 and p = 0.051 are essentially the same evidence — do not treat the 0.05 threshold as a magical cutoff. Report the exact p-value.

For z = 1.96 (two-tailed p = 0.05): the critical region is the top and bottom 2.5% of the normal distribution — p/2 in each tail.

The p-value does NOT measure the probability that the null hypothesis is true. It measures how unusual your data would be if the null were true.

Common Mistakes

Saying "p = 0.03 means 3% chance the null is true"

P-value is conditional on the null being true — P(data|H₀). It is not P(H₀|data). The probability the null is true requires Bayesian analysis with a prior. p=0.03 means: if H₀ were true, 3% chance of data this extreme.

Using one-tailed test to make p easier to reach significance

One-tailed tests require pre-specified directional hypotheses. Choosing one-tailed after seeing the data direction is p-hacking — it inflates Type I error. Reviewers and journals require justification for one-tailed tests.

Treating p > 0.05 as "proving the null"

P > 0.05 means insufficient evidence to reject the null — not proof the null is true. "Absence of evidence is not evidence of absence." A non-significant result might reflect low power (small n) rather than no effect.

Comparing p-values across different studies as if they measure effect size

p = 0.001 does not mean the effect is larger than p = 0.04. P-value depends on n — the same effect gives smaller p with larger n. Use effect sizes (Cohen's d, relative risk, odds ratio) for cross-study comparison.

Not adjusting for multiple comparisons

Running 20 tests at α=0.05 means approximately 1 false positive expected. Use Bonferroni correction (α_adj = α/k for k tests) or false discovery rate methods when testing multiple hypotheses simultaneously.

Frequently Asked Questions

P-value is the probability of observing a test statistic at least as extreme as your result, assuming the null hypothesis is true. It measures evidence against the null hypothesis — not the probability the null is true. Small p (&lt; 0.05) → reject null.

For a two-tailed test: z = ±1.96. The 5% significance level splits between both tails: 2.5% in each. For one-tailed test: z = 1.645 (right tail) or z = −1.645 (left tail).

Two-tailed: tests whether a parameter differs from null in either direction. P = 2×Φ(−|z|). One-tailed: tests in one direction only (e.g. "is the new treatment better?"). P = Φ(−z) for right-tailed. Use two-tailed as default unless you have a strong directional hypothesis established before data collection.

If the null hypothesis were true, there would be less than 5% chance of seeing results this extreme. Common convention: p &lt; 0.05 is "statistically significant" (reject null at 5% level). This does NOT mean the effect is practically important or large.

Statistical significance (p &lt; 0.05) means the result is unlikely under the null. Practical significance means the effect is large enough to matter. With large n, tiny meaningless effects become statistically significant. Always report effect size (Cohen's d, odds ratio) alongside p-value.

The z-value at which p equals your significance level α. For α=0.05 two-tailed: z_critical = ±1.96. For α=0.01 two-tailed: z_critical = ±2.576. Reject the null when |z_observed| &gt; z_critical.

P &lt; 0.05 (two-tailed) is equivalent to the 95% CI not containing the null value. If the null is μ=0 and the 95% CI for your mean does not include 0, then p &lt; 0.05. CIs and p-values are two presentations of the same information.

Related Calculators