workshops

P value, Significance, and T-test

Misconception 1: The p-value indicates the probability of the null hypothesis being true.

Reality: The p-value actually indicates the probability of observing data as extreme as, or more extreme than, what was observed, assuming the null hypothesis is true. It is not a direct probability of the null hypothesis itself.

Misconception 2: A smaller p-value means a more important or impactful result.

Reality: The p-value does not measure the size of an effect or the importance of a result. It only measures the strength of evidence against the null hypothesis. A small p-value indicates that such a result is unlikely to occur if the null hypothesis were true, but says nothing about the practical significance of the finding.

Misconception 3: The p-value can tell you the chance of a Type I error (false positive).

Reality: The p-value itself is not the probability of a false positive; it is conditioned on the null hypothesis being true. The predefined alpha level (like 0.05) is what sets the Type I error rate, not the p-value.

Misconception 4: If a test statistic falls within the critical region, the null hypothesis can be rejected with 100% certainty.

Reality: Even if the test statistic falls in the critical region and we reject the null hypothesis, there is still a chance of a Type I error. We never prove or disprove hypotheses with absolute certainty in statistics; we only assess evidence against a null hypothesis.

Misconception 5: A non-significant result (p-value above the threshold) means there is no effect or the study was wrong.

Reality: A non-significant result simply means there was not enough evidence to reject the null hypothesis given the data. It could be due to a small effect size, insufficient power, or just random chance.

Misconception 6: The p-value tells you the magnitude of the effect or its clinical importance.

Reality: The p-value does not give any information about how large or important an effect is. Other statistics, like the effect size and confidence intervals, are needed to understand the magnitude of an effect.

Misconception 7: P-values are not affected by sample size.

Reality: P-values are highly sensitive to sample size. With very large samples, even tiny, trivial effects can produce very small p-values, while in small samples, even large effects may not be statistically significant (reflected in the code that plots p-values with increasing sample sizes).

Misconception 8: P-values provide a measure of the probability that the observed data was produced by chance alone.

Reality: The p-value does not measure the probability that the observed data was produced by chance alone. It measures how compatible the data are with the null hypothesis.

library(tidyverse)

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
βœ” ggplot2 3.4.2     βœ” purrr   1.0.1
βœ” tibble  3.2.1     βœ” dplyr   1.1.1
βœ” tidyr   1.3.0     βœ” stringr 1.5.0
βœ” readr   2.1.3     βœ” forcats 0.5.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
βœ– dplyr::filter() masks stats::filter()
βœ– dplyr::lag()    masks stats::lag()
# Define the mean and standard deviation
mean <- 0
std_dev <- 1

# Define the critical value (e.g., for a Z-score of 1.96 corresponding to a 95% confidence level)
critical_value <- 1.96

# Create a sequence of x values for the normal distribution
x <- seq(-4, 4, length.out = 1000)
# Calculate the density of the normal distribution for each x value
y <- dnorm(x, mean, std_dev)

# Create a dataframe for plotting
data <- data.frame(x, y)

# Plot the normal distribution
p <- ggplot(data, aes(x, y)) +
  geom_line() +
  geom_area(data = subset(data, x > critical_value), fill = "red", alpha = 0.5) +
  geom_vline(xintercept = critical_value, linetype = "dashed", color = "blue") +
  annotate("text", x = critical_value + 0.2, y = 0.1, label = "Critical value", hjust = 0) +
  annotate("text", x = critical_value + 0.2, y = 0.05, label = "Critical region (p-value)", hjust = 0) +
  ggtitle("Visualization of P-Value Calculation") +
  xlab("Test Statistic") +
  ylab("Density") +
  theme_minimal()

# Display the plot
print(p)

svg

 # Simulate 10000 p-values under the null hypothesis
set.seed(123) # For reproducibility
p_values <- replicate(10000, {
  sample_data <- rnorm(30) # 30 samples from a standard normal distribution (mean=0, sd=1)
  test_result <- t.test(sample_data)
  test_result$p.value
})

# Plot the distribution of p-values
hist(p_values, breaks=40, main="Distribution of p-values under the Null Hypothesis", xlab="p-value")
abline(v=0.05, col="red", lwd=2, lty=2)

svg

# Generating data for null and alternative hypothesis distributions
x_values <- seq(-4, 4, length.out = 1000)
null_distribution <- dnorm(x_values, mean = 0, sd = 1)
alt_distribution <- dnorm(x_values, mean = 1, sd = 1)
critical_value <- qnorm(0.95, mean = 0, sd = 1)  # For a right-tailed test
p_value_area_x <- seq(critical_value, 4, length.out = 1000)
p_value_area_y <- dnorm(p_value_area_x, mean = 0, sd = 1)

# Creating a data frame for ggplot
data <- data.frame(x_values, null_distribution, alt_distribution)

# Plotting
ggplot() +
  geom_line(data = data, aes(x = x_values, y = null_distribution, color = "Null Hypothesis Distribution"), size = 1) +
  geom_line(data = data, aes(x = x_values, y = alt_distribution, color = "Alternative Hypothesis Distribution"), size = 1) +
  geom_area(data = data.frame(p_value_area_x, p_value_area_y), aes(x = p_value_area_x, y = p_value_area_y), fill = "skyblue", alpha = 0.5) +
  geom_vline(xintercept = critical_value, linetype = "dashed", color = "grey") +
  annotate("text", x = critical_value, y = 0.02, label = "Critical Value", hjust = 1) +
  annotate("text", x = critical_value + 2, y = 0.02, label = "P-value Area", color = "blue") +
  scale_color_manual(values = c("Null Hypothesis Distribution" = "skyblue", "Alternative Hypothesis Distribution" = "salmon")) +
  labs(title = "Corrected Visualization of P-value",
       x = "Value",
       y = "Probability Density") +
  theme_minimal() +
  theme(legend.title = element_blank())

Warning message:
β€œUsing `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
β„Ή Please use `linewidth` instead.”

svg

# Simulate p-values with increasing sample sizes
set.seed(123)
sample_sizes <- seq(10, 1000, by = 50)
p_values <- sapply(sample_sizes, function(n) {
  data <- rnorm(n, mean = 0.5) # Simulate data with a small effect
  t_test <- t.test(data)
  return(t_test$p.value)
})

data <- data.frame(sample_sizes, p_values)

# Plot
ggplot(data, aes(x = sample_sizes, y = p_values)) +
  geom_line() +
  geom_point() +
  scale_y_continuous(trans = 'log10') + # Log scale to better display p-values
  ggtitle("P-values with Increasing Sample Sizes") +
  xlab("Sample Size") +
  ylab("P-value (log scale)") +
  theme_minimal()

svg

# Define effect sizes with 0.1 increments and sample size
effect_sizes <- seq(0.1, 1.0, by = 0.1) # Effect sizes from 0.1 to 1.0 in increments of 0.1
sample_size <- 100

# Simulate data and calculate p-values for each effect size
p_values <- sapply(effect_sizes, function(d) {
  control_group <- rnorm(sample_size) # Control group, mean = 0
  treatment_group <- rnorm(sample_size, mean = d) # Treatment group with effect size 'd'
  t_test <- t.test(control_group, treatment_group) # Compare the two groups
  return(t_test$p.value)
})

data <- data.frame(effect_sizes, p_values)

# Plotting the relationship between effect size and P-values
ggplot(data, aes(x = effect_sizes, y = p_values)) +
  geom_point() +
  geom_line() + # Adding a line to connect points for better visualization
  ggtitle("P-values for Different Effect Sizes (0.1 Increments)") +
  xlab("Effect Size") +
  ylab("P-value") +
  theme_minimal()

svg

# Define a range of t-values for 0 to 6 and specific sample sizes
sample_sizes <- c(3, 4, 5, 10, 20)
t_values <- seq(0, 6, length.out = 300)

# Calculate p-values for each combination of t-value and sample size
plot_data <- data.frame(
  t_value = rep(t_values, times = length(sample_sizes)),
  p_value = unlist(lapply(sample_sizes, function(n) {
    2 * pt(t_values, df = n - 1, lower.tail = FALSE)
  })),
  sample_size = factor(rep(sample_sizes, each = length(t_values)))
)

# Plot the relationship between p-values and t-statistics
p <- ggplot(plot_data, aes(x = t_value, y = p_value, color = sample_size)) +
  geom_line() +
  labs(title = "P-values of t-statistics for Different Sample Sizes",
       x = "t-statistic",
       y = "P-value",
       color = "Sample Size") +
  theme_minimal()

print(p)

svg

# Simulate data with a known small effect size under the alternative hypothesis
set.seed(123)
effect_size <- 0.3
sample_data_null <- rnorm(100) # 100 samples from a standard normal distribution
sample_data_alt <- rnorm(100, mean=effect_size) # 100 samples from a normal distribution with a small effect size

# Conduct a t-test to compare the two groups
test_result <- t.test(sample_data_null, sample_data_alt)
test_result$p.value

# Output the p-value and effect size
cat("p-value:", test_result$p.value, "\nEffect size:", effect_size)

0.443758558440655

p-value: 0.4437586
Effect size: 0.3
# Simulate data where the null hypothesis is false
set.seed(123)
true_effect <- 0.5
sample_data_control <- rnorm(100) # Control group with no effect
sample_data_treatment <- rnorm(100, mean=true_effect) # Treatment group with a true effect

# Conduct a t-test
test_result <- t.test(sample_data_control, sample_data_treatment)

# Check if p-value is below 0.05 and output result
if(test_result$p.value < 0.05) {
  cat("Significant at p < 0.05. p-value:", test_result$p.value, "\n")
} else {
  cat("Not significant at p < 0.05. p-value:", test_result$p.value, "\n")
}

Significant at p < 0.05. p-value: 0.02420057

Simple Guides