Non-parametric methods provide powerful alternatives to parametric tests, especially useful when data do not meet the assumptions required for parametric analyses. Here’s a concise overview of the main types of non-parametric methods and their typical applications:
wilcox.test(x ~ group, data = df)
wilcox.test(before, after, paired = TRUE)
kruskal.test(x ~ group, data = df)
friedman.test(x ~ time | subject, data = df)
cor.test(x, y, method = "spearman")
chisq.test(table(df$Var1, df$Var2))
The ToothGrowth
dataset in R examines the impact of Vitamin C on tooth growth in guinea pigs. It includes 60 observations across three variables:
len
: Length of odontoblasts (measured in micrometers) indicating tooth growth.supp
: Type of Vitamin C supplement given (OJ
for orange juice, VC
for ascorbic acid).dose
: Daily Vitamin C dose in milligrams (0.5, 1, and 2 mg/day).This dataset is useful for statistical analysis, exploring how Vitamin C dosage and supplement type influence tooth growth.
# Load the ToothGrowth data
data("ToothGrowth")
# Load necessary libraries
library(ggplot2)
library(dplyr)
library(tidyr)
# Summary statistics
summary(ToothGrowth)
aggregate(len ~ supp + dose, data = ToothGrowth, FUN = function(x) c(mean = mean(x), sd = sd(x)))
len supp dose
Min. : 4.20 OJ:30 Min. :0.500
1st Qu.:13.07 VC:30 1st Qu.:0.500
Median :19.25 Median :1.000
Mean :18.81 Mean :1.167
3rd Qu.:25.27 3rd Qu.:2.000
Max. :33.90 Max. :2.000
supp | dose | len |
---|---|---|
<fct> | <dbl> | <dbl[,2]> |
OJ | 0.5 | 13.23, 4.459709 |
VC | 0.5 | 7.98, 2.746634 |
OJ | 1.0 | 22.70, 3.910953 |
VC | 1.0 | 16.77, 2.515309 |
OJ | 2.0 | 26.06, 2.655058 |
VC | 2.0 | 26.14, 4.797731 |
## Histogram of tooth lengths
ggplot(ToothGrowth, aes(x = len)) +
geom_histogram(binwidth = 1, fill = "blue", color = "black") +
ggtitle("Distribution of Tooth Lengths") +
xlab("Tooth Length") + ylab("Frequency")
# KDE plot
ggplot(ToothGrowth, aes(x = len)) +
geom_density(fill = "blue", alpha = 0.5) +
ggtitle("Kernel Density Estimation of Tooth Lengths") +
xlab("Tooth Length") + ylab("Density")
ggplot(ToothGrowth, aes(x = supp, y = len, fill = supp)) +
geom_boxplot() +
ggtitle("Tooth Length by Supplement Type") +
xlab("Supplement Type") + ylab("Tooth Length")
ggplot(ToothGrowth, aes(x = as.factor(dose), y = len, fill = as.factor(dose))) +
geom_boxplot() +
ggtitle("Tooth Length by Dose") +
xlab("Dose (mg/day)") + ylab("Tooth Length")
library(dplyr)
ToothGrowth_agg <- ToothGrowth %>%
group_by(supp, dose) %>%
summarise(mean_len = mean(len), .groups = "drop")
ggplot(ToothGrowth_agg, aes(x=dose, y=mean_len, group=supp, color=supp)) +
geom_line() +
geom_point() +
labs(title="Mean Tooth Length by Supplement and Dose",
x="Dose (mg/day)", y="Mean Tooth Length") +
theme_minimal()
# Normality check with QQ plots
qqnorm(ToothGrowth$len)
qqline(ToothGrowth$len)
# Test normality
shapiro.test(ToothGrowth$len)
Shapiro-Wilk normality test
data: ToothGrowth$len
W = 0.96743, p-value = 0.1091
To compare the tooth length (len) between the two supplement types (supp), you can use the Mann-Whitney U test (also known as the Wilcoxon rank-sum test).
wilcox.test(len ~ supp, data = ToothGrowth)
Warning message in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...):
“cannot compute exact p-value with ties”
Wilcoxon rank sum test with continuity correction
data: len by supp
W = 575.5, p-value = 0.06449
alternative hypothesis: true location shift is not equal to 0
This test compares the median tooth lengths between groups given ascorbic acid (VC) and orange juice (OJ) without assuming the data are normally distributed.
For comparing the effects of different dose levels (0.5, 1, and 2 mg/day) on tooth length, you can use the Kruskal-Wallis test.
kruskal.test(len ~ dose, data = ToothGrowth)
Kruskal-Wallis rank sum test
data: len by dose
Kruskal-Wallis chi-squared = 40.669, df = 2, p-value = 1.475e-09
This test is an extension of the Mann-Whitney U test for more than two groups. It compares the median tooth lengths across different dose levels without assuming normal distribution.
If your data structure included paired observations (which the ToothGrowth dataset does not inherently have, but could if measuring tooth length before and after treatment within the same subjects), you would use the Wilcoxon signed-rank test.
# Hypothetical example if you had pre- and post-treatment measurements
#wilcox.test(preTreatmentLengths, postTreatmentLengths, paired = TRUE)
This test compares the median of the paired differences to zero, suitable for before-after studies or matched pairs.
While non-parametric tests do not assume normality, they still have assumptions (e.g., independence of observations). Always ensure these assumptions are met before proceeding. Additionally, non-parametric tests generally compare medians rather than means, so interpret your results accordingly.
The faithful
dataset comprises 272 observations of Old Faithful geyser in Yellowstone National Park, featuring:
eruptions
: Duration of eruptions (minutes).waiting
: Waiting time until the next eruption (minutes).data("faithful")
summary(faithful)
hist(faithful$eruptions, breaks = 20, main = "Eruption Duration", xlab = "Duration (minutes)")
hist(faithful$waiting, breaks = 20, main = "Waiting Time to Next Eruption", xlab = "Time (minutes)")
eruptions waiting
Min. :1.600 Min. :43.0
1st Qu.:2.163 1st Qu.:58.0
Median :4.000 Median :76.0
Mean :3.488 Mean :70.9
3rd Qu.:4.454 3rd Qu.:82.0
Max. :5.100 Max. :96.0
qqnorm(faithful$eruptions)
qqline(faithful$eruptions)
qqnorm(faithful$waiting)
qqline(faithful$waiting)
# Shapiro-Wilk test for normality
shapiro.test(faithful$eruptions)
shapiro.test(faithful$waiting)
Shapiro-Wilk normality test
data: faithful$eruptions
W = 0.84592, p-value = 9.036e-16
Shapiro-Wilk normality test
data: faithful$waiting
W = 0.92215, p-value = 1.015e-10
# Categorize 'waiting' into two groups
faithful$waiting_group <- ifelse(faithful$waiting < median(faithful$waiting), "Short", "Long")
# Mann-Whitney U test
wilcox.test(eruptions ~ waiting_group, data = faithful)
Wilcoxon rank sum test with continuity correction
data: eruptions by waiting_group
W = 16952, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
# Divide 'waiting' into three groups
faithful$waiting_quantiles <- cut(faithful$waiting, breaks = quantile(faithful$waiting, probs = c(0, 1/3, 2/3, 1)), include.lowest = TRUE, labels = c("Low", "Medium", "High"))
# Kruskal-Wallis test
kruskal.test(eruptions ~ waiting_quantiles, data = faithful)
Kruskal-Wallis rank sum test
data: eruptions by waiting_quantiles
Kruskal-Wallis chi-squared = 182.21, df = 2, p-value < 2.2e-16
# Spearman correlation
cor.test(faithful$eruptions, faithful$waiting, method = "spearman")
Warning message in cor.test.default(faithful$eruptions, faithful$waiting, method = "spearman"):
“Cannot compute exact p-value with ties”
Spearman's rank correlation rho
data: faithful$eruptions and faithful$waiting
S = 744659, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.7779721
This means that there’s strong evidence to reject the null hypothesis that there’s no monotonic relationship between the two variables.
ggplot(faithful, aes(x = waiting, y = eruptions)) +
geom_point() +
theme_minimal() +
labs(title = "Correlation between Eruption Duration and Waiting Time",
x = "Waiting Time to Next Eruption (minutes)",
y = "Eruption Duration (minutes)")
ggplot(faithful, aes(x = waiting, y = eruptions)) +
geom_point() +
geom_smooth(method = "lm", color = "blue") +
theme_minimal() +
labs(title = "Correlation between Eruption Duration and Waiting Time with Regression Line",
x = "Waiting Time to Next Eruption (minutes)",
y = "Eruption Duration (minutes)")
[1m[22m`geom_smooth()` using formula = 'y ~ x'
Non-parametric tests are essential in statistical analysis but come with certain limitations:
Lower Statistical Power: Generally, they have less power than parametric tests, making it harder to detect true effects.
Median Comparisons: Often focus on medians rather than means, which may not always provide the desired insight.
Complex Model Limitations: Less effective for analyzing complex relationships due to simpler modeling capabilities.
Reduced Parameter Information: Provide less information about population parameters, such as mean and variance.
Sensitivity to Ties: The presence of ties in data can affect the test’s efficiency and discrimination ability.
Interpretation Challenges: Results can be more difficult to interpret and communicate, especially to non-specialist audiences.
Limited Scale: Primarily suited for ordinal or nominal data, potentially underused information from higher-level scales.