###------------------------------------------------------------------### ### BRM Session 7 - Hypothesis testing: Analysis of Variance (ANOVA) ### ###------------------------------------------------------------------### # Dennis Abel and Lukas Birkenmaier # Intro and setup --------------------------------------------------------- # Based on exercises in chapter 6 of Sarstedt & Mooi (2019) # Load required packages library(tidyverse) # Many functions including ggplot library(car) # For Levene's test library(lsr) # several useful functions, especially post hoc and effect estimations # Working directory setwd("[insert directory here]") # Use read_rds-function to load rds data oddjob <- read_rds("oddjob.rds") # Research design --------------------------------------------------------- # Our aim is to identify the factors that influence customers' overall price/performance # satisfaction with the airline and explore the relevant target groups for future # advertising campaigns. The following question should be answered: # 1. Does the overall price/performance satisfaction differ according to the # traveler's status? # In order to address this question, we need the following variables: # overall price/performance satisfaction: overall_sat hist(oddjob$overall_sat) # Traveler's status: status counts_status <- table(oddjob$status) barplot(counts_status) # Formulate hypothesis # 1. H0: Overall satisfaction means are the same between status groups # 1. H1: Overall satisfaction means differ at least for two status groups # Choose significance level # Use significance level (alpha) of 0.05, which means that we allow a maximum # chance of 5% of mistakenly rejecting a true null hypothesis # Select appropriate test # We start by defining the testing situation, which is to compare the mean overall # price/performance satisfaction scores across three status groups. # [The textbook classifies these scores as measured on a ratio scale, I assume this is a typo # and it means interval]. # We know that a comparison of three or more groups involve a one-way ANOVA or Kruskal-Wallis test. # What we know about this sample is that it is a random subset and we also know # that these are independent observations # Check assumptions ------------------------------------------------------- # Next, we need to check if our dependent variable is normally distributed # Shapiro-Wilk test for whole sample shapiro.test(oddjob$overall_sat) # Shapiro-Wilk test for each status group by(oddjob$overall_sat, oddjob$status, shapiro.test) # The results show that the p-values of the Shapiro-Wilk test are smaller than .05, indicating that the # normality assumption is violated for all three status groups. # Visual inspection with quantile plots (qqplots) ggplot(oddjob, aes(sample = overall_sat, colour = status)) + stat_qq() + stat_qq_line()+ facet_wrap(~status) # The dots appear to follow the line reasonably well but deviations are # detectable. Visual inspection unclear but Shapiro-Wilk test is definitive. # As we find no support for normality, we may have to use the a one-way ANOVA F-test when # the variances are equal or a Kruskal-Wallis rank test when the variances are not equal. # So the decision on which to choose depends on whether the variances are equal. # We inspect this with Levene's test leveneTest(oddjob$overall_sat, oddjob$status) leveneTest(oddjob$overall_sat, oddjob$status, center=mean) ## By default, the test utilizes # the median. We can switch to mean with the option "center=mean" # The Levene's test statistic clearly suggests that the population variances are equal, # as the test's p-value (0.404) is well above 0.05. Thus, we can utilize the one-way # Anova F-test. # Calculate test statistic and effect size -------------------------------- # We will use the aov-function to calculate the test anova <- aov(formula = overall_sat ~ status, data = oddjob) summary(anova) # R doesn't use the terms "between-group" and "within-group". The between groups variance # corresponds to the effect that the status has on the outcome variable and the within # groups variance corresponds to the ``leftover'' variability, so it calls that the # residuals. # Between group sum of squares SSb = 51.76 # Within group sum of squares SSw = 2758.46 # The model has an F-value of 9.963. We also learn that the p-value is smaller than the # significance level (0.001). Therefore, # we can reject the null hypothesis that there is no difference # in satisfaction between status groups and conclude that the overall # price/performance satisfaction differs significantly across at least two of the # three status groups. # Post-hoc test to identify which groups differ # The textbook suggests various post hoc tests depending on whether population variances # and sample sizes per group differ. The available post-hoc tests in SPPS exceed the # available approaches in R. For our purposes, we can use the widely used Holm # correction or the conservative Tukey's honestly significant difference test (Tukey's HSD) # We use the posthocPairwiseT-function from the lsr-package posthocPairwiseT(anova, p.adjust.method="holm") # Writing a conclusion on pairwise comparisons: # Post hoc tests (using the Holm correction to adjust p) indicated that Blue status group # members have a significantly larger satisfaction with the price/performance ratio of # Oddjob Airways compared to Silver group members (p=.0008) and Gold group members (p=.0021). # We found no evidence that Silver and Gold members differ (p=.78). # Alternative TukeyHSD TukeyHSD(anova) plot(TukeyHSD(anova, conf.level=.95), las = 2) # Furthermore, we can calculate the effect size. # For this, we use the etaSquared-function from the lsr-package etaSquared(x=anova) # The first output corresponds to the eta-squared statistic. The second one is a partial # eta-squared and only relevant for more complicated ANOVAs (not covered in this course). # Remember, the eta-squared is the between sum of squares divided by the total sum of squares. # The interpretation is straightforward: it refers to the proportion of the variability # in the outcome variable than can be explained in terms of the predictor. # Eta-squared is closely related to the squared correlation and the square-root of eta-squared # can be interpreted as if it referred to the magnitude of a Pearson correlation. sqrt(0.0184) ## 0.13 equals a rather small relationship. # Interpret the results: # 1. Gold and Blue members, as well as Silver and Blue members, differ significantly in # their mean overall price/performance satisfaction. # 2. There is no difference between Gold and Silver member in their mean overall price/ # performance satisfaction. # 3. Membership status explains only a minimal share of the customer's price/performance # satisfaction. Hence, other factors - presently not included in the model - explain # the remaining variation in the outcome variable.