Chapter 11 The Power of R & R Markdown

Happy Papaya Logo

Indentation and spacing is very important! In the text, it is better to add more. In the YAML header, it may not be recognized properly with improper spacing

11.1 Reporting with the `apa` & `papaja` packages

You know the papaja package already for theme_apa() in data visualization
The package also has many wrapper functions to make reporting in R & R Markdown a lot easier
- "apa_print()"
- Chi$^2$ Test reporting cannot be achieved with this, so we use apa::chisq_apa() for that

11.1.1 Usage

# Chi (add format = "rmarkdown" if needed)
apa::chisq_apa(chisq.test(seminar$v07_genre, seminar$soul_dummy))
# Warning in chisq.test(seminar$v07_genre, seminar$soul_dummy): Chi-squared
# approximation may be incorrect
# chi^2(2) = 2.14, p = .344

# t-test
papaja::apa_print(t.test(v05_skill_tech ~ v01_gender, data = seminar, alternative = "greater"))$full_result
# [1] "$\\Delta M = 10.50$, 95\\% CI $[-109.56, \\infty]$, $t(1.28) = 0.39$, $p = .377$"

# ANOVA
papaja::apa_print(aov(v08_loudness ~ as.factor(v07_genre), data = seminar))$full_result |> suppressMessages()
# $as_factorv07_genre
# [1] "$F(2, 10) = 2.56$, $\\mathit{MSE} = 120.80$, $p = .126$, $\\hat{\\eta}^2_G = .339$"
# suppressMessages() is a useful helper function that avoids any messages from the function it is used with. Here it would just tell us "For one-way between subjects designs, generalized eta squared is equivalent to eta squared. Returning eta squared.""

11.1.2 Usage in R Markdown

By using apa_print(model)$full_result, we can automatically report results inside our documents:

11.2 The `papaja` Package

reporting is very easy with these packages
they provide "wrapper" functions around statistical tests that allow for dynamic, correctly formatted reports
most can be accomplished with the papaja package
only the chi^2 test needs the apa package but the principle is the same

11.2.1 Markdown Templates

papaja has R Markdown templates and they are awesome for writing scientifically sound papers according to APA guidelines

There are many other awesome templates that are available for styling and creating R Markdown documents. The example in Figure 10.1 was written with the oxforddown template! You can write nicely formatted letters with the linl package!

11.3 Taking full advantage of R

Conduct Analysis, e.g. an ANOVA
Write about your analysis & report results (with proper statistics)
Choose and create an appropriate visualization, e.g. a boxplot

11.3.1 Walkthrough

We will use the package palmerpenguins (install.packages("palmerpenguins")) and see whether penguins from different islands have different body mass.

library(palmerpenguins)

# Test assumption of homoscedasticity
car::leveneTest(aov(bill_length_mm ~ species, data = penguins)) # yay
# Levene's Test for Homogeneity of Variance (center = median)
#        Df F value Pr(>F)
# group   2  2.2425 0.1078
#       339

# Run & save ANOVA
penguinmodel <- aov(bill_length_mm ~ species, data = penguins)

# Run & save post-hoc Tests
TukeyHSD(penguinmodel)
#   Tukey multiple comparisons of means
#     95% family-wise confidence level
# 
# Fit: aov(formula = bill_length_mm ~ species, data = penguins)
# 
# $species
#                       diff       lwr        upr     p adj
# Chinstrap-Adelie 10.042433  9.024859 11.0600064 0.0000000
# Gentoo-Adelie     8.713487  7.867194  9.5597807 0.0000000
# Gentoo-Chinstrap -1.328945 -2.381868 -0.2760231 0.0088993

For anova, we want to report the full statistics, a.k.a "full_result" from papaja::apa_print()

library(papaja)
apa_result <- apa_print(penguinmodel)$full_result

Create a boxplot and add results right in the caption!

library(ggplot2)

ggplot(penguins, aes(x=species, y=bill_length_mm, color=species, fill=species), na.rm=T) +
  geom_boxplot(alpha = .7) + theme_apa() + theme(legend.position = "none") +
  labs(x = "Penguin Species", y = "Bill Length (mm)", 
       caption = latex2exp::TeX(unlist(apa_result))) +
  scale_color_brewer(palette = 5) +
  scale_fill_brewer(palette = 5)

Note that it is a little complex to include the result right in the caption
- ggplot2 cannot handle notation so we need to use package latex2exp
- the TeX function from that package cannot handle the list-output from apa_print()$full_result so we need to "unlist" that
Using RMarkdown, it is much easier to include the statistic right in the text with inline code:
- `r apa_print(penguinmodel)$full_result`
- $\rightarrow$ $F(2, 339) = 410.60$, $\mathit{MSE} = 8.76$, $p < .001$, $\hat{\eta}^2_G = .708$

Exercise

Create an R Markdown Document that includes the same type of analyses of variance with all steps that we have conducted before. Please use the penguins dataset from the palmerpenguins package and take "species" as the independent variable and "flipper_length_mm" as the dependent variable.

leveneTest(aov(flipper_length_mm ~ species, data = penguins))
penguinmodel2 <- aov(bill_length_mm ~ species, data = penguins)
posthoc <- TukeyHSD(penguinmodel2)
apa_result <- apa_print(penguinmodel2)$full_result

ggplot(penguins) +
  geom_boxplot(aes(x = species, y = flipper_length_mm, color = species, fill = species), 
               alpha = .7, na.rm = TRUE) +
  theme_apa() + 
  theme(legend.position = "none") +
  labs(x = "Penguin Species", y = "Flipper Length (mm)") +
  scale_color_brewer(palette = 5) +
  scale_fill_brewer(palette = 5)

Wrap-Up & Further Resources

Rmd offers many options for customization
Analyses can be conducted and reported in the same document
We can profit from many automatizations, e.g. chapter numbering
Single values/ results can be reported with inline code

Palmer Penguins from the palmer penguins package