A web visualisation that demonstrates the difference in power and Type 1 error rates between Welch’s and 2-sample t-tests can be found here.

  1. Test different means, standard deviation, and sample size for the two populations. See how the power and Type 1 error rates change between Welch’s and 2-sample t-tests.

A common error when comparing two groups is to test each group mean separately against the same null hypothesized value, rather than directly comparing the two means with each other. This is known as the fallacy of indirect comparison.

  1. Do babies look more like their fathers or their mothers? Christenfeld and Hill (1995) tested this by obtaining pictures of a series of babies and their mothers and fathers. A photograph of each baby, along with photos of three possible mothers and of three possible fathers, were shown to a large number of volunteers. Each volunteer was asked to pick which woman and which man were the parents of the baby based on facial resemblance. The percentage of volunteers who correctly guessed a parent was used as the measure of a given baby’s resemblance to that parent.

    P5Q2

    1. If there were no facial resemblance of babies to parents, then the mean resemblance should be 33.3%, the percentage of correct guesses expected by chance. Convert the data frame from wide to long format.
      • Answer
    2. The package ggpubr is really handy for producing publication-ready results. Plot a graph showing the mean and corresponding 95% confidence interval for both each parent using ggerrorplot(). Add a horizontal line for resemblance = 33.3% using geom_hline().
      • Answer
    3. Conduct one-sample t-test for each parent to test whether the sample means are different from 33.3%.
      • Answer
    4. Based on 6c, the researchers from the study concluded that babies resembled their fathers more than they resembled their mothers. Is this conclusion valid? Now, conduct a two-sample t-test to test whether the two groups have different sample means.
      • Answer
  2. A paper on Nature examined how fungi cause negative density dependence in plants. The hypothesis was that a higher plant density attracts more fungal pests, causing reduced survival. This is one of their figures.

    Untitled

    1. This is their claim: “Suppressing fungi using the fungicide Amistar reduced the strength of NDD so that the mean slope was no longer significantly different from 1 ($t_{48}=-1.54, P=0.130$). Neither Ridomil nor Engeo reduced the strength of NDD, with the mean slope remaining significantly less than 1 in both treatments.” What might be a potential problem in their analysis?

We then move on to the real-life data.

  1. Rising atmospheric carbon dioxide and higher tree mortality associated with climate change have been hypothesised to drive increases in the abundance of lianas (climbing woody vines) in tropical forests. Researchers counted the number of lianas in a series of 1-ha plot sites located in primary Amazonian forest in two surveys. The first survey was conducted between 1997 and 1999 and the second survey was an average of 13.6 years later, in 2012.

    P5Q4

    1. Plot a graph to visualise your data. Do any assumptions for statistical tests look like violated?
    2. What is the mean abundance of lianas on the dates of the two surveys?
    3. Did the abundance of lianas change significantly between the two surveys? Carry out an appropriate test.
    4. How large was the increase? Calculate the 95% confidence interval for the mean change in liana abundance between the two survey dates. You will need to use the function qt(0.975, df = ___) to calculate the critical *t-*statistic and the formulae $CI=\bar X\pm t_{\alpha(2),df}SE$ and $SE=s/\sqrt n$.
  2. Conservation efforts include reintroduction of species into the wild from captive breeding programs. Researchers infected mice with a parasite to test the effect of rewilding. Rewilded mice were released into outdoor fenced enclosures, whereas control mice were kept in the lab. After 3 weeks, they compared the nematode burdens of mice in the two groups. The data are in $\log_{10}$ units of total nematode biomass per individual mouse in nanograms).

    P5Q5

    1. Plot a graph to visualise your data. Do any assumptions for statistical tests look like violated?
    2. Compute the mean nematode burdens in the two groups of mice. Which group seems to have the higher burden?
    3. Calculate a 95% confidence interval for the difference in these groups’ mean nematode burdens.