Hands-on_Ex04Chpt10

Author

Lin Lin

Visualizing the uncertainty of point estimates

  • A point estimate is a single number, such as a mean.

  • Uncertainty is expressed as standard error, confidence interval, or credible interval

  • Important:

  • Don’t confuse the uncertainty of a point estimate with the variation in the sample

    #| warning: false
pacman::p_load(tidyverse, plotly, crosstalk, DT, ggdist, gganimate)
exam <- read_csv("data/Exam_data.csv")
Rows: 322 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): ID, CLASS, GENDER, RACE
dbl (3): ENGLISH, MATHS, SCIENCE

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Visualizing the uncertainty of point estimates: ggplot2 methods

  • group the observation by RACE,

  • computes the count of observations, mean, standard deviation and standard error of Maths by RACE, and

  • save the output as a tibble data table called my_sum.

    #| warning: false
  my_sum <- exam %>%
    group_by(RACE) %>%
    summarise(
      n=n(),
      mean=mean(MATHS),
      sd=sd(MATHS)
      ) %>%
    mutate(se=sd/sqrt(n-1))

  knitr::kable(head(my_sum), format = 'html')
RACE n mean sd se
Chinese 193 76.50777 15.69040 1.132357
Indian 12 60.66667 23.35237 7.041005
Malay 108 57.44444 21.13478 2.043177
Others 9 69.66667 10.72381 3.791438

Visualizing the uncertainty of point estimates: ggplot2 methods

below is how reveal the standard error of mean maths score by race.

  ggplot(my_sum) +
  geom_errorbar(
    aes(x=RACE, 
        ymin=mean-se, 
        ymax=mean+se), 
    width=0.2, 
    colour="black", 
    alpha=0.9, 
    size=0.5) +
  geom_point(aes
           (x=RACE, 
            y=mean), 
           stat="identity", 
           color="red",
           size = 1.5,
           alpha=1) +
  ggtitle("Standard error of mean 
          maths score by rac")
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.