Solved – Recommendations for first time teacher (Intro to Biostatistics)

biostatisticsteaching

I'm teaching my first class this fall (Intro to Biostatistics). Anyone have any suggestions for teaching statistics better? Perhaps some example you wish your first teacher had used? I'm using Principles of Biostatistics by Pagano and Gauvreau.

EDIT: DETAILS

This class is an online class that meets twice a week for 1.5 hours. Students will listen to me lecture while viewing a powerpoint/beamer presentation (boring?) mixed with a little tablet/pen action (exciting?) This class is VERY beginner statistics stuff, taught to mostly biomedical engineers (undergrads) and some non-statistics graduate students (nursing, medical students, public health, etc)

Syllabus:

1) What is biostatistics?

2) Probability

3) Diagnostic tests (i.e. specificity, sensitivity, ROC curves. Mostly here b/c it allows us to apply some of the things we learned about in probability e.g. Bayes rule)

4) Distributions

5) Sampling Distributions

6) Confidence Intervals

7) Hypothesis testing (one sample, two sample, proportions)

8) Power and Sample size calculations

9) Non-parametric methods

10) Contigency tables (chi-square test, fishers test, mcnemars test, relative risk, odds ratios

11) correlation

Main object is for students to learn the core concepts of statistical inference e.g how do you quantify the question "Which drug is better?" stuff like that.

For the sections listed above, do you have any words of advice/caution for teaching them.

For example: I've seen/heard of many different approaches to introducing students to the wald test vs the score test when doing inference on proportions. If taught poorly, students easily get confused ("Why are there two of these?", "Which one do I use?", "They look the same to me.") Some teachers don't even mention these names and just say: Do this for confidence intervals and this other thing for hypothesis testing. How would you approaches this problems or others like it?

Best Answer

Here are a few examples which worked well for me when I was teaching statistics.

I like to begin the class with the martingale, because somehow everybody finds a winning strategy at roulette interesting, and it is fairly easy to grasp. Then later you can have people try it out for themselves, if you are doing computer labs and can find an online roulette simulator. [Warning: I once had a lab of students do this, and one of them ended up with a $60,000 profit. After that, it was not easy to convince them that the martingale is bad.]
A good way to illustrate faulty reasoning about independence is Munchhausen's Syndrome by Proxy. Allegedly several people went to prison because the doctor who invented this syndrome claimed in court that the deaths of children within the same family were indpendent events.
Everybody finds bad graphics like this one entertaining, and students often enjoy collecting them for themselves and bringing them to class.
When talking about expected value, the St. Petersburg paradox is a good one. Most people can understand it fairly quickly and it shows that the definition of expected value is tricky.
When teaching the central limit theorem, it's useful to have a wacky bimodal distribution to hand. A good one is the distirbution of the last two digits of the years on the one-cent coins which the students happen to have in their pockets. I got this one from a professor at Oberlin College.
Identifying a fake series of coin flips is a good one because the students can try it out on their friends.
The British magician Derren Brown has quite a few videos which relate to probability and statistics and are also entertaining. I used to show clips of these in class sometimes.
Finally, and most importantly, use data sets from the students' fields whenever you can. It doesn't matter exactly what, but it's really important to show them data of the type that they might plausibly collect in the future. Most students don't choose to take a statistics course. Showing students how it applies to them can make a huge difference to their enjoyment. There are statistics papers on virtually everything, even poetry. Or you are teaching life tables; instead of using boring data, how about making one for tyrannosaurs like in these notes?

Related Solutions

Distributions Sampling Teaching – Effective Strategies for Teaching the Sampling Distribution Concepts

In my opinion, sampling distributions are the key idea of statistics 101. You might as well skip the course as skip that issue. However, I am very familiar with the fact that students just don't get it, seemingly no matter what you do. I have a series of strategies. These can take up a lot of time, but I recommend skipping / abbreviating other topics, so as to ensure that they get the idea of the sampling distribution. Here are some tips:

Say it distinctly: I first explicitly mention that there 3 different distributions that we are concerned with: the population distribution, the sample distribution, and the sampling distribution. I say this over and over throughout the lesson, and then over and over throughout the course. Every time I say these terms I emphasize the distinctive ending: sam-ple, samp-ling. (Yes, students do get sick of this; they also get the concept.)
Use pictures (figures): I have a set of standard figures that I use every time I talk about this. It has the three distributions pictured distinctly, and typically labeled. (The labels that go with this figure are on the powerpoint slide and include short descriptions, so they don't show up here, but obviously it's: population at the top, then samples, then sampling distribution.)
Give the students activities: The first time you introduce this concept, either bring in a roll of nickles (some quarters may disappear) or a bunch of 6-sided dice. Have the students form into small groups and generate a set of 10 values and average them. Then you can make a histogram on the board or with Excel.
Use animations (simulations): I write some (comically inefficient) code in R to generate data & display it in action. This part is especially helpful when you transition to explaining the Central Limit Theorem. (Notice the Sys.sleep() statements, these pauses give me a moment to explain what is going on at each stage.)

N = 10
number_of_samples = 1000


iterations  = c(3, 7, number_of_samples)  
breakpoints = seq(10, 91, 3)  
meanVect    = vector()  
x           = seq(10, 90)  
height      = 30/dnorm(50, mean=50, sd=10)  
y           = height*dnorm(x, mean=50, sd=10)  

windows(height=7, width=5)  
par(mfrow=c(3,1), omi=c(0.5,0,0,0), mai=c(0.1, 0.1, 0.2, 0.1))  

for(i in 1:iterations[3]) {  
  plot(x,y, type="l", col="blue", axes=F, xlab="", ylab="")  
  segments(x0=20, y0=0, x1=20, y1=y[11], col="lightgray")  
  segments(x0=30, y0=0, x1=30, y1=y[21], col="gray")  
  segments(x0=40, y0=0, x1=40, y1=y[31], col="darkgray")  
  segments(x0=50, y0=0, x1=50, y1=y[41])  
  segments(x0=60, y0=0, x1=60, y1=y[51], col="darkgray")  
  segments(x0=70, y0=0, x1=70, y1=y[61], col="gray")  
  segments(x0=80, y0=0, x1=80, y1=y[71], col="lightgray")  
  abline(h=0)  

  if(i==1) {  
    Sys.sleep(2)  
  }  
  sample = rnorm(N, mean=50, sd=10)  
  points(x=sample, y=rep(1,N), col="green", pch="*")  

  if(i<=iterations[1]) {  
    Sys.sleep(2)  
  }  
  xhist1 = hist(sample, breaks=breakpoints, plot=F)  
  hist(sample, breaks=breakpoints, axes=F, col="green", xlim=c(10,90),  
       ylim=c(0,N), main="", xlab="", ylab="")  
  if(i==iterations[3]) {  
    abline(v=50)  
  }  

  if(i<=iterations[2]) {  
    Sys.sleep(2)  
  }  
  sampleMean = mean(sample)  
  segments(x0=sampleMean, y0=0, x1=sampleMean,   
           y1=max(xhist1$counts)+1, col="red", lwd=3)  

  if(i<=iterations[1]) {  
    Sys.sleep(2)  
  }  
  meanVect = c(meanVect, sampleMean)  
  hist(meanVect, breaks=x, axes=F, col="red", main="",   
       xlab="", ylab="", ylim=c(0,((N/3)+(0.2*i))))  
  if(i<=iterations[2]) {  
    Sys.sleep(2)  
  }  
}  

Sys.sleep(2)  
xhist2 = hist(meanVect, breaks=x, plot=F)  
xMean  = round(mean(meanVect), digits=3)  
xSD    = round(sd(meanVect), digits=3)  
histHeight = (max(xhist2$counts)/dnorm(xMean, mean=xMean, sd=xSD))  
lines(x=x, y=(histHeight*dnorm(x, mean=xMean, sd=xSD)),   
      col="yellow", lwd=2)  
abline(v=50)  

txt1 = paste("population mean = 50     sampling distribution mean = ",  
             xMean, sep="")  
txt2 = paste("SD = 10     10/sqrt(", N,") = 3.162     SE = ", xSD,  
            sep="")  
mtext(txt1, side=1, outer=T)  
mtext(txt2, side=1, line=1.5, outer=T)

Reinstantiate these concepts throughout the semester: I bring the idea of the sampling distribution up again each time we talk about the next subject (albeit typically only very briefly). The most important place for this is when you teach ANOVA, as the null hypothesis case there really is the situation in which you sampled from the same population distribution several times, and your set of group means really is an empirical sampling distribution. (For an example of this, see my answer here: How does the standard error work?.)

Solved – What are hot research topics for PhD dissertation in Biostatistics

Leah Welty, Emerging Trends, 2013
Davidian, Cutting Edge: Emerging trends in biostatistics, 2012
Modern Issues and Methods in Biostatistics, Springer, 2011

Best Answer

Related Solutions

Distributions Sampling Teaching – Effective Strategies for Teaching the Sampling Distribution Concepts

Solved – What are hot research topics for PhD dissertation in Biostatistics

Related Question