Statistical tests for data on occurrence of reptiles found under different types of artifical cover objects

biostatisticscombinatoricsstatistical significance

I am studying small-bodied ground snakes and skinks (lizards). I use artifical cover objects to determine their presence.

Artificial cover objects (hereafter CO) consisting of 7 different types of materials were placed at seven different sites within my study area. At each site, seven CO were placed on each side of a mowed path along a 60m transect (10m between successive CO). At each site there were 14 total CO (two of each material type: plywood, metal, rubber mat, etc.). Each time the COs were checked at a site I recorded any reptiles present under them and how many were present (red-bellied snake, garter snake, prairie skink). I want to determine which statistical test(s) is/are most appropriate to assess if there was any difference in CO use by the snakes and skinks that used the COs. My null hypothesis is that there was no preference by or any of the three reptile groups with regard to the 7 different CO materials.

Best Answer

As a biologist I really like you experiment! So I thought I give you an idea of how this type of data could be analyzed. I am working in R so my post will be R-centric, but the general process is the same across other statistical programming software packages.

1.) First of all since you are measuring counts, you probably want to look into a Poisson GLM (there a lot of great posts here on this, for example to get started see here, there are many other links in there too).

2.) More specifically, you probably want to look into a GLMM (generalized mixed effects model) since in ecology we always have some sort of clustering, such as measurements within sites, that we need to account for. This can be done using a so called "mixed-model". I am mostly using the glmmTMB() function from the glmmTMB package

3.) With the Poisson GLM, you should also be aware of overdispersion and how to deal with it once you encounter it. A great package to learn, explore and further educate yourself is the DHARMa package in R.

4.) Then you should check whether those model terms are significant. One way to do this is to use the Anova() function from the car package.

5.) Then you need to extract the "model means" for the significant effects. There are many packages that can do that for example via the emmeans, or ggeffects packages.

6.) And lastly, you need some sort of plot that visualizes your results. I am using the ggplot() function for this from the tidyverse packages

Final words: This is just a very general approach I would take to get started. This process takes time and you will usually encounter issues along the way. Answers to many of those issues you can be found here on this site. If you don't, you can consider asking a new question.

So here is the approach from above, provided as a reproducible example. I hope that helps.

# required packages
library(tidyverse)
library(glmmTMB)
library(car)
#> Loading required package: carData
#> 
#> Attaching package: 'car'
#> The following object is masked from 'package:dplyr':
#> 
#>     recode
#> The following object is masked from 'package:purrr':
#> 
#>     some
library(DHARMa)
#> This is DHARMa 0.4.6. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa')
library(emmeans)
library(multcomp)
#> Loading required package: mvtnorm
#> Loading required package: survival
#> Loading required package: TH.data
#> Loading required package: MASS
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> 
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#> 
#>     geyser

# Set a seed for reproducibility
set.seed(123)

# Creating a tibble with the example dataset and simulated Poisson counts
snake_skink_data <- tibble(
  Site = rep(1:7, each = 7 * 3),
  Material = rep(rep(
    c(
      "Plywood",
      "Metal",
      "Rubber Mat",
      "Plastic",
      "Carpet",
      "Grass",
      "Shingle"
    ),
    each = 3
  ), times = 7),
  Red_bellied_Snakes = rpois(7 * 7 * 3, lambda = 1),
  Garter_Snakes = rpois(7 * 7 * 3, lambda = 1),
  Prairie_Skinks = rpois(7 * 7 * 3, lambda = 2)  # Adjusted lambda for skinks to make it different
)

# Reshape the data
snake_skink_data_long <- snake_skink_data %>%
  pivot_longer(
    cols = c(Red_bellied_Snakes, Garter_Snakes, Prairie_Skinks),
    names_to = "Species",
    values_to = "Count"
  )


# model using a generalized linear mixed-effects model with the Poisson family distribytion and a log link
m <- glmmTMB(Count ~ Species * Material + (1 | Site),
          family = poisson(link = "log"),
          data = snake_skink_data_long)

# check model fit
# specifically check whether numbers look OK (no repeated zeros, or ones, or something that just doesn't look right)
# also check whether your random effect levels are correct: see under "Conditional model" and then the "groups:" part
# in this case it makes sense since we have 7 sites
summary(m)
#>  Family: poisson  ( log )
#> Formula:          Count ~ Species * Material + (1 | Site)
#> Data: snake_skink_data_long
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   1283.8   1373.7   -619.9   1239.8      419 
#> 
#> Random effects:
#> 
#> Conditional model:
#>  Groups Name        Variance  Std.Dev. 
#>  Site   (Intercept) 4.651e-10 2.157e-05
#> Number of obs: 441, groups:  Site, 7
#> 
#> Conditional model:
#>                                              Estimate Std. Error z value
#> (Intercept)                                  -0.27193    0.25000  -1.088
#> SpeciesPrairie_Skinks                         0.94098    0.29477   3.192
#> SpeciesRed_bellied_Snakes                     0.22314    0.33541   0.665
#> MaterialGrass                                 0.17185    0.33931   0.506
#> MaterialMetal                                 0.48551    0.31774   1.528
#> MaterialPlastic                               0.27193    0.33184   0.819
#> MaterialPlywood                              -0.20764    0.37339  -0.556
#> MaterialRubber Mat                            0.36291    0.32554   1.115
#> MaterialShingle                              -0.06454    0.35940  -0.180
#> SpeciesPrairie_Skinks:MaterialGrass          -0.14775    0.40414  -0.366
#> SpeciesRed_bellied_Snakes:MaterialGrass      -0.60263    0.49199  -1.225
#> SpeciesPrairie_Skinks:MaterialMetal          -0.41489    0.38481  -1.078
#> SpeciesRed_bellied_Snakes:MaterialMetal      -0.18540    0.43359  -0.428
#> SpeciesPrairie_Skinks:MaterialPlastic        -0.22431    0.39720  -0.565
#> SpeciesRed_bellied_Snakes:MaterialPlastic    -0.08961    0.44921  -0.199
#> SpeciesPrairie_Skinks:MaterialPlywood         0.55451    0.42551   1.303
#> SpeciesRed_bellied_Snakes:MaterialPlywood     0.38996    0.48072   0.811
#> SpeciesPrairie_Skinks:MaterialRubber Mat     -0.49296    0.39768  -1.240
#> SpeciesRed_bellied_Snakes:MaterialRubber Mat -0.41420    0.45674  -0.907
#> SpeciesPrairie_Skinks:MaterialShingle        -0.09369    0.42676  -0.220
#> SpeciesRed_bellied_Snakes:MaterialShingle     0.24686    0.46993   0.525
#>                                              Pr(>|z|)   
#> (Intercept)                                   0.27671   
#> SpeciesPrairie_Skinks                         0.00141 **
#> SpeciesRed_bellied_Snakes                     0.50587   
#> MaterialGrass                                 0.61253   
#> MaterialMetal                                 0.12652   
#> MaterialPlastic                               0.41252   
#> MaterialPlywood                               0.57815   
#> MaterialRubber Mat                            0.26495   
#> MaterialShingle                               0.85749   
#> SpeciesPrairie_Skinks:MaterialGrass           0.71467   
#> SpeciesRed_bellied_Snakes:MaterialGrass       0.22062   
#> SpeciesPrairie_Skinks:MaterialMetal           0.28096   
#> SpeciesRed_bellied_Snakes:MaterialMetal       0.66894   
#> SpeciesPrairie_Skinks:MaterialPlastic         0.57226   
#> SpeciesRed_bellied_Snakes:MaterialPlastic     0.84188   
#> SpeciesPrairie_Skinks:MaterialPlywood         0.19251   
#> SpeciesRed_bellied_Snakes:MaterialPlywood     0.41725   
#> SpeciesPrairie_Skinks:MaterialRubber Mat      0.21512   
#> SpeciesRed_bellied_Snakes:MaterialRubber Mat  0.36448   
#> SpeciesPrairie_Skinks:MaterialShingle         0.82624   
#> SpeciesRed_bellied_Snakes:MaterialShingle     0.59936   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# check residuals (those will look fine since we generated the data using Poisson counts)
# in reality this may be different
resids <- simulateResiduals(m, plot = FALSE)
plot(resids)


# check for significant effects
car::Anova(m)
#> Analysis of Deviance Table (Type II Wald chisquare tests)
#> 
#> Response: Count
#>                    Chisq Df Pr(>Chisq)    
#> Species          77.1332  2     <2e-16 ***
#> Material          7.3407  6     0.2905    
#> Species:Material 13.1689 12     0.3569    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# get model means (species only since it was the only significant term)
em <- emmeans(m, specs = ~ Species, type = "response")
#> NOTE: Results may be misleading due to involvement in interactions

diff_letters <- multcomp::cld(em, Letters = LETTERS)

tibble(diff_letters) %>%
  # trim whitespace
  mutate(.group = trimws(.group)) %>%
  #plot results
  ggplot(aes(x = Species, y = rate)) + geom_col(position = "dodge") +
  geom_errorbar(aes(ymin = rate - SE, ymax = rate + SE),
                position = "dodge",
                width = 0.25) +
  geom_text(aes(label = .group), nudge_y = 0.2) + 
  labs(y = "Average reptile counts")

^{Created on 2024-03-02 with reprex v2.1.0}

Best Answer

Related Solutions

Solved – Statistical tests for count data with many zeros

Related Question