Solved – JAGS Multinomial mixture model with missing data

jagsmultinomial-distribution

I am trying to fit a multinomial mixture model to data from a stream depletion survey. The data were collected by selecting a stream site that is a standard length (usually 150-200m depending on width), blocking the upper and lower end of the site off with a block net to assume closure (i.e., no fish are immigrating or emigrating during the survey), electrofish the entire length and collect all fish over multiple passes. At the end of each pass the number of fish are recorded and the fish are released outside of the sampling area. In my data set the number of passes range from 2 to 6 and I have up to five sites sampled every two years for a total of eight years. I have the response (number of fish collected) in a three way matrix (year x site x pass) which is modeled using a multinomial distribution. Below is a some code to create the type of data I am modeling.

fish_fakeData<-c(4,8,9,NA,7,16,1,4,10,NA,3,5,NA,NA,5,NA,0,2)
dim(fish_fakeData)<-c(3,2,3) 

The error I am getting would the thrown for element 1,1,3 where there is an NA for the last pass but values for the first two passes. The NA indicates that site did not have three passes for that particular year.

The basic model I am fitting in JAGS is

model{
for (l in 1:speciesN){
  a[l]~dnorm(0,0.01)I(-10,10)
  for (k in 1:siteN){
   beta0[l,k] ~ dunif(0,100)
  }
}

for (i in 1:obs){

logit(p[sp[i],site[i]])<-a[sp[i]]

mu[sp[i],site[i],1]<- p[sp[i],site[i]]
mu[sp[i],site[i],2]<- p[sp[i],site[i]]*(1-p[sp[i],site[i]])
mu[sp[i],site[i],3]<- p[sp[i],site[i]]*(1-p[sp[i],site[i]])*(1-p[sp[i],site[i]])

mu_1[sp[i],site[i]]<-mu[sp[i],site[i],1]
mu_2[sp[i],site[i]]<-mu[sp[i],site[i],2]
mu_3[sp[i],site[i]]<-mu[sp[i],site[i],3]

pi0[sp[i],site[i]]<- 1 - mu[sp[i],site[i],1]-mu[sp[i],site[i],2]-mu[sp[i],site[i],3]
pcap[sp[i],site[i]]<-1-pi0[sp[i],site[i]]

for(j in 1:3){
   muc[sp[i],site[i],j]<-mu[sp[i],site[i],j]/(pcap[sp[i],site[i]]+0.0000001) #add a small offset to prevent division by 0
}

#observed counts model
ncap[year[i],site[i]]~dbin(pcap[year[i],site[i]],N[year[i],site[i]])

#Abundance model
N[year[i],site[i]] ~ dpois(lambda[year[i],site[i]])
log(lambda[year[i],site[i]])<- beta0[year[i],site[i]]

y[year[i],site[i],1:3]~dmulti(muc[year[i],site[i],1:3],ncap[year[i],site[i]])

}
}

When I try to fit the model with the missing data, more specifically sites having NA's on pass 3, I get the error:
Error in jags.model("model.txt", data = dataList, inits = initsList, n.chains = nChains, :
RUNTIME ERROR:
Compilation error on line 57.
y[1,1,1:3] has missing values

I know this is referencing the NA's in the response but I thought JAGS handled missing response data automatically? Or is this not the case with the multinomial distribution? If there cannot be missing data in a multinomial distribution are there any suggestions to get around it without dropping sites or years to have a balanced design?

Thanks!

EDIT

I think I found a work-around. Rather than making the number of passes static in the line

y[year[i],site[i],1:3]~dmulti(muc[year[i],site[i],1:3],ncap[year[i],site[i]])

I added a new variable for number of passes at each year/site combination so that

[year[i],site[i],1:npass[i]]~dmulti(muc[year[i],site[i],1:npass[i]],ncap[year[i],site[i]])

Does this fix make sense? I know I also need to modify the code so that mu values that are not needed are not estimated (e.g., mu_3 for site 1 in the data example).

Best Answer

You can't use missing data with the multinomial distribution

See e.g. here from the Patuxent folk for some relevant coding work-arounds

http://www.mbr-pwrc.usgs.gov/workshops/unmarked/Slides/Slides_Multimix.pdf

Related Question