Some of you might have read this nice paper:
O’Hara RB, Kotze DJ (2010) Do not logtransform count data. Methods in Ecology and Evolution 1:118–122. klick.
Currently I am comparing negative binomial models with gaussian models on transformed data. Unlike O’Hara RB, Kotze DJ (2010) I’m looking at the special case of low sample sizes and in a hypothesis testing context.
A used simulations to investigate the differences between both.
Type I Error simulations
All compuations have been done in R.
I simulated data from a factorial design with one control group ($μ_c$) and 5 treatment groups ($μ_{1−5}$). Abundances were drawn from a negative binomial distributions with fixed dispersion parameter (θ=3.91). Abundances were equal in all treatments.
For the simulations I varied the sample size (3, 6, 9, 12) and the abundances (2, 4 ,8, … , 1024). 100 datasets were generated and analysed using a negative binomial GLM (MASS:::glm.nb()
), a quasipoisson GLM (glm(..., family = 'quasipoisson'
) and a gaussian GLM + logtransformed data (lm(...)
).
I compared the models with the null model using a LikelihoodRatio test (lmtest:::lrtest()
) (gaussian GLM and neg. bin GLM) as well as Ftests (gaussian GLM and quasipoisson GLM)(anova(...test = 'F')
).
If needed I can provide the R code, but see also here for a related question of mine.
Results
For small sample sizes, the LRtests (green – neg.bin.; red – gaussian) lead to an increased TypeI error. The Ftest (blue – gaussian, purple – quasipoisson) seem to work even for small sample sizes.
LR tests give similar (increased) Type I errors for both LM and GLM.
Interestingly the quasipoisson works pretty well (but also with an FTest).
As expected, if sample size increases LRTest performs also well (asymptotically correct).
For the small sample size there have been some convergence problems (not show) for the GLM, however only at low abundances, so source of error can be neglected.
Questions

Note the data was generated from a neg.bin. model – so I would have expected that the GLM performs best.
However in this case a linear model on transformed abundances performs better. Same for quasipoisson (FTest). I suspect this is because of the Ftest is doing better with small sample sizes – is this correct and why?

The LRTest does not perform well because of asymptotics. Are the possibilities for improvement?

Are there other tests for GLMs which may perform better? How can I improve testing for GLMs?

What type of models for count data with small sample sizes should be used?
Edit:
Interestingly, the LRTest for a binomial GLM does work pretty well:
Here i draw data from a binomial distribution, setup similar as above.
Red: gaussian model (LRTest + arcsin transformation), Ocher: Binomial GLM (LRTest), Green: gaussian model (FTest + arcsin transformation), Blue: Quasibinonial GLM (Ftest), Purple: Nonparametric.
Here only the gaussian model (LRTest + arcsin transformation) shows an increase Type I error, whereas the GLM (LRTest) does pretty well in terms of Type I error. So there seems to be also a difference between distributions (or maybe glm vs. glm.nb?).