Two way ANCOVA with slight heteroscedasticity

I am about to perform a 2-way ANCOVA but I reject the null hypothesis in Levene’s Test with a p-value of 0.023. See standard deviation and sample sizes below.

enter image description here

I googled up and down and found people saying that: 1) I can still perform the ANCOVA because the smaller group ($n=32$) has the smaller variance, and 2) I should be good to go if the smallest variance is not more than 4 times smaller than the biggest variance–> something like this:

enter image description here
Does somebody here know a citable source for these two claims? They would both save me from going crazy!

Thanks a lot in advance!

Calculating Within-group variance

My study is looking at attitudes towards a concept across four different professional groups: Physicians, Nursing, Pharmacy, and Allied Health. I want to see whether there are differences in attitudes between the groups (e.g. across the professions) as well as within the groups (amongst members of the same profession). I used a validated survey instrument, comprised of 27 likert-type items, from which I extracted three components using PCA. I created each component as a new variable in SPSS, by averaging the mean of the items that comprised the component (e.g. Variable 1= average of means of questions 1 to 11; variable 2 = average of means of questions 12 to 24, etc.). To get at the between-group results, I’ve done ANOVA/Welch’s ANOVA and the relevant post-hoc tests (Tukey’s HSD, Games Howell, etc.) to determine where there are statistically significant differences in the mean between the four groups, for each of the three new variables. For example, between physicians-nursing, or between pharmacy-nursing.

I now also want to determine if there are differences in attitudes WITHIN each group. What I mean by this is, amongst all of the, for example, physicians, is there significant difference in the means of all physician respondents, for a particular component? So, I am not comparing two or more groups, but rather, want to look at the variance within a single profession group. Can I even do that? I through that interpreting the SD might give me this, but how do you determine if an SD value is statistically significant?

Test equality of binomial variances across four groups

I have four 100×1 vectors of binary outcomes of a particular experiment. I want to test for equality in variance across all of the four different treatment groups.

At the moment I have used the Levene test to do this. However, I wanted to check whether this is a reasonable thing to do when dealing with binary data?



Homogenity of Variances – F-test and Levene Test yield different results – which one to trust?

I want to perform a t-test on two independent samples with each n = 50. to check the assumptions, I ran an F-test for homogenity, which is significant. then I used SPSS which does a levene test by default and this levene test was not significant. I don’t have the data here, but n = 50 (data are rental prices) is probably normal distributed. which test should I trust and why?
thanks in advance for your answers!

What's the rationale behind the degrees of freedom in Levene's test?

I’ve been reading the Wikipedia page for Levene’s test, and it cites the degrees of freedom as (k – 1, N – k), where k is the number of different groups to which the sampled cases belong, and N is the total number of cases in all groups. However, it does not explain why this is so. There is a very thorough answer here which would suffice to answer this question in relation to the chi square goodness of fit. However, I have not been able to find a satisfactory answer to the question in relation to Levene’s test.

“One-tailed” Levene Test

F-tests can be two-tailed (to test that $s_1^2 ne s_2^2$) or one-tailed (to test that $s_1^2 > s_2^2$).

How can I modify Levene/Brown-Forsythe to be “one-tailed”, that is, to test $s_1^2 > s_2^2$ instead of $s_1^2 ne s_2^2$?

Here is a demo:


The image shows normally distributed training data (n=1000) and a model. An F-test is used to compare the variance of one point’s residuals (n=2) to the variance of all of the residuals (n=2000), so the point is an outlier if its residual variance is “too large.” The points are colored by p-value, where light points fit the model and dark points are outliers, and you can see that the two-tailed Brown-Forsythe rejects points that are too close to the model as well as too far.

Note: A different non-parametric, one-sided variance test would be fine as well.

Glen_b gave the information I needed, but I thought I would leave some implementation details (using scipy).

#basic F-test
F = var(a) / var(b)
Fp = stats.f.sf(F, df1, df2)

#Brown Forsythe
BF, BFp = stats.levene(a, b, center='median')

#two tailed t-test on transformed data
za = abs(a-median(a))
zb = abs(b-median(b))
t, tp_two_tailed = stats.ttest_ind(za, zb)

#the two tailed t test recapitulates the BF test
assert(t**2 == BF)
assert(p_BF == p_two_tailed)

#one tailed t test p value
tp = stats.t.sf(t, df)

scatter plots

Above shows scatter plots of the p values from the one-tailed $F$-test and two-tailed BF-test (left), and the one-tailed $t$ tests (right). Red points are “too close” ($s_1^2 < s_2^2$).

Questions about Factorial MANOVA

I have a few questions about the factorial MANOVA below which I hope can be answered:

1)What type of follow-up tests should be done after finding significant interaction effects in a factorial manova?

2)According to what I have read, Box’s test can essentially be ignored if I have equal sample sizes for different levels of my independent variable. Or if a significant result is found (violating the assumption), the Pillai’s Trace test can be used. Is this correct?

3)Is it necessary to use Bonferroni correction?

4)Even if the Levene’s test yields significant results for the dependent variable (this violates the assumption), I have read that it is okay to go ahead with follow-up tests as long as the sample sizes are equal and the standard deviations are within 20% of each other or are not 4 times greater. Is this correct?

5)Lastly, if the factorial MANOVA is run and it is not appropriate to proceed because of the violation of assumptions, how would you report this in a paper?

Thanks and hope that at least some of these questions can be addressed!

Test for equal variability in mixed model setting

I have a setting where I normally would model the variability in measurements by a linear mixed model which would look in R as follows.

lmer(measure ~ equipment + (1|operator) + (equipment|batchid), data=mydataset) 

So basically

  • a fixed effect of equipment
  • random intercept per operator
  • random intercept per batch, differing for each type of equipment

Now in this model, I would like to define a test, which checks whether the variability explained by the equipment (fixed effect + random effect) changes by equipment level. There are 2 equipment levels and a unit based on which data is collected is called a batch.

Where can I find the specification of such a test?

Assumptions of two-way ANOVA and k-fold cross validation

I want to compare 3 classifiers (kNN, SVM and CT) by using their classification accuracies on 10 folds, to highlight eventual differences between them.

I think it could be done by a two-way ANOVA analysis, where classifiers=factors and folds=blocks, if some assumptions on data are verified.

Following wikipedia, the assumptions are:

  1. The populations from which the samples are obtained must be normally distributed.

  2. Sampling is done correctly. Observations for within and between groups must be independent.

  3. The variances among populations must be equal (homoscedastic).

  4. Data are interval or nominal.

I need an help on how to verify them in my case.

  1. Do I have to verify that for every classifier, its 10 accuracies are normally distributed and/or that for each fold, the 3 accuracies are normally distributed?

  2. Observations for within groups are independent because I use a different test set for every fold. Am I wrong? Observations for between groups are independent because I suppose classifiers to act in an independent way. Aren’t they?

  3. Do I have to verify that each group described in the first point has the same variance?

  4. No problems in my case.

Is there a quick way to verify all of the assumptions in Matlab?

Does Levene's test assume separate samples?

I want to run Levene’s test to test the equality of variances between a full sample a number of sub-samples. I can’t find anything about Levene’s test that states whether this would violate the assumptions of the test. In other words, given the null hypothesis that $mathrm{Var}(X_{1}) = mathrm{Var}(X_{2})$, does Levene’s test require that $X_{1} cap X_{2} = varnothing$?

Question and Answer is proudly powered by WordPress.
Theme "The Fundamentals of Graphic Design" by Arjuna
Icons by FamFamFam