Objectives ANOVA Spring 2001

EDRM612 Syllabus/Objectives

Spring Semester, 2001

Introduction

1. Identify the independent and dependent variables in a research problem.
The independent variable is the variable that is used to explain, cause, or predict the dependent variable. The dependent variable is the effect, or the variable to be predicted. In causal terms, the independent variable causes the dependent variable.

2. Know how the statistical procedures in this course (ANOVA, ANCOVA, Correlation, and Multiple Regression) are related to other statistical procedures (Chi square, Discriminant Analysis, t-test, MANOVA, and Canonical Analysis).
The statistical procedures listed in the table below are classified by the number and type of dependent and independent variables analyzed.

Analysis type	Independent	Dependent
Chi square	1 Categorical	1 Categorical
Discriminant Analysis	1+ Interval	1 Categorical
t-test	1 Categorical (2 groups)	1 Interval
ANOVA	1+ Categorical	1 Interval
ANCOVA	1+ Categorical 1+ Interval	1 Interval
Correlation	1 Interval	1 Interval
Multiple Regression	1+ Interval	1 Interval
MANOVA	1+ Categorical	2+ Interval
Canonical Analysis	2+ Interval	2+ Interval

3. Know the type of data required to do the statistical procedures of this course (correlation, multiple regression, ANOVA, ANCOVA).
In order to do a correlation analysis you must have two variables in which the data consists of matched or paired cases. The two paired variables are usually referred to as X and Y. For correlation analysis either variable can be designated as X or Y. For multiple regression you have one dependent variable (Y) and one or more independent variables (X). Both X and Y are analyzed as if they were interval or better data (there are procedures that can transform other variables). ANOVA analyzes one interval (or better) dependent variable (the same type as multiple regression), but the independent variable is treated as if it were a categorical variable. ANCOVA combines characteristics of ANOVA and multiple regression. One interval (or better) dependent variable is used (the same as ANOVA and multiple regression), but two types of independent variables are used-one or more categorical independent variables (the same as ANOVA) and one or more interval (or better) independent variables (the same as multiple regression).

Simple Analysis of Variance

4. Know the relationship between the terms factor, level, treatment, group, and variable.
The independent variables in ANOVA are called factors. The values of the factors are called levels, treatments, or groups. The term treatment is usually reserved for an experimentally assigned group and level is sometimes reserved for variables that are quantitative in nature (even though analyzed as groups).

    5. Know the main assumptions of ANOVA.
    The main assumptions of ANOVA are:
        a. interval data on X (the dependent variable)
        b. normal distribution on X for the population from which each group was selected
        c. equal variance on X for the population from which each group was selected
        d. observations are independent of each other (this is almost always satisfied)

6. Know how to identify violations of the assumptions of ANOVA.
Assumption a is determined by the type of variable used. The other assumptions depend on the data used. Scatterplots of the data can identify assumptions b and c. The Levene's test of homogeneity of variance is also used to test assumption c.

7. Know what to do when the assumptions of ANOVA are violated.
ANOVA is fairly robust for violations of assumptions b and c. If there are severe violations with these, use a nonparametric test. Assumption a must be satisfied.

    8. Know the meaning of the sums of squares in the ANOVA table in terms of deviation scores.
    SS_Between = the sum of the squared deviations between the group means and the grand mean.
    SS_Within = the sum of the squared deviations between the individual scores and the group means.
    SS_Total= the sum of the squared deviations between the individual scores and the grand mean.

    9. Know the relationship between explained, unexplained, and total variation and sum of squares for between, within, and total.
    Explained variation = Between sum of squares.
    Unexplained variation = Within sum of squares.
    Total variation = Total sum of squares

10. Know the relationship between each mean square in an ANOVA table and a variance.
Mean squares are equal to variances. For example, MS within can be called error variance.

11. Know the mean squares used to compute an F ratio.
F = MS Between/ MS Within.

12. Know how to use scatterplots, boxplots, and error bar charts to evaluate the differences between group means.
Scatterplots give you the most information - they show every case in each group but there is no indication of what the mean or median is for each group. Boxplots provide less individual detail but give the median and other quartiles for each group. Error bar charts indicate the mean of each group along with a confidence interval around the mean. If there is overlap between the confidence intervals in the bar charts, the differences are unlikely to be significant.

13. Know the meaning of fixed and random effects.
Fixed effects are those where the groups studied are the only groups to which the results are to be applied. Random effects are those where the groups are a sample of those of interest. Most research using ANOVA deals with fixed effects. These are the only problems we will deal with in EDRM612.

14. Know the meaning of small, medium, and large effect sizes in ANOVA and how they are computed.
Effect sizes in ANOVA refer to the differences between the means. They are frequently interpreted in either of two ways: in standard deviation units (z scores) or an eta squared value. When different analyses are compared or combined (e.g., in meta analysis) each difference of means is converted to a z score. Eta squared summarizes differences between all means being compared in one analysis. Conventional standards for interpreting z score effect sizes are .2 (.2 standard deviation difference between two means) for a small effect, .5 for a medium effect, and .8 for a large effect. Eta squared cutoff guidelines are .01, .06, and .14.

15. Given a data set with more than two groups of one independent variable, use SPSS to test for the significance of the difference between the means and interpret the results correctly (One-way ANOVA).

16. Know when tests of multiple comparisons are appropriate and why they are needed.
When more than two means are being compared the initial test between all means is called an omnibus test. Tests of multiple comparisons are helpful to compare pairs or sets of these multiple means. Tests of multiple comparisons are needed to compensate for inflated alpha rates done when many tests are being conducted.

17. Know the meaning of contrast or familywise (experimentwise) alpha rates.
If there were really no differences between groups, using a contrast alpha rate of .05 for each test would result in finding a significant difference in 5% of the contrasts. Using a familywise alpha rate of .05 would result in a significant difference 5% of the time you did an analysis (ignoring the number of comparisons you were making).

18. Know the meaning of post-hoc and a priori tests.
Post-hoc tests are done after finding a significant F in an omnibus test. Tests specified before the omnibus F test is done are called "a priori" tests.

19. Know the issues involved in determining which test of multiple comparisons is best to use.
Different tests vary in their power, how they control for familywise Type I error, their appropriateness with unequal variances or sample sizes among groups, and their sensitivity to the number of tests to run (all or a subset), or whether the comparisons are specified in advance.

20. Given SPSS output from a one-way ANOVA, interpret the results of a multiple comparison test.
Results are reported either as p values (significance) for every pair of means or indicating homogeneous subsets of means which are not significantly different from each other.

Factorial Analysis of Variance

21. Know the meaning of the term "factorial design".
A factorial design is a design that includes two or more factors. Factorial designs are frequently referred to by the number of factors, such as a two-way design, three-way design, etc. They are also referred to by the number of categories in each factor, such as 2x4 or 3x2x5 designs.

22. Know the meaning of row, column, layer, simple, and main effects.
The factors in a two-way design are usually called rows and columns and in a three-way design they are called rows, columns, and layers. The term "effect" is a general term referring to the difference in means between rows, between columns, and between layers in a factorial design. The row effect is the difference between the row means. If the difference between the means is large you would have a large row effect. Similarly you might have a large column or layer effect in a three-way design. Row, column, and layer effects together are also called "Main" effects. Simple effects are effects of each independent variable at only one level of the other independent variables. You might have a row effect (difference between rows) at column one. In contrast, other components of the ANOVA model covered later (interaction and covariate effects) might be of "Secondary" importance.

23. Know the meaning of two-way and three-way interaction.
A two-way interaction is when the row effect is not consistent (not the same) over the columns and the column effect is not consistent over the rows. For example, if the overall difference between two rows is 10 points, interaction would occur if the difference between the rows was not 10 points within each of the columns.

A three-way interaction is when the two-way interaction is not consistent over the third factor. For example, the overall two-way interaction may be that the row effect is three times as large in column one as in column two overall, but it is not consistent for each of the layers of the third variable. The difference in row effect between the two columns may be four times as much at one layer but only two times as much in the other layer.

24. Know the meaning of ordinal and disordinal interaction.
Ordinal interaction occurs in a two-way interaction when the order of the categories from highest to lowest is maintained across all levels of the other category even though the differences between the categories is not consistent. Disordinal interaction occurs when the inconsistency results in a different ordering of the categories at each level of the other variable. Interaction is expressed graphically as non-parallel lines. The lines cross in disordinal interaction but do not cross in ordinal interaction.

25. Know how to use clustered bar charts, clustered boxplots, and multiple line charts to determine interaction.
Interaction effects in a clustered bar chart and boxplots are indicated by varying differences between the bars (bar chart) or medians (boxplot) for each cluster. Interaction effects in a multiple line chart are indicated by non-parallel lines.

26. Know the components of a two-way ANOVA table and the meaning of sums of squares for rows, columns, and interaction in terms of cell, marginal, and expected means.
SS_Row weighted sum of the squared differences between the row means and the grand mean.

SS_Column weighted sum of the squared differences between the column means and the grand mean.

SS_Interaction weighted sum of the squared differences between the cell means and the expected cell means.

The expected cell means are those that would exist if each dimension (row, column, layer, etc.) was consistent across the other dimensions.

27. Know the advantage of factorial designs over separate one-way designs.
Row and column sums of squares (and mean squares) would be identical in a factorial design and separate one-way designs with the same data (in the most common type of analysis). The error term, however, will usually be smaller in a factorial design, therefore resulting in a larger F. The factorial design also allows you to study the interaction of the separate effects. This is not possible in separate one-way designs.

28. Given a data set, be able to complete an analysis of a two-way ANOVA using SPSS and interpret the results.

29. Given a printout of a two-way ANOVA from SPSS, be able to interpret the main effect and interaction results.
Means are given in two places: a Descriptive Statistics table gives the actual or weighted means. An Estimated Marginal Means table gives unweighted means. The unweighted means are an estimate of what the means would be if the groups had been equal in size. It is important to select the appropriate means for your interpretation.

30. Be able to complete an ANOVA table given summary data (number of subjects, sum of squares, and number of levels for each factor.

31. Know the meaning of and how to interpret ANOVA results using Type I, II, and III Sum of Squares.
Type I SS adjusts each effect for those listed before it in the list of effects. It would be used if there is a hierarchy of cause and effect factors being hypothesized. Type II SS adjusts the main effects for each other (not the interaction) and the interaction effect for both main effects. It is equivalent to the regression approach. Type III SS evaluates differences between the unweighted means. Type III is the default and most common in SPSS. It assumes that differences in sample sizes are a result of random events which is what would normally occur in an experiment where the experimental and control groups would have equal n's at the beginning of the experiment. Type IV will not be covered in this class. It is useful with missing cells (cells with no subjects).

32. Know the meaning of tests of simple main effects and when to conduct them and pairwise comparisons in a factorial design.
Tests of simple main effects are one-way F tests for individual levels of one or more of the independent variables using the overall error term. Pairwise multiple comparisons can be done on either the marginal means or cells means as in a one-way design. Simple main effects are conducted after a significant interaction has been found. In many cases the interaction can be interpreted satisfactorily by inspection of the means and not using simple main effects tests. The SPSS procedures for simple main effects and pairwise comparisons for factorial designs require using complicated lmatrix syntax commands. This is necessary to ensure that the correct error term is used. An alternative procedure would be to do a one-way F test using SPSS but compute the F ratios by hand using the overall error term from the two-way analysis.

33. Know how a repeated measures design differs from one-way and factorial ANOVA designs.
In a repeated measures design, the groups being compared are composed of the same subjects. Because of this, the error term for the difference between the means needs to be computed differently to account for the similarity that is expected when measuring the same persons. In effect, the difference between subjects across all variables (groups) is removed from the error term which results in a more powerful test. Many designs include both a between-subjects factor and a within-subjects (repeated measures) factor.