Spring Semester, 2001
Introduction
1. Identify the independent and dependent variables
in a research problem.
The independent variable
is the variable that is used to explain, cause, or predict the dependent
variable. The dependent variable is the effect, or the variable to be predicted.
In causal terms, the independent variable causes the dependent variable.
2. Know how the
statistical procedures in this course (ANOVA, ANCOVA, Correlation, and
Multiple Regression) are related to other statistical procedures (Chi square,
Discriminant Analysis, t-test, MANOVA, and Canonical Analysis).
The statistical procedures
listed in the table below are classified by the number and type of dependent
and independent variables analyzed.
Analysis type | Independent | Dependent |
Chi square | 1 Categorical | 1 Categorical |
Discriminant Analysis | 1+ Interval | 1 Categorical |
t-test | 1 Categorical (2 groups) | 1 Interval |
ANOVA | 1+ Categorical | 1 Interval |
ANCOVA | 1+ Categorical
1+ Interval |
1 Interval |
Correlation | 1 Interval | 1 Interval |
Multiple Regression | 1+ Interval | 1 Interval |
MANOVA | 1+ Categorical | 2+ Interval |
Canonical Analysis | 2+ Interval | 2+ Interval |
3. Know the type
of data required to do the statistical procedures of this course (correlation,
multiple regression, ANOVA, ANCOVA).
In order to do a correlation
analysis you must have two variables in which the data consists of matched
or paired cases. The two paired variables are usually referred to as X
and Y. For correlation analysis either variable can be designated as X
or Y. For multiple regression you
have one dependent variable (Y) and one or more independent variables (X).
Both X and Y are analyzed as if they were interval or better data (there
are procedures that can transform other variables). ANOVA analyzes one
interval (or better) dependent variable (the same type as multiple regression),
but the independent variable is treated as if it were a categorical variable.
ANCOVA combines characteristics of ANOVA and multiple regression. One interval
(or better) dependent variable is used (the same as ANOVA and multiple
regression), but two types of independent variables are used-one or more
categorical independent variables (the same as ANOVA) and one or more interval
(or better) independent variables (the same as multiple regression).
Simple Analysis of Variance
4. Know the relationship
between the terms factor, level, treatment, group, and variable.
The independent
variables in ANOVA are called factors. The values of the factors are called
levels, treatments, or groups. The term treatment is usually reserved for
an experimentally assigned group and level is sometimes reserved for variables
that are quantitative in nature (even though analyzed as groups).
5. Know the main
assumptions of ANOVA.
The main assumptions
of ANOVA are:
a. interval data on X (the dependent variable)
b. normal distribution on X for the population from which each group was
selected
c. equal variance on X for the population from which each group was selected
d. observations are independent of each other (this is almost always satisfied)
6. Know how to identify
violations of the assumptions of ANOVA.
Assumption a is determined
by the type of variable used. The other assumptions depend on the data
used. Scatterplots of the data can identify assumptions b and c. The Levene's
test of homogeneity of variance is also used to test assumption c.
7. Know what to
do when the assumptions of ANOVA are violated.
ANOVA is fairly robust
for violations of assumptions b and c. If there are severe violations with
these, use a nonparametric test. Assumption a must be satisfied.
8. Know the meaning
of the sums of squares in the ANOVA table in terms of deviation scores.
SSBetween
= the sum of the squared deviations between the group means and the grand
mean.
SSWithin
= the sum of the squared deviations between the individual scores and the
group means.
SSTotal =
the sum of the squared deviations between the individual scores and the
grand mean.
9. Know the relationship
between explained, unexplained, and total variation and sum of squares
for between, within, and total.
Explained variation
= Between sum of squares.
Unexplained variation
= Within sum of squares.
Total variation =
Total sum of squares
10. Know the relationship
between each mean square in an ANOVA table and a variance.
Mean squares are equal
to variances. For example, MS within can be called error variance.
11. Know the mean
squares used to compute an F ratio.
F = MS Between/ MS
Within.
12. Know how to
use scatterplots, boxplots, and error bar charts to evaluate the differences
between group means.
Scatterplots give
you the most information - they show every case in each group but there
is no indication of what the mean or median is for each group. Boxplots
provide less individual detail but give the median and other quartiles
for each group. Error bar charts indicate the mean of each group along
with a confidence interval around the mean. If there is overlap between
the confidence intervals in the bar charts, the differences are unlikely
to be significant.
13. Know the meaning
of fixed and random effects.
Fixed effects are
those where the groups studied are the only groups to which the results
are to be applied. Random effects are those where the groups are a sample
of those of interest. Most research using ANOVA deals with fixed effects.
These are the only problems we will deal with in EDRM612.
14. Know the meaning
of small, medium, and large effect sizes in ANOVA and how they are computed.
Effect sizes in ANOVA
refer to the differences between the means. They are frequently interpreted
in either of two ways: in standard deviation units (z scores) or an eta
squared value. When different analyses are compared or combined (e.g.,
in meta analysis) each difference of means is converted to a z score. Eta
squared summarizes differences between all means being compared in one
analysis. Conventional standards for interpreting z score effect sizes
are .2 (.2 standard deviation difference between two means) for a small
effect, .5 for a medium effect, and .8 for a large effect. Eta squared
cutoff guidelines are .01, .06, and .14.
15. Given a data set with more than two groups of one independent variable, use SPSS to test for the significance of the difference between the means and interpret the results correctly (One-way ANOVA).
16. Know when tests
of multiple comparisons are appropriate and why they are needed.
When more than two
means are being compared the initial test between all means is called an
omnibus test. Tests of multiple comparisons are helpful to compare pairs
or sets of these multiple means. Tests of multiple comparisons are needed
to compensate for inflated alpha rates done when many tests are being conducted.
17. Know the meaning
of contrast or familywise (experimentwise) alpha rates.
If there were really no differences between
groups, using a contrast alpha rate of .05 for each test would result in
finding a significant difference in 5% of the contrasts. Using a familywise
alpha rate of .05 would result in a significant difference 5% of the time
you did an analysis (ignoring the number of comparisons you were making).
18. Know the meaning
of post-hoc and a priori tests.
Post-hoc tests are
done after finding a significant F in an omnibus test. Tests specified
before the omnibus F test is done are called "a priori" tests.
19. Know the issues
involved in determining which test of multiple comparisons is best to use.
Different tests vary
in their power, how they control for familywise Type I error, their appropriateness
with unequal variances or sample sizes among groups, and their sensitivity
to the number of tests to run (all or a subset), or whether the comparisons
are specified in advance.
20. Given SPSS output
from a one-way ANOVA, interpret the results of a multiple comparison test.
Results are reported
either as p values (significance) for every pair of means or indicating
homogeneous subsets of means which are not significantly different from
each other.
Factorial Analysis of Variance
21. Know the meaning
of the term "factorial design".
A factorial design
is a design that includes two or more factors. Factorial designs are frequently
referred to by the number of factors, such as a two-way design, three-way
design, etc. They are also referred to by the number of categories in each
factor, such as 2x4 or 3x2x5 designs.
22. Know the meaning
of row, column, layer, simple, and main effects.
The factors in a two-way
design are usually called rows and columns and in a three-way design they
are called rows, columns, and layers. The term "effect" is a general term
referring to the difference in means between rows, between columns, and
between layers in a factorial design. The row effect is the difference
between the row means. If the difference between the means is large you
would have a large row effect. Similarly you might have a large column
or layer effect in a three-way design. Row, column, and layer effects together
are also called "Main" effects. Simple effects are effects of each independent
variable at only one level of the other independent variables. You might
have a row effect (difference between rows) at column one. In contrast,
other components of the ANOVA model covered later (interaction and covariate
effects) might be of "Secondary" importance.
23. Know the meaning
of two-way and three-way interaction.
A two-way interaction
is when the row effect is not consistent (not the same) over the columns
and the column effect is not consistent over the rows. For example, if
the overall difference between two rows is 10 points, interaction would
occur if the difference between the rows was not 10 points within each
of the columns.
A three-way interaction is when the two-way interaction is not consistent over the third factor. For example, the overall two-way interaction may be that the row effect is three times as large in column one as in column two overall, but it is not consistent for each of the layers of the third variable. The difference in row effect between the two columns may be four times as much at one layer but only two times as much in the other layer.
24. Know the meaning
of ordinal and disordinal interaction.
Ordinal interaction
occurs in a two-way interaction when the order of the categories from highest
to lowest is maintained across all levels of the other category even though
the differences between the categories is not consistent. Disordinal interaction
occurs when the inconsistency results in a different ordering of the categories
at each level of the other variable. Interaction is expressed graphically
as non-parallel lines. The lines cross in disordinal interaction but do
not cross in ordinal interaction.
25. Know how to
use clustered bar charts, clustered boxplots, and multiple line charts
to determine interaction.
Interaction effects
in a clustered bar chart and boxplots are indicated by varying differences
between the bars (bar chart) or medians (boxplot) for each cluster. Interaction
effects in a multiple line chart are indicated by non-parallel lines.
26. Know the components
of a two-way ANOVA table and the meaning of sums of squares for rows, columns,
and interaction in terms of cell, marginal, and expected means.
SSRow weighted
sum of the squared differences between the row means and the grand mean.
SSColumn weighted sum of the squared differences between the column means and the grand mean.
SSInteraction weighted sum of the squared differences between the cell means and the expected cell means.
The expected cell means are those that would exist if each dimension (row, column, layer, etc.) was consistent across the other dimensions.
27. Know the advantage
of factorial designs over separate one-way designs.
Row and column sums
of squares (and mean squares) would be identical in a factorial design
and separate one-way designs with the same data (in the most common type
of analysis). The error term, however, will usually be smaller in a factorial
design, therefore resulting in a larger F. The factorial design also allows
you to study the interaction of the separate effects. This is not possible
in separate one-way designs.
28. Given a data set, be able to complete an analysis of a two-way ANOVA using SPSS and interpret the results.
29. Given a printout
of a two-way ANOVA from SPSS, be able to interpret the main effect and
interaction results.
Means are given in
two places: a Descriptive Statistics table gives the actual or weighted
means. An Estimated Marginal Means table gives unweighted means. The unweighted
means are an estimate of what the means would be if the groups had been
equal in size. It is important to select the appropriate means for your
interpretation.
30. Be able to complete an ANOVA table given summary data (number of subjects, sum of squares, and number of levels for each factor.
31. Know the meaning
of and how to interpret ANOVA results using Type I, II, and III Sum of
Squares.
Type I SS adjusts
each effect for those listed before it in the list of effects. It would
be used if there is a hierarchy of cause and effect factors being hypothesized.
Type II SS adjusts the main effects for each other (not the interaction)
and the interaction effect for both main effects. It is equivalent to the
regression approach. Type III SS evaluates differences between the unweighted
means. Type III is the default and most common in SPSS. It assumes that
differences in sample sizes are a result of random events which is what
would normally occur in an experiment where the experimental and control
groups would have equal n's at the beginning of the experiment. Type IV
will not be covered in this class. It is useful with missing cells (cells
with no subjects).
32. Know the meaning
of tests of simple main effects and when to conduct them and pairwise comparisons
in a factorial design.
Tests of simple main
effects are one-way F tests for individual levels of one or more of the
independent variables using the overall error term. Pairwise multiple comparisons
can be done on either the marginal means or cells means as in a one-way
design. Simple main effects are conducted after a significant interaction
has been found. In many cases the interaction can be interpreted satisfactorily
by inspection of the means and not using simple main effects tests. The
SPSS procedures for simple main effects and pairwise comparisons for factorial
designs require using complicated lmatrix syntax commands. This is necessary
to ensure that the correct error term is used. An alternative procedure
would be to do a one-way F test using SPSS but compute the F ratios by
hand using the overall error term from the two-way analysis.
33. Know how a repeated
measures design differs from one-way and factorial ANOVA designs.
In a repeated
measures design, the groups being compared are composed of the same subjects.
Because of this, the error term for the difference between the means needs
to be computed differently to account for the similarity that is expected
when measuring the same persons. In effect, the difference between subjects
across all variables (groups) is removed from the error term which results
in a more powerful test. Many designs include both a between-subjects factor
and a within-subjects (repeated measures) factor.