|
The previous page dealt with the Two way analysis of variance with repeated measures on both factors. In general, the previous pages provided the computational principles underlying the calculation of effects and error terms, showing how this could be done using Excel. The calculations for a three-way anova with or without repeated measures, however, are somewhat daunting. Here, instead, we deploy only a conceptual understanding of the statistical analysis. We illustrate the relevant concepts with a three-way anova with repeated measures on one factor. The Before and After effects of using three types of learning Maps on learning to bake a cake are examined using student volunteers from four different Domains of study. On one afternoon, a group of 9 subjects studying Mathematics bake a cake using a specified recipe and ingredients in a Food Studies kitchen. The following morning they are given a hour of instruction in the technique of Mind Mapping and then an hour of instruction on how to bake a cake during which time they are asked to use Mind Mapping to support their learning. They bake a second cake in the afternoon. A second group of 9 subjects studying Mathematics bake a cake, are instructed in the technique of Topic Mapping and then cake making and bake a second cake. A third group of 9 subjects studying Mathematics bake a cake, are instructed in the technique of Intended Learning Outcome (ILO) Mapping and cake making, and bake a second cake. This experimental protocol is replicated for a second group of 27 students studying History, a third group of 27 studying Art, and a fourth group studying Psychology. For all 108 subjects, their two cakes are each evaluated by a Food Studies technologist and given a score out of 100. Notice the repeated measures – the same subjects bake one cake, are given instruction on the techniques of a given type of Mapping, use their Mapping type while being instructed on baking a cake, and bake a second cake. The raw cake scores (sorry!) are not shown here. Instead, the relevant means and variances of the raw data are shown in Table 1. Table 1. Cake mean scores and variances for 108 subjects in 4 Domains, Before and After using 3 Map Types.
The research question here is whether the use of any Mapping technique improves the cake score of the subjects compared to any other Map Type, and if any improvement depends on the Domain of study of the subject. The subjects provide their own control data, in that their baseline cake baking skill is tested before their instruction on Map Type and on cake making, so that their cake score afterwards gives a better picture of whether the Map Type or Domain have any effect. The question could be answered by calculating a change score for each subject, being their After score minus their Before score, and then asking whether the change shown by one Map Type is larger than that shown by another, and whether this change is different depending upon the subjects' Domain of study. As discussed in the page on Two dependent samples, this is a natural and straightforward approach to analysing any pre-post or before-after data, but it has a hidden and therefore problematic issue — relating any data to the change score can give spurious results. In addition, analysing the change scores does not allow an answer to the preliminary question of whether the subjects in each group started off with equal or similar cake baking skills. If the average pre-test score of each group is different, then any apparent increase (or decrease) in the average post-test score of a group could be due to its higher (or lower) pre-test rather than due to the effect of the group's Map Type or Domain of study. These two problems with analysing change scores are addressed by not using a two-way analysis on change scores but instead by investigating the interaction effect using a three-way analysis. We remember that a significant interaction suggests the effects of one factor depend upon the levels of the second factor. In the case of our data, significant interaction suggests the differences between Before and After scores depend upon whether the subjects were instructed in Mind Mapping, Topic Mapping, or ILO Mapping and whether they were studying Mathematics, History, Art, or Psychology. This gives the answer to the question whether a given Map Type or a given Domain of study improves the learning of our subjects, and importantly this answer does not depend on any differences between the groups' baseline cake baking skills before the experiment.
Workflow for analysing the three-way anovaWe note that the three-way anova involves three factors and hence three main effect means — Domain, Map Type, and Pre-post. It involves three first-order or two-way interaction effects — Domain × Map Type, Domain × Pre-post, and Map Type × Pre-post. And it involves one second-order or three-way interaction effect — Domain × Map Type × Pre-post. We have met and dealt with main effects and two-way
interaction effects already. To rehearse our understanding — The new issue concerns the meaning and significance of a three-way interaction, and that is where the interpretation of a three-way anova begins. The interaction effect of three factors, Domain, Mapping, and Pre-post, for example, is whether there are any differences between Before and After in the pattern of the interaction effect of Domain × Map Type cake scores, whether there are any differences between Maths, History, Art, and Psychology students in the pattern of the interaction effect of Mapping × Pre-post cake scores, and whether there are any differences between Mind, Topic, and ILO maps in the pattern of the interaction effect of Domain × Pre-post cake scores. If significant, the three-way interaction effect of Domain × Map Type × Pre-post suggests one or more, if not all, of the following: - the trend of the differences between Before and After for
various Domain of study depends upon the particular Map Type; Another way of expressing a significant three-way interaction effect is that it suggests one or more, if not all, of the following: - the pattern of trends seen in the Domain × Map Type profile plot,
and the pattern seen in the Map Type
× Domain profile plot, are different Before and After; In either case, a significant three-way interaction leads to the analyses of three sets of simple interaction effects: - two Domain × Map Type simple interaction effects, one at Pre-post = Before, and
one at
Pre-post =After; If any of these simple interaction effects are significant, we undertake an analysis of simple simple (yes, two "simple"s) main effects and related pairwise comparisons for the two factors involved. Where a factor is not involved in a significant simple interaction, its simple main effect may be examined for significance and, if significant, its pairwise comparisons. An insignificant three-factor interaction effect leads to the
examination of the main interaction effects — Domain × Map Type, Domain ×
Pre-post, and Map Type × Pre-post. Just before starting on the data analysis, it is worth pausing and considering which of the effects reflect the interests of the experiment. First is that we would like to demonstrate improved learning, and this would be given by a significant Pre-Post main effect. Then, we would like to know if improved learning is seen across all three Map Types -- an insignificant Map Type main effect -- or if one Map Type affects learning more than another Map Type -- a significant Map Type x PrePost interaction. Similarly, we would like to know if improved learning is seen across all four Domains -- an insignificant Domain main effect -- or if learning in one Domain is affected more than in another Domain -- a significant Domain x PrePost interaction. Finally, probably the whole point of the experiment is to see whether improved learning in a Domain is associated with a particular Map Type, perhaps the Arts students learn better with Mind Maps and the History students learn better with Topic Maps -- and that is shown by a significant Domain x Map Type x PrePost interaction.
Calculating the statistical significance of three-way anova effectsTo rehearse our understanding — the variation shown by the means of a factor, for example
Factor A, is compared to what would be expected if the means were not
significantly different, that is, if they came from the same population. The
variance of a mean based upon N scores is given by the population variance
divided by N. The population variance is estimated by the variance shown by the
scores of the subjects. If the observed variance of the means is larger than
expected, we infer that one or more means did not come from the same population.
In order to compare the variance of the means with the variance seen in the
subject scores, the observed variance of the means is multiplied by N and is
called MS(A). The variance of the subject scores is in general called MS(error).
The comparison is made with the F ratio where MS(A) is divided by MS(error). The variance of the subject scores is generally given by the average of the cell variances. In the case of repeated measures on one factor, for example Factor C, there are two views of the subject scores available. There is the view given by the subject scores in each cell, as before, and then there is a second view given by averaging the subjects' scores over the levels of Factor C. The average variation shown by the cell scores is called error or cells and is used to test the A, B, and A×B effects. The variance shown by the averaged subject scores over Factor C is called between subjects, and is subtracted from the error or cells variance to give a residual or within subjects variance, used to test the C, A×C, B×C, and A×B×C effects. These error terms are explained in more detail in the page dealing with the Two way anova with repeated measures on one factor. In the case of repeated measures on two factors, there are four views of the subject scores available. There is the view given by the subject scores in each cell, as before. The other three views are given by the subjects' scores averaged over B, over C, and over both B and C together. The average variation shown by the cell scores is called error or cells and may be used to test the A effect. The variances shown by the averaged subject scores over B, over C, and over both B and C are usually used to calculate error terms for testing the B, C, A×B, A×C, B×C, and A×B×C effects. These error terms are explained in more detail in the page dealing with the Two way anova with repeated measures on both factors. Similarly, in the case of repeated measures on all three factors, there are eight views of the subject scores available. There is the view given by the subject scores in each cell, as before. The other seven views are given by the subjects' scores averaged over A, over B, over C, over both A and B, both A and C, both B and C, and finally over all A, B, and C. The average variation shown by the cell scores is called error or cells, but is not used in testing any of the anova effects in the 3-way RM on ABC. The variances shown by the averaged subject scores are subtracted from the cells variation and used for testing. In the case of a three-way anova with repeated measures on one factor, Table 2 summarises the error terms appropriate for the test of each effect. Table 2. Error terms for the three-way anova with repeated measures on one factor.
Summary tableThe summary table for the three-way anova with repeated measures on the Pre-Post factor from Table 1 is shown in Table 3. Table 3. Three-way anova with repeated measures on one factor.
Terminology for the error terms can vary in publications reporting analyses of variance. In particular, the within-subjects error term might be termed 'Subjects x C' if the repeated measures are over Factor C. What is more or less universal is that the between subjects sources of variation are grouped together in the summary table and their between-subjects error term is shown with them, whatever it is called; while the within subjects sources of variation are separately grouped in the summary table with their within-subjects error term, whatever that is called.
Profile plotsWe usually postpone drawing the three-factor interaction profile plots until we know they are needed, that is, until we know if the three-factor interaction is significant. When interpreting the results of three-way and higher analyses of variance, unnecessary inspection of the high order interaction plots can be misleading. The PrePost × Domain × MapType interaction is significant, however, and so the profile plots are shown below. Notice their large number. It may be useful to pause and consider the meaning of a significant three-factor interaction here. As identified earlier, probably the point of the experiment is to see whether improved learning in a Domain is associated with a particular Map Type. That is shown by a significant Domain x Map Type x PrePost interaction, so yes, the result suggests that improved learning depends upon which Map Type and which Domain. One set of three-factor interaction profile plots concerns the Domain × MapType simple interaction at Pre-post = Before and at Pre-post = After, and then concerns the same interaction with the factors plotted on the other axis, hence MapType × Domain at Pre-post = Before and at Pre-post = After, giving 4 plots.
The second set of three-factor interaction profile plots concerns the Domain × PrePost simple interaction at MapType = Mind, MapType = Topic, and MapType = ILO, and the same interaction but with the factors on the other axis, hence PrePost × Domain at MapType = Mind, MapType = Topic, and MapType = ILO, giving 6 plots.
The third set of three-factor interaction profile plots concerns the MapType × PrePost simple interaction at Domain = Maths, Domain = History, Domain = Art, and Domain = Psychology, and the same interaction shown with the axes exchanged, hence PrePost × MapType at Domain = Maths, Domain = History, Domain = Art, and Domain = Psychology, giving 8 plots. These are plots of the simple interaction effects of MapType × PrePost and PrePost × MapType at the different levels of Domain.
The are 18 three-factor interaction profile plots. The easiest place to start is usually with the plots with the fewest lines, so we look at the nature of a significant Domain × MapType × PrePost interaction effect as shown in the profile plots of PrePost × MapType for Maths, History, Art, and Psychology (the ones just above this paragraph). We can see that the profile lines are reasonably parallel in the Art plot, roughly parallel in the Maths and History plots, but cross over in the Psychology plot. We see that the patterns of the PrePost × MapType profile lines are similar in Art, Maths, and History, but quite different in Psychology. The A×B interaction depends upon the particular level of C. We see that the trend of the differences between Before and After for various Map Types depends upon the particular Domain of study, and vice versa, that the trend of the differences between Before and After for each Domain of study depends upon the particular Map Type used by the students.
Simple interaction effectsInspection of the profile plot of PrePost × MapType for Art suggests an insignificant simple interaction effect, the plot of PrePost × MapType for History and the plot of PrePost × MapType for Maths might or might not suggest a significant simple interaction effect, while that for Psychology almost certainly suggests a significant simple interaction effect. The error terms for testing simple interaction effects are established by considering the composition of the simple effect. Each simple 2-way interaction effect is part of the combination of the main 2-way interaction effect and the 3-way interaction effect: SS(A×B at C1) + SS(A×B at C2) = SS(A×B) + SS(A×B×C) If the main 2-way and 3-way interactions are tested by the same error term, then the simple interaction is tested by that error term. If the main 2- and 3-way interactions are tested by different error terms, then a pooled error term needs to be constructed for the simple interaction effect, being the pooling of the SS and df of the different error terms involved. Table 4 shows the error terms appropriate for testing simple interaction effects for the three-way anova with repeated measures on one factor. Table 4. Error terms for the simple interaction effects of the three-way anova with repeated measures on the PrePost factor.
For the anova of Table 3, SS(Pooled) = SS(Between) + SS(Within) = 8428.33 + 4384.78 = 12813.11, df(Pooled) = df(Between) + df(Within) = 96 + 96 = 192, and so MS(Pooled) = 66.73, which is the error term required for the analysis of the simple interaction effects of PrePost × MapType at each of the Domains of Art, Maths, History, and Psychology. These are shown in Table 5. Note that similar tables are required (but not provided!) for the three simple interaction effects of PrePost × Domain and the two simple interaction effects of Domain × MapType. Table 5. Simple PrePost × MapType interaction effects summary table.
The simple PrePost × MapType interaction effects at Maths, History, and Art are not significant, so the next step is the examination of the simple main effects of PrePost and MapType at each of these Domains. These insignificant simple interaction effects of PrePost × MapType tell us that improved learning, if any, is similar for the Map Types in Maths, Art, and History. The simple PrePost × MapType interaction effect at Psychology is significant, and the next step is the examination of the simple simple main effects of PrePost and of MapType at Psychology. This significant simple interaction effect tells us that improved learning, if any, in Psychology is different depending upon Map Type.
Simple main effectsThe error terms for testing simple main effects arising from an insignificant simple interaction effect are established by considering the composition of the simple main effect. Each simple main effect is a part of the combination of the main effect and the 2-way simple interaction effect involved: SS(B at C1) + SS(B at C2) = SS(B) + SS(B×C) If the main effect and the 2-way interaction are tested by the same error term, then the simple main effect is tested by that error term. If tested by different error terms, then a pooled error term needs to be constructed. Table 6 shows the error terms appropriate for testing simple main effects for the three-way anova with repeated measures on one factor. Table 6. Error terms for the simple main effects of the three-way anova with repeated measures on the PrePost factor.
The example analyses of the insignificant simple PrePost × MapType interaction effects requires the examination of the simple main effects of PrePost at Domain and of MapType at Domain. The errors terms are MS(Within) and MS(Between) respectively. It may be helpful to spell out the meaning of a test of a simple main effect here. For example, the simple main effect of Map Type at Domain=Maths is a test of the differences between the three Map Types means (Mind, Topic, ILO) overall PrePost (overall Before and After) for the Maths students. This may not be a very interesting effect from the point of view of the experiment and what we would like to find out, because the Map Type means are the average of Before and After. On the other hand, for example, the simple main effect of PrePost at Domain=Maths is a test of the difference between the two PrePost means (Before, After) overall Map Types for the Maths students. This is a much more interesting effect from the point of view of the experiment and what we would like to find out, because the PrePost means are the average of the three Map Types. Table 7 provides the simple main effects of PrePost for Maths, History, and Art, while Table 8 provides the simple main effects of MapType for Maths, History, and Art. Note that the profile plot of a main effect, whether simple simple, simple, or main, is a single line being the means averaged over one or other of the other factors. The profile plots shown earlier are plots of interaction effects and do not properly provide the necessary visualisation of a main effect. The relevant simple main effect plots are shown below each table. Table 7. Simple main effects of PrePost overall Map Types.
The simple main effects for PrePost overall Map Types for Maths, History, and Art are all significant. Inspection of the profile plots shows that the After means scores are all significantly higher than the Before mean scores. There is no need for pairwise comparisons here, because the PrePost factor comprises two levels. If the PrePost effect is significant, the two means concerned are significantly different, and it only needs inspection of the values or the profile plot to identify which is the significantly higher mean.
Table 8. Simple main effects of Map Type overall PrePost.
The simple main effect for Map Type overall PrePost is significant for Maths and not significant for History and Art. For History and Art, the mean scores (averaged over Before and After) for Mind Mapping, Topic Mapping, and ILO Mapping are all similar. Inspection of the profile plot for Maths suggests that the mean score (averaged over Before and After) for Mind Mapping is significantly lower than for Topic and ILO Mapping, and for ILO Mapping is significantly higher than for Topic Mapping. This suggestion needs pairwise comparisons to identify exactly which means are different from which other means. The required error term for pairwise comparisons is given by the error term which was used in the parent analysis. The simple main effect of MapType at Domain=Maths was tested against MS(Between). The standard error for the difference between one Map Type mean and another, each based on n items of data, is given by √ (2MS / n). Note that the value for "n" is the number of data items which made up the mean in question. Here, n = 18, being 9 data points for the Before measurement and 9 data points for the After measurement. MS here is MS(Between) = 87.8, hence SEdiff = √ (2 · 87.8 / 18) = 3.12. The difference between the Mind Map Type mean, 52.7, and the ILO Map Type mean, 62.75, for Maths overall PrePost = 10.05. Hence t = 10.05/3.12 = 3.22 with df of df(Between) = 96, p < .01, and we may see that the ILO mean is significantly higher than the Mind mean for Maths overall PrePost.
Simple simple main effectsThe error terms for testing simple simple main effects arising from a significant simple interaction effect are established by considering the composition of the simple simple main effect. Each simple simple main effect is a part of the combination of the main effect, the two 2-way interaction effects involved, and the 3-way interaction effect. For example, where each of the three factors have two levels: SS(A at B1 at C1) + SS(A at B1 at C2) + SS(A at B2 at C1) + SS(A at B2 at C2) = SS(A) + SS(A×B) + SS(A×C)+ SS(A×B×C) If the main effect, the two 2-way interaction effects, and the 3-way interaction effect are tested by the same error term, then the simple simple main effect is tested by that error term. If tested by different error terms, then a pooled error term needs to be constructed. Table 7 shows the error terms appropriate for testing simple main effects for the three-way anova with repeated measures on one factor. Table 7. Error terms for the simple simple main effects of the three-way anova with repeated measures on the PrePost factor.
Earlier, we saw that the simple PrePost × MapType interaction effect at Psychology was significant, and the next step is the examination of the simple simple main effects of PrePost and of MapType at Psychology. The error term for testing PrePost at Map Type = X and Domain = Psychology is MS(Within), while for testing Map Type at PrePost = Y and Domain = Psychology is MS(Pooled). Because PrePost is a factor with just two levels, the simple simple main effects of PrePost at Domain = Psychology and at MapType = Mind, = Topic, and = ILO are given by pairwise comparisons of the Before and After means at Domain = Psychology and at MapType = Mind, = Topic, and = ILO. From Table 7, the error term for these pairwise comparisons is MS(Within). The standard error for these pairwise comparisons is given by √ (2MS / n), where MS is MS(Within) = 45.7 (with df(Within) = 96), and n = 9, hence SE(diff) = 3.19. The results are shown in Table 8. Table 8. Pairwise comparisions for simple simple main effects.
We may see that, for Psychology students, scores are significantly higher After using Mind Mapping and Topic Mapping but are not significantly different between Before and After for ILO Mapping. The simple simple main effects of MapType at Domain = Psychology and at PrePost = Before and = After are given in Table 9. Table 9. Simple simple main effects.
We may see that, for Psychology students, there are significant differences between Map Type scores Before, but there are no significant differences between Map Type scores After. Exactly which Map Type scores are different Before may be given by pairwise comparisons between the Before Map Type scores of 45.6, 53.7, and 61.8. SE, based upon MS(Pooled), = 3.85, and the results are t(Mind-Topic) = -2.1, p = .04, t(Mind-ILO) = -4.2, p < .001, and t(Topic-ILO) = -2.1, p = .04. For some reason, Psychology students started with significantly different Before scores on all three Map Types.
SummaryThe three-way anova introduces the 3-way interaction, a new type of effect which is concerned with differences between the three 2-way interaction effects. A 2-way interaction effect may be thought of as a pattern of trends of one main effect as compared with the trends of the other. The 3-way interaction is a pattern of patterns, where the pattern of one 2-way interaction effect is compared to the pattern of the others. The logic of the analysis of a three-way ABC anova design follows a workflow which always starts with the A×B×C 3-way interaction. If the 3-way interaction is significant, 2-way interaction effects are not interpreted. Instead, 2-way simple interaction effects are investigated, being A×B at each level of C, A×C at each level of B, and B×C at each level of A. If there are p levels of factor A, q levels of factor B, and r levels of factor C, the workflow here is just as though there are r AB anovas, q AC anovas, and p BC anovas. The terminolgy adds "simple" to the analyses undertaken in these simple two-way anovas. Their main effects are now simple main effects, and if any simple interaction effect is significant then its simple simple main effects are examined. There is an interesting adjustment to the usual workflow guideline that a main effect should not be interpreted if it is involved in an interaction effect. Because the 3-way interaction is a pattern of patterns, the trend of a main effect may be interpreted if it does not appear in any significant simple interaction consequent upon a significant 3-way interaction. For example, following a significant A×B×C interaction, it may be that one or more of the simple interaction effects of A×B at each level of C are significant, but that none of the simple A×C and B×C effects are significant. In this case, Factor C "escapes" involvement in any significant interaction, and so its main effect can be interpreted. Factor C was not "involved" in the significant A×B×C interaction because the 3-way interaction is exclusively concerned with the 2-way A×B, A×C, and B×C interactions. If the 3-way interaction is not significant, the three 2-way interaction effects may be examined, and the workflow here is just as though there are three two-way anovas: an AB anova, an AC anova, and a BC anova. The three-way anova may have repeated measures on one factor, two factors, or all three factors. The issues here lie with the choice of the correct error MS for the main, 2-way, and 3-way effects, and for the simple main and simple interaction effects, as well as with the choice of the correct SE in the pairwise comparisons t-tests following significant main, simple main, or simple simple main effects. Previously in these web pages on Psychostats, we have called the between-subjects error variation in a paired or repeated measures design "SS(Subjects)", the variation seen in the cells "SS(Cells)", and have subtracted SS(Subj) from SS(Cells) to yield "SS(Residual)". In the page on Two way repeated measures on both factors we noted that the between subjects variation was better thought of as a Subjects × Repeated Factor variation. In the case of a two-way AB anova with repeated measures on B, the subjects variation would better be termed "B × Subjects", and for a two-way AB anova with repeated measures on both A and B, the subjects variation comprises three sources termed "A × Subjects", "B × Subjects", and "AB × Subjects". A three-way ABC anova with repeated measures on C has one source of subjects variation termed "C × Subjects", and this is the design we have examined in detail in this page, using the previous explanations found in the Two way repeated measures on one factor page. A three-way ABC anova with repeated measures on B and C has three sources of subjects variation termed "B × Subjects", "C × Subjects", and "BC × Subjects", and with some work could be analysed using the information found in the Two way repeated measures on both factors page. A three-way ABC anova with repeated measures on A, B, and C has seven sources of subjects variation termed "A × Subjects", "B × Subjects", "C × Subjects", "AB × Subjects", "AC × Subjects", "BC × Subjects", and "ABC × Subjects", and there is no attempt in this web site to illustrate its analysis.
|
©2025 Lester Gilbert |