# Assignment: Causal explaining

## Assignment: Causal explaining

## Assignment: Causal explaining

**Permalink:**

### Assignment: Causal explaining

#### ASSIGNMENT: CAUSAL EXPLANATIONS AND NONEXPERIMENTAL STUDIES

Using Johnson’s classification system (2001), many nonexperimental studies are either descriptive or predictive. For those, the notion of causation is not relevant. However, a goal for many explanatory nonexperimental research studies is to explore potentially causal relationships. A causal relationship is one in which a given action is likely to produce a particular result.

The terms *independent *and *dependent *refer to the different roles variables play in experimental studies. If a causal relationship exists, then the outcome (the measured DV) depends on, or is a direct result of, the nature of the assigned independent treatment condition. Strictly speaking, these terms are not applicable in nonexperimental research, although they are often used. The more appropriate terms in nonexperimental studies are **criterion **and **predictor **variables, criterion being the presumed outcome of one or more predictor variables. When the intent is to use nonexperimental research to study potential cause-and-effect relationships where experimentation is not possible, the concept of IV and DV may still be of interest, but conclusions about causation that can be made from nonexperimental studies are weaker than those that can be made from true-experimental studies. Additionally, great care needs to be taken to assure that nothing essential has been overlooked.

As explained earlier, the distinction is often made between nonexperimental studies that involve both categorical and quantitative variables and those that involve only quantitative variables. Considering only two variables for the sake of simplicity, an example of the first type of study is a comparison of gender differences in mathematics achievement in high school. Gender, with male and female as the two categories, is considered the independent variable and some mathematics achievement score is the measured dependent variable. Examples where both variables are quantitative might be an examination of the relationship between test scores and time spent studying, or between scores on some measure of motivation and scores on an achievement test. Examples like these, of very simple cases involving only two variables, are neither very interesting nor very informative. Additional variables could be included in order to examine more complex relationships.

No matter which type of design or which type of variable is used, evidence of a relationship would not be convincing evidence of causality. Recall the example described earlier about investigating the relationship between education level and salary and the

## 72

two ways that education level could be measured. Regardless of whether education level was construed as categorical (highest degree earned) or as quantitative (number of years of schooling), it should not be concluded that one’s educational level *caused *or produced a different level of salary. If dramatic differences across the five groups with different degrees were found such that those with higher education had higher median salaries, all that can be concluded is that there was a *relationship *between educational level and salary. This same conclusion would be possible if results indicated a strong positive correlation between years of schooling and salary: that people with fewer years of school tended to have low salaries and people with more years of school tended to have high salaries (see Figure 4.1 for graphical representation of a positive relationship). The scatter plot for a negative relationship would go from the upper left corner to the lower right corner, indicating that low scores on one variable tended to go with high scores on the other variable.

The differences in the wording of the research questions in the previous two cases reflect the nature of the variables used (categorical or quantitative). They would require different analysis strategies, either to test if the median values did differ more than you might expect by chance, or to determine the strength and direction of the relationship. Differences in wording or analysis do not, however, reflect any difference in the nature of the relationship between the variables. Explanatory nonexperimental research articles often have conclusions phrased in causal language. Therefore, the next section is a review of the essential elements needed to establish cause-and-effect relationships and a discussion of their applicability to nonexperimental studies.

### Requirements for Causality

Assignment: Causal explaining

There are three conditions necessary in order to be able to argue that some variable X (the presumed independent) causes another variable Y (the presumed dependent).

1. The two variables X and Y must be related. If they are not related, it is impossible for one to cause the other. For nonexperimental research, that means that it must be demonstrated that differences in X are associated with differences in Y.

2. Changes in X must happen before observed changes in Y. This is always the case when X is a manipulated treatment variable in an experiment. But establishing that a cause happened before an effect needs to be documented in some way or logically explained in nonexperimental studies. This is impossible to do when the data are cross-sectional and collected simultaneously.

3. There is no possible alternative explanation for the relationship between X and Y. That is, there is no plausible third variable that might explain the observed relationship between X and Y, possibly having caused both of them.

In nonexperimental studies, the first requirement can be established easily with correlational analyses. The second could also be established if longitudinal data are used so that predictor variables are measured before the criterion. The third requirement is more difficult to demonstrate. To do so requires a thorough knowledge of the literature and the underlying theory or theories governing the topic being investigated, logical arguments, plus testing and ruling out of alternative possibilities.

Causal Explanations and Nonexperimental Studies **73**

The fact that two variables are related does not inform us of which one influences the other. There are at least three reasons why two variables could be related and it is not possible to know from the correlation which one is the correct reality. Three potential explanations are: (1) that X causes or influences Y, (2) that Y causes or influences X, or (3) that Z, a third variable, causes both X and Y. Consider the following headline: “Migraines plague the poor more than the rich.” It could be argued that the stresses of living in poverty and other poverty-related conditions could trigger migraine headaches. It could also be argued that migraines cause one to miss work and eventually lose employment, thereby inducing poverty for a subset of individuals prone to migraines. Which is the correct interpretation? It is impossible to tell.

Although there is no formal way to prove causation in nonexperimental research, it may be possible to suggest it. This is done through careful consideration, by referring to the three conditions for cause, by presenting logical arguments, and by testing likely alternatives in order to make a case for the *likely *conclusion of a causal relationship. One must be careful, however, not to phrase conclusions as proof of causation.

### Ruling Out Alternative Hypotheses

To demonstrate the process for ruling out alternative hypotheses, we will use a medical example. Consider the process a doctor goes through in diagnosing a new patient’s illness. First, the doctor considers the symptoms. The list of symptoms is used to select potential problems with similar symptoms and to rule out problems with different symptoms. Tests are ordered to confirm the most likely diagnosis and remedies are tried. If the test results are negative or the remedies do not work, then the original diagnosis is discarded, and other possible diagnoses are considered and tested. How does this process relate to research? The first step is matching observations (the reported symptoms) to theory (known symptoms for an illness). The second step is to test a hunch or tentative hypothesis (initial diagnosis) and rule out alternative hypotheses (other potential diagnoses). The process continues until a reasonable conclusion is reached. The analogy breaks down because, ideally, the correct diagnosis is made and the patient is cured, although results are never as conclusive in nonexperimental studies.

Assignment: Causal explaining

Given a theory that is driving the research, how does one rule out potential alternative hypotheses? One way is to consider all likely **confounding **or **lurking variables**. In an experimental study, two variables are confounded when their effects on a dependent variable cannot be distinguished. The following example, although purely correlational, should clarify the concept of confounding or lurking variables.

One would expect that grades and standardized tests, such as SAT scores, would be related more to each other than they would to socioeconomic status (SES). In many studies, however, SES and SAT appear to have a much stronger relationship than do grades and SAT. Rebecca Zwick and Jennifer Green (2007) explored reasons for such results with data from a random sample of 98,391 students from 7,330 high schools. They performed two different analyses. In the first analysis, they found the correlation for grades and SAT for the entire sample and, in the second analysis, they did so for each school individually and then averaged the school-level results to get one overall measure of relationship. The second analysis produced a much stronger relationship

## 74

between grades and SAT scores than did the first analysis. This is because the first analysis ignored the fact that there are school-level differences in SES as well as other variables.

Figure 4.2 should help you visualize this discussion. In part A, the two smaller ovals represent a scatter plot of scores for two schools, where both grades and SAT scores tend to be higher in School 2 than in School 1. The lines bisecting these two ovals provide a linear representation of the relationship between the variables within each school and are called **regression lines**. Both ovals are rather narrow in width, being fairly close to their regression lines, and thereby give a visual representation of a relatively strong positive relationship between grades and SAT *within *each school. The larger oval represents the relationship between grades and SAT scores as it would appear across or *between *schools, that is, if school membership were ignored in the analysis. It is much more spread out around its regression line (the dotted line), erroneously indicating a much weaker relationship between grades and SAT. The two smaller ovals correspond to Zwick and Green’s second analysis (2007) and the larger oval to their first analysis. Ignoring the differences between the schools confounds the relationship between grades and SAT being investigated.

Part B of Figure 4.2 shows a worst-case scenario of ignoring a lurking variable. Suppose the relationship between two variables, X and Y, is negative for each of two groups. This is shown by the two smaller ovals, where lower scores on X tend to go with higher scores on Y and vice versa within each group. Ignoring groups,

**FIGURE 4.2.**

** Representation of Effects of Confounding Variables **

SAT

Grades

School 2

School 1

Y

X

Group 2

Group 1

A. Fairly strong positive relationship

between grades and SAT within each

school. Weak relationship when

Assignment: Causal explaining

school membership is ignored.

B. Fairly strong negative relationship

between X and Y within each group.

A seemingly positive relationship

when group membership is ignored.

Analysis and Interpretation in Nonexperimental Studies **75**

however, would produce a positive relationship, which would be a completely wrong conclusion.

### REFLECTION QUESTIONS

By now, you should be able to

1. List and explain three essential requirements to argue cause

2. Explain why even a strong correlation does not imply causation

3. Describe why ruling out alternative hypotheses is important.

4. Find one or two nonexperimental studies in your field of study where hypotheses were tested or where a theory was explored. What extraneous variables or potential alternative hypotheses were discussed? Can you think of others that were not discussed? How might inclusion of those variables have changed results?