Chapter 11 Correlation
11.1 What is correlation?
Answers the research question: How are two variables related (strength and direction)? Actually, it only looks for linear relationships. There are four possible relationships between two variables:
Positively related variables: As one variable increases, the other variable also increases. Higher scores of variable x are associated with higher scores of variable y.
Negatively related variables: As one variable increases, the other variable decreases. Higher scores of variable x are associated with lower scores of variable y.
No relationship: As one variable increases, the other variable might increase or might decrease.
Non-linear relationship: As one variable increases, the other will change according to a function.
When you graph the two variables, they can form a variety of patterns:
See these examples with their corresponding values of r. What do you think is indicated by r?
Note- these graphs are not histograms. They are scatterplots of the actual values of the variable. Each dot represents one case.
11.2 Correlation and Causation
The phase “correlation does not equal causation” says that simply by computing a correlation between two variables, you may find them to be related, but you cannot say that one variable causes the other variable. This is because there are other explanations for the data (besides X causes Y):
- A third variable is causing both X and Y
- Y causes X
The only way to make a causal statement with your research is to run an experiment. An experiment is a research design that has two features:
- A manipulation (i.e., an independent variable): A variable you control the value of (plants getting water, drug or placebo)
- Random assignment: Each participant has an equal chance of being at each level of your manipulation
If you don’t have random assignment, your research is quasi-experimental. If you don’t have random assignment or a manipulation, your research is non-experimental. We sometimes call quasi-experiments and non-experiments “correlational research designs,” but this name is misleading. Correlational research designs really mean “not an experiment” and are separate from the statistical technique of correlation. In this section, we are learning about the statistical technique of correlation. You can use the statistical technique of correlation whether you have a correlational research design or an experiment.
11.3 Sample Size
Sample size must be sufficiently large to detect a linear relationship. Larger samples increase power (they make it more likely that you will reject the null hypothesis when it is false). Power is especially important when your measures are not reliable or when there is a weak linear relationship you are trying to find. Having two samples of at least size 50 is a general rule-of-thumb.
You can also have too large of a sample. Extremely large samples will always be significant because it is assumed that your sample is almost perfect. Thus, even the tiniest effects will be significant (a very tiny correlation of r = .02 would lead you to reject the null hypothesis).
11.4 Computing a Correlation
You can find the value of r, which will tell you the strength and direction of the relationship between two variables. You can do this easily with your calculator. The full name of this statistic is the Pearson product-moment correlation.
- Create two lists, one for each sample. Enter your two variables into the two lists.
- Use the “Two-var Stats” function to compute r.
- Interpret r: Is the relationship positive or negative? Is the relationship strong or weak?
11.5 Correlations are Sensitive to Outliers
If either of your variables includes an outlier, it will reduce the strength of your correlation. Because of this, you want to consider whether the extreme score belongs as part of your data set. The most common cause of an outlier in real world data is a mistake during data entry.
11.6 Hypothesis Testing for a Correlation
A hypothesis test using correlation follows the same steps as any other hypothesis test. Hypotheses: The two formal hypotheses for a correlation are:
\(H_O:\rho=0\)
\(H_0:\rho\ne0\)
In words, the null hypothesis is that the population correlation coefficient (called rho [ρ]) is equal to zero. In other words, there is no linear relationship. The alternative hypothesis is that the population correlation coefficient is not equal to zero. In other words, there is a linear relationship.
Rho (\(\rho\)) is used instead of \(r\) in the formal hypothesis. They are the same thing, except that \(\rho\) is for a population and \(r\) is for a sample.
Analysis. The second step is to perform the analysis. When writing up your results, give the full name of the analysis technique you used. In this case, it is “the Pearson product-moment correlation.” This is when you compute r.
Decide. By using a t-table, you can determine if your value of r is so large that it is unlikely to have occurred by chance. First, use your value of r to get a value of t. Then, find the critical value of t using the t-table. Compare your observed t (the one you calculated) to critical t.
\({t}=\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}\)
\(df=n-2\)
Conclude. Did you reject or retain the null hypothesis? What is your conclusion? Interpret the meaning of your results. For example, “the correlation revealed a positive relationship between fitness and lifespan. This means that people who exercised more tended to live longer. This was a large effect.”
11.7 Effect size
Aside from the direction of the relationship, you can also talk about its strength. A strong correlation means that the data points of the scatterplot lie close to the line. A weak correlation means that the data points of the scatterplot are scattered far from the line. When you report a correlation, give the proper interpretation of its effect size (small, medium, or large).
11.7.1 Interpretation of \(\eta^2\) and \(r^2\) (Cohen, 1988)
These are reference points, not firm cutoffs. For example, .056 is a medium effect size.
Effect Size | Interpretation |
---|---|
\(\eta^2 = r^2 = .01\) | Small effect |
\(\eta^2 = r^2 = .06\) | Medium effect |
\(\eta^2 = r^2 = .14\) | Large effect |