## Should we use correlations in research?

What is a correlation? Well, a correlation is defined as a relationship between two variables. And just like any kind of relationship there are both positive and negative aspects to correlational designs. I’m going to start with a bit of correlation basics before discussing the negative aspects to correlations (to get them out of the way first), and then I will talk about the more positive side to this type of design before finally reaching a conclusion to the question “should we use correlations in research?”

I’m pretty sure that by now everyone knows that we can have positive or negative relationships between variables or even no relationship at all. However, we need to be able to measure how strong the relationship between the two variables is. If you look at the image below it shows the different relationships that can occur and their strength. The strength of relationship is assigned a numerical value with -1.00 being a perfect negative correlation and +1.00 being a perfect positive correlation. The first image (top left) shows no correlation (a value of 0.00 shows no relationship between the two variables being measured). However the bottom right image shows a correlation equal to 0.99 which is suggestive of a very strong positive relationship between variables. Similarly a value of -0.99 shows a strong negative relationship.

Now that little introduction is out of the way we can get into the more interesting stuff (can’t believe I’ve just said that!) As I just mentioned above, a correlation of 1.00 (+/-) shows a perfect correlational relationship between two variables. But even so we cannot infer causation from correlational research. For instance, we may see that there is a relationship between A and B but we do not know whether A causes B.  The first of the negative aspects of correlation studies I will discuss. One of the main problems to do with causation is that we often do not have tight control over variables so we may not always know whether the two variables we aim to study are the only variables at effect. The third variable problem suggests that there may be a third variable at play in a study that you are not aware of! So we might think that there is a relationship between A and B, when in fact a third variable (let’s call it C) is affecting A, or B, or both! I presented a piece of evidence in one of my comments the other week that shows the third variable problem in action and probably helps to get across what I mean. Li (1975)* wanted to find out which variables were the best predictors of the use of birth control in Taiwan. To cut a long story short it was found that the variable that correlated the most highly with the use of birth control was the number of electrical items that there were in the house! Clearly the researchers could tell something was amiss there, no way is owning a kettle going to increase the use of birth control right? Exactly, and this is why researchers realised that there was a third variable contributing to the correlation they were seeing. After some more research they discovered that the third variable was actually how well educated the individuals were; those who attended school regularly learnt about birth control, they probably got better jobs and so could afford more electrical appliances. So therefore it wasn’t actually whether you owned a toaster (A) causing the use of contraceptives (B) but actually our third variable, education (C). Whilst this piece of research was quite easy to spot that there was something else contributing to the correlation that was seen it isn’t always that easy and often things such as this can go unnoticed.

And unfortunately the third variable problem isn’t the only negative aspect to correlations. We can’t see which way the relationship goes; does A affect B or is it B that affects A? Often it is difficult to know which direction the relationship goes for definite. Gentile and Anderson (2003)** were interested in studying the relationship between aggression and the use of video games. The results of their study found that the amount of time that children spent playing violent video games (D) correlated positively with aggressive behaviour (E). However there is no way that we can say that the violent video games were causing children to act aggressively. Yes, possibly violent video games can increase aggressive tendencies, but it is also just as likely that children who are already more aggressive may choose to play violent video games. In other words, it is a “bi-directional model” as we don’t know which the determining factor is.

Now we’ve got two of the main negatives out of the way I’m going to show you that correlations aren’t all bad. Correlations are used throughout research as they are an easy way to determine if there is a relationship between variables.  Correlation studies are often used in medicine. For example, McNeal and Cimbolic (1986)*** noticed a correlation between depression and low serotonin levels. This has consequently led to the development of new drugs to treat depression, such as Selective Serotonin Reuptake Inhibitors (SSRIs) that increase the levels of serotonin in the brain. Without correlation studies, we might miss relationships like this!

Correlations are also good because they allow researchers to study naturally occurring relationships between variable that it would be unethical to manipulate in, for example, a laboratory experiment. One study http://www.cadca.org/resources/detail/study-finds-correlation-between-rapid-rise-unemployment-and-alcohol-abuse found that there was a correlation between increasing unemployment levels and instances of alcohol abuse, suicides and homicides. You can read more about it in the link above, but the study collected information from various sources such as the World Health Organisation (WHO). It was found that unemployment increases of 3% were correlated with a 28% increase in alcohol related deaths. The reason I mention this piece of research is because we couldn’t possible test it in a laboratory as it would be extremely unethical to make people unemployed to see how they’re health deteriorated as a result. Therefore researchers have to use the information that is available for them to observe. This is why correlational studies can be a great benefit to researchers as they show us things that we may otherwise miss. They’re relatively easy to run and can produce some extremely useful results without manipulating any variables and simply observing natural interactions between different variables.

I suppose I should really conclude before this gets even longer: Correlational research is a pain in the neck when it comes to inferring causation- we just can’t do it. But do we always need to know if one thing causes another? The issue of the third variable problem is, let’s face it, similar to problems that arise in laboratory experiments. We say we are better able to infer causation in lab experiments because they are controlled, however extraneous variables can still go unnoticed. They’re good at showing relationships, and can lead to further research once a relationship is established.

The End.

(Oh, actually I haven’t answered the question: “should we use correlations in research?”… My simple answer is yes. Why not? As my conclusion shows, there are strengths and weaknesses but nothing that’s bad enough to completely dismiss correlational research all together.)

*Li (1975) in S. L. Jackson’s Research Methods and Statistics: A Critical Thinking Approach

**Gentile, D.A. and Anderson, C.A. (2003). Violent video games: the newest media violence hazard. In D. A. Gentile (Ed.) Media violence and children.

***McNeal, E.T. and Cimbolic, P. (1986). Antidepressants and biochemical theories of depression