Friday, November 3, 2017

Pearson correlation in Stata

In Stata, Pearson corrrelation is carried out by the following code:

pwcorr var1 var2

Here, var1 and var2 are the two variables you're correlating. Unless you've actually named your variables var1 and var2, you'll have to insert your own variable names here.

Bear in mind that you can correlate as many variables as you like. If you were correlating four variables, your code would be:

pwcorr var1 var2 var3 var4

What this code returns is a Pearson (r) value, which can vary from -1 (perfectly negative correlation) to 1 (perfectly positive correlation), with 0 representing the complete absence of correlation. However, the r value alone isn't of much use. You'll want to know whether the correlation is statistically significant, in which case you will alter the code as follows:

pwcorr var1 var2, sig

Let's try this code on a real-world example. Copy and paste the following lines of code directly into command box in Stata and press the RETURN key.

webuse census13
pwcorr brate pop, sig


In this Census dataset, brate is birth rate and pop is population, so are we correlating the relationship between the population of a state and its birth rate. 

Once you enter the commands, a correlation matrix is produced. Obviously, the correlation of birth rate with itself is 1. The correlation of birthrate with population yields an r value of -0.283, and it is statistically significant, because is < .05. It therefore seems that, the larger the state, the lower its birthrate. 





Obviously, correlation is just the beginning of numerous possible analyses of the relationships between these two variables. Elsewhere, we've provided tutorials on performing ordinary least squares (OLS) regression, also known as linear regression, which is a common procedure conducted after correlation. We've also provided guidance on how to construct scatter plots of the relationship between variables in order to complement your statistical analysis.

Convinced of our expertise? Let 272Analytics assist with data analysis and/or methodology for your quantitative thesis or dissertation.

No comments:

Post a Comment