Sunday, November 5, 2017

Histograms in Stata

A histogram is a graph of the distribution of a continuous variable. Let's create a histogram using Stata's built-in census dataset. Copy and paste the following code into your Stata command window:

webuse census13
hist pop



The resulting histogram indicates that the variable of population is not normally distributed. Most states have relatively smaller populations.



Now let's say you wanted to see the distribution of population by region. Try this command:

hist pop, by(region)




The resulting histogram shows you that the Western states tend to have lower populations.

There are other options within the hist command that can be useful. Let's say you want to increase your number of bins. Stata picked 10 bins as the default for your histogram, but, in some cases. you can make your histogram more informative by increasing the number of bins. Try this code:

histogram pop, bin(20)



See the differences between this histogram and the one you produced earlier?

Some other ways to manipulate histograms in Stata are to (a) add labels and (b) change the y axis to different measures. Typically, the y axis measure density, but you can change it to percent:

histogram pop, percent

Try adding labels to each bar, keeping the percent option, and expanding to 20 bins:

histogram pop, bin(20) percent addlabel



Like all Stata graphics, histograms can be manipulated in an extremely broad variety of ways, and with a fairly simple and intuitive series of commands. That's one of Stata's benefits in comparison to other software packages.

Convinced of our expertise? Let 272Analytics assist with data analysis and/or methodology for your quantitative thesis or dissertation.






No comments:

Post a Comment