Stata for Students: Describe

This article is part of the Stata for Students series. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.

The describe command gives you a variety of useful information about your data set.

Setting Up

If you plan to carry out the examples in this article, make sure you've downloaded the GSS sample to your U:\SFS folder as described in Managing Stata Files. Then create a do file called desc.do in that folder as described in Doing Your Work Using Do Files and start with the following code:

capture log close
log using desc.log, replace

clear all
set more off

use gss_sample

// do work here

log close

If you plan on applying what you learn directly to your homework, create a similar do file but have it load the data set used for your assignment.

Using describe

If you run describe all by itself, you'll get a description of all the variables in the data set:

describe

Produces the output:

Contains data from U:\sfs\gss_sample.dta
  obs:           254                          
 vars:           895                          22 Jun 2016 15:52
 size:       277,622                          
-----------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
-----------------------------------------------------------------------------------------------------------------
prestg10        byte    %8.0g      LABA       Rs occupational prestige score (2010)
sppres10        byte    %8.0g      LABA       Spouse occupational prestige score (2010)
papres10        byte    %8.0g      LABA       Father's occupational prestige score (2010)
mapres10        byte    %8.0g      LABA       Mother's occupational prestige score (2010)
prestg105plus   byte    %8.0g      LABA       Rs occupational prestige score using threshold method (2010)
sppres105plus   byte    %8.0g      LABA       Spouse occupational prestige score using threshold method (2010)
papres105plus   byte    %8.0g      LABA       Father's occupational prestige score using threshold method (2010)
mapres105plus   byte    %8.0g      LABA       Mother's occupational prestige score using threshold method (2010)
sei10           double  %12.0g     LABB       R's socioeconomic index (2010)
spsei10         double  %12.0g     LABB       R's spouse's socioeconomic index (2010)
pasei10         double  %12.0g     LABB       R's father's socioeconomic index (2010)
masei10         double  %12.0g     LABB       R's mother's socioeconomic index (2010)
sei10educ       double  %12.0g     LABB       Percentage of some college educ in OCC10 based on ACS 2010
spsei10educ     double  %12.0g     LABB       Percentage of some college educ in SPOCC10 based on ACS 2010
pasei10educ     double  %12.0g     LABB       Percentage of some college educ in PAOCC10 based on ACS 2010
masei10educ     double  %12.0g     LABB       Percentage of some college educ in MAOCC10 based on ACS 2010

This is just the first page. With 895 variables, the describe output for the GSS is very long. Remember you can press 'q' or click on the red stop sign button to have Stata quit what it is doing.

A few highlights of this output:

  • This data set has 254 observations, which in this case means 254 people who responded to the General Social Survey. It is a subset of the complete GSS results.
  • It has 895 variables.
  • The variable name is what you need to use in your commands.
  • The variable label can help you understand what each variable means, though it's no substitute for the complete GSS documentation.
  • All of these variables have something in the value label column. Commands like tab will show you the value labels by default, but code must refer to the actual values.

If you want information about a specific variable, put its name right after describe:

describe sex

Produces:

              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------
sex             byte    %8.0g      SEX        RESPONDENTS SEX

With so many variables, it can be hard to find what you need in the GSS. One useful trick:

describe *edu*

This will describe all variables that contain "edu" anywhere in their name. The output is:

              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------------------------------------------------
sei10educ       double  %12.0g     LABB       Percentage of some college educ in OCC10 based on ACS 2010
spsei10educ     double  %12.0g     LABB       Percentage of some college educ in SPOCC10 based on ACS 2010
pasei10educ     double  %12.0g     LABB       Percentage of some college educ in PAOCC10 based on ACS 2010
masei10educ     double  %12.0g     LABB       Percentage of some college educ in MAOCC10 based on ACS 2010
coneduc         byte    %8.0g      LABAB      CONFIDENCE IN EDUCATION
educ            byte    %8.0g      LABAJ      HIGHEST YEAR OF SCHOOL COMPLETED
immeduc         byte    %8.0g      IMMEDUC    LEGAL IMMIGRANTS SHOULD HAVE SAME EDUCATION AS AMERICANS
inteduc         byte    %8.0g      INTEDUC    INTERESTED IN LOCAL SCHOOL ISSUES
maeduc          byte    %8.0g      LABAJ      HIGHEST YEAR SCHOOL COMPLETED, MOTHER
nateduc         byte    %8.0g      LABBL      IMPROVING NATIONS EDUCATION SYSTEM
nateducy        byte    %8.0g      LABBL      EDUCATION -- VERSION Y
paeduc          byte    %8.0g      LABAJ      HIGHEST YEAR SCHOOL COMPLETED, FATHER
sexeduc         byte    %8.0g      SEXEDUC    SEX EDUCATION IN PUBLIC SCHOOLS
speduc          byte    %8.0g      LABAJ      HIGHEST YEAR SCHOOL COMPLETED, SPOUSE
usedup          byte    %8.0g      USEDUP     HOW OFTEN DURING PAST MONTH R FELT USED UP

This is not a complete list of variables related to education, and includes one variable that is not related to education, usedup. But if you're interested in looking at education issues using the GSS it's a start.

Complete Do File

The following is a complete do file for this section.

capture log close
log using desc.log, replace

clear all
set more off

use gss_sample

describe
describe sex
describe *edu*

log close

Last Revised: 7/18/2016