The American Jobs Machine

By Erik Olin Wright and Rachel Dwyer

June, 2000

Technical Appendix:
Data and details for the Strategy of the analysis


The data we use, like the data used in the Stiglitz report, come from the Current Population Survey (CPS) for the period 1963-1999. The CPS is the major U.S. government household survey of employment and labor force participation, conducted monthly by the Bureau of the Census for the Bureau of Labor Statistics. The sample, selected to be representative of the civilian, non-institutional population of the United States, is very large and thus allows for fairly fine-grained analyses of labor market trends.

Throughout this analysis we will restrict our investigation to jobs held by employees, thus excluding the self-employed. In principle the problem of job expansion should include all active participants in the labor force, both employees and self-employed. However, the CPS does not contain comparable earnings data for both self-employed and employees, and thus it is difficult to create a unified analysis of job expansion including both categories of earners. For present purposes, therefore, we will restrict the analysis to employees. We will also, for most of the analysis, restrict our attention to full-time employment, defined as working 35 hours a week or more.

No useable CPS data is currently available before 1962, so our analysis must begin after the 1961 start date of the expansion. In addition, the CPS data files that are available for the early 1960's have possible reliability and validity problems -- in large part a result of the difficulties of data collection and storage in that period. For example, employment level estimates in some years are appreciably higher or lower than those reported by other sources. The 1962 data is particularly problematic, therefore we decided not to use that year. Later years become progressively better. We decided to begin the analysis in 1963 as a compromise between the quality of the data and our need to include as many years of the economic expansion as possible. We did run the analysis using different years as our starting point, and the broad patterns of our conclusions were maintained. Despite the problems with the data, there are no other sources of employment data for the 1960's as comprehensive and appropriate for our purposes as the CPS.

There are a number of changes in the data over the years that affect our methodological strategy and coding regimes. In particular, the sampling structure of the CPS dataset we use changes from the 1960s to the 1990s, and a number of the central survey questions used to define the variables of our analysis change in the two periods as well.

For the 60's analysis, we use the annual CPS demographic data, collected in March of each year. For 90's analysis, we use the Bureau of Labor Statistics (BLS) earner study data collected from the Outgoing Rotation Groups (ORG) in each month of the survey. The BLS data are more complete and better suited to our purposes, but the earner study did not begin until 1979. In the earlier period, therefore, we had to rely on the March annual demographic files. Differences in the survey questions between these two sets of files require somewhat different strategies of variable construction in the 60's compared to the 90's. In addition, in the same year the BLS earner study began, the CPS changed it's measure of race.

Summary of major differences between the 1960's data and the 1990's data:

1. The March sample size used in the 1960s analysis is roughly 1/3 that of the ORG. (The ORGs consist of 1/4 of each month's sample. An entire year of ORG data is thus 3 times the sample size of one month.)

2. The March full-time/part-time status based on hours worked at all jobs vs. ORG full-time/part-time status based on hours worked at main job.

3. The March median earnings of jobs calculated on a restricted group because of the necessity of using the demographic data on the year before the survey vs. ORG median earnings based on all current main jobs.

4. From 1962-1978 there is a race variable specifying white, black and other, but no variable identifying Hispanic heritage. From 1979 on, the race variable is a little more detailed, including codes for American Indian and Asian, and there is an additional variable identifying whether the respondent is of Hispanic heritage. As a result, from 1979 the white category is currently non-Hispanic white, whereas before that, all who consider themselves white remain in that category.

5. There are significant changes in the occupation and industry coding schemes in the two periods.


1960's: March Annual Demographic Data

Before the BLS earners study began in 1979, the CPS was composed of some basic labor force questions asked each month, supplemented at times by another topic depending on the specific month. The basic labor force data do not include data on earnings. However, some data on earnings has been collected since 1961 in the March annual demographic supplement -- a battery of questions asked about activities in the last year including some data about work. Therefore, for the 60's, we require questions both from the basic labor force section of the survey, and from the March annual demographic survey.

There are basically two steps in the construction of the data we use: First we need to calculate the median earnings for every cell in the occupation-by-industry matrix. This is needed to rank-order the job-cells in order to construct the job quality deciles. Second, we need to measure the number of people within each of these cells at the beginning and end of the period under study.

Median hourly wage of jobs

The annual demographic survey collects the following data, which we then use to develop a measure of earnings:

· screening questions used to identify employees
· average weekly earnings last year (calculated from 2 questions, income from wages and salary last year and weeks worked last year)
· occupation of longest job held last year
· industry of longest job held last year
Since there is no data on hours worked in the last year, we use one of the basic labor force questions:
· hours worked last week on all jobs
In order to get reasonable measures of median job earnings for the categories in our occupation-by-industry matrix it was necessary to restrict the sample for this task to people who did not change jobs during the last year both because the average weekly earnings variable applies to the entire year, not to a particular job, and because the hours worked variable refers to the current job, not the longest job held last year. Since there are no explicit data on job changes, we restrict the sample to people who worked full-time for the full year last year, and people who work in the same occupation and industry this year as they did last year in our calculations of job-cell median earnings for the 1960s.

Counts of people in cells in beginning and end of the period

The basic labor force questions provide the following data which we use to measure the number of people in job cells at the beginning and end of each period:

· screening questions used to identify employees
· occupation of main job worked last week
· industry of main job worked last week
· usual full-time/ part-time status on all jobs
Occupation and sector coding scheme (32x21 = 672 cells)

From 1963-67, the CPS uses a 2 digit code which does not fully map into the categories we used for the later expansions, therefore the occupation codes for the 60's are essentially a different scheme. In addition, the codes are slightly different for respondent's current occupation and sector and for the respondent's "last year's" occupation and sector (which we used to calculate the median earnings of the jobs in the occupation by industry matrix). There are 32 codes for current occupation but only 21 codes for last year's occupation. For the period 1963-67, therefore, we could calculate the job-cell median earnings only for 21 occupational categories; from 1967-70 we could do this calculation for the entire set of 32 categories. We combined these two sets of calculations to generate our rank-ordering of job-cells. Sector is more manageable. In the 60's it was possible generate a 21-category sector variable. (A 22 category variable - the same as the one used in the 1990s - was available from 1968, but since it was unavailable for the early 1960s we had to use the 21 category variable for the entire period).
 

1990's: Bureau of Labor Statistics Earner Study - the Outgoing Rotation Group Files

In the 1990s analysis the specific part of the CPS data we use is called the "outgoing rotation group." Each household in the sample is interviewed for four consecutive months, then not interviewed for eight months, followed by four more consecutive months of interviews. Part of the sample changes every month so the sample is composed of households at various stages along their interview course in a given calendar month. Households in their fourth and eighth months of interviews are called "outgoing rotation groups"-- in month four, outgoing temporarily for the eight months off interviewing, and in month eight, outgoing permanently. A basic module of the survey is repeated each month; however the survey does vary in other respects depending in part on where the household is along its interviewing course. Questions on "usual weekly earnings" and "usual hours worked per week" are asked only of the outgoing rotation groups (households in month 4 or 8). Responding to the great social science interest in earnings, every year the CPS prepares a file containing the combined outgoing rotation interviews over the whole year. The file contains the labor force and employment series, the earnings and hours worked variables, and some basic demographic characteristics for about 30,000 individuals per month for 12 months. We use an extract from these files prepared by Daniel Feenberg at the National Bureau of Economic Research.

Median hourly earnings for jobs and counts

The Bureau of Labor Statistics initiated the "earner study" in the CPS in order to remedy some of the holes in the data described above for the 1960s. The earner study combined with the basic labor force questions contains all the data we need for this project:

· screening questions use to identify employees
· occupation of main job worked last week
· industry of main job worked last week
· usual hours worked on main job
· usual weekly earnings on main job (for salaried workers)
· usual hourly wage on main job (for workers paid hourly)
We construct an hourly wage variable for salaried workers by dividing weekly earnings by usual hours worked per week.

Note on comparability: Since the basis labor force questions and the March annual demographic survey continue in much the same form in the 1990's that they had in the 1960's, we also conducted the 1990's analysis using the same methodological strategy we use for the 1960's to see if the change in measures affect the general contours of our results. The results for the 90's are essentially the same regardless of the method used.

Occupation and sector coding scheme (45x22 = 990 cells)

In 1992, the occupation and sector codes were changed to those based on the 1990 Census. We are able to construct a full 45x22 occupation by sector matrix similar to the one used in the Stiglitz report.