Department of Labor Logo United States Department of Labor
Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Tag Archives: Big Data

BLS Microdata Now More Easily Accessible to Researchers across the Country

I am pleased to announce that BLS is now part of the Federal Statistical Research Data Center Network.

Researchers at universities, nonprofits, and government agencies can now go to 24 secure research data centers across the United States to analyze microdata from our National Longitudinal Surveys of Youth and our Survey of Occupational Injuries and Illnesses. Before, researchers had to visit our headquarters in Washington, D.C., to use these data.Image of researchers examining data.

Making our underlying data more accessible for researchers from coast to coast is a huge step forward, and I hope it will lead to a surge in research using BLS data. I believe that having more researchers use BLS data not only will showcase new uses of the data but improve our products by encouraging researchers from BLS and other organizations to collaborate. It also supports transparency because external researchers can analyze inputs to our published statistics.

Another key benefit to having BLS data alongside datasets from the U.S. Census Bureau and the National Center for Health Statistics is that researchers can combine data from two or more agencies. Using multiple datasets allows researchers to match data to answer new questions with no more burden on our respondents. Put simply, more data = better research = better decisions that rely on research.

Researchers are enthusiastic about adding BLS data to the research data center network.

“We at the Federal Reserve Bank of Atlanta are excited that more BLS microdata are available to researchers. Policy questions are usually complicated. Matched data from different sources can give researchers a much better understanding of economic relationships. That will help us provide more informed policy advice,” said John Robertson, senior policy adviser at the Federal Reserve Bank of Atlanta.

Over the next year, we will add more BLS data to the research data centers based on user demand.

Researchers can also still visit us at our D.C. headquarters to access our full suite of microdata. To learn more and to apply, see our BLS Restricted Data Access page.

Entrepreneurship Facts: Announcing New Research Data on Job Creation and Destruction by Firm Age and Size

I’m delighted to announce that we now have new research data on job gains and losses by firm age and size across industries and states.

For many years, policymakers, economists, and others have debated whether small or large firms create more jobs. Our Business Employment Dynamics program, which measures gross job gains and losses to help us understand net employment changes, informs that debate with data on firm size. A related question is whether startups or older establishments create more jobs. Again, BLS has a stat for that. We have data on employment and business survival rates by the age of the establishment.

While it’s useful to know the age of an establishment—that is, a single location of a business—for some questions, we need to know the age of the firm. A firm may include several or even many establishments. To understand entrepreneurship in particular, we want to know how both the age and size of firms affect job gains, job losses, and employment growth.

With these new data we can answer many interesting questions, including:

  • How much do older firms contribute to job growth? Firms 10 years or older created 800,000 jobs, or 29 percent of the total 2.7 million net employment gain in the year ending March 2015. See the chart below.
  • How much do startup firms contribute to job growth? In the year ending March 2015, startup firms—firms less than 1 year old—created 1.7 million jobs or 60 percent of total employment growth. More than half these jobs were from firms with fewer than 10 employees.
  • How does the age or size of the firm affect the rate of business closures? In 2015, 788,000 establishments closed. Of these, 55 percent were from firms 10 years or older; 16 percent were from firms 5 to 9 years old; and 28 percent were from firms less than 4 years old. Of the establishments that closed from March 2014 to March 2015, 91,000 of them, or 12 percent of the total, had 500 or more employees.
  • Which firm-age group accounted for most job losses during the last two recessions? Firms 10 years or older lost the most jobs during both recessions. Again, see the chart below.


The new research data measure annual gross job gains and gross job losses by firm age and size from March of one year to March of the next. We get the data on firms from the Quarterly Census of Employment and Wages by linking individual establishments over time. Besides firm age and size, we also measure establishment age and size. We have two methods to examine size. One method compares the current size of firms or establishments with the size at the beginning of the year (the base-sizing method). The other method compares the current size with the average size over the year (the average-sizing method).

I really want to know how you like these new data and what we can do to make them more useful. I invite you to explore the data and share your comments. Your feedback will help us develop the dataset and possibly move it into our regular production. Please write your comments below, or you can email the Business Employment Dynamics staff.

How Does BLS Deal with Uncertainty in Our Measures?

I recently spoke in Pittsburgh at the 2015 Policy Summit on Housing, Human Capital, and Inequality. The Federal Reserve Banks of Cleveland, Philadelphia, and Richmond sponsored this event. I spoke on a panel with Professor Charles Manski of Northwestern University and Jeffrey Kling of the Congressional Budget Office about measuring uncertainty in federal statistics. You can watch the full discussion below.

When I speak to groups around the country or write in the Commissioner’s Corner, I always discuss the importance of having good information to make good decisions. Federal, state, and local policymakers use information from BLS, and so do private businesses, nonprofit organizations, and households. But how do the users of our data and analyses know they can rely on BLS information? Our users shouldn’t simply have blind faith. After all, households, businesses, and governments make decisions based on our data, and those decisions can involve a lot of money. Users of statistics need to understand that all measures have limitations. Data are a tool. Just like screwdrivers or spatulas, data have specific uses and different levels of precision. Data users need to choose the right tools for their purpose and use them correctly. Our goal is to measure the true state of the economy, but data users must recognize that all measures of the truth come with some uncertainty.

So what are the sources of uncertainty in our measures? One source is what we call sampling error. Most statistics we publish at BLS come from sample surveys. Sampling error is the uncertainty that results by chance because we collect the information from a sample instead of the full population. Even though we select our samples carefully using scientific methods, the characteristics of a sample still may differ from those of the population. We rely on sample surveys because it is far too expensive to ask questions of all workers or all businesses every time we need new information about the labor market and economy. Fortunately, statisticians have developed tools to measure sampling error. We publish these measures on our website. For example, you can see whether the most recent monthly changes in our measures of the labor force, employment, and unemployment are statistically significant. If we want to reduce sampling error, we can increase the size of our samples. Larger samples cost more money, but our measures of sampling error can help us decide whether the benefit of reducing that source of uncertainty is worth the cost.

Other types of uncertainty are harder to measure. For example, some people and businesses choose not to respond to our surveys. If those who don’t respond have different characteristics from those who respond, it could bias our measures. Even when people and businesses agree to participate in a survey, they might not answer every question or their answers might not be accurate. It’s hard to measure the effects of these challenges in collecting information about the economy. We try to minimize the sources of uncertainty, however. For example, we try to design our surveys to make it easier for people and businesses to respond. We show people and businesses how they benefit from responding. We test our survey questionnaires carefully to make sure they are clear and easy to answer. We seek out other sources of information to supplement our surveys, using what many people call “big data.”

Most of all, we communicate with our data users about the strengths and limitations of our data and the methods we use to compile them. We’re always looking for better, clearer ways to explain our data, and I welcome you to share your ideas.

Government Statistics in a World of Big Data

“Big data” is a buzzword you hear often these days. Long before the term even existed, BLS and other federal statistical agencies have used alternative data sources—that today would be labeled “big data”—to revolutionize the way we do business.

Last week I participated in a panel, sponsored by the American Enterprise Institute, to discuss the current and future role of federal statistical agencies in this era of big data. (See the video of the discussion.)

My fellow panelists and I agreed on one point early on: our dislike for the term “big data”!

Former U.S. Census Bureau Director Robert Groves prefers the term “organic data,” while Burning Glass CEO Matthew Sigelman refers to big data as “open market” data sources. Billion Prices Project cofounder Alberto Cavallo defines big data as “new technologies for data collection.”

Whatever term we use, we all agreed that government and private-sector data should be viewed as complementary or mutually reinforcing.

During my presentation, I discussed how big data can complement government surveys. I talked about how the Billion Prices Project, which Cavallo cofounded at the Massachusetts Institute of Technology, relates to the BLS Consumer Price Index.

The Billion Prices Project provides the extreme timeliness of a daily price index and large sample sizes that serve the almost instant needs of some data users, particularly investors.

The Consumer Price Index measures changes in the cost of living for a representative consumer buying a representative market basket. This comprehensive approach is critical to serving policymakers, Social Security recipients, and many others who use the Consumer Price Index in government programs and private contracts.

Far from being a competition, these two approaches provide important, though different, ways to measure and track the economy. Or, as I like to say, two lenses are always better than one.

I was happy to learn the panelists appreciate the key role that federal statistical agencies must play in the emerging world of big data. All parties need to work together to better use all the information we have, whether survey data or big data. Indeed, blending these two types of data creatively will produce new and better ways to inform sound decision making by our nation’s businesses, families, and policymakers. That’s a win-win for everyone.

Women in Statistics: Beyond the Headline

BLS Commissioner Erica Groshen and Department of Labor Chief Economist Heidi Shierholz wrote this post about women in the statistics profession. This post also was published in the U.S. Department of Labor Blog.

As the top two economists at the Labor Department, a recent article in The Washington Post caught our eye. The article, entitled “Women flocking to statistics, the newly hot, high-tech field of data science,” stated that statistics is the one STEM (science, technology, engineering, and math) profession where women are taking the lead.

As women and as economists, we see this as welcome news.

We both strongly believe that guiding more women into careers in science and math is essential. It’s good for women and their families because there are so many new, exciting, and rewarding opportunities in this field. And our whole economy will benefit as more talented women participate fully in these innovative activities. At the Department of Labor we produce a wealth of data on this topic, so we wanted to take a look at the numbers beyond the headline.

People in general are entering statistics jobs—and women seem to be holding their own. The total employment number for statisticians has grown quite a bit in recent years, from 28,000 in 2010 to 72,000 in 2013. Women  accounted for 38.3 percent of those 72,000 statisticians, according to Current Population Survey data. In comparison, the “computer and mathematical occupations” category as a whole was 26.1 percent female.

One telling sign about the potential rise of women in this field is that, prior to 2013, the number of female statisticians was too small to publish. We look forward to making historical comparisons and tracking trends as we get more data on the number of female statisticians in the years to come.

One promising factor is that statistics as a profession is expected to see strong growth in coming years. The BLS Occupational Outlook Handbook profile of statisticians shows the employment of statisticians is projected to grow 27 percent from 2012 to 2022, much faster than the projected growth rate of 11 percent across all occupations. To contrast this with some data from our profession, employment of economists is expected to grow 14 percent in the same time period.

In part because of such strong growth, the field also offers a competitive salary. The median wage for a statistician is $79,290 per year, with about a quarter of statisticians working for the government, mostly at the federal level like us.

Julie Gershunskaya, a BLS statistician who received her Ph.D. in Survey Methodology from the University of Maryland in 2011, said she found in school that students focused much more on each other’s qualifications and experience than on gender.

Already, more women are getting advanced degrees in statistics than in similar fields. According to the American Mathematical Society’s 2013 Annual Survey of the Mathematical Sciences in the U.S., women accounted for 44 percent of Ph.D.’s granted in statistics/biostatistics, compared to all other mathematical science doctoral degrees combined, where 27 percent are female.

The state where the most statisticians work is Maryland, but if you want to make the most money, head to California, the District of Columbia, and New Jersey, the three top-paying places for statisticians, according to BLS Occupational Employment Statistics.

So, is the headline that “women are flocking to statistics” reflected in our current data? Not quite yet. We can only see a few glimmers thus far, but it’s important to remember that the most detailed data can be slow to reflect the rapid changes we now see anecdotally.

This data snapshot confirms the outlook is indeed bright for the field of statistics as a whole. The rise of women in statistics is especially exciting; we hope it continues and carries forward to other STEM professions as well.