Economic Statistics and Data Analysis

Economic Statistics and Data Analysis

Instruction: The problem sets are designed to be challenging (especially if you are new to data analysis)
and very time-intensive, so plan ahead.  Economic Statistics and Data Analysis In general, the problem sets consist of both solving theoretical problems, and analyzing and interpreting real data. You may discuss the questions with your classmates, but you are required to hand in your own independently written solutions, do-files, and log-files. No late Economic Statistics and Data Analysis. work will be accepted and I do NOT accept any electronic copy. Please do not email me your assignments
as you will not receive any credit. All the data necessary for the problem set is available under UBlearns.
Important: It is extremely important to write a clean well-commented program for transparency and
replication purposes in all empirical work. You should always be able to reproduce your result from raw
data to support your claim. Economic Statistics and Data Analysis
There are 3 items to hand in: (1) Typed write-up (i.e., word-file) answering the assigned questions,
reporting your results, and interpreting your findings; if the question asks for graphs or tables, these must
be in the word-file in an organized manner with your interpretation, (2) do-file (i.e., program text-file),
and (3) log-file (i.e., output text-file that shows the results). You MUST use Stata. For questions
involving data analysis, you will NOT get any credit if you do not provide a program code and the output.
You may not use Excel. Do not submit any undigested log-file that contains errors. Economic Statistics and Data Analysis
1. [Empirical Exercise] (10 points) What types of jobs are available for students who graduate with
a business degree? The website careerbuilder.com lists job opportunities classified in a variety of
ways. A recent posting had 25,120 jobs. Economic Statistics and Data Analysis BUSJOBS data on UBlearns show types of jobs and the
numbers of postings listed under the classification “business administration” on a recent day.
Describe these data using the methods you learned in Chapter 1, and write a short summary about
jobs that are available for those who have a business degree. Include comments on the limitations
that should be kept in mind when interpreting this particular set of data.
2. [Empirical Exercise] (40 points) This exercise focuses on data management using a dataset that is
downloaded directly from an original source. First, you will learn how to download a dataset
from the Bureau of Labor Statistics (BLS). We are going to use the aggregate Current Population
Survey data prepared by the BLS to compute unemployment rate by different demographic
groups. The data is available on https://www.bls.gov/data/ under the unemployment selection.
Select “Top Picks” of Labor Force Statistics including the National Unemployment Rate (Current
Population Survey – CPS). Select overall unemployment rate as well as unemployment rates by
gender, race/ethnicity, and education. Select “Retrieve data.” Then you will use formatting option
to have the data available in “column format” and all years, and all time periods. (If you do not Economic Statistics and Data Analysis

select this format, you will have a coding challenging to set up the data appropriate for a time-
series analysis. You will download 11 Excel files from the website. Second, you need to learn Economic Statistics and Data Analysis

how to merge the datasets. Finally, plot four sets of graphs: (1) overall unemployment rate over
time since 1948, (2) unemployment rate over time by gender, (3) unemployment rate over time by
race/ethnicity, and (4) unemployment rate over time by education. Describe your findings in
words (max 1⁄2 page). Hint: (1) Convert the Excel file to csv file. (2) Use Stata’s merge command
to merge all 11 files into one single file. (3) Sort the data by year and month, then create a time
variable to use tsset command. (4) Use tsline command.

 

ECO 380 Problem Set 1 Economic Statistics and Data Analysis

2

3. [Empirical Exercise] (10 points) Use EDUSEV to answer the following questions.
a. Which variables are categorical? (1 point) Economic Statistics and Data Analysis
b. How many percent is female? (1 point)
c. Make a suitable graph that describes the shape, center, and spread of the distribution of
students’ IQ scores. (2 points)
d. In general, IQ scores are usually said to be centered at 100. Is this true for this data? (2
point)
e. Make a suitable graph that describes the shape, center, and spread of the distribution of
self-concept scores. (2 points)
f. Can you identify any suspected outliers? Why? (2 point)
4. [Empirical Exercise] (10 points) Use TALK to answer the following questions. People often
generalize that women are more talkative than men. Is this supported by data? One study
designed to examine this stereotype collected data on the speech of 42 women and 37 men in the
U.S.
a. Calculate the mean  Economic Statistics and Data Analysisand standard deviation of number of words spoken per day by gender.
Report the results by gender. (2 points)
b. Use the 68-95-99.7 rule to describe the distribution by gender. Report the results. (4
points)
c. Describe the skewness of the distribution by gender. Support your statement by
constructing an appropriate graph of your choice. (2 points)
d. Do you think that applying the rule in this situation is reasonable? Do you think that the
data support the generalization that women are more talkative than men? Explain your
answer. (2 points) Economic Statistics and Data Analysis

5. [Empirical Exercise] Use COLLEGE to answer the following questions. (Total 20 points)
a. Report the basic descriptive statistics of all the variables that is contained in the dataset
(i.e., mean, standard deviation, and median). (2 points) Economic Statistics and Data Analysis
b. Make a scatterplot of undergraduate population and population with the least-squares
regression line. (4 point) [Hint: explore ‘lfit’ command]
c. Focus on California, the states with the largest population. Is this state an outlier when
you consider only the distribution of population? Why? (2 points) Economic Statistics and Data Analysis
d. Is California an outlier when viewed in terms of the relationship between number of
undergraduate college students and population? Why? (2 points)
e. Repeat (c) and (d) using the logs of both variables (4 points)
f. Delete four largest states and run your own regression and report the results. (2 points)
g. What is the equation of your least-squares regression line? (2 point)
h. Interpret the value of r Economic Statistics and Data Analysis
2
from your regression. (2 point)

 

Loader Loading...
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Download [376.17 KB]