Part B: How to download data and set up a new data set in Stata
ADD NEW VARIABLE
- Go at: http://data.worldbank.org/
- Have a look at the window and choose: Indicators
- Go to the set of indicators with title: Economy & growth and choose:GDP (current US$)
- Choose: download data\EXCEL
- Save the new file at your desktop as gdp_original
- Mark and delete the rows 1-3 and do the same for the columns B up to AR. So, you have to remain with the years 2000-2017 (i.e. 18 years)
- Save the file as: gdp
- Mark the cells from B1-P1 with the years
- Ctrl+f to open the find and replace window
- Find: 20; Replace with: newvarname20 let say: gdp20 and choose: replace all
- Save the file as: gdp
- Open a stata file and open the menu: FileàimportàExcel spreadsheet: import first row as variable names-browse-ok (or select all the table copy and paste it into the data editor of a new stata file).
- Choose that the first line contains variables names.
- Open the data editor in stata. Open the menu: DataàVariables manager.
- Choose all the variables of this list on your left except the first.Choose on your right (format) createàtotal digits: 15, to increase the numbers that STATA shows to you. Alternatively you can type the command: format %15.0g gdp2000-gdp2017 at the command line.
- Create an index variable: generate cnameid=_n
- Rename the variable CountryName typing the command: rename CountryNamecname
- Place the variable cnameid before the variable cname: order cnameid, before(cname)
- Add label for the variable cnameid: label variable cnameid “Country id”
- Save the stata file as gdp
- Open the data editor and have look of your dataset.
- Reshape your data from wide to long giving the order: reshape long gdp, i( cnamecnameid) j(year)
- Open the data editor and have a look again.
- Sort your data set by cnameid: sort cnameid year
- Save the stata file as gdp_final
- Open the data editor and have a look again.
ADD NEW VARIABLE
- Go at: http://data.worldbank.org/
- Go to the set of indicators with title: Economy & growth and choose: Foreign direct investment, net inflows (BoP, current US$)
- Choose: download data\EXCEL
- Save the new file at your desktop as fdi_original
- Mark and delete the rows 1-3 and do the same for the columns B up to AR. So, you have to remain with the years 2000-2017 (i.e. 18 years)
- Save the file as: fdi
- Mark the cells from B1-P1 with the years
- Ctrl+f to open the find and replace window
- Find: 20; Replace with: newvarname20 let say: fdi20 and choose: replace all
- Save the file as: fdi
- Open a stata file and open the menu: FileàimportàExcel spreadsheet: import first row as variable names-browse-ok (or select all the table copy and paste it into the data editor of a new stata file).
- Choose that the first line contains variables names
- Open the data editor in stata. Open the menu: DataàVariables manager.
- Choose all the variables of this list on your left except the first.Choose on your right (format) createàtotal digits: 15, to increase the numbers that STATA shows to you. Alternatively, you can type the command: format %15.0g fdi2000-fdi2017 at the command line.
- Create an index variable: generate cnameid=_n
- Rename the variables CountryNamewith:rename CountryNamecname
- Place the variable cnameid before the variable cname: order cnameid, before(cname)
- Add label for the variable cnameid: label variable cnameid “Country id”
- Save the stata file as fdi
- Open the data editor and have look of your dataset.
- Reshape your data from wide to long giving the order: reshape long fdi, i(cnamecnameid) j(year)
- Open the data editor and have a look again.
- Sort your data set by cnameid: sort cnameid year
- Save the stata file as fdi_final(pay attention: you must save the file in the same folder you saved the gdp_final.dta).
- Open the data editor and have a look again.
- Close the fdi_final.dta
- Open the stata file gdp_final
- To merge data from the fdi.dta into the dataset of gdp.dta give the order: merge 1:1 cname year using fdi_final
- Put the year variable at the beginning: order year cnameid, before(cname)
- Sort your dataset by cnameid year: sort cnameid year
- Delete an unnecessary variable: drop _merge
- label variable gdp “Gross domestic product”
- label variable fdi “Foreign direct investment”
- Save the new file as:merged_data
Before running panel data analysis (please leave these steps for the end of the semester and after the completion of our lecture about panel data analysis)
- Denote to the program that you use panel data by giving the order: xtsetcnameid year
- Run: xtreggdpfdi,fe r
Find out more useful information at:http://dss.princeton.edu/training/