***************************************************. *** This file prepares datasets for use in the practical session: ***** Census Programme Workshop on Spatial and Social Classification . ***** LEEDS, 8 June 2010 . ***** Practical session on the GESDE services . ***** Prepared by Paul Lambert, University of Stirling. ********************************************************. ** PRELIMINARIES: YOU'LL NEED TO EDIT THESE TO APPROPRIATE PATHS ON YOUR MACHINE . **. ******. define !pathlfs1 () 'C:\data\lfs\teaching\' !enddefine. * LFS teaching dataset: UK Data Archive Study Number 4736 . define !pathess1 () 'C:\data\ess\2002\' !enddefine. * European Social Survey, Round 1 dataset, available from http://www.europeansocialsurvey.org/ . define !pathghs1 () 'C:\data\ghs\time_series\' !enddefine. * General Household Survey Time Series datasets, UK Data Archive Study Number 5664 . define !pathbh () 'C:\data\bhps\bh11709\' !enddefine. * British Household Panel Survey, UK Data Archive Study Number 5151. *** You'll also need 'path1' (for derived data) and 'path9' (for temporary data) * as defined on the workshop analysis file itself. ***************************************************************************. ***************************************************************************. ***************************************************************************. *** THE EXTRACT FILE FROM THE 2002 LFS, USED IN EXAMPLE 1.1(i). import file=!pathlfs1+"lfs2002.por". fre var=region quart . select if (region=10 & quart=2) . fre var=nation. delete variables nation cry01 region quart . fre var=sex soc2km manage status nstat nsecm . compute ukempst = 0 . if (manage=3) ukempst=7 . if (manage=2) ukempst=6. if (manage=1) ukempst=4. if (nsecm=1) ukempst=1. if (nsecm=8.1 | nsecm=8.1) ukempst=2. if (nsecm=9.1 | nsecm=9.2 ) ukempst=3. * Note - Only get imperfect can't get self-employment detail from nssec * - 3.3, 3.4, 4.3 and 4.4 could all be empst 1, 2 or 3. variable label ukempst "Employment status (derived from 'manage' and 'nsecm')". add value labels ukempst 0 "Unknown" 1 "Self-employed with 25 or more employees" 2 "Self-employed with fewer than 25 employees" 3 "Self-employed without employees" 4 "Manager" 6 "Supervisor" 7 "Employee" . fre var=ukempst. descriptives var=all. sav out=!path1+"lfs_2002extract.sav" . ***************************************************************************. ***************************************************************************. *** THE EXTRACT FILE FROM THE EUROPEAN SOCIAL SURVEY, USED IN EXAMPLE 1.1(ii). import file=!pathess1+"ESS1e06_1.por". fre var=cntry . select if (cntry='CH' | cntry='CZ' | cntry='IE' | cntry='GB' | cntry='HU' | cntry='PL' | cntry='PT' | cntry='SE' | cntry='SI'). fre var=cntry. fre var=emplrel. compute empstat=emplrel. recode empstat (1=1) (2,3=2) (else=-999). fre var=empstat. missing values empstat (-999). fre var=empstat. fre var=jbspv njbspv. compute supvn=-999. if (jbspv=2) supvn=0. if (jbspv=1 & njbspv ge 0 & njbspv le 10) supvn=1. if (jbspv=1 & njbspv ge 11 & njbspv lt 8000) supvn=11. fre var=supvn. fre var=emplrel. compute stdempst=emplrel. recode stdempst (1=6) (1=2) (2=5) (else=0). add value labels stdempst 0 "Unknown" 2 "Self-employed" 5 "Family worker" 6 "Employee" . select if (stfeco=0 | stfeco=8 | stfeco=9 | stfeco=10). fre var=gndr. select if (gndr=1 | gndr=2). fre var=tvtot trstplt imbghct hincfel stfeco . cro tvtot trstplt imbghct hincfel stfeco by cntry. descriptives var=cntry gndr iscoco emplrel jbspv njbspv stdempst empstat supvn tvtot trstplt imbghct hincfel stfeco . sav out=!path1+"ess_2001extract.sav" /keep=cntry gndr iscoco stdempst empstat supvn emplrel jbspv njbspv tvtot trstplt imbghct hincfel stfeco . ********************************************************. ***************************************************************************. *** THE EXTRACT FILE FROM THE BHPS EDUCATIONAL DATA, USED IN EXAMPLE 1.2 (i). ** Retrieve age-left-school data from cross-wave file: . get file=!pathbh+"xwavedat.sav" /keep=pid scend. sort cases by pid. sav out=!path9+"m1.sav". ** Retrieve highest qualifications data (and job info) from individual file (at wave q). get file=!pathbh+"qindresp.sav" /keep=pid qage qsex qqfedhi qqfachi qqfvoc qjbcssm . sort cases by pid. ** Merge the files and save the combined data. match files file=* /in=waveq /file=!path9+"m1.sav" /by=pid. fre var=waveq. select if waveq=1. fre var= qqfedhi qqfachi qqfvoc . sav out=!path1+"bhps_educ_extract.sav" /drop=waveq. **********************. ***************************************************************************. ***************************************************************************. *** THE EXTRACT FILE FROM THE GHS TIME SERIES DATA, USED IN EXAMPLE 1.1(ii). get stata /file=!pathghs1+"ghs72-04_mar07_esds.dta". *compute randomvar=trunc(rv.uniform(1,100)). /* Could be used to reduce sample size */. *descriptives var=randomvar. *select if (randomvar >= 1 & randomvar <= 10). descriptives var=all. fre var=year pcountry. select if (page >= 18 & page <= 65 & psex >= 1 & psex <= 2 & pcountry=1). fre var=pcountry. descriptives var=year pcountry pweightc page psex pcob1 pfcob1 pmcob1 pedfull degree pgenhlth pdoctalk pcigsmk1 pcigsmk . sav out= !path1+"ghs7204_extract.sav" /keep=year pcountry pweightc page psex pcob1 pfcob1 pmcob1 pedfull degree pgenhlth pdoctalk pcigsmk1 pcigsmk . ******************************************************************************************. ********************************************************. ***************************************************************************. *** THE EXTRACT FILE FROM THE BHPS ETHNICITY DATA, USED IN EXAMPLE 1.3. * Retrieve data. get file=!pathbh+"xwavedat.sav" /keep= pid sex race racel plbornc memorig. sort cases by pid. save out=!path9+"m1.sav". get file=!pathbh+"qindresp.sav" /keep= pid qhid qopfamc qqfedhi qxrwtuk1 qage qfimn qfihhmn . sort cases by pid. match files file=* /in=waveq /file=!path9+"m1.sav" /by=pid. fre var=waveq. select if waveq=1. descriptives var=all. fre var= memorig qage sex . select if (qage >= 15 & qage <= 150). descriptives var= qfimn qfihhmn. missing values qfimn qfihhmn (-9 thru 0). missing values qqfedhi (-9, -7, 13). fre var=qqfedhi. compute degree=(qqfedhi=1 | qqfedhi=2). compute diploma=(qqfedhi >= 3 & qqfedhi <= 5). compute lowlev=(qqfedhi=9 | qqfedhi=12). fre var=qqfedhi degree diploma lowlev. sav out=!path1+"gemde_bhps_extract.sav". **********************************************************************.