**************************************************** **************************************************** ******* GEMDE WORSKHOP, 28 JANUARY 2010 ** ******* ILLUSTRATIVE EXAMPLE OF DATA FOR MUGS AND MIRS: BHPS WAVE Q DATA ** ** ** This lab exercise prepared by Paul Lambert, University of Stirling ** ** **** GEMDE is a product of the ESRC funded DAMES Research Node, www.dames.org.uk, hosted at **** University of Stirling and National e-Science Centre, University of Glasgow **************************************************** ***************************************************** **** ** 0) Preliminaries: clear set mem 150m global path1 "c:\gemde\lab\data\" /* Path where the BHPS data extract we've supplied to you is found */ global path2a "c:\gemde\lab\mugs\2\" /* Path for existing and newly created MUGs */ global path2b "c:\gemde\lab\mirs\2\" /* Path for existing and newly created MIRs */ global path5 "c:\gemde\lab\macros\" /* Path for macros called by this do file */ global path9 "c:\temp\" /*Path for saving temporary copies of files */ ******************************************************* ******************************************************* *** 1) Open the data and explore the information on ethnicity and religion *** use $path1\gemde_bhps_extract.dta, clear tab1 race racel, missing * 14909 people in this extract: most but not all have data for 'race' (close to the 1991 census question) * and most but not all for 'racel' (close to the 2001 census question). bysort race: tab racel, missing * Some have data on one but not the other; some have conflicting values between questions ******************************************************* ******************************************************* ******************************************************* ******************************************************* *** 2) Define a new MUG, and document it: * For analytical purposes, we'd typically recode ethnicity measures before analysis, * combining together sparse categories **** i) Class led example: * Exploring the data for patterns / possible recodes tab racel memorig summarize qage table racel, c(mean qage n qage) capture drop eth2 gen eth2=racel recode eth2 1/4=1 6/17=2 5 18=3 *=.m /* Recodes to three groups */ replace eth2=3 if (racel==2 & memorig ~= 7) /* Makes 'white irish' a minority only if not in NI sample */ capture label drop eth2l label define eth2l 1 "White UK" 2 "Black or Asian" 3 "Other white/other" label values eth2 eth2l numlabel eth2l, add tab eth2 , missing tab racel eth2 * => We've created a new, 3 category MUG which is intended to differentiate ethnic groups in Britain * in terms of typcial age profiles label save eth2l using $path2a\bhps_3_category_labels.do, replace **** ii) Try a different derivation yourself: * (Remove the *'s and edit the text) *capture drop eth3 *gen eth3=racel *recode eth3 [INSERT VALUES HERE ]*=.m *capture label drop eth3l *label define eth3l [INSERT LABELS HERE] *label values eth3 eth3l *numlabel eth3l, add *tab eth3, missing * => The MUG you've created is summarised in the list of value labels. ******************************************************* ******************************************************* ******************************************************* ******************************************************* *** 3) Exploit an existing MIR to link the two MUGs 'race' and 'racel' ** => At GEMDE, I've deposited a Stata macro for 'harmonising' the 'race' and 'racel' measures ** (It's visible at: C:\gemde\lab\macros\bhps_ethnicity_combined.do) tab1 race racel, missing ** Tasks of the macro: follow the advice recommended by ONS on harmonising ethnicity, * plus take advantage of the household clustering in BHPS to impute race for those with * missing values on both variables but with valid values for some household sharers do $path5\bhps_ethnicity_combined.do" xethbhps race racel qhid xeth xethh ** => We've now created two new variables 'xeth' and 'xethh' which are 'harmonised' measures describe xeth xethh numlabel eth_ons3, add tab1 xeth xethh ******************************************************* ******************************************************* ******************************************************* *** 4) Create a new MIR summarising aggregate statistical data on minority groups *** Example 4.1: An analysis of average attitudes by ethnic group * Review ethnicity and attitudes tab1 race racel qopfamc sex capture drop qopfamc2 clonevar qopfamc2=qopfamc recode qopfamc2 -9=.m -7=.p -1=.n 5=1 4=2 3=3 2=4 1=5 *=.e capture label drop attrevl label define attrevl 1 "Strongly disagree" 2 "Disagree" 3 "Neither agree, disagree" 4 "Agree" 5 "Strongly agree" label values qopfamc2 attrevl numlabel _all, add tab qopfamc2, missing table xethh sex , c(mean qopfamc2 n qopfamc2 ) table xethh sex [aw=qxrwtuk1], c(mean qopfamc2 n qopfamc2 ) ** Now preparet derived statistics from this summary: average attitudes by ethnic group sav $path9\temp1.dta, replace use $path9\temp1.dta, clear gen men=(sex==1) gen fem=(sex==2) collapse (mean) qopf_all=qopfamc2 (rawsum) nmen=men nfem=fem [aw=qxrwtuk1] , by(xethh) sort xethh sav $path9\m1.dta, replace use $path9\temp1.dta, clear keep if sex==1 collapse (mean) qopf_men=qopfamc2 [aw=qxrwtuk1] , by(xethh) sort xethh sav $path9\m2.dta, replace use $path9\temp1.dta, clear keep if sex==2 collapse (mean) qopf_fem=qopfamc2 [aw=qxrwtuk1] , by(xethh) sort xethh sav $path9\m3.dta, replace use $path9\m1.dta, clear sort xethh merge xethh using $path9\m2.dta tab _merge drop _merge sort xethh merge xethh using $path9\m3.dta tab _merge drop _merge keep if xethh >= 1 & xethh <= 11 label variable qopf_all "All adults, agreement that 'A woman and her family would all be happier if she goes out to work' [weighted] " label variable qopf_men "Men, agreement that 'A woman and her family would all be happier if she goes out to work' [weighted] " label variable qopf_fem "Women, agreement that 'A woman and her family would all be happier if she goes out to work' [weighted] " label variable nmen "Unweighted number of male respondents" label variable nfem "Unweighted number of female respondents" codebook, compact label data "Ethnic group averages: agreement that it is better for women to work, BHPS w17" sav $path2b\bhps_qopfamc.dta, replace format qopf_men %8.4g format qopf_fem %8.4g describe, short list xeth qopf_men qopf_fem nmen nfem graph hbar (mean) qopf_men qopf_fem, /// over(xethh) legend(order(1 2) label(1 "Men") label(2 "Women") ) /// bar(1, bcolor(gs8)) bar(2, bcolor(gs13)) /// subtitle("Agree that 'A woman and her family would all be happier if she goes out to work' ", span) /// note("Source: BHPS, 2007. Weighted using UK national weights", span) ******************************************************* ******************************************************* ******************************************************* ******************************************************* *** Example 4.2: A SOR model use $path1\gemde_bhps_extract.dta, clear do "C:\gemde\lab\macros\bhps_ethnicity_combined.do" xethbhps race racel qhid xeth xethh numlabel eth_ons3, add tab1 xeth xethh summarize qfimn qfihhmn tab1 qqfedhi sex qage qopfamc tab1 degree diploma lowlev, missing capture drop fem gen fem=sex==2 recode qopfamc -9=.m -7=.p -1=.n tab1 fem qopfamc tab1 xethh summarize fem qage degree diploma lowlev qfimn qfihhmn qopfamc slogit xethh fem qage degree diploma lowlev qfimn qfihhmn qopfamc , baseoutcome(4) * (Generates SOR scores summarising socio-economic/demographic measures, parameterised around the * difference between Pakistani (0) and White (1) groups) * The scores dimensions are most influenced by age, income, and gender attitudes. * The commands below export them into a plain text data file. matrix list e(b) matrix temp=e(b)' matrix sors=temp[9..18,1] matrix input eth=(1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 4) matrix eth2=eth' matrix list sors matrix list eth2 svmat sors, names(sor_score) svmat eth2, names(ethnic) list sor_score ethnic in 1/11 replace sor_score = 0 if ethnic==4 keep if ethnic >= 1 & ethnic <= 11 keep sor_score ethnic summarize label values ethnic eth_ons3 list label data "Socio-demographic SOR scores for UK Ethnic Group (using BHPS analysis of educational attainment, gender and age)" sav $path2b\bhps_ethnic_group_SOR_scores.dta, replace gen eth2=ethnic outsheet eth2 ethnic sor_score using $path2b\bhps_ethnic_group_SOR_scores.dat, replace gen one=1 capture drop pos gen pos=3 replace pos=9 if ethnic==2 | ethnic==6 graph twoway (scatter sor_score one , mcolor(dkgreen) mlabel(ethnic) mlabcolor(gs6) mlabv(pos) ) , /// ytitle(" ") xscale(off) ylabel(,nogrid) /// title("SOR model dimension scores for BHPS ethnic groups", span) /// subtitle("Identified principally by age, gender attitudes and household income", span) /// note("Source: BHPS wave 17, n = 12626, % 'White' = 97.3 ") ******************************************************* ******************************************************* ******************************************************* **** Example 4.3: Try creating one yourself... ******************************************************* ** EOF ******************************************************* *******************************************************