Project Themes

The work of DAMES involves developing four groups of social science provisions. These comprise work on:

alongside three more specialist topics in social science research:

All of these projects are concerned, in different ways, with tasks of 'data management'. This term is sometimes used for different purposes, but in the DAMES Node we use it to refer to activities, typically undertaken by social researchers themeselves, concerned with manipulating data. Examples include checking data for inconsistencies (‘cleaning data’); linking together different data files; and coding or recoding measures (‘operationalising variables’). Such tasks are a substantial and often challenging part of social research.

These four social science themes run alongside four groups of computer science research activities which focus on using e-Science approaches to deliver these provisions. Administratively, the Node therefore has 8 research themes, as follows:

1.1) Grid Enabled Specialist Data Environments 2.1) Description, discovery and use of data and services through use of metadata and data abstraction
1.2) Data Resources for Micro-Simulation on Social Care Data 2.2) Techniques to handle data from multiple sources
1.3) Linking e-Health and Social Science Databases 2.3) Workflow modelling for social science
1.4) Training and Interfaces for Management of Complex Survey Data 2.4) Security driven data management
Below we provide some further details on the 8 research themes of DAMES.


Meanings of the terms 'data management'.



Theme 1.1: Grid Enabled Specialist Data Environments

This theme (which we sometimes abbreviate as GE*DE, pronounced 'Geode') is in many ways a direct extension to a previous project which involved most of the same researchers, called 'Grid Enabled Occupational Data Enivronment' (GEODE). This project was funded by an ESRC Small Grant in e-Social Science between October 2005 and March 2007. That project's website is now updated as part of the DAMES Node: http://www.geode.stir.ac.uk/.

The GE*DE theme deals with specialist social science data on occupations, on educational qualifications, and on ethnicity. Our basic argument goes that in each area, there is quite a lot of specialist social science data - such as, for example, data on how to use occupations to classify people into a social class scheme. Many datasets on these topics are potentially of use to a wide group of social scientists. However, it is frequently the case that some degree of specialist knowledge - and manual effort! - is required before a researcher can effectively access and exploit relevant specialist data. Therefore, in DAMES, we are interested in facilitating access to, and the distribution of, such specialist data.

The precise way we will provide services in these areas will vary by each topic, reflecting slightly different types of data and different user requirements. A provisional description of our likely activities can be found in some slides from one of our sessions at the ESRC NCRM Oxford Research Methods Festival (slides on theme 1.1 - ppt).

The lead in Theme 1.1 is taken by Paul Lambert and Vernon Gayle (see Personnel). Paul Lambert's research background is particularly focussed upon data on occupations and on ethnicity; Vernon Gayle has worked for many years on projects concerned with data on educational qualifications

Theme 1.2: Data Resources for Micro-Simulation on Social Care Data

Theme 1.2 is concerned with linking together data from different sources which is relevant to the analysis of social care needs in the UK - with a particular focus upon analysis through 'micro-simulation modelling'. The idea here is that different forms of data (which are available under different levels of security restriction) can be linked for analytical purposes (often dynamically).

Theme 1.2 has a very specific focus which will contribute to research on social care needs and demographic trends - it should benefit research on the costing of social service interventions for older populations, and their effectivenes. However it is also hoped that the model for data linkage developed in this theme will grow to be instructive to a wider range of application areas.

A significant activity in this Theme is about finding approaches to effectively link together different types of data which can inform the same analyses. This will include:

The lead in Theme 1.2 is taken by Alison Bowes, David Bell, Alison Dawson and Ken Turner (see Personnel). David Bell (Dept Economics) led a recent project on micro-simulation (see OPERA). Alison Bowes and Alison Dawson (Dep Applied Social Science) have backgrounds in the collection, review and analysis of different forms of social care data. Ken Turner has worked on various projects concerned with data on daily living

Theme 1.3: Linking e-Health and Social Science Databases

The primary focus of this theme is on the topic of health inequalities and how social science data can be utilised in conjunction with e-Health data and projects to monitor and enhance understanding of health inequalities across a wide spectrum of clinical, biomedical and health related fields.

There is a great deal of e-Health data relevant to this sort of enquiry. The main challenge, in this theme, concerns methods of access to relevant data that satisfies the various security conditions relevant to confidential personal data.

Theme 1.3 is likely to begin by focussing specifically on certain examples in the analysis of health data, but in so doing it aims to develop much wider ranging services relevant to linking and enhancing secure data resources.

The lead in Theme 1.3 is taken by Margaret Maxwell and Richard Sinnott (see Personnel). Margaret Maxwell (Dept Applied Social Science) has a background in health inequalities, and Richard Sinnott (National e-Science Centre) in security infrastructures, and in applications using bio-medical data.

Theme 1.4: Training and Interfaces for Management of Complex Survey Data

This theme involves programmes of training activities, and the development of generic services, for data management of complex social survey data.

Theme 1.4 is more wide ranging than the other social science themes. Its generic provisions are designed to complement and generalise the provisions developed under themes 1.1, 1.2 and 1.3.

Theme 1.4 is led by Vernon Gayle and Paul Lambert (see Personnel). As much as possible, it is intended that activities and services from this theme will be coordinated with other major UK led capacity building activities, such as programmes within the ESRC NCRM and RDI initiatives, and other programmes which Vernon Gayle and Paul Lambert participate in - the projects Longitudinal Data Analysis for Social Science Researchers; Scottish Social Survey Network; and the Lancaster-Warwick-Stirling NCRM Node.

Theme 2.1: Description, discovery and use of data and services through use of metadata and data abstraction

This theme will address the challenges involved in providing easy but secure access to distributed heterogeneous data resources. An important consideration is to develop approaches which are compatible with the standards adopted by major social science data providers, such as the UK Data Archive.

The theme has specified work-packages covering metadata support; data abstraction; semantically-based data discovery; and data usability.

The lead in Theme 2.1 is taken by Ken Turner, Jesse Blum and Guy Warner (see Personnel). Slides from an introductory talk on metadata which were prepared by Jesse Blum for a social sicence audience are available here (ppt).

Theme 2.2: Techniques to handle data from multiple sources

This research theme arises from the observation that social science datasets are often distributed, disaggregated and uncoordinated. Specialist data such as is examined in Themes 1.1, 1.2 and 1.3 is often stored in differing formats, with differing metadata descriptions, requiring differing access techniques. Where such datasets hold related data, the social science researcher faces considerable challenges in extracting the information that they require form different sources, and merging the data into a uniform body. This theme investigates ameliorating this problem through providing grid services that "virtually fuse" disparate data sources in order to answer research questions through uniform query processing.

Work-packages within this theme concern 'data abstraction techniques', 'data fusion techniques', and 'query processing'. The lead in Theme 2.2 is taken by Simon Jones and Guy Warner (see Personnel).

Theme 2.3: Workflow modelling for social science

'Workflows' are often examined within an e-Science framework as an approach which allows for the recording of patterns and processes within research activities, and building services in response to those patterns. In this theme, work will look at describing and supporting workflow models appropriate to social science tasks of data management. Examples might include the sequential steps involved in the manipulation then analysis of social survey datasets; or the progression from data access, to linking together datasets, to undertaking analyses on the new linked dataset.

The lead in Theme 2.3 is taken by Ken Turner, Simon Jones, Larry Tan and Guy Warner (see Personnel).

Theme 2.4: Security driven data management

This work will build on the themes 2.1, 2.2 and 2.3 and focus on supporting social science scenarios requiring finer grained security. This will leverage a range of technology-oriented projects at NeSC Glasgow such as SPAM-GP, VPman, DyVOSE and GLASS combined with a range of clinical, epidemiological and geographic information system research projects at NeSC Glasgow such as VOTES, SFHS, EuroDSD where linkage with social data could greatly benefit research capacity.

Ultimately our focus in this theme is to draw the results of the other themes together and demonstrate the added value offered by the DAMES infrastructure. Key areas of added value we will focus on is far richer linkage of data resources and services for the social science community, and the usability of the DAMES infrastructure for accessing and sharing data resources.

Theme 2.4 is led by Richard Sinnott, John Watt and Susan McCafferty (see Personnel).

Under construction, last update: 31/07/08