Compiling the International Dataset
The project relied on a comparative parallel survey design, with surveys in each country being created collaboratively by the research teams, devising equivalent questions to allow the same phenomena to be explored, whilst taking account of national context. This comparative parallel design represents an alternative to integrated or post-hoc designs: integrated designs place an emphasis on the collaborative development of near identical questions and thus tend to focus on structures and practices that lend themselves more readily to transnational comparison (e.g. Cranet); post-hoc comparative designs attempt to align questions from similar national surveys at the point of data analysis (e.g. WERS in Britain) which constrains the scope of comparisons. Our design had the advantages of the former but with flexibility built in. Thus there were a set of questions that were identical and a further set that were addressing equivalent phenomena but were phrased differently.
More specifically, the survey instrument contains a set of questions, such as those relating to country of origin and size levels, which were phrased in exactly the same way in each survey. A further set of questions were almost identical but national teams added or reduced response options because of institutional differences or question saliency in each country. Data transformation was therefore required to produce equivalent data. A third set of questions were functionally equivalent in that questions were asked about the same issue, but because the institutions governing this activity varied across countries, the question was adapted to take this into account. For example, in exploring the influence of unions the survey instrument needed to reflect different national arrangements underpinning union presence within firms. Here again data transformations were required. A fourth set of questions were thematically equivalent; questions asked about the same phenomenon, but due to institutional differences the structures and practices examined were unique to each country. These questions provided valuable national contextual insights that expand upon some of the functionally equivalent and identical data. Considerations of functional equivalence and, for non-core questions, data availability in each national data set, mean that the scope of comparative analysis varies according to issue, with some involving 2- and 3- rather than 4-country comparisons.
Given that the research involved a comparative parallel design, with many questions that were not identical, the task of integrating the datasets was not straightforward. Substantial and painstaking work was undertaken by an international working group that identified the identical and equivalent questions and produced a code book defining the SPSS transformations to be undertaken, with syntax being written which converted the original national variables into new comparative variables. The process of the integration of the national data into one comparative international dataset was centralised, to minimize error, with an expert located in the UK undertaking the merging process. Substantial cross checking was carried out to ensure the integrity of the international data, with subject experts taking responsibility for each section. This had the advantage of checking data across countries rather than solely within countries. The checking process was extremely time-consuming, involving variable-by-variable checks between the merged and unmerged data and a discussion on the conceptual utility of the transformed data.