Appendix 1: methodology for the quantitative component of The Learning Curve
As part of the Learning Curve programme, the Economist Intelligence Unit (EIU) undertook a substantial quantitative exercise to analyse nations' educational systems’ performance in a global context. The EIU set two main objectives for this work: to collate and compare international data on national school systems’ outputs in a comprehensive and accessible way, and for the results to help set the editorial agenda for the Learning Curve programme.
The EIU was aided by an Advisory Panel of education experts from around the world. The Panel provided advice on the aims, approach, methodology and outputs of the Learning Curve’s quantitative component. Feedback from the Panel was fed into the research in order to ensure the highest level of quality.
The EIU developed three outputs as part of the quantitative component of the Learning Curve. These are an exhaustive data bank of high quality national education statistics, an index measuring national cognitive skills and educational attainment, and research on correlations between educational inputs, outputs and wider society. Each is described in more detail below.
Learning Curve Data Bank
The Learning Curve Data Bank (LCDB) provides a large, transparent and easily accessible database of annual education inputs and outputs and socio-economic indicators on 50 countries (and one region – Hong Kong) going back to 1990 when possible. It is unique in that its aim is to include data that are internationally comparable. The user can sort and display the data in various ways via the website that accompanies this report.
Country selection to the Data Bank was on the basis of available education input, output and socio-economic data at an internationally comparable level. A particularly important criterion was participation in the international PISA and/or TIMSS tests. Forty countries (and Hong Kong) were included as 'comprehensive-data' countries within the Data Bank, and ten countries as 'partial-data' countries, according to availability of data.
The EIU's aim was to include only internationally comparable data. Wherever possible, OECD data or data from international organisations was used to ensure comparability. For the vast majority of indicators, the EIU refrained from using national data sources, and when possible, used inter- and extrapolations in order to fill missing data points. Different methods for estimations were used, including regression when found to be statistically significant, linear estimation, averages between regions, and deductions based on other research. The source for each and every data point is cited in the Data Bank. The data were last collected and/or calculated in September 2012.
Over 60 indicators are included, structured in three sections: inputs to education (such as education spending, school entrance age, pupil teacher ratio, school life expectancy, teacher salaries, among others), outputs of education (such as cognitive skills measured by international tests such as PISA, literacy rates, graduation rates, unemployment by educational attainment, labour market productivity, among others) and socio-economic environment indicators (social inequality, crime rates, GDP per capita, unemployment, among others). The Data Bank’s indicators were used to create the Index and conduct a correlations exercise.
Global Index of Cognitive Skills and Educational Attainment
The Global Index of Cognitive Skills and Educational Attainment compares the performance of 39 countries and one region (Hong Kong is used as a proxy for China due to the lack of test results at a national level) on two categories of education, cognitive skills and educational attainment. The index provides a snapshot of the relative performance of countries based on their education outputs.
Country and indicator selection
For data availability purposes, country selection to the Index was based on whether a country was a 'comprehensive-data' country within the Data Bank. Guided by the Advisory Panel, the EIU’s goal in selecting indicators for the Index was to establish criteria by which to measure countries’ output performance in education. Initial questions included: What level of cognitive skills are national education systems equipping students with, and how are students performing on internationally comparable tests at different ages? What are levels of reading, maths and science in these countries? How successful are national education systems at attaining a high level of literacy in the population? How successful are national education systems at educating students to secondary and tertiary degree level?
Based on this set of questions, the EIU chose objective quantitative indicators, grouping them into two groups: cognitive skills and educational attainment. For cognitive skills, the Index uses the latest reading, maths and science scores from PISA (Grade 8 level), TIMSS (Grade 4 and 8) and PIRLS (Grade 4). For educational attainment, the Index uses the latest literacy rate and graduation rates at the upper secondary and tertiary level. Data for some countries were more recent than others; when the latest available data point was five years older than the latest, the EIU chose not to include it, although this was very rarely found to be an issue.
The EIU made estimations when no internationally comparable data were available. For example, a number of countries’ Grade 8 TIMSS Science scores were estimated by regression with PISA Science scores, when the regression was found to be statistically significant. In addition, when OECD data were not available for graduation rates, national ministry or statistics bureau data were sanity-checked and then used if deemed internationally comparable.
Calculating scores and weightings
In order to make indicators directly comparable across all countries in the Index, all values were normalised into z-scores. This process enables the comparison and aggregation of different data sets (on different scales), and also the scoring of countries on the basis of their comparative performance. A z-score indicates how many standard deviations an observation is above or below the mean. To compute the z-score, the EIU first calculated each indicator’s mean and standard deviation using the data for the countries in the Index, and then the distance of the observation from the mean in terms of standard deviations.
The overall index score is the weighted sum of the underlying two category scores. Likewise, the category scores are the weighted sum of the underlying indicator scores. As recommended by the Advisory Panel, the default weight for the Index is two-thirds to cognitive skills and one-third to educational attainment. Within the cognitive skills category, the Grade 8 tests’ score accounts for 60% while the Grade 4 tests’ score accounts for 40% (Reading, Maths and Science all account for equal weights). Within the educational attainment category, the literacy rate and graduation rates account for equal weights. The user can, however, change the weightings and recalculate scores according to personal preference via the website that accompanies this report.
Areas for caution
Because indexes aggregate different data sets on different scales from different sources, building them invariably requires making a number of subjective decisions. This index is no different. Each 'area for caution' is described below.
Z-scores for PISA, TIMSS and PIRLS
It is important to note that, strictly speaking, the z-scores for PISA, TIMSS and PIRLS are not directly comparable. The methodology applied both by the OECD and the International Association for the Evaluation of Educational Achievement (IEA) to calculate the performance of the participating countries consists of comparing the performance of the participating countries to the respective mean performance. (The countries’ ‘raw’ test scores before normalisation are not published; just their scores in comparison to the other participants.) Thus, which countries participate in each test and how well they perform in comparison to the other participants has a direct impact on the resulting final scores. Given that the sample of countries that take the PISA, TIMSS and PIRLS tests are not exactly the same, there are limitations to the comparability of their scores.
The EIU has chosen not to change these scores to account for this lack of direct comparability; however, it did consider other options along the way. The main alternative suggestion from the Advisory Panel was to use a pivot country in order to transform the z-scores of other countries in comparison to that pivot country’s z-score. Although this method is used in some studies, after substantial consideration, the EIU decided not to employ this method for the purpose of an index. The resulting z-scores after transformation depend heavily on the choice of pivot country; choosing one country as a pivot over another affects countries’ z-scores quite substantially. The EIU did not feel it was in a position to make such a choice. Despite these limitations to test scores’ direct comparability, the EIU believes that the applied methodology is the least invasive and most appropriate to aggregate these scores.
Graduation rate data
Some members of the Advisory Panel questioned the use of graduation rates in the Index in that it is not clear whether they add value as a comparative indicator of education performance. Unlike test results and literacy rates, standards to gaining an upper secondary and tertiary degree do differ across countries. Notwithstanding, the EIU believes that graduation rates do add value in evaluating a national educational system's performance, as there is common acceptance that national education systems should aim for their citizens to gain educational qualifications, especially at the secondary level. Including graduation rate data in the Index therefore awards countries that have put this aim into practice, albeit at varying levels of quality.
Because of the variation in how countries measure graduation rates, the EIU followed the Panel's suggestion in using OECD graduation rate data, which use one main definition. When OECD data were not available, national ministry or statistics bureau data were sanity-checked and then used if deemed comparable. In some cases, no data on graduation rates were available. In this case, the EIU awarded the country the mean score for this indicator. One disadvantage of giving a country the mean score is that if in reality it performs worse than the average in this indicator, the Index boosts its score, and vice versa.
The EIU used the most recent data available. Because graduation rates are based on the pattern of graduation existing at the time, they are sensitive to changes in the educational system, such as the addition of new programmes or a change in programme duration. As an extreme example, Portugal’s upper secondary graduation rate increased from a range between 50% and 65% in the early 2000s to 2008, to 104% in 2010, as a result of the government’s “New Opportunities” programme, launched to provide a second chance for those individuals who left school early without a secondary diploma. In order to treat countries consistently, the Index takes the 2010 figure. Although this inflates Portugal’s score in this indicator, this inflation should eventually fall out of the Index should it be updated on an annual or bi-annual basis. Given the limitations of graduation rate data, the EIU followed the Panel's suggestion of giving a smaller weighting (one-third) to educational attainment.
It is also important to note that the tertiary graduation rate indicator covers only tertiary-type A programmes. Tertiary-type B programmes are not included. This methodology was chosen largely because not all countries collect data and organise their education systems along the lines of A and B. As per the OECD, tertiary-type A programmes are largely theory-based and are designed to provide qualifications for entry into advanced research programmes and professions with high requirements in knowledge and skills. These programmes are typically delivered by universities, and their duration ranges from three to five years, or more at times. Tertiary-type B programmes are classified at the same academic level as those of type A, but are often shorter in duration (usually two to three years). They are generally not intended to lead to further university-level degrees, but rather to lead directly to the labour market.
Although excluding tertiary-type B programmes makes for a more relevant comparison among countries, it also slightly disadvantages a number of countries that have particularly high type B graduation rates (as these rates are not included). These countries are Canada, Ireland, Japan and New Zealand. Nonetheless, this exclusion has a limited impact on these countries’ ranking in the Index.
The EIU had wanted to include other education performance indicators in the Index, such as how well national education systems prepare students for the labour market and the performance of vocational studies. However, data availability was a limiting factor. The EIU found that sufficient data were not available that isolates educational attainment within labour market outcomes; and internationally comparable data on vocational studies covering all countries in the Index were not readily available either.
With the ‘comprehensive-data’ countries data from the Data Bank, a correlations exercise was undertaken in order to test relationships across countries between education inputs, outputs and wider society. The EIU tested for correlations between the inputs to and outputs of education, the inputs to education and socio-economic environment indicators (as a proxy for wider society), and the outputs of education and socio-economic environment indicators.
Definition of a correlation and thresholds used
The correlation coefficient is a measure of the degree of linear relationship between two variables. While in regression the emphasis is on predicting one variable from the other, in correlation the emphasis is on the degree to which a linear model may describe the relationship between two variables. Importantly, the presence of a correlation does not imply causality.
In order to ensure that relationships being found were indeed strong, the EIU looked for at least a 0.65 level of correlation (the higher it is, the stronger the relationship). It is important to acknowledge that some social science research uses a lower level of correlation, but the EIU wished to maintain a high level to avoid finding relationships between indicators that might not be significant.
Correlation tests were conducted on an indicator-by-indicator basis, between two variables over time (on an annual basis) and at three-year growth rates (for example, the three-year growth rate of 1999 (1996-99) against the three-year growth rate of 2007 (2004-07)). For the latter tests, adjustments were made to include TIMSS and PIRLS tests even though these are not taken every 3 years (they are taken every four and five years respectively). The EIU used the same time lags across countries on the same indicator, as per the Panel’s suggestions.
When looking for evidence of a strong correlation, the EIU sought a strong relationship over time. For example, although there may have been evidence of a strong correlation between one input variable in 1990 and an output variable in 2005; a strong level of correlation would also need to be found for 1991 and 2006, 1992 and 2007, and so on, for at least a number of years. In addition, correlation tests were only run if there were at least 15 countries with relevant data for both of the indicators being assessed.
Factors affecting the correlations
The EIU did not find a great number of strong relationships. Given the complexity of education, this was not totally surprising. However, other factors may also account for the lack of correlations. For one, not all indicators were available going back 15-20 years in time. There was also a lack of data availability for some countries (some of this due to the Data Bank’s focus on ensuring that data being used were internationally comparable). Finally, other qualitative factors that are difficult to measure, such as culture and the quality of teaching, were not included in the Data Bank. These factors may have a significant impact on education outputs, but the EIU was not able to take these into account within the correlations exercise.