News from Hamburg: Big Steps Forward towards Reliable Metrics to Harmonise Learning Assessment Data Globally

By Silvia Montoya, Director of the UNESCO Institute for Statistics (UIS) and Dirk Hastedt, Executive Director of the International Association for the Evaluation of Educational Achievement (IEA).

This blog was also published by Norrag.

On the day that the UNESCO Institute for Statistics (UIS) released new global numbers of children and adolescents not learning, representatives from regional and international learning assessments gathered in Hamburg, Germany. They had answered the UIS’ call to come together to help tackle measurement issues around the coverage and comparability of data for SDGs Indicator 4.1.1: the proportion of children and young people in Grade 2 or 3; at the end of primary education; and at the end of lower secondary education, achieving at least a minimum proficiency level in reading and mathematics.

General agreement on interim reporting

First, the Hamburg meeting agreed with a UIS proposal for an interim reporting period that will cover 2017-2019. This was a major breakthrough and an endorsement of the work by the UIS and its technical partners to support international consistency in reporting against Indicator 4.1.1. Some might prefer tabulating results of lower reliability than having no results; consequently, the UIS is willing to provide the best possible map based on available data.

For the first year of the interim reporting period, the UIS reported in 2017 on those countries participating in cross-national assessments (CNAs). The Institute’s reporting included regional assessments such as SACMEQPASECLLECE and PILNA, as well as citizen-led assessments like ASERUWEZO and assessments with global coverage such as PISATIMSS and PIRLS. It is important to note that the minimum proficiency levels (MPLs) reported are not globally harmonised because each assessment has its own definitions.

Leveraging on all existing data and addressing equity

For 2018, we are proposing to maximise again the use of available data coming from cross-national assessments (CNAs). At the same time, we will be filling the gaps for countries that do not participate in CNAs by using national assessment data and citizen-led assessments, as long as they provide the exact information the indicators require, e.g. information on minimum proficiency levels according to a cut-off point of some sort and descriptors of those levels. Reporting  based on public examinations has also been opened for discussion.

There was agreement during the meeting that reporting based on regional or international studies would yield the highest-quality data. From this perspective, it would be best if countries would participate in these assessments. There is, however, doubt that all countries are willing to do so. Therefore, the UIS will continue to use all available information and will fill in the gaps to ensure that a maximum number of countries are reporting. Footnotes or annotation will be used to provide the minimum proficiency levels used by the different assessments, and specifications will be added about the coverage and quality of the data collection. This footnoting or annotation matters, because it will help to ensure that the results are fit-for-purpose and relevant for action.

In short, the UIS will present the available data and clearly indicate which countries participated in which assessments and whether the data come from a national source (either official or citizen-led). Very importantly, by drawing on citizen-led assessments, UNICEF’s MICS and OECD’s PISA for Development results when available; the data will include not only children in the classroom but also those who are out of school. The Sustainable Development Goals (SDGs) stress equity, which means reporting learning for all children. There was agreement during the meeting that even if there is a lack of information for children who are not enrolled in school, the reporting should reflect out-of-school populations as well.

Green light for UIS Reporting Scales

Meanwhile, the UIS will continue to develop the UIS Reporting Scales for each domain and point of measurement (based on conceptual work around domains and sub-domains/strands of learning) to facilitate alignment between assessment programmes that measure the same domains. The scales will enable countries to pursue different options for assessments and set goals for progress, depending on the programme they choose for Indicator 4.1.1 reporting, and yet allow for some equating or harmonisation of the results.

In Hamburg, there was consensus among the meeting participants that a set of scales should be built for both reading and mathematics at each of the three key points of measurement, meaning at the end of Grade 2 or 3, at the end of primary education and at the end of lower secondary education. The set of scales will thus address multidimensionality in the way knowledge and skills are built and facilitate information sharing. The experts preferred this approach rather than developing a single reporting scale for all of the measurement points and would explore in the near future if any overlapping exists.

Each reporting scale starts by mapping cross-national and national curricula and assessment frameworks, with all their performance levels (PL) in each domain (reading and mathematics). This work, undertaken by the UNESCO International Bureau of Education (IBE), consists of mapping PLs and descriptors to better understand which content and skills are involved. For example, PISA uses six proficiency levels to describe reading. Based on their scores, students who are categorised at Level 2 and above are considered to have met minimum proficiency levels.

In the last stage, we will work together to identify common recommended benchmarks for minimum levels of proficiency used in cross-national – as well as national – assessments. This implies harmonising national assessments to adequately monitor progress.

Expanding comparability and coverage with innovative solutions

Regional assessments allow for comparison across countries within regions, but it is also important that countries are compared in a fair way. The Hamburg meeting agreed to address this challenge while expanding the coverage of learning data by linking regional and international assessments.

Under the technical leadership of the UIS, representatives from both IEA and regional assessment programmes confirmed their strong interest in linking with TIMSS. The experts proposed a process whereby two to three countries per region would participate in both the regional assessment and TIMSS in 2019. Though this solution could be extended to PIRLS and the Literacy and Numeracy Assessment (LaNA), TIMSS and most regional assessments will be conducted again in 2019. By comparing the scores of these participating countries across both assessments, we can gain valuable insight on the discussions on benchmarking and the definition of MPLs.

The remaining countries of the region can then report on the TIMSS scale using results from the “ring” countries that have administered both assessments. This solution is pragmatic and somehow similar to the “Ring Comparison” methodology used to link country-level data in the purchasing power parity measure. The Ring Comparison allows each region to be independent of other regions, whilst adopting the estimation methodologies that are best suited to its country characteristics and statistical capacities.

The IEA stands ready to develop alternative design options that will support work between regional assessment programmes and countries. GAML will be hosting and coordinating these efforts to foster participation of all stakeholders in the discussion. It will ensure synergies between this endeavor and the various streams of work undergoing and notably the work of other technical partners such as the Australian Council for Educational Research (ACER).

While TIMSS measures results for the primary level, it is possible to build on MICS, PASEC and LLECE for early grades (linking their results to PIRLS, TIMSS /4th grade and LaNA). And although there is no regional assessment at the lower secondary level, linking TIMSS and PISA could be an option using a similar methodology.

But one note of caution emerged from Hamburg: the need for funding to ensure that any or all of this can happen. Technical work is on its way of convergence. Countries need money to administer national and international assessments. Regional learning assessments also need funding to strengthen themselves while supporting countries capacities. And global initiatives need continuous financial support to ensure that they deliver to the end. Participants pledged for donor support and called for an Investment Case  that will document the necessary funds. The Global Partnership for Education (GPE) will work with the UIS to develop this as soon as possible.

Leave a comment