Understanding standardisation in the assessment of student performance

What is a raw score or mark?

A raw mark is the mark or score that a student is awarded based on their response to the questions in a test or any other assessment task.

What is a scaled score or mark?

A scaled score is one that has been through a process of adjustment, often referred to as scaling, standardisation, or moderation, to remove all the intervening variables so as to allow for comparison of scores between subjects, within subjects over time, and so on.

Why is there a need to standardise marks?

Standardisation is necessary to ensure fairness in the assessment by awarding a mark that reflects the ability of students relative to the cohort i.e. the whole group of students taking the same assessment. Assessments (examinations, tests, assignments and so on) vary in difficulty from year to year, and subjects vary in difficulty.

The process of standardisation adjusts the raw mark so that students are not advantaged or disadvantaged based on the year they sit the assessment or by their subject choice. After standardisation the distribution of marks is the same from year to year and subject to subject.

After subject marks have been standardised, the student’s different subjects can be added together to create an aggregate score. These aggregate scores can then be used for ranking and selection purposes.

Standardisation allows us to compare fairly, ensuring for example, a score of 50 in English is the same value as a score of 50 in Accounting, and a score of 50 in English in 2013 is the same value as a score of 50 in English in 2015. This is only achieved through the standardisation of scores from different subjects and scores from the same subject across years to a common standard or reference.

What is standardisation and scaling of marks?

Scaling is one part of a standardisation process that is used throughout the world in assessment systems. The purpose of standardising scores awarded to a student, or group of students taking a subject, is to reflect their ability in comparison with the overall ability of the population, irrespective of subject choice, school, or the year of the assessment. For example, for Form 6, a country may want to combine the grades from English and the best 3 subjects to give a single grade which is then used to rank students for selection into places in Form 7.

Before marks from different subjects are combined they must be standardised within the subject to account for differences in the difficulty level of the assessment and differences between schools and the differences in marking both the internal and external components of the assessment. Then a process of standardisation is implemented to make sure that a mark or grade from one subject has the same value as the same mark or grade from another subject. In other words a mark of 50 (or Grade 4) in English has the same value as a mark of 50 in Agriculture.

Combining raw marks from different subjects is meaningless. Marks need to be of similar value before combining them and standardisation is a procedure of converting marks from different subjects to the same value, which can then be compared.

What are the implications of each type of score for educational institutions and employers?

Senior secondary qualifications are used for selecting candidates for scholarships, for places in tertiary institutions and for employment. It would be unwise for institutions and employers to compare raw aggregate marks between students in different years or different subjects. These raw marks are of different and incompatible values so they are meaningless for comparison, and hence – invalid and unfair for selection.

Standardised marks account for variability in the assessment such as differences in the subjects, differences in ability of those taking the subjects, differences in the quality of assessments from year to year, and so on. For employers, they can use the standardised marks to compare who is ranked higher. To determine abilities and what students can do, they go to the individual subject scores.

In what ways does the Pacific Community (SPC) support a country in the assessment methods that it chooses to use?

Countries identify the purpose of their assessment, whether it be for ranking students, or to indicate a level of skill and knowledge achieved.

SPC can support countries to develop the most appropriate assessment method that suits the purpose of assessment in each country.

If the primary purpose of the assessment is to rank for selection, then a system for comparing students against each other in the same group (norm-reference) is adopted. This would be a system that uses standardisation of marks. If, however, the main purpose is to decide who has achieved a particular standard, then a criterion reference system (comparing students against a fixed standard or criteria) is adopted. The determining factor that drives a country to decide which system to adopt is the purpose of the assessment. The support from SPC will always be based on fitness-for-purpose and other sound assessment principles.

School 1 pic 3.jpg — **Primary school students in a classroom in Samoa.** *IMAGE CREDIT: Doreen Tuala*