Language testing and methods of assessment

Methods of Assessment

- 20. What is item analysis? What is the facility value of an item? What is its discrimination index?
- 21. What are the main difficulties and limitations of language testing?

- 20. What is item analysis? What is the facility value of an item? What is its discrimination index?

Item analysis is for judging the value of items in norm-referenced testing. The main information to be obtained about individual items (e.g. Multiple Choice or True / False) is ITEM DIFFICULTY and ITEM DISCRIMINATION.

ITEM DIFFICULTY - finding out the percentage of people who get the item right in the try-out group.

In norm-referenced testing, one rejects items which are too easy or too difficult because the purpose is to discriminate.

The difficulty of an item = its FACILITY INDEX: % who give the right answer. The usual aim of the test setter is to achieve even to middling facility indices ranging from about 40-60%.

The DISCRIMINATION of an item is judged by comparing those individuals who succeed on a given item with those who score highly on the test as a whole:

Discrimination for any given item = [Correct Tops - Correct Bottoms] / 1/2 Number of students

- 21. What are the main difficulties and limitations of language testing?

The state of knowledge of language & language learning. Content validity: does what we are testing represent language? Does it match up to the goals of language learning and the objectives of the language learner?

Difficulties in testing communicative competence / interaction - administrative difficulties.

Objective and norm-referenced tests tend to point towards measures of receptive learning. Learners don't have to produce language in many of these tests. They may produce no language at all! Imagine an education system which awards students qualifications in the Teaching of English on the basis of Multiple Choice Tests where they only need to answer A, B, C or D. Such receptive tests fail to reference the productive skills of speaking and writing. The suggestion that a person is qualified to teach English on the basis of the data yielded by Multiple Choice Tests is clearly ridiculous unless their future occupation consists of setting more Multiple Choice Tests, thus depriving another generation of the real product i.e. the training needed to speak and to write.

There is therefore what is termed as Validity/Reliability tension in what is deceptively described as 'objective' testing.

The main advantage of objective and norm-referenced tests is that actual marking is easy. Can be done mechanically or by overlay. Can be pre-tested. Test population. Compared over different years. These tests are not as objective as the marking system suggests. What is the point if productive skills are sidelined to make exams quicker to mark and easier to administer?

When designing so-called 'objective' tests, a good deal of judgement needs to be used in relation to the rejection or acceptance of items (e.g. A, B, C, or D). Good distractors are needed - if the incorrect answers are 'too obviously wrong', the correct answer can be picked without any real knowledge of the subject matter. 'Common sense' rather than 'knowledge of specific subject matter' can gain a large number of correct answers. These sorts of tests also invite guessing (No of alternatives = 4 or 5 for each item.

A much wider sample of grammar, vocabulary & phonology can generally be included in an objective test than a subjective one, but objective tests can never test ability to communicate in the target language, nor can they evaluate actual performance.