# Reliability

‘Reliability’ generally refers to the extent to which a test can be expected to give the same results when administered on a different occasion (test-retest reliability) or to which the components of a test give consistent results (internal consistency).

Internal consistency is a measure of whether each item in a test measures the same concept. There are several methods of calculating this, although the most commonly used is Cronbach’s alpha, which is based on the ratio of the sum of the individual item variances to the overall subtest score variance. However, Cronbach’s alpha presumes a complete set of responses to the items, since all items need to contribute to the factor score equally, which is not case with all the Rapid subtests. An alternative formula is the standardised Cronbach’s alpha (shown below), which is based on the average non-redundant item correlation.

Table 12 shows the standardised Cronbach’s alpha estimates. An internal consistency of a > .7 is generally considered to be adequate, whilst a > .8 is deemed as good. It can be seen from Table 12, that the majority of the subtests show a good level of internal consistency, with a few at an adequate level. Mobile phone (8–10) is showing a lower level of internal consistency due to the strict discontinuation rule on this particular subtest (whereby the test stops when the student fails both items at a level – similar to other digit span tests). However, a normal Cronbach’s alpha calculation (based on the remaining more difficult items being failed after discontinuation) estimates the internal consistency on this subtest as .831.

Table 12. Internal consistency

 Subtest Standardised a Crayons (4-6) .822 Crayons (7) .736 Races (4-6) .786 Races (7) .730 Rhymes (4-6) .856 Rhymes (7) .823 Mobile phone (8-10) .629 Funny words (8-10) .805 Word chopping (8-10) .813 Mobile phone (11-15) .693 Non-words (11-15) .728 Segments (11-15) .803

Test-retest reliability estimates the degree to which a test provides stable measurements over time. A small subset of the Rapid standardisation sample (n = 200) repeated the Rapid subtests 4–6 weeks after the first administration. Correlations (using Pearson’s r) between scores on the two sittings are given in Table 13. A correlation of .60 is considered to be an adequate level of test-retest reliability, with .70 considered as good. As can be seen in Table 13, Rhymes shows a good level of test-retest reliability. The remaining subtests are mostly within or around the acceptable level, although Races and Mobile phone are a little below. Earlier research on LASS found lower correlations on the memory subtests than on the literacy subtests, which appeared to be due to greater susceptibility of these tasks to practice effects arising from enhanced motivation and application of strategic thinking at the retest.

Table 13. Test-retest reliability

 Subtest Pearson’s r Crayons 63 Races 53 Rhymes 76 Mobile phone 57 Funny words (Non-words) 59 Word chopping (Segments) 62