Let’s save the Trees – Online Testing to the Rescue

 

 

Sivakumar Alagumalai

Jonathan Anderson

 

The Flinders University of South Australia

 

 

Traditionally, the process of much research commences with a survey /

test done on paper and culminates into transforming the data from hard

copies to a digital form on the computer before any statistical

analysis can be attempted. This can be a huge undertaking, especially

if numerous test instruments are used so as to understand or answer a

comprehensive research question. At the end of the study, the ‘used’

hard copies are destroyed or recycled. This can be both an expensive

and time-consuming activity. This paper examines an approach and trial

on internet-based testing (net-testing), and also highlights advantages

online testing has over traditional methods of data collection. An IT

course currently being taught at the School of Education is used to

highlight net-testing. Psychometric differences between traditional and

net-testing methods are discussed. Findings on student’s views about

net-testing are also reported. Technical details of setting-up

net-tests and procedures for manipulation of data and statistical

analyses are discussed.

 

 

 

 

 

 

 

Please send suggestions and queries to:

sivakumar.alagumalai@flinders.edu.au

__Introduction

 

Computer administered and managed tests (CAMT) are here.

Computerised-Linear Tests (CLT) and Computer-Adaptive Tests (CAT),

which are forms of CAMT, have been shown "to reduce testing time, to

obtain more information about the test takers, to increase test

security, to provide instant scoring and to be scheduled more easily

than paper-and-pencil-administered tests (PAPAT) (Bugbee, 1996, p.282).

 

Most, if not all, CAMT have been conducted on standalone workstations

and on Local-Area-Network (LAN) systems. This paper seeks to provide

the leap from testing in LAN environment to the Wide-Area-Network (WAN)

systems, with special reference to the internet. The paper also

attempts to provide an extension of CAMT to survey scales and

opinionnaires. CAMT raises problems related to the psychometric

properties of the scales used and is examined. It also discusses

popular views of students using CAMT. Further to these, the paper also

seeks to address the issues related to technical-setup and problems of

internet-based CAMT (net-testing). The paper concludes with

illustrations of data management and analyses.

 

 

 

Advantages of net-testing

 

Rapidly emerging technologies are now making possible for all schools

and institutions to have the infrastructure for CAMT (Sandals, 1992).

He indicates that the technological enhancements and their related

psychometric capabilities have brought psychological and testing in

related fields into the teaching and learning arena.

 

There are advantages, disadvantages and differences to be found in

net-tests (and CAMT) as compared with traditional PAPAT. Numerous

researchers (Sandals, 1992; Glowacki, et al., 1995; Kumar, & Helgeson,

1995; Bugbee, 1996; Lloyd et al., 1996) indicate the advantages of

net-tests. Tables 1 and 2 (adopted from Sandals, 1992) summarise the

major advantages.

 

Tables 1 and 2. Major advantages of net-tests (Sandals, 1992)

 

 

Figure 1 is an example of a survey form used to obtain

student’s feedback about teaching.

 

Figure 1. Sample student feedback form

 

The advantage of putting such survey forms and any CAMT on the internet

is that is allows for student’s access at any time. However, in the

case of CAMT, safeguards against student collusion must be in place and

needs to be looked into. One possible option for this is to have a

large database of items. The large pool of items enables setting of

parallel tests with items having comparable difficulty level. Thus no

two student will be sitting for the same test, i.e. with identical

items. Optimally, it is recommended that net-tests be administered at

an assigned period to reduce variations in test conditions.

 

Although there are technical problems associated with net-tests and

net-surveys, there exist great potential for large participation rates.

In line with Anderson and Alagumalai (1996) and Alagumalai, Anderson

and Mala (1997) arguments of accessibility and availability of

cyber-based HyperText Markup Language (HTML) forms across platforms,

research and testing programs can be effectively managed globally.

 

Furthermore, net-tests allow for a wider modality of testing, which is

impossible through PAPAT. Use of sound, animations and video-clips can

be easily incorporated into net-tests and allow for authentic testing.

The use of these modalities in real-time in net-tests answers the

queries raised by Keeves and Alagumalai (in print), in providing a

solution to improving measurement in science and mathematics

eduucation. Thus, the implications for net-tests and net-surveys are

much wider and far reaching.

 

 

Psychometric properties of net-tests and net-surveys

 

Associated to net-tests and net-surveys is the mode-effect of the

examinee (Parshall & Kromrey, 1993). This raises further questions of

the psychometric equivalence of both CAMT and PAPAT. Validity,

reliability, parallax issues and student’s preference are inevitable

problems that need to be addressed.

 

Olson (1986) indicated that there were no significant differences in

measurement precision between PAPAT and CAMT. Furthermore, CAT was the

most precise in estimating ability and required 75% less time to

ascertain this. In line with this, Perkins (1993) purports that there

were no significant difference in scores between CAMT and PAPAT. This

has important implications for the ‘transferability’ or parallax

between tests. Mazzeo et al. (1991) argues that in CLT, the constructs

being measured were not affected by ‘mode-of-administration’ effect.

However, the mode-effect needs to be further explored with CAT.

 

Associated with the mode-effect are the reliabilities and validities of

net-tests. Watson et al., (1990) and Watson (1992) indicate the

reliabilities of CAMT and PAPAT are about equal. However, Sukigara

(1996) argues that test-retest reliabilities of CAMT are slightly

higher than PAPAT. Although Russell et al., (1986) purport that CAMT

yields relatively more reliable scores, it may be necessary to re-look

into reliabilities based on Classical test Theories. Reliability checks

through Item Response Theory (IRT), especially with Person-Separation

Index and Item-Separation Index may shed new light on this issue of

reliability of net-tests as compared to PAPAT. Furthermore, it may be

necessary to look into time and rate of response of students, which can

be easily captured through net-tests, and then compared to that of

PAPAT.

 

In discussing validities of CAMT, Bugbee (1996, p.286) stresses that,

"validity of computer-based versions of a test must be proved by the

developers." Thus, it is the onus of the test developer to ensure

parallel internal- and external validities of test forms, especially

when random item selections are being done for creating parallel test

forms. If these parallel tests fulfil the equality of ranks but not the

score distribution equivalence, than the scores need to be re-scaled.

Currently available software allow for automatic rescaling (Gibbs &

Lario, 1995).

 

Hence to reduce parallax errors between PAPAT and CAMT and to make

meaningful comparison between them, all test form versions need to have

approximately equal validities, reliabilities and correlation with

criterion variables and equal correlation among multiple versions. The

current state of software control through HTML forms,

Common-Gateway-Interfaces (CGIs), Integrated-Developmental-Environment

(IDE) and Java-applets removes the complexities of adjusting the

psychometric properties and indices of net-tests and net-surveys.

 

Computer familiarity and scores on CAMT still require further research,

as current findings are inconclusive (Mazzeo. 1991; Perkins, 1993). Lee

(1986) showed that past computer experience significantly affected

performance on CAMT. As CAMT may discriminate against those who have

not had experience in CAMT, administrators need to take into cognisance

student’s previous experience (Heywood, 1989).

 

Perkins (1993) found that computer ownership lowered anxiety of

computer use. Further to this, he also found that computer use and

exposure to computers during Computer-based Instruction were not good

enough to lower anxiety. His latter findings contradict to what was

recorded by our students.

 

In the Multimedia Literacy course taught in the School of Education at

Flinders University, students indicated lowered anxiety levels during

CAMT and they attributed it to the frequent exposure to computer use.

All lecture materials and slides for the Multimedia Literacy course

were made available to the students through the department’s intranet

facility. Furthermore, all assignments and tutorials when done through

the computers.

 

Students were given parallel versions of a PAPAT and CAMT and scores

for both these tests were identical. Most students indicated

‘acclimatisation’ to computers and had spent on the average, four to

nine hours a week with their assignments and projects. Practice on

model net-tests prior to actual net-test appeared to have lowered

student’s anxiety. It can be argued that constant exposure to computers

and teaching and learning through computers, especially through

flexible delivery, enhanced their performance at net-test. The whole

structure and approach to the course, which is different from the

traditional definition of Computer-based Instruction, has facilitated

in lowering high anxiety levels and thus more positive attitude towards

net-tests and better performance in it.

 

Open-book, where students are allowed to browse the internet for

information and notes and also using textbooks and personal notes may

further help remove net-test anxiety. With emphasis on the processes

and skills of information use, it may be necessary to move in the

direction of open-book net-tests and needs further exploration.

 

Setting up the Net-test

 

CGIs and IDE)-links are the basis for all forms of net-tests and

net-surveys. The HTTP protocol used by the internet is generally a

one-way street, going from servers to clients. However, browsers can

ask the server to display specific requests. Thus there is also the

return requester path. CGIs function on this path (McComb, 1996). CGIs

pass data in two ways: one is URL-based and can be displayed readily

while the other is hidden. Commonly used CGI functions are GET and

POST.

 

McComb (1996, p.550) indicates that "CGI programs that use the GET

method are generally easier to write, but the URL is limited to 256

characters. The POST method is ideal when lots of data has to be

provided by the client, and there are no restriction to the number of

character used."

 

Tied to these CGI functions are the necessary IDE-links that passes

commands from the CGIs to the databases that stores the information.

The Microsoft’s Access *.mdb database format was used as it provided

several advantages over other database engines (Garcia, 1997).

 

The CLT version of net-tests and net-surveys have similar form

structure. As illustrated in Figure 1, the form has a title, an

introduction, and followed by the test-questions or survey items. At

the end of these items are the "SEND FORM / SUBMIT" and "RESET ENTRIES"

button. Behind these buttons are the associated CGI functions. On

completing the net-survey or net-test, the user clicks on the first

button, which then initiates a sequence of events and is summarised in

Figure 2.

 

 

C

L

I

E

N

T

 

S

E

R

V

E

R

 

 

C

L

I

E

N

T

 

Figure 2. Technical structure for net-tests and net-surveys

 

The front-end panel was a form designed using JavaScript and HyperText

Markup Language (HTML). This HTML form provide the user or client with

the necessary entry fields and instructions. On ‘submitting’ their

response, the related CGI routines through the software server responds

by producing another form, this time acknowledging receipt of users

information.

 

All the client’s information is captured on an Access database that has

fields corresponding to the headings of the test database and sit in a

directory where the CGI functions were located. Website Professional

Version (Beta) II server software was used to pass data between

database engine and the compiled CGI functions, and the form on the

internet. The IDE provided the transfer of all data received on the

database, which can be translated to a spreadsheet for data management

and analysis.

 

Net-tests could also have any level of password protection. In the

Multimedia Literacy course, a two-level password protection was used.

Students had to first enter their name and student identification

numbers, which was then parsed to the server. The compiled routines

checked the received information against a database of student’s

personal particulars. If and only if the information received and held

were identical, were the student allowed to enter their respective

tutorial/test group, which was only given when the whole class was

present in the computer laboratory. This was to prevent ‘outsiders’

hacking into the test system and to prevent cheating. Only when the

second level password tallies with that of the examiner’s, then student

get to sit for their net-test. Figure 3 is an example of a one-level

password protection system, where all information are parsed and

checked by the server in one sweep.

 

Figure 3. One-level password protection

 

 

In the CLT form of the net-test, a pre-determined number of items are

presented. The use of buttons and check-boxes for alternatives in a

multiple choice tests allows for easy checking and correction. Free

response answers can also be collected through text-fields (Figure 4).

Unlike numeric responses which can be easily analysed on a spreadsheet,

free-response items need to be looked at individually by the examiner.

 

Figure 4. CLT-version of net-test (HTML Form)

 

Unlike the CLT-version of net-test presented illustrated by Figure 4,

greater technical challenges are provided by the CAT-version. They are

both memory / resource intensive. Administering a CAT-version of

net-test is parallel to going through a series of password levels.

Information is send to and fro the client to the server. Each response

of the client is checked against the compiled functions for the

selection of the next item to suit the user’s ability level. Trials

have been done with a maximum of three students sitting for a

CAT-version of net-test, the webserver running off a Pentium 100 MHz

machine with 16M of RAM. It was very slow. With current developments

for faster machines and larger memory, it is hoped that CAT-version of

net-test would be commercially viable.

Data manipulation, analysis and Conclusion

 

Traditionally, all test responses have to be entered through the

keyboard. The may be a challenging task, especially if it involves

large number of test-items and respondents. Further to the huge job of

entering the data, one has to also understand the software in which the

data is being entered. Data manipulation and analyses bring a further

challenge to the examiner or researcher. The process of data entry,

manipulation and analysis is simplified by net-tests.

 

As the response of the examinee / client goes directly into a

spreadsheet, many man-hours can be saved. The IDE-concept together with

the availability of developers version of spreadsheets (Microsoft’s

EXECL), enables the programmer to write routines for scoring and also

incorporate interactive functions for the examiner to print information

collected about the examinee. Similarly, immediate feedback can be

given through HTML forms to the client. The possibilities are there to

be explored.

 

Student assessment is an area of work which can be expensive,

especially in terms of examiner’s time and the amount of resources

used. Net-tests and net-surveys offer great potential saving in both

management time and also the huge loads of paper used. Net-tests and

net-surveys are the way to go if we want to save all the trees and

through it our own existence on Earth. As educators and researchers

shift gear into the era of net-tests and net-surveys, we can all but

pray for technology to bring us through.

 

 

THE NET PRAYER

 

In you we trust, not only for information retrieval,

but also for assessing information digestion and assimilation.

 

Please do not fail us, for you are governed by Murphy’s Laws;

… your hiccups are our "DNS Failure",

… your sighs are our "URL not Found",

… your frustrations are our "Server not Found", and

… your indifferences are our "Still Connecting".

 

For if do give-up on us,

you may create regurgitation and future inhibitions,

both for you and your future descendants.

References

 

Alagumalai, S., Anderson, J., & Mala, V. (1997). Software evaluation –

A Pedagogic solution. Paper presented at the 11th Conference of the

Australian Association for Research in Education Brisbane, Australia

(1-5 December, 1997).

 

Anderson, J., & Alagumalai, S. (1996). From text-based pedagogy to

cyber-based methodology. Paper presented at the 10th Joint Conference

of the Australian Association for Research in Education and the

Singapore Educational Research Association, Singapore (25-29 November,

1996).

 

Bugbee, A. C. (1996). The Equivalence of paper-and-pencil and

computer-based testing. Journal of Research on Computing in Education,

28(3), pp. 282-299.

 

Bunderson, C., Inouye, D., & Olsen, J. (1989). The four generations of

computerised education measurement. In Robert Linn (Ed.), Educational

Measurement, (3rd Ed.), pp. 367-407. NY: American Council of Education

/ Macmillan.

 

Eaves, C.R. (1984-1985). Educational Assessment in the United States.

Diagnostique, 10, pp. 5-39

 

Garcia, J. (1997). Make your database sing. VisualBasic Programmer’s

Journal, 7(9), p.95-97.

 

Gibbs, W.J., & Lario, G.A. (1995). TestMaker: A computer-based test

development tool. In Proceedings of Association of Small Computer Users

in Education (ASCUE) Summer Conference. North Myrtle Beach, South

Carolina. (18-22 June, 1995)

 

Glowacki, M.L., et al., (1995). Developing computerised tests for

classroom teachers: A pilot study. Paper presented at the Annual

Meeting of the Mid-South Educational Research Association. Biloxi, MS.

(8-10 November, 1995).

 

Heywood, J. (1989). Assessment in Higher Education. Chichester: John

Wiley & Sons.

 

Hicken, S. (1993). Administering comprehensive examinations using

computers. Collegiate Microcomputer. 11(3), pp.194-198.

 

Keeves, J.P. & Alagumalai, S. (in print). Advances in Measurement in

Science Education. In Fraser, B.J., & Tobin, K.G. (Eds.) International

Handbook of Science Education. Dordrecht, The Netherland: Kluwer

Academic Publishers. pp. 1229-1244.

 

Kumar, D.D., & Helgeson, S.L.(1995). Trends in computer applications

in science assessment. Journal of Science Education and Technology.

4(1), pp. 29-36.

 

Lee, J.A. (1986). The effects of past computer experience on

computerised aptitude test performance. Educational and Psychological

Measurements, 46(3), pp. 723-733.

 

Lloyd, D., Martin, J.G., & McCaffery, K. (1996). The introduction of

computer-based testing on an engineering technology course. Assessment

and Evaluation in Higher Education, 21(1), pp. 83-90.

 

Mazzeo, J. E. et al., (1991). Comparability of Computer and

paper-and-pencil scores for two CLEP General Examinations. College

Entrance Examination Board Report. NY: Educational Testing Service.

 

McComb, G. (1996). JavaScript Sourcebook. New York: John Wiley & Sons,

Inc

 

Olson, J.B. (1986). Comparison and equating of paper-administered,

computer-administered and computerised-adaptive tests of achievement.

Paper presented at the Annual Meeting of the American Educational

Research Association, San Francisco, CA. April 1986

 

Parshall, C. & Kromery, J.D. (1993). Computer testing versus

paper-and-pencil testing: An analysis of examinee characteristics

associated with mode effect. Paper presented at the Annual Meeting of

the American Educational Research Association. Atlanta, GA. (12-16

April, 1993).

 

Perkins, B. (1993). Differences between computer-administered and

paper-administered computer anxiety and performance measure.

 

Russell, G.K., Peace, K.A., & Mellsop, G.W. (1986). The reliability of

a micro-computer administration of the MMPI. Journal of Clinical

Psychology, 42, pp. 120-122.

 

Sandals, L.H. (1992). An overview of the uses of computer-based

 

 

assessment and diagnosis. Canadian Journal of Educational

Communication, 21(1), pp. 67-78.

 

Sukigara, M., (1996). Equivalence between computer and booklet

administrations of the new Japanese version of the MMPI. Educational

and Psychological Measurement, 56(4), pp. 570-584.

 

Watson, C.G., Manifold, V., Klett, W.G., Brown, J., & Thomas, D.

(1990). Comparability of computer- and booklet-administered Minnesota

Multiphasic Personality Inventories among primarily chemical dependent

patients. Psychological Assessment: Journal of Consulting and Clinical

Psychology, 2, pp. 276-280.

 

Watson, C.G., Thomas, D., & Anderson, P.E. (1992). Do

computer-administered Minnesota Multiphasic Personality Inventories

underestimate booklet-based scores? Journal of Clinical Psychology, 48,

pp. 744-748.

 

 

 

 

 

_PAGE _