Every time we automate something new, it sparks ecstasy and terror. That's true even for something as mundane as automated essay grading, which has been around for a while, but was added to the EdX platform earlier this year and then .
The next important example is the Computerised Instrument for writing assessment (CIWE pronounced kiwi) . This was designed in conjunction with the Carmel California Evaluation Centre following work done on the Alaska Writing Programme (it is not clear whether this is the same event as the Alaska Assessment Project mentioned above). This method collected around 500 student essays in text form. The students ranged from grade school, to high school, to university applicants and graduate students. CIWE used 13 factors to grade any essay, of which the 4 most important factors were fluency, sentence development, and word use or vocabulary and paragraph development. However, as mentioned previously, the E-rater and the Intelligent Essay Assessor remain the leading Automated Essay Graders, to date.
Automated Essay Grading Using Machine Learning
The Intelligent Essay Assessor is a commercial implementation of the LSA approach. Later in this paper we discuss a trial of this system for first year university student essays.Landauer, et al (1998), report that LSA has been tried with five scoring methods, each varying the manner in which student essays were compared with sample essays. Primarily this had to do with the way cosines between appropriate vectors were computed . For each method an LSA space was constructed based on domain specific material and the student essays. Foltz (1996) also reports that LSA grading performance is about as reliable as human graders. Landauer (1999) reports another test on GMAT essays where the percentages for adjacent agreement with human graders were between 85%-91%.Larkey (1998) implemented an automated essay grading approach based on text categorisation techniques, text complexity features, and linear regression methods. The Information Retrieval literature discusses techniques for classifying documents as to their appropriateness of content for given document retrieval queries ( van Rijsbergen, 1979). Larkey’s approach
".. is to train binary classifiers to distinguish "good" from "bad" essays, and use the scores output by the classifiers to rank essays and assign grades to them."