Stephen Downes

Knowledge, Learning, Community

It does sort of raise a Turing test-type question for exam marking. Here is the study result: "The results demonstrated that overall, automated essay scoring was capable of producing scores similar to human scores for extended-response writing items with equal performance for both source-based and traditional writing genre." Now, is 'success' producing the same output as a human grader, or is success something else? If, for example, a teacher is supposed to be marking for content, but is instead responding subliminally to style, and the computer marks for style, and both human and computer return the same test score, is that a success? Or to put the same point another way: should we evaluate automated grading by comparing the results with human grading, or should we evaluate it based on elements we know empirically are present in the material being evaluated.

Today: 1057 Total: 1062 [Direct link] [Share]

Image from the website


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Mar 28, 2024 5:07 p.m.

Canadian Flag Creative Commons License.

Force:yes