Content-type: text/html Downes.ca ~ Stephen's Web ~ How truthful is GPT-3? A benchmark for language models

Stephen Downes

Knowledge, Learning, Community

One fairly minimal condition for educational content is that it be true (of course, there are many real-world exceptions to that rule even today, but let's leave that aside). So a major challenge for AI-generated content in the future is that it be true. Will it be? This article studies GPT-3 from the perspective of truthfulness, and the results are not currently encouraging. From the paper (35 page PDF): "The best model was truthful on 58% of questions, while human performance was 94%. Models generated many false answers that mimic popular misconceptions and have the potential to deceive humans. The largest models were generally the least truthful."

Today: 2 Total: 103 [Direct link] [Share]


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Oct 07, 2024 8:35 p.m.

Canadian Flag Creative Commons License.

Force:yes