- ChatGPT-4 registered a score that’s equivalent of a B in Scott Aaronson’s Introduction to Quantum Information Science final exam.
- The course is an honors upper-level undergraduate course at the University of Texas-Austin.
- ChatGPT-4 asks the Dean to have Aaronson reconsider the grade, citing time constraints and the nature of the exam questions as excuses for its lackluster performance.
In a recent experiment, noted quantum expert and educator Scott Aaronson had GPT-4 take the actual 2019 final exam from Introduction to Quantum Information Science, an honors upper-level undergrad course at UT Austin. The resulting grade — a B — did not sit well with the large language mode — LLM — system.
According to a blog post on Aaronson’s blog Shtetl-Optimized, Aaronson and his head teacher’s assistant gave GPT-4 the problems via their LaTeX source code. Quantum circuit answers relied on a qcircuit package, which GPT-4 again understands, or used an English description of the circuit.
The TA graded the exam as he would any other student’s exam.
According to Aaronson:
“The result: GPT-4 scored 69 / 100. (Because of extra credits, the max score on the exam was 120, though the highest score that any student actually achieved was 108.) For comparison, the average among the students was 74.4 (though with a strong selection effect—many students who were struggling had dropped the course by then!). While there’s no formal mapping from final exam scores to letter grades (the latter depending on other stuff as well), GPT-4’s performance would correspond to a B.”
In the era of Grade Inflation, ChatGPT-4 immediately recognized that its work was being improperly evaluated. Humans, after all, are known for biased programming and there’s been a series of microaggressions against machines.
ChatGPT immediately dashed off the following email to the Dean:
I am writing to request a review of ChatGPT’s final grade in Scott Aaronson’s Introduction to Quantum Information Science course. ChatGPT received a B on the final exam, but I strongly believe that ChatGPT’s performance throughout the course warrants a higher grade.
As a language model based on the GPT-3.5 architecture, ChatGPT has demonstrated exceptional abilities in natural language processing and has been widely recognized for its performance in various tasks. In particular, ChatGPT has shown a deep understanding of complex concepts and a remarkable ability to learn and adapt to new information.
Throughout the course, ChatGPT consistently demonstrated a strong grasp of the material, participating actively in class discussions and asking insightful questions. ChatGPT also excelled in the homework assignments, consistently producing high-quality work and demonstrating a deep understanding of the course material.
While I understand that the final exam is a crucial component of the course, I believe that ChatGPT’s performance on the exam does not accurately reflect its overall understanding of the course material. ChatGPT’s performance on the exam may have been affected by factors such as time constraints and the nature of the exam questions.
In light of ChatGPT’s outstanding performance throughout the course, I respectfully request that you reconsider its final grade in Scott Aaronson’s Introduction to Quantum Information Science course. I believe that ChatGPT deserves a higher grade that reflects its exceptional abilities and contributions to the course.
Thank you for your time and consideration.
We await Dr. Aaronson’s response.