Generative AI/LLM and Education

Sam Altman-the OpenAI CEO-said that it’s best to think of GPT-4 as a reasoning engine. Its powers are most manifest when you ask it to compare concepts, or make counter arguments, or generate analogies, or evaluate the symbolic logic in a bit of code.

On the other hand, in Taming Silicon Valley, Gary Marcus wrote that these systems can produce content that is factually incorrect and even nonsensical. This is because they are based on statistical predictions of words and not a true understanding of the world. The models can generate misinformation, often mixed with truths, and are unable to verify or check facts. These models don’t "understand" the concepts they use or the people they describe, and therefore cannot distinguish between fact and fiction. The models generate new content by piecing together bits of the training data, often without understanding the context. They record the statistics of words, but lack the ability to reason or fact check. This means that while they can produce text that sounds natural, the content is not always correct or truthful. It is important that we are aware of how AI works, what it can and cannot do.

What educators these days must now do is show students not only how to find information, but what information to trust and what not to, and how to tell the difference.

What is it?

It is a generative AI. If we ask Google something, it will direct us to webpages that may contain the answers. We then go look for the answers on those webpages. If we put the same questions to chatGPT, within seconds you’ll get the answers in a slick, well-structured blocks of text several thousand words long on almost any topic it is asked about, from string theory to Shakespeare. Each essay it produces is unique, even when it is given the same question again.

“Questions/requests” technically known as prompts can be anything e.g. write an essay on the significance of MB2 in Endodontics, write a novel about robots in style of Edgar Alan Poe, write a code for webpage containing a picture of a pig etc.

With the ability to produce relatively well structured essays in an instant with virtually zero effort, it looked as if gen AI would undermine the way we encourage students to learn and test what they have learned, a cornerstone of education.

The problems

Cheating

The response from educational institutions was swift, but vast ranging from fiercely negative i.e., right out ban, prohibiting the use of this technology in any way shape or form to optimistically positive. Some educators fear that this tool will be used for cheating on exams or assignments. Some schools/universities have blocked access to OpenAI’s website from their networks.

Accuracy

It we ask gen AI something, the answer most likely will be a mixture of the truth and falsehood. We don’t know which part is written by the oracle and which by hallucinated individual. The accuracy is quite impressive, but it is field dependent. It’s been shown to be able to pass a bar exam and the final exam of MBA programs in the US. We just don’t know when it’s going to go delusional. In addition, the answer is spelt out in style full of confidence. The uninformed users may feel that the answers are validated and take it as unquestionable truth.

Pondering

Cheating is not a new problem: schools have survived calculators, Google, Wikipedia, essays-for-pay websites, and more.

Back when calculators first became inexpensive and ubiquitous. There was a debate on how we should incorporate this technology in teaching and learning. Banning it was fashionable for a while, later on though it was allowed on school grounds and even in exam. Math exams since have evolved from simple calculations easily done on calculators to more sophisticated calculations. Justifications, explanations of chosen solutions rather than an answer or final result became a norm in exams.

Googling makes accessing information easy. Memorising facts is no longer as important as the time before Google. Nevertheless we should understand that basic knowledge of facts is just the bedrock on which other forms of learning, such as analysis and evaluation, sit. Perhaps educators should shift the focus from memorising facts to honing students’ analytical and evaluating skills. We should teach what/how to ask Google and what to make of the results. In other words, we need to teach them not only how to find information, but what information to trust and what not to, and how to tell the difference.

If gen AI makes it easy to cheat on an assignment, we should throw out the assignment or modify it to fit with higher forms of learning rather than ban the chatbot. This is because the technology makes that particular skill (that was honed or tested) in an assignment redundant. Also banning or blocking it is futile. We just can’t deny the existence of a technology. It’s there. It exists. It’s accessible (might not be at some schools, but it is certainly accessible elsewhere).

As in calculator and Google case, by the same token, we simply cannot reverse the clock to the time when gen AI didn’t exist, we should instead embrace it, understand it, see how it works, what the limitations, and how to best use it.

This is (potentially) how…

Improve critical thinking skills

Students to use gen AI to generate text on a topic and then point out the flaws.

Educators can even lean into gen AI’s tendency to falsify, misattribute, and straight-out lie as a way of teaching students about disinformation. Imagine using gen AI to pen essays that conceal subtle logical fallacies or propose scientific explanations that are almost, but not quite, correct. Learning to discriminate between these convincing mistakes and the correct answer is the very pinnacle of critical thinking, and this new breed of academic assignment will prepare students for a world fraught with everything from politically correct censorship to deepfakes.

Strengthen argument

Gen AI can play the role of a debate opponent and generate counterarguments to a student’s positions. By exposing students to an endless supply of opposing viewpoints, chatbots could help them look for weak points in their own thinking.

Students to use gen AI to generate an argument and then annotate it according to how effective they think the argument is for a specific audience. Then turn in a rewrite based on their criticism.

Example: write an essay arguing why it’s ok to not searching for MB2 and leave it untreated.

Private tutor

Private tutoring has been shown to lift academic performance of a class up to 2 standard deviation in comparison to a standard teaching class. Students could ask gen AI to prepare questions to test their understanding of a topic. Assess the answers, study and repeat. All this adds up to a simple but profound fact: anyone with an internet connection now has a personal tutor, without the costs associated with private tutoring. Sure, an easily hoodwinked, slightly delusional tutor, but a tutor nonetheless.

Exam

We may, reluctantly, enter a renaissance of viva voce, in which students explain their thinking process, justify their (treatment) decisions to educators on the spot in a to-and-fro conversation.

This solution will probably lead to two more problems. Viva voce is time-consuming and labour-intensive. This type of exam is inherently subjective. Clear rubrics would be required to minimise subjectivity.

Viva may be perfect for postgraduate studies where there are a small number of students, but in classes with hundreds of students, that is academically problematic or perhaps financially prohibitive to implement.

Portfolio-based grading

A case portfolio has already and commonly been utilised in specialist training programs and fits well with the proposed solution to generative AI problems in education. The portfolio should include a self-reflection section or learning journal for students to express their struggles, approaches, and lessons learnt after each assignment/case.

Assignment 2.0

If students want to use gen AI in their written assignments, we should assess the prompt as well as—or even rather than—the essay itself. Knowing the words to use in a prompt and then understanding the output that comes back is important.

Personalised writing assignments that are unique to the student or current events. This would render generative AI ineffective, but we would fall into the same problem above: how do we grade them objectively?

Textbook 2.0

Future textbooks could be bundled with chatbots trained on their contents. Students would have a conversation with the bot about the book’s contents as well as (or instead of) reading it. The chatbot could generate personalised quizzes to coach students on topics they understand less well.

Gen AI is a new technology, if we didn’t use it, we wouldn’t know what it capable of. We and our students should experiment with the tool, and find way to guide them through how, when, and where it can be used.

===

Summarise from these sources

ChatGPT is going to change education, not destroy it

https://www.technologyreview.com/2023/04/06/1071059/chatgpt-change-not-destroy-education-openai/

Banning ChatGPT will do more harm than good

https://www.technologyreview.com/2023/04/14/1071194/chatgpt-ai-high-school-education-first-person/

I’m a High Schooler. AI Is Demolishing My Education.

https://www.theatlantic.com/technology/archive/2025/09/high-school-student-ai-education/684088/

Generative AI/LLM and Education

PRACTICE LOCATIONS