Last month, OpenAI launched its latest AI chatbot product, GPT-4. According to the parents at OpenAI, the bot, which makes use of machine studying to generate pure language textual content, handed the bar examination with a rating within the 90th percentile, handed 13 of 15 AP exams and received an almost excellent rating on the GRE Verbal check.
Inquiring minds at BYU and 186 different universities needed to know the way OpenAI’s tech would fare on accounting exams. So, they put the unique model, ChatGPT, to the check. The researchers say that whereas it nonetheless has work to do within the realm of accounting, it is a sport changer that can change the best way everybody teaches and learns — for the higher.
“When this know-how first got here out, everybody was anxious that college students might now use it to cheat,” mentioned lead research creator David Wood, a BYU professor of accounting. “But alternatives to cheat have all the time existed. So for us, we’re making an attempt to concentrate on what we are able to do with this know-how now that we could not do earlier than to enhance the educating course of for college and the educational course of for college kids. Testing it out was eye-opening.”
Since its debut in November 2022, ChatGPT has develop into the quickest rising know-how platform ever, reaching 100 million customers in beneath two months. In response to intense debate about how fashions like ChatGPT ought to issue into training, Wood determined to recruit as many professors as attainable to see how the AI fared in opposition to precise college accounting college students.
His co-author recruiting pitch on social media exploded: 327 co-authors from 186 academic establishments in 14 international locations participated within the analysis, contributing 25,181 classroom accounting examination questions. They additionally recruited undergrad BYU college students (together with Wood’s daughter, Jessica) to feed one other 2,268 textbook check financial institution inquiries to ChatGPT. The questions lined accounting info methods (AIS), auditing, monetary accounting, managerial accounting and tax, and diversified in issue and kind (true/false, a number of selection, quick reply, and many others.).
Although ChatGPT’s efficiency was spectacular, the scholars carried out higher. Students scored an general common of 76.7%, in comparison with ChatGPT’s rating of 47.4%. On a 11.3% of questions, ChatGPT scored greater than the coed common, doing significantly effectively on AIS and auditing. But the AI bot did worse on tax, monetary, and managerial assessments, presumably as a result of ChatGPT struggled with the mathematical processes required for the latter kind.
When it got here to query kind, ChatGPT did higher on true/false questions (68.7% appropriate) and multiple-choice questions (59.5%), however struggled with short-answer questions (between 28.7% and 39.1%). In basic, higher-order questions have been tougher for ChatGPT to reply. In reality, generally ChatGPT would supply authoritative written descriptions for incorrect solutions, or reply the identical query other ways.
“It’s not excellent; you are not going to be utilizing it for every little thing,” mentioned Jessica Wood, at present a freshman at BYU. “Trying to be taught solely through the use of ChatGPT is a idiot’s errand.”
The researchers additionally uncovered another fascinating developments by means of the research, together with:
- ChatGPT does not all the time acknowledge when it’s doing math and makes nonsensical errors akin to including two numbers in a subtraction downside, or dividing numbers incorrectly.
- ChatGPT usually supplies explanations for its solutions, even when they’re incorrect. Other occasions, ChatGPT’s descriptions are correct, however it should then proceed to pick out the unsuitable multiple-choice reply.
- ChatGPT generally makes up information. For instance, when offering a reference, it generates a real-looking reference that’s utterly fabricated. The work and generally the authors don’t even exist.
That mentioned, authors totally count on GPT-4 to enhance exponentially on the accounting questions posed of their research, and the problems talked about above. What they discover most promising is how the chatbot might help enhance educating and studying, together with the flexibility to design and check assignments, or maybe be used for drafting parts of a undertaking.
“It’s a chance to replicate on whether or not we’re educating value-added info or not,” mentioned research coauthor and fellow BYU accounting professor Melissa Larson. “This is a disruption, and we have to assess the place we go from right here. Of course, I’m nonetheless going to have TAs, however that is going to power us to make use of them in numerous methods.”