Because giant language fashions work by predicting the subsequent phrase in a sentence, they’re extra probably to make use of frequent phrases like “the,” “it,” or “is” as a substitute of wonky, uncommon phrases. This is strictly the form of textual content that automated detector techniques are good at selecting up, Ippolito and a staff of researchers at Google discovered in analysis they printed in 2019.
But Ippolito’s research additionally confirmed one thing attention-grabbing: the human contributors tended to assume this sort of “clean” textual content seemed higher and contained fewer errors, and thus that it should have been written by an individual.
In actuality, human-written textual content is riddled with typos and is extremely variable, incorporating completely different types and slang, whereas “language models very, very rarely make typos. They’re much better at generating perfect texts,” Ippolito says.
“A typo in the text is actually a really good indicator that it was human written,” she provides.
Large language fashions themselves can be used to detect AI-generated textual content. One of essentially the most profitable methods to do that is to retrain the mannequin on some texts written by people, and others created by machines, so it learns to distinguish between the 2, says Muhammad Abdul-Mageed, who’s the Canada analysis chair in natural-language processing and machine studying on the University of British Columbia and has studied detection.
Scott Aaronson, a pc scientist on the University of Texas on secondment as a researcher at OpenAI for a yr, in the meantime, has been developing watermarks for longer items of textual content generated by fashions akin to GPT-3—“an otherwise unnoticeable secret signal in its choices of words, which you can use to prove later that, yes, this came from GPT,” he writes in his weblog.
A spokesperson for OpenAI confirmed that the corporate is engaged on watermarks, and stated its insurance policies state that customers ought to clearly point out textual content generated by AI “in a way no one could reasonably miss or misunderstand.”
But these technical fixes include massive caveats. Most of them don’t stand an opportunity in opposition to the most recent era of AI language fashions, as they’re constructed on GPT-2 or different earlier fashions. Many of those detection instruments work finest when there may be a variety of textual content out there; they are going to be much less environment friendly in some concrete use circumstances, like chatbots or electronic mail assistants, which depend on shorter conversations and supply much less information to research. And utilizing giant language fashions for detection additionally requires highly effective computer systems, and entry to the AI mannequin itself, which tech firms don’t enable, Abdul-Mageed says.