Here’s a groaner for you: The greyhound stopped to get a hare cut.
Don’t blame dad for this one. Blame the machines.
A pun generator might not sound like serious work for an artificial intelligence researcher—more the sort of thing knocked out over the weekend to delight the labmates come Monday. But for He He, who designed just that during her postdoc at Stanford, it’s an entry point to a devilish problem in machine learning. He’s aim is to build AI that’s natural and fun to talk to—bots that don’t just read us the news or tell us the weather, but can crack jokes or compose a poem, even tell a compelling story. But getting there, she says, runs up against the limits of how AI typically learns.
Gregory Barber covers cryptocurrency, blockchain, and artificial intelligence for WIRED.
Neural networks are natural imitators, learning patterns of language by scouring vast amounts of text. If coherency is your aim, that approach works well—so well, in fact, that recent advances have sparked an ethical debate about whether people could abuse AI to generate convincing fake news. But the resulting prose is as dry as the newspaper text and Wikipedia articles typically used to train them. Neural networks, in other words, are rule-abiding to a fault, and that makes them terrible jokers. A well-crafted joke teeters at the edge of coherency without wading into nonsense, He says, and neural networks simply don’t have the sense to strike that balance. Besides, the whole point of creativity is to be, well, novel. “Even if we had a long list of puns it could learn from, that would miss the point,” she says.
Instead, He and her team, which included Nanyun Peng and Percy Liang, tried to give their AI some creative wit, using insights from humor theory. To anyone who’s dared craft a pun, the intuition will sound familiar. For a pun to work, He decided it needs to be surprising in a local context (“stopped to get a hare cut” makes little sense on its own) but also have an “aha” factor that ties it all together (in this case, thanks to the word “greyhound”). He and her team anoint this tension with proper academese: the “local-global surprisal principle.” To make a pun, the neural network is given a pair of homophones (hair/hare) and generates a sentence that’s ordinary with the first word, but elicits surprise when the second word is swapped in. Then, to pull it back from the cusp of gibberish, it inserts another word that gives the overall sentence a bit more logic.
Next, He staged a pun contest, pitting the AI against (human) humorists. According to the crowdworkers who rated the puns, the results were … not great for the machines, at least by human standards. While He’s system produced puns that were much funnier than a previous AI-driven attempt, it only beat the humans 10 percent of the time. Plus, the puns were stuck in a rather rudimentary structure (and struggled at times with grammar). Some examples:
That’s because negotiator got my car back to me in one peace.
Even from the outside, I could tell that he’d already lost some wait.
Well, gourmet did it, he thought, it’d butter be right.
“We’re nowhere near solving this,” He says.
Still, Roger Levy, director of MIT’s computational psycholinguistics lab, says the approach is a promising step toward building AI with a bit more personality. “Humor is an intrinsically challenging aspect of studying the mind. But it’s also fundamental to what makes us human,” he says. Four years ago, Levy described a computational approach to predicting whether a pun is funny—work that would eventually become the foundation for He’s joke-generation method. Levy says he had planned on testing something like the local-global surprisal principle, which is more fine-tuned than the theories used in his paper. The concept made sense, intuitively, but he didn’t yet have the data to prove it. “It’s really cool to see that actually pan out,” he says.
More broadly, the humor research highlights the need to bring more human intelligence to neural nets, Levy says. Recently, he’s been using surprise as a way to study other aspects of how AI understands language. “Surprisal is one of the most central concepts in both AI and cognitive science,” Levy says. In humans, it reflects when we encounter new or unexpected information, and can be measured with a proxy, like tracking eye movements as we read. In machines, it’s measured with probabilities—a word that’s lower probability in a given context is more surprising.
That makes surprise a handy way of comparing how human brains and machines reason their way through language—a way of probing the inner workings of our respective black boxes. By submitting neural networks to a set of psycholinguistic tests intended to study how humans handle ambiguous language, Levy found he could begin to see where the machines were unexpectedly set off-kilter or blew past challenges in un-humanlike ways. Adjusting for those differences, he says, could be the key to designing AI with more humanlike behavior.
In the meantime, He says she hopes to apply her general pun approach to more difficult creative tasks, like storytelling. The idea, she says, is to let the neural network do what it’s good at and then edit the result with human intelligence. A neural network could be trained to generate a dull string of perfectly coherent sentences, for example, and then learn to edit that output into a creative short story based on theories of narrative. “The goal is to make stories that are more creative and interesting,” He says. “I want AI to write stories about things humans wouldn’t think to write about.”