What do you get when you cross Chinese AI with comedy?

For all the things artificial intelligence can do, one area where it has consistently struggled is with cracking jokes.

But a team of Chinese scientists is trying to solve the problem in collaboration with their international peers, developing a model that can crack its own jokes and comment on images it encounters in the way many internet users do.

Which may highlight the basic problem they face: that humour is entirely subjective, depends on context and cultural factors and is notoriously hard to translate.

But while admitting “the humour generated here may hit differently for everyone”, the researchers from Sun Yat-sen University in southern China, with collaborators from Singapore Management University and Harvard University are hoping to one day create a model that can boost creativity.

In a survey of 154 internet users, the AI’s jokes in English, Japanese and Mandarin Chinese were judged funnier than comparable models such as LLaVA-1.5 by Microsoft Research and GPT-4v by OpenAI.

“What a baby might be thinking when lifted up high?”

“Now I have the urge to pee.”

And when shown the poster for the movie Forrest Gump, the model’s response was: “Sorry, the spot is taken.”

image
Oogiri jokes generated by AI. Photo: Handout

The problem with trying to use AI to generate humour is that traditional large language models employ a logic-based chain-of-thought (CoT) reasoning, which is effective for mathematical or scientific tasks but struggles with creativity.

Researchers addressed this by developing a “Leap-of-Thought” (LoT) capability, which fosters innovation by linking disparate concepts and making intellectual jumps.

“[The idea is] thinking outside the box, which bridges disparate ideas and facilitates conceptual leaps. Embracing LLMs with a strong LoT ability can unlock significant potential for innovation, contributing to advancements in creative applications,” the researchers said in a paper first presented at last year’s CVPR, a conference in computer vision and pattern recognition.

A revised version was uploaded to the open-access archive website ArXiv this April.

The team drew inspiration from the whimsical and imaginative Japanese game Oogiri, where players are challenged to come up with creative and humorous responses to pictures.

The team created a data set named Oogiri-GO, which contains over 130,000 samples in Chinese, Japanese and English, featuring humorous responses sourced from the internet.

By using a method called Associable Instruction Tuning and the model’s self-refinement, the researchers trained Alibaba’s Qwen model, endowing it with human-like leap-of-thought thinking that allowed it to respond to images and texts.

Alibaba is the parent company of the South China Morning Post.

For example, a picture of two brown dogs with a white one in the middle under a green blanket was captioned as “Matcha Oreo”; while a child and a dog apparently having a tearful argument was subtitled: “Is there another dog in your life?!”

The research team believes the model could go on to develop smarter, more entertaining interactive content, such as automatically generating humorous sketches or writing creative scripts.

“To the best of our knowledge, we are the first to profoundly explore the Leap-of-Thought capability in multimodal large language models,” the paper said.

The team hopes this research could signify a further blurring of the boundaries between humans and AI, saying: “The LoT ability serves as a cornerstone for creative exploration and discovery in LLMs.”

image

  

Read More

Leave a Reply