Friends AI — A Simulated Sitcom with GPT4
For our latest research project, we put the cast from Friends into the Simulation to generate infinite episodes.
Our goal was to compare the quality of AI-generated scripts with those of real episodes to see where state-of-the-art large language models fall short and where the simulation could step in. We deliberately did not push the visual fidelity to easily adapt the system to other types of shows.
It’s important to use well known and recognizable characters to judge the results. Is this really Chandler making a joke? Is this something Phoebe would say? Is Joey’s reaction true to his character?
Watch the generated 15-min episode of Friends AI down below.
Also, if you missed the media buzz around AI Seinfeld you should definitely check it out. It brilliantly shows the creative potential and pitfalls of the technology with no human in the loop. And just recently the cast from M*A*S*H took chat GPT for a spin to author new scenes as reported by the NY Times. Yet another project is the Anime series Always Break Time on twitch https://www.twitch.tv/alwaysbreaktime
To make Friends AI, we generated basic avatars that somewhat resemble the original cast, so that they can interact with their virtual world. Next, we paired these AI actors with fine-tuned GPT3 models and GPT4 which landed just in time. The simulation then generates episodes, including the title, synopsis, scenes, beats and dialogue. Lastly, the AI Director and Show Runner systems we developed internally make sure everything is presented to us in similar fashion to the original TV show we all love.
Generative AI
Sitcoms can show-off a longevity rarely seen in other formats. Episodes are often self-contained, character centric and well written. A sense of progression is then injected through changing relationships and new challenges each character is confronted with in their lives.
AI sitcoms have yet to prove that they can build a devoted fanbase. They have a hard time, because they either are derivatives of something that already exists or they are abundant 24/7 streams or they simply lack character.
Right now, we are witnessing a radical change in the creative process. We believe AI will blur the line between authors and fans and how IP owners are organized. It allows for massively co-created and adaptive experiences.
However, large language models (including the latest GPT4) have some known drawbacks in the context of dialogue generation and storytelling.
Coherence and Memory: The system forgets the overall plot and character intentions over time
Hallucination and inappropriate content: The system comes up with random topics or characters that don’t exist and it goes off rails
Non sequiturs and timing: The system does not know how a scene should advance or when it should end with a punchline.
Let’s look at two approaches to mitigate some of these problems: Beats and Fine-Tuning.
Units of Entertainment
You’ve probably heard of acts, scenes and maybe even sequences but what any stand-up comedian will tell you, a beat is the thing that keeps you glued to their lips.
A beat sheet breaks a scene (or act or sequence) down into smaller moments long before the author focuses on specific dialogue or the actor gets to improvise. Through beats, authors have the power to turn the world of their characters upside-down within seconds.
Fine-tuning GPT3
We’ve used the following Friends database to create a fine tuned model to generate beats per scene.
https://github.com/emorynlp/character-mining
What’s important here is that the dialogue is already broken down into scenes because we first had to augment the dataset. We asked GPT to generate beats for every real episode, so that we end up with new synthesized data that nicely aligns with the episode’s synopsis and dialogue.
Now we are able to train a new model that mimics the plot points, narrative threads and arcs within a typical Friends scene.
Here is an abridged example of the training data (1 of 326)
{"prompt":"SEASON 01\nEPISODE 01\nMonica and the gang introduce Rachel to the 'real world' after she leaves her fiance at the altar. Ross struggles with his rediscovered feelings for Rachel.", DIALOGUE\nMonica: There's nothing to tell! He's just some guy I work with![...]
"completion":"\nSCENE 01\n(Monica introduces Rachel to the group)\n(Joey suggests Ross go to a strip joint)\n(Rachel explains why she left her wedding at the last minute)\n(Ross reveals he told his parents about Carol)\n(Rachel explains why she came to Monica's apartment)\n[...]
And this is an example of generated beats for a new episode using the fine-tuned model.
(Joey tells Chandler that he has to audition for a part that requires him to play a man with three breasts.) (Joey explains that he will audition by reciting a scene from Twelfth Night.) (Ross interrupts, asking if he can borrow a book about dating.) (Chandler and Joey tell Ross to wait until they are done and then he can have the book.) (Joey stands in the living room, holding a script while Chandler lounges on the couch.)
Show Runner
The show runner system defines the fundamental show structure:
Type of Show
Number of seasons
Number of episodes per season
Number of scenes per episode
Cast, Location (Sets) etc. per scene
Number of beats per scene
Scene Data and Scene State
The AI director and staging system we developed in Unity makes sure all actors and cameras are at their mark for the scene. It also injects emotional tags and actions into the prompt to further enhance the dialogue generation.
This allows for highly variable and potentially interactive episodes with just the beat structure as the backbone.
Dialogue generation with GPT4
Our attempts to use a fine-tuned dialogue model based on the show’s cast were promising when compared to GPT3, but in the end, it did not yield better results than the latest GPT4 model. The new model mitigates most of the coherence and hallucination issues mentioned above. It also has a good understanding of the characters in friends as it was like its predecessors already trained on the same data.
So all that was needed was the beat structure we generate during scene changes and the simulation data to guide it through the episode.
The dialogue is then generated on the fly. For audio we’ve used Elevenlabs API because of it’s good quality, easy voice cloning and overall speed.
Perspective on the future
Eventhough this experimental project shows that the same issues we already saw years ago still persist, large language models have improved significantly so that new generative seasons of our favorite TV shows seem very doable in the near future.
GPT4 specifically was able to interpret the guard rails the simulation provided and write dialogue that feels appropriate for each character without losing coherence and with a good sense of humour.
We would love to push this experiment even further. There is no shortage of ideas in this new world of generative AI. What if Friends would play in the future with a very old cast or was set in a western town? What if Chandler and Monica never got married?
Here are a few general directions we currently think about:
Make even more use of simulation data to create new episodes in playful and interactive ways. How would user interaction look like? Episode to episode, scene to scene, moment to moment: Changing the casts emotions, needs, personalities, prompt the system with new obstacles or do a cameo.
How would we keep viewers engaged with infinite 24/7 episodes? How much would it change the format. How much the character’s life? How can we make it more reliable and safe for public display?
Can we generalize the Show Runner concept to adapt it to more dramatic series like a True Detective show or NYPD Blue?
Which series should we tackle next? An animated show like South Park with timely subjects and on the edge of inappropriate AI generated content?
We are super excited for whatever comes next!
The "Friends AI" research project is an experimental, non-commercial endeavor aimed at exploring the potential of artificial intelligence, voice synthesis, and deep learning technologies to recreate the images and voices of the original cast members from the television series "Friends".
The Project is not affiliated with, endorsed by, or connected in any way to the creators, producers, or copyright holders of "Friends," the Cast, or any related parties. All intellectual property rights, trademarks, and copyrights associated with "Friends" remain the exclusive property of their respective owners.
Sets by Koen J. https://3dwarehouse.sketchup.com/user/0998643389915521058863890/Koen-J