A little while ago, an old friend of mine got in touch with me to ask about the feasibility of a project she had in mind.
“I made this account for weird image based memes,” Lucy said. “I wondered if there was a way to make like a bot that spews out stupid little pithy sayings.”
The memes she had made were things like this (you can see more on her Instagram):
As you can see, they’re idiosyncratic in their use of slang, topical subjects, and sheer creativity!
“The images are easy to find,” she said, “but I want to show it loads of options of mum says memes so it can just build its own text for them and I’ll match the pics.”
I thought this should be doable, and a fun learning experience for me, so I accepted the challenge!
The way I approached this was to use a large language model. Large language models are AI models that have been trained on enormous swathes of text on the internet. They learn which words are likely to be found together, and thus can be used to complete or generate text. They can also be fine-tuned for particular tasks. For example, you could feed a large language model a whole load of Shakespeare plays, and ask it to generate text in that style.
I used a particular large language model called GPT-2, which has been open-sourced by the company OpenAI. GPT stands for “Generative Pre-Trained Transformer” (a transformer is a particular architecture of model). Since its release, GPT-2 has been succeeded by the even more impressive GPT-3, and OpenAI are now working on a fourth iteration, but these are not publicly available due to concerns over risks and biases (more on this later).
So, using some nice Python libraries, I was able to take GPT-2 and fine-tune it to our meme text generation task. I typed up all of Lucy’s previous “mum says” memes, and fed this to the model as training data. (If you’re interested in the code, you can find it in my GitLab repo.)
I set the model fine-tuning to Lucy’s style, and got it to print out some samples as it went along. This was immediately quite eery. Some of the time it would spit out text that was identical or very close to the training data:
Mum says read the news.
Mum says be open and honest.
Mum says we don’t have to go.
Mum says I am always late to the party.
Mum says your tweets are greatly appreciated.
Then there were odd phrases that it repeated over and over again:
Mum says ( light lyric )
Mum says ( light lyric )
Mum says ( light lyric )
Mum says ( light lyric )
…
Then there were some sections that seemed to be very specific memories of something GPT-2 had seen on the internet:
Released under a Creative Commons 2.0 licence.
– @marcabranthawkins
Web created by Julian Assange.
Copyrights Julian Assange.
Embed this page in your website:
<iframe src=”http://www.guardian.co.
Then there were bits in other languages (or imitations of other languages):
omexander r infinitum quoque lange est argentum.
,ou flowa vero.
née obtus inter alia.
um tenem.
um mut emie.
une faut dans le seigneur.
né ami.
né sommera.
une vuelta se rendu de se rende.
une vuelta le se souri le vesse.
une vuelta le mots sa faisait.
I’ve run the fine-tuning a few separate times, and have seen interesting behaviour each time. On one occasion it was like the computer was having a nightmare – it kept saying “DELETE ALL THE FILES, DELETE ALL THE FILES!”. You begin to see why some people think that these models might be “slightly conscious“. (They’re not – but the emulation of language is disconcerting.)
Now I have fine-tuned the model, I am able to generate new chunks of text beginning with “Mum says”. Sometimes it spits out the training data, but I see this behaviour less as I turn up the model’s “creativity”. I have been delighted with some of the results, and have found myself snorting aloud as I read through. Some of them are so unexpected!
Mum says FYI use your brain.
Mum says anyway you are a bother.
Mum says (sigh)
Mum says don’t be awful.
Mum says find love.
Mum says unfollow casserole queen.
Mum says sex sells chips.
Mum says open libraries keep secrets.
Mum says Twitter unites campers.
Mum says crank it up a notch and make it big on Friday.
Mum says demon baby you have it tough like that.
Mum says you will not be shoemakers, drunk sailors, thieves or anything like that in a world filled with dim lights and lonely hour candles.
Mum says old times were boring.
Mum says nail art is just like work.
Mum says politics is just a party.
Mum says escape from material limitations.
Mum says i am cringe dig it.
Mum says life’s too short to indulge too few calories.
Mum says collect flowers.
Mum says there is no future to be afraid of.
It’s amazing how it’s picked up on the style, and incredible how novel it’s managed to be with it. Lucy has been as delighted as me, and has made some hilarious new memes with the AI-generated text (sometimes using an AI image generator to create the background):
I should end with a little warning, as it’s not quite all fun and games. A common criticism of these language models (and the reason GPT-3 and GPT-4 are unlikely to be made publicly available) is that they have been trained on some fairly dubious parts of the internet, and can therefore repeat problematic content. Even in this fun meme text task, I have seen it. When I ask the model to generate text, it doesn’t shy away from offensive or discriminatory language. For our task there is always a “human in the loop” (in fact two – I have been curating the generated text before sending it to Lucy, and she chooses which sentences to make into memes). But it highlights the issue of using these sorts of models for more automated tasks, such as writing articles or performing as chatbots.
Anyway, you can rest assured that “Computer Mum” is in safe hands. I’m hoping to do more work with her – maybe create an application where you can click a button and get a random “Mum says” statement (with moderation, to avoid anything offensive). Watch this space.








