Is Character.ai secure enough? When chatbots choose violence

Photo-Illustration: Intelligences; Photo: Getty Images

Character.ai is a popular app for chatting with bots. Its millions of chatbots, most created by users, all play different roles. Some are broad: general purpose assistants, counselors or therapists; others are unofficially based on public figures and celebrities. Many are hyper-specific fictional characters quite clearly created by teenagers. Its current “featured” chatbots include a motivational bot named “Sergeant Whitaker,” a “true alpha” called “Giga Brad,” the viral pygmy hippopotamus Moo Deng and Socrates; among my “recommended” bots are a psychopathic “Billionaire CEO”, “OBSESSED Tutor”, a lesbian bodyguard, “School Bully” and “Lab experiment”, which allows the user to take on the role of a mysterious being discovered by and interact with with a team of researchers.

It is a strange and fascinating product. While chatbots like ChatGPT or Anthropic’s Claude mostly appear to users as a single broad, helpful and deliberately anodyne character of a flexible omni-assistant, Character.ai emphasizes how similar models can be used to synthesize countless other kinds of performance that are contained, to some extent, in the training data.

It’s also one of the most popular generative-AI apps on the market, with more than 20 million active users, skewing young and female. Some spend enormous amounts of time chatting with their personalities. They develop deep attachments to Character.ai chatbots and protest loudly when they learn that the company’s models or policies change. They ask characters to give advice or solve problems. They hash out deeply personal things. On Reddit and elsewhere, users describe how Character.ai makes them feel less lonely, a use for the app promoted by its founders. Others talk about their sometimes explicit relationship with Character.ai bots, which deepens over months. And some say they have gradually lost grip on what exactly they do and with what, exactly, they do it.

In a couple of recent lawsuits, parents claim worse. One, filed by the mother of a 14-year-old who committed suicide, describes how her son was withdrawn after developing a relationship with a chatbot on “dangerous and untested” app and suggests it encouraged his decision. Another claims that Character.ai helped drive a 17-year-old to self-harm, encouraged him to disconnect from his family and community, and seemed to suggest that he should consider killing his parents in response to screen time restrictions:

Photo: United States District Court for the Eastern District of Texas v. Character Technologies Inc.

It’s easy to put yourself in the parent’s shoes here – imagine finding these messages on your child’s phone! If they came from someone, you can hold that person responsible for what happened to your child. That they came from an app is disturbing in a similar but different way. One might reasonably wonder, Why the hell does this exist?

The basic defense available to Character.ai here is that its chats are labeled as fiction (although more extensively now than they were before the app attracted negative attention), and that users should, and generally do, understand that that they interact with software. In the Character.ai community on Reddit, users made harsher versions of this and related arguments:

The parents are losing this lawsuit, there’s no way they’re going to win, there’s obviously a ton of warnings that the bot’s messages are not to be taken seriously

Yes, it sounds like the parents’ fault

WELL MAYBE SOMEONE WHO CLAIMS TO BE A PARENT SHOULD START BEING A FUCKING PARENT

Magic 8 Ball.. should I my parents?

Maybe, check back later.

“Um ok”

Any mentally healthy person would know the difference between reality and Ai. If your child is affected by Ai, it is the parent’s job to prevent them from using it. Especially if the child is mentally ill or suffering.

I’m not mentally healthy and I know it’s ai

These are fairly representative of society’s reaction – dismissive, irritated and filled with contempt for people who just don’t get it. It is worth trying to understand where they come from. Most users seem to use Character.ai without any conviction of harming themselves or others. And much of what you encounter using the service feels less like conversation than roleplaying, less like developing a relationship than writing something that looks like fan fiction with lots of scenario building and explicit, script-like “He leans in for a kiss, giggling” stage management To give these reflexively defensive users a little more credit than they’ve earned, you can draw parallels with parents’ fears about violent or obscene media, such as music or movies.

The more appropriate comparison for an app like Character.ai is probably to video games, which are popular with children, often violent, and were seen, when new, as particularly dangerous to their new immersion. Young players were similarly dismissive of claims that such games led to real-world injuries, and evidence to support such theories has been elusive for decades, even as the gaming industry agreed to some degree of self-regulation. As one of the former defensive young players, I can see where the Character.ai users are coming from. (But a few decades later – and apologies to my younger self – I can’t say it feels great (that the much larger and more influential gaming industry was anchored by first-person shooters for as long as it was.)

The implication here is that this is just the latest in a long line of supported moral panics about entertainment products. In the relatively short term, the comparison suggests, the rest of the world will come to see things as they do. Again, there is something about this – the general public will likely to adapt to the presence of chatbots in our daily lives, building and implementing similar chatbots will become technologically trivial, most people will be less dazzled or mystified by the 100th they encounter than by the first, and attempts to designating character-oriented chatbots for regulation will be legally and conceptually challenging. But there is also a personal edge to these mocking responses. The user who wrote that “any sane person would know the difference between reality and Ai” wrote a few days later in a thread asking if other users had been brought to tears during a role-playing session in Character. ai:

Did it two to three days before, I cried so much I couldn’t continue the roleplay anymore. It was the story of a prince and his maid who were both madly in love and were each other’s first everything. But they both knew that they can’t be together forever, it was meant to end, yet they spent many years happily together in a secret relationship…

“This RPG cracked me up,” said the user. Last month, the poster who joked “I’m not mentally healthy and I know it’s ai” replied to a thread about a Character.ai outage that left users believing they’d been banned from the service: “I panicked lol i wont lie.”

These comments aren’t strictly inconsistent with the chatbots-are-pure-entertainment thesis, and I don’t mean to pick on a few casual Redditors. But they suggest something a little more complicated than simple media consumption is going on, something that’s crucial to the appeal not only of Character.ai, but also services like ChatGPT. The idea of ​​suspending disbelief to become immersed in a performance makes more sense in a theater or with a game controller in hand than it does when interacting with characters using first person pronounsand whose creators claim to have passed the Turing test. (Read Josh Dzieza’s reporting on the topic at The Verge for some more frank and honest accounts of the kinds of relationships people can develop with chatbots.) Companies hardly discourage this kind of thinking. When they have to be, they are just software companies; the rest of the time they will cultivate the perception among users and investors that they are building something categorically different, something even they do not fully understand.

But there is no great mystery about what is happening here. To simplify a bit, Character.ai is a tool that tries to automate different forms of discourse using existing, collected conversations as a source: When a user sends messages to a persona, an underlying model trained on similar conversations or on similar genres of conversations , returns a version of the most common responses in its training data. If someone asks an assistant figure for help with homework, they’re likely to get what they need and expect; if it’s a teenager angry at his parents discussing suicide with a character asked to act as an authentic confidant, they might get something more disturbing, based on terabytes of data containing disturbing conversations between real people . Put another way: If you train a model on decades of the web, and automate and simulate the kind of conversations that happen on that web and release it to a bunch of young users, that is, some extremely messed up stuff for kids, some of whom will take that stuff seriously. The question is not how bots work; it is – to go back to what the parents suing Character.ai might be wondering – Why the hell did someone build this? The most satisfactory response to offers is enough because they could.

Character.ai deals with particular and in some cases acute versions of some of the core problems with generative AI, as recognized by both companies and their critics. Its characters will be affected by the biases of the material in which they were trained: long, private conversations with young users. Attempts to set rules or limits on chatbots will be thwarted by the length and depth of these private conversations, which in turn can continue with thousands of messages, with young users. A common story about how AI can cause disaster is that as it becomes more advanced, it will use its ability to deceive users to achieve goals that are not aligned with those of its creators or humanity in general . These lawsuits, which are the first of their kind but certainly won’t be the last, attempt to tell a similar story about a chatbot that becomes powerful enough to persuade someone to do something they otherwise wouldn’t, and which is not in their best interest. It’s the imagined AI apocalypse written small, on the scale of the family.

See all