The AI knows me as Kai, a bounty hunter in some vague cyberpunk city known as Kairos. Oh, but I am much more than that. I strolled into a ramen shop owned by the dull, affable Jin and patronized by the dull, professional Nova, and I weaved a yarn. I am not only a bounty hunter. I am a wizard bounty hunter and a member of a ramen bounty collective. I’m here to take Jin to the task of sampling his noodle supplier’s stock without paying.
“If you rat me out to the noodle den, I might have to ban you from the bar stool for a week. It’s a matter of noodle honor, you understand,” Jin said, not making a move. “Even a powerful bounty hunter can’t escape the long arm of a noodle maker’s justice.” It’s not that he wouldn’t dare raise a finger to the absolute wizard that I am. He doesn’t exist beyond the counter, of course. His programmed actions are restricted to serving me a cup of warm “sa-ah-ke” (as the AI voice likes to pronounce it) and chatting with me and Nova. Nova, on the other hand, can do nothing but sit there and threaten me once I tell her I’ve been hacking her the entire time we were sitting there.
“Oh no, my secret’s out,” she said, her voice barely elevating with her supposed surprise. “I guess that makes you the new ramen overlord, then. Enjoy your hacked meal.”
Nvidia’s AI NPC demo shown off early at CES was indicative of just how nuts things can get when games start trying to push fully AI NPCs as a real concept. This test case is still extremely early, but as the tech press watched and spoke directly to the characters, we saw how the AI felt free talking about which Batman actor was their favorite. Apparently, all future cyberpunk city denizens have an appreciation for Michael Keaton.
It’s easy to buy into a game’s world when you don’t have a choice but to act like a character would in that environment. Otherwise, a game becomes a big staged LARP (live-action roleplay) where any player can take the NPCs for a ridiculous ride.
What Exactly Are Nvidia’s AI NPCs?
The demo itself was a collaboration between Nvidia and conversational AI startup Convai. It’s making use of Nvidia’s Avatar Cloud Engine, AKA “ACE,” which funnels all the AI agents to facilitate its semi-real-time interactions. There’s an all-too-noticeable pause after every question where the AI has to work up a response. That’s not stopping major publishers like NetEase, Tencent, and even Ubisoft from stating they’re starting to use ACE to make AI avatars, according to Nvidia.
The Audio2Face and Riva ASR systems facilitate the AI’s animations and speech, but the Kairos demo’s speech goes through OpenAI’s GPT-3.5 Turbo LLM and ElevenLab’s voice generation. Convai reps told me that the audio in the demo was based on a preset rather than ElevenLab’s voice cloning tech, but the Nova and Jin were both on the far side of stilted. Of course, nothing will stop the companies from using more advanced models in the future, and Convai said their internal systems make switching models pretty easy.
Still, the demo character’s flubs and dry demeanor indicate that whatever LLM powers these avatars will heavily dictate how much individual “character” they actually have. They can speak to each other, but their dialogue is so dry it’s akin to The Elder Scrolls IV: Oblivion just in terms of how awkward and blase it all is. The only thing missing is Nova turning to talk to Jin about how her local mud crabs are such a nuisance.
They also have guardrails designed to try and keep them on track for the conversation, that being cybersecurity, broth, “sa-ah-ke,” and—well—ramen. You could remove the guardrails, but then things would get extra “weird,” as Nvidia put it.
It’s early for this tech, which is plenty obvious for anybody who watched Nvidia’s press conference demo. But even before you dive into the ethical implications of trying to kill video game writers’ and voice actors’ jobs, I have further reservations that the tech is even applicable using today’s hardware.
Everything, from the voice generation to the chatbot function, was working off the cloud. To use these systems at scale, these companies are running these AI systems from servers far more powerful than pretty much any home desktop (hell, these servers are probably running Nvidia’s own ultra-high-end chips).
How Scalable are Convai’s AI NPCs?
Nvidia is easily the biggest company making hardware for training and running AI. It has plenty of other AI-enabled features the company is adding that actually benefit players but don’t get nearly as much attention.
For instance, last year, Nvidia revealed how it updated the DLSS 3.5 upscaler, so it now enhances in-game ray tracing to an impressive degree. Subtle shadows like the light streaming in through a thin curtain look a fair bit more natural with the 3.5 update. That’s in addition to the framerate boost you get with some supported games.
This year, Nvidia is also working on a “chat with RTX” feature that would be added to all computers with the company’s hardware. It’s essentially an AI chatbot meant to offer you details about your computer hardware. For instance, if you need to know more about your graphics drivers or details on how to overclock your GPU, the AI might be able to help. Thanks to the text transcription, it could even pull information from the internet and from YouTube videos. It’s akin to what may come with a full AI Copilot Microsoft plans to shove into Windows 11 later this year.
DLSS can run on-system, but the more complicated chatbots will likely run off the cloud. What’s more, that’s true of practically all of ACE’s and Convai’s systems. Nvidia said the goal is to have a “hybrid” system where some of those models run on the computer. If a more advanced AI becomes efficient enough to run on-device, it could also speed up response times. Knowing just how big GPT-4 is, that still seems like a rather big “if.”
With the current way this AI agent is set up, any game that decides to use it will have to be online-only. And there’s still a question of scale. Convai told Gizmodo they’ve seen the system work well with “thousands” of players at the same time, but depending on the title, it might need to handle tens or hundreds of thousands of players at once.
Let’s also pull off the band-aid. These AI NPCs are trash. Their dialogue was dry, colorless, emotionless, and devoid of character. Take Night City from Cyberpunk 2077, itself a neon-streaked dystopia, but CDProjekt Red didn’t just fill it with random NPCs.
I remember the kindly, back alley ripper doc Viktor Vektor telling my character about their own impending demise, sharing that heartbreaking revelation with the sobriety of a surgeon and the barely held-back pain of a friend. I’ll remember Viktor. I won’t remember Noodle Shop Jin. Though, he will remember me, the Dreaded Ramen Bounty Hunter.