Vetted by NeuralPress's Multi-Agent Verifier for strict factual validity and event relevance. Our compliance engine cross-checks and filters search results to ensure zero false correlations or misleading content.
Latency Comparison (Processing Speed)
Comparison of reaction time between traditional turn-based systems and new interaction models.
Primary Sources
Mira Murati's Thinking Machines unveils AI models designed for live ...
SynopsisThinking Machines Lab has unveiled "interaction models," a new category of multimodal AI designed for real-time communication. These systems process audio and visual input simultaneously, enabling continuous reaction and significantly reducing response latency. This breakthrough aims to foster more natural human-AI collaboration, particularly in time-sensitive enterprise and industrial applications.Listen to this article in summarized formatAgenciesMira Murati, founder, Thinking Machines LabThinking Machines Lab, the artificial intelligence startup founded by former Mira Murati, is attempting to solve one of the biggest frustrations with modern AI systems: the awkward pause between asking something and getting a response.The company has unveiled a research preview of what it calls “interaction models,” a new category of multimodal AI systems designed for real-time communication. Unlike conventional AI models that wait for users to finish typing or speaking before generating a response, Thinking Machines is building systems that can listen, process, see and respond simultaneously.That shift could fundamentally change how humans interact with AI.Today’s AI systems still operate in a rigid turn-based format. Users provide a prompt, wait for processing, and then receive an answer. Over time, people have adapted themselves to this limitation by speaking to AI in carefully structured sentences, almost like writing emails or commands. Natural interruptions, pauses, acknowledgements and conversational cues rarely work well because existing systems are not designed to handle them in real time.Thinking Machines argues that this becomes a major limitation if AI is expected to evolve into a genuine collaborator in environments where timing matters, including healthcare, industrial operations and customer support.To address this, the company has developed a new architecture based on what it describes as “full-duplex” interaction. Instead of processing conversations as one long alternating sequence, the system breaks communication into micro-turns of roughly 200 milliseconds. This allows the AI to continuously react to visual and auditory input, even while it is already speaking.At the center of the system is TML-Interaction-Small, a 276-billion parameter mixture-of-experts model focused on fast conversational handling, presence and immediate responses. Alongside it is a secondary asynchronous “background” model responsible for more computationally intensive tas...
Why Mira Murati's Interaction Models is the next big thing in AI ...
The way we interact with AI is about to change dramatically. Mira Murati (former CTO of OpenAI) and her new company, Thinking Machines, unveiled Interaction Models — a new class of AI designed from the ground up for natural, real-time collaboration rather than the familiar back-and-forth prompting we’re used to with ChatGPT, Claude, or Gemini. Note that Thinking Machines, which currently has a valuation of $12 billion post-money, isn’t the first one to introduce something similar. Segment rival, Google, promotes its Gemini Live as an AI buddy, whereas OpenAI offers its GPT-Realtime-2 model. These versions are tuned for lower latency, aiming to make the voice versions of regular AI chatbots sound as close to natural human conversations. However, Murati’s startup has designed its model to be preemptive, i.e., it can cut in while you are speaking and correct you if wrong, as demonstrated by the video. This is what’s drawing attention from the AI world. Traditional AI chatbots: All about waiting for their turn Most current AI chatbots, including ChatGPT and Claude, are turn-based systems. Their concept of arriving at an answer comes like: – You type or speak your full message. – The AI waits until you finish. – It thinks and generates a complete response. Then it’s your turn again. ALSO READ While the process seems easy, it creates a narrow “chat window” experience. You have to batch your thoughts, phrase everything clearly upfront, and wait for the model to finish before you can correct, interrupt, or show something on your screen. Even voice modes (like GPT-4o real-time or Gemini Live) still rely on external “harnesses”, add-on systems that detect when you stop speaking, which feels artificial and often laggy. In essence, the AI has no real sense of time, can’t naturally interrupt, struggles with simultaneous input/output, and has limited awareness of what you’re seeing or doing in the moment. So where do Interaction Models fit in? Interaction Models, as introduced by Thinking Machines, change this approach. Instead of making the bot react in a turn-by-turn situation, Interaction Models are trained from scratch to handle interaction natively, i.e., just like humans. Compared to regular AI chatbot models, some key differences include: Micro-turns (200ms chunks): The model processes tiny slices of audio, video, and text continuously, allowing near-instant reactions. Full-duplex communication: The AI can listen and speak at the same time (e.g., ...
Mira Murati's Thinking Machines: New AI Interaction Models
Discover Thinking Machines, Mira Murati's new AI venture focusing on interaction models for natural human-AI collaboration via continuous audio and video.
Mira Murati's Thinking Machines Unveils "Interaction ... - TechStory
Murati's launch of Interaction Models is a clear philosophical departure from the "Agentic" path taken by OpenAI and Anthropic. While those companies focus on "autonomous agents" that can perform tasks solo for hours, Thinking Machines is betting that the most valuable AI will be the one that stays " in the loop" with humans.



