- Everyday AI
- Posts
- Ep 702: AI Can Finally Hear What You Actually Mean. What this unlocks
Ep 702: AI Can Finally Hear What You Actually Mean. What this unlocks
Chrome goes agentic, Microsoft's AI earnings, Meta announces $100+ billion AI investment and more.
👉 Subscribe Here | 🗣 Hire Us To Speak | 🤝 Partner with Us | 🤖 Grow with GenAI
In Partnership With Modulate
Modulate: Voice intelligence to understand conversational context
Voice is multi-dimensional and AI models that rely on text to analyze your conversations miss their true meaning. Velma, a voice-native AI leveraging the unique Ensemble Listening Model (ELM) architecture, goes beyond transcripts to analyze emotion, timing, tone, and intent.
Built on 21 billion hours of audio, trusted by Fortune 500s, and more accurate and cost-effective than models like Google and xAI for conversational understanding, Velma is now publicly available so you can get human-level voice intelligence that’s 100x faster, cheaper, and more reliable.
Outsmart The Future
Today in Everyday AI
8 minute read
🎙 Daily Podcast Episode: AI can hear your words — but now it’s starting to understand what you actually mean. From tone and emotion to intent and trust, voice-native AI is unlocking an entirely new layer of intelligence. Give today’s show a watch/read/listen.
🕵️♂️ Fresh Finds: Gemini releases a new side panel for multitask automation, Apple launches Creator Studio with major new AI features, Meta teases upcoming AI models and more. Read on for Fresh Finds.
🗞 Byte Sized Daily AI News: DeepMind releases AlphaGenome, Microsoft posts stronger-than-expected Q2 earnings with $81.27 billion in revenue, Meta announces plans to spend more than $115 billion on AI infrastructure in 2026, and more. Read on for Byte Sized News.
💪 Leverage AI: Most “voice AI” today can’t hear frustration, sarcasm, or intent — only text. We explored what that blind spot is costing companies and why understanding emotion may be the next major competitive edge. Keep reading for that!
↩️ Don’t miss out: Miss our last newsletter? We covered: OpenAI launches Prism, Google rolls out an AI Plus subscription, Amazon cuts 16,000 jobs, and more. Check it here!
Ep 702: AI Can Finally Hear What You Actually Mean. What this unlocks
Your company’s goldmine?
All those meetings and call recordings. It’s the fuel that AI needs.
But here’s the big letdown: those call transcripts only pick up the words. Not what they mean.
And the difference?
Well…. That can make all the difference.
But some new technology might change what’s possible.
Also on the pod today:
• AI hears tone, not just text 🗣️
• Sarcasm detection for customer calls 🙃
• Voice deepfakes exposed in real time 🎭
It’ll be worth your 30 minutes:
Listen on our site:
Subscribe and listen on your favorite podcast platform
Listen on:
Here’s our favorite AI finds from across the web:
New AI Tool Spotlight – Meet-Ting Is An AI email scheduling assistant that books meetings directly from your inbox, Story.cv turns complex experiences into clear, concise bullet points that make hiring managers compete for you, Highlight GPT is A lightweight browser extension that lets you ask, explain, translate, and memorize from highlights — using ChatGPT’s native responses, with zero extra tokens.
AI Slop Youtubers — YouTube removes millions-subscriber AI channels as Mohan vows crackdown
LMArena Rebrand — LMArena is now Arena, a global platform where users pit and judge real-world AI — want to see their new look?
AI Short Films — Aronofsky uses DeepMind AI to reenact 1776 moments on Time’s YouTube channel. Want to see history remade by AI?
Grok Video AI — Grok Imagine makes pro-quality text‑to‑video fast and affordable. Want to see it in action?
Gmail Suggested Replies — Suggested Replies: one-tap email replies that match your tone — free in the US.
Excel Agent Mode — Excel’s Copilot Agent Mode now runs on desktop with selectable OpenAI or Anthropic models.
Claude Disempowerment Patterns — Anthropic’s Claude rarely steers users wrong, but when it does the effects on beliefs, values, or actions can be serious. Curious how often and why?
Gemini Automated Tasks — Chrome’s new Gemini side panel brings multitask automation — U.S. preview only.
Meta AI Tease — Meta will start rolling out new AI models and shopping agents in the coming months, using personal data for hyper-personalized recommendations. Want the details?
Google Genie 3 Tease — Logan Kilpatrick teases Genie 3 — Google’s world simulator incoming.
Apple Creator Studio AI — Final Cut and Logic gain AI-powered search, beat detection, and iPad-only subscription twists.
1. DeepMind unveils AlphaGenome to read long stretches of noncoding DNA 🧬
According to Scientific American, DeepMind released AlphaGenome, an AI that predicts how mutations in up to one million base pairs of noncoding DNA affect gene expression, marking a timely advance in understanding the genome beyond protein-coding regions.
The model improves on prior tools by handling much longer sequences with competitive accuracy, helping researchers focus on which variants might influence disease rather than testing everything blindly. AlphaGenome is research-only for now, trained on human and mouse data, so it cannot be used clinically and may miss some real effects, but it can narrow the search space for disease-causing mutations.
2. Microsoft beats estimates but stock tumbles on AI cost worries 📈
Microsoft reported stronger-than-expected Q2 results after the bell, with revenue of $81.27 billion and EPS of $5.16, and cloud revenue topping $50 billion for the first time, but shares plunged over 11% as investors fretted that cloud growth is slowing and AI-related spending is ballooning.
The company’s Intelligent Cloud and Productivity segments both beat estimates, while remaining performance obligations hit $625 billion, with about 45% tied to OpenAI commitments — a key signal of future AI demand. Management warned capacity constraints are limiting how much AI demand Microsoft can serve today, prompting a jump in capital expenditures to $37.5 billion as the company builds out infrastructure.
3. Google brings Gemini 3 to Chrome with side-panel assistant 🧑💻
Google today rolled out major Gemini updates in Chrome, introducing a persistent side panel, Nano Banana image edits, Connected Apps integrations, and forthcoming Personal Intelligence, all aimed at making browsing more agentic and productive.
The biggest shift is agentic auto browse for AI Pro and Ultra subscribers in the U.S., which can carry out multi-step web tasks like research, sign-ins with password manager permission, and commerce actions while pausing for sensitive confirmations. These features push Chrome from a passive tool toward a proactive assistant that remembers context across apps and can act on your behalf, raising new utility and new security and control considerations.
4. Tesla to stop Model S and X as factory shifts to robots 🤖
Tesla told investors Wednesday it will wind down production of the Model S and Model X next quarter and convert its Fremont, California factory to build the yet-to-market Optimus humanoid robots, signaling a strategic move toward autonomous vehicles and robotics.
CEO Elon Musk framed the change as part of a broader pivot to robotaxis and humanoid robots while promising continued support for existing owners. The announcement comes as Tesla’s revenue dipped last year and competition and political backlash have weighed on sales, underscoring why leadership is prioritizing future tech over legacy models.
5. Meta doubles down on AI with massive 2026 capex plan 🤑
Meta announced it will spend $115 billion to $135 billion on AI-related capital expenditures in 2026, signaling an urgent push to build data centers and computing capacity for next-generation models.
The move comes after Meta beat Q4 expectations with 24% revenue growth, easing investor worries even as most current revenue still comes from online ads. CEO Mark Zuckerberg framed the spending as necessary to deliver “personal super intelligence” and avoid being constrained by outside models, while the company folds Scale AI talent into its TBD unit to develop successor models like Avocado.
6. OpenAI hires at least seven engineers from Cline 🤝
OpenAI this week quietly picked up at least seven staffers from coding startup Cline, signaling a targeted talent grab that accelerates its push into developer tools and coding assistance.
The hires are timely because they come as competition for AI engineering talent tightens and companies race to strengthen code-focused products. This move suggests OpenAI is shoring up hands-on expertise rather than buying a rival, shifting the balance of talent in the AI tooling market.
Most companies think they have voice AI giving them insights in their sales calls, meetings and customer service recordings.
They don't.
They have text-based AI models that transcribe speech first, then process tokens. And that transcription step strips out everything that actually matters.
Tone. Emotion. Intent. Sarcasm.
The difference between a customer who's mildly annoyed and one who's about to churn forever.
Your competitors who figure this out first will own customer relationships you're still trying to automate.
That's exactly what we unpacked on today's show with Mike Pappas, CEO and Co-founder of Modulate.
His company builds AI that actually hears what people mean, not just what they say. And the strategic implications go way beyond better customer service.
Here's what busy leaders need to know.
1. Same Words, Opposite Meaning 🎯
Your AI doesn't understand sarcasm.
That's not a minor issue. It's a fundamental blindness that corrupts every insight you think you're getting from customer conversations.
Mike explained this using an example that stuck with us. When someone says "nice job" sarcastically, the meaning is the complete opposite of the words. Text-based AI can't connect those dots. It sees the words and moves on. Every summary, every sentiment analysis, every insight downstream from that moment is now wrong.
This compounds across thousands of daily interactions. Your dashboards show positive sentiment while actual customers seethe. Your AI agents respond to words that mean something entirely different in context.
Modulate’s new Velma model uses Ensemble Listening Model (ELM) architecture to analyze the acoustic "raw" data of a voice—like tone, timing, and emotion—rather than just reading a text transcript.
By deconstructing layers like sarcasm, stress, and background noise in real-time, it allows businesses to detect fraud and deepfakes that traditional text-based AI misses completely.
Try This: Pull three random customer call recordings from the past week and listen specifically for moments where tone contradicts words. That frustrated "fine" that sounds like surrender. The polite "I understand" that actually means the opposite.
Count how many of those moments your current analytics captured versus missed entirely. That gap represents the customer insight your competitors could be capturing while you're still measuring word frequency.
2. AI Agents Need Supervision Before They Need Scale 🛡️
Here's what keeps compliance teams up at night.
If your AI agent hallucinates a refund policy that doesn't exist, courts could hold you to it. Mike confirmed this is already happening. The legal liability is real and growing.
But the scarier problem is you might not even know it's happening. Most AI agent platforms promise logging and reporting. What they actually deliver is transcription of what was said without any understanding of whether it went wrong.
One company Mike described misprompted their AI interviewer to check if candidates were "flexible." The agent started asking people to demonstrate yoga poses. This happened because no one was actually monitoring what flexible meant in context.
The enterprises succeeding with voice AI aren't racing to deploy at scale. They're building guardrails that catch hallucinations in real time, flag emotional escalation before it explodes, and provide explainable reasoning for every decision the AI makes.
Try This: Document every customer-facing AI voice interaction your company runs today. For each one, answer honestly: how would you discover it said something catastrophically wrong on call number 4,327?
If the answer involves hoping someone complains or manually sampling recordings, you've found the gap that could generate your next regulatory nightmare. Build the detection layer before you build the scale.
3. Efficiency Is Killing What Voice Actually Builds 💡
When customers know they're talking to AI, they dumb themselves down.
Mike shared research that should concern anyone building customer relationships. When a human asks "do you own your home?" another human might respond in hundreds of different natural ways. When an AI asks that same question, people limit themselves to four or five safe responses.
Why? They're terrified of being misunderstood.
Your AI is literally making your customers worse communicators. Every "press one for yes" interaction reinforces that customers should restrict themselves around your brand.
Meanwhile, the real opportunity isn't completing transactions faster. It's understanding customers so well that they stop regulating themselves entirely. Mike raised a question worth sitting with: when customers start sending their own AI delegates to talk to your AI agents, what exactly are you building a relationship with?
Voice is where trust forms. Optimizing purely for efficiency might be optimizing away your most valuable asset.
Try This: Call your own support line and pay attention to what happens inside your head. Notice every moment you simplify your language, choose safer words, or hold back context because you sense AI on the other end.
Those moments are exactly what your customers experience daily. Then ask this question in your next strategy meeting: are we building voice AI to complete transactions or to understand humans? The honest answer determines whether you're creating competitive advantage or commoditizing your most important touchpoint.







Reply