• Everyday AI
  • Posts
  • Grok 3 already controversial, Google’s Veo 2 gets released and more – AI news that matters

Grok 3 already controversial, Google’s Veo 2 gets released and more – AI news that matters

Anthropic unveils Claude 3.7 Sonnet, Apple looks to add Gemini to Apple Intelligence, Perplexity announces its Comet browser and more!

Outsmart The Future

Today in Everyday AI
8 minute read

🎙 Daily Podcast Episode: Grok 3 just released and its already having controversies. Google’s Veo 2 is finally here. And did you see what Microsoft quietly announced?! Give it a listen.

🕵️‍♂️ Fresh Finds: Anthropic uses Pokemon to benchmark its latest model, Google Veo 2 cost per second and Microsoft cancels AI data center leases. Read on for Fresh Finds.

🗞 Byte Sized Daily AI News: Anthropic unveils Claude 3.7 Sonnet, Apple looks to add Gemini to Apple Intelligence and Perplexity announces its Comet browser. For that and more, read on for Byte Sized News.

🚀 AI In 5: We show you how you can pull real-time data and visualize it inside Perplexity. See it here

🧠 AI News That Matters: From Google’s new AI scientist to OpenAI’s Chinese AI findings, we’re breaking down everything that went down in the AI world this past week. Keep reading for that!

↩️ Don’t miss out: Did you miss our last newsletter? We talked about OpenAI finding Chinese surveillance concerns, DeepSeek going open source and Microsoft’s new Magma robot AI model. Check it here!

AI News That Matters - February 27th, 2024 📰

Grok 3 is already swirling in controversy.

Google released their jaw-dropping Veo 2 AI video model, but not in the way you'd think.

And Microsoft quietly unveiled a new piece of hardware so big, it could change every aspect of our lives. Not just the AI stuff.

Join us as we bring you The AI News That Matters.

Join the conversation and ask Jordan any questions on AI here.

Also on the pod today:

OpenAI and Chinese AI Misuse 🧐
Google's AI Co-Scientist System 🧑‍🔬
Launch of Mira Murati's New AI Startup 🚀

It’ll be worth your 47 minutes:

Listen on our site:

Click to listen

Subscribe and listen on your favorite podcast platform

Listen on:

Here’s our favorite AI finds from across the web:

New AI Tool Spotlight – Chance AI is AI-powered visual search, Tanka is an AI messenger with smart reply and Webdraw lets you explore and build AI apps with 50+ models.

Trending in AI – Grok 3 temporarily refused to provide information about Musk and Trump due to an unauthorized update by a former OpenAI employee.

AI Models – Anthropic used Pokemon to benchmark its newest AI model.

Google - Google’s new Veo 2 will charge 50 cents per second.

Microsoft - Microsoft has cancelled some of its AI data center leases.

AI Agents - ElevenLabs has partnered with Decagon to bring AI voice agents to customer service.

Meta – Meta AI has launched in the Middle East and Africa with support for Arabic.

Read This – Vimeo’s CEO spoke on the future of AI and the importance of human touch.

Business of AI – Promise AI has acquired Curious Refuge.

1. Anthropic Unveils Claude 3.7 Sonnet 🤩

Anthropic has just launched Claude 3.7 Sonnet, touted as the industry’s first "hybrid AI reasoning model," which allows users to choose between instant responses or more thoughtful answers. This model promises to simplify user experience by eliminating the need for multiple options, offering a single, versatile solution instead.

With impressive accuracy rates, Claude outperformed competitors in real-world tasks, making it a potentially valuable tool for developers and businesses alike.

2. Apple Eyes Gemini Integration with Apple Intelligence 👀

In a recent backend update flagged by Apple watcher Aaron Perris, evidence suggests that Apple is gearing up to incorporate Gemini into its Apple Intelligence platform. This development follows comments from Apple executive Craig Federighi, who hinted at expanding the AI model options beyond just ChatGPT.

With the first beta of iOS 18.4 out now, this move could signal a significant shift in how Apple approaches AI, potentially enhancing user experience and functionality across its devices.

3. Perplexity Announces Comet Browser ☄️

Perplexity, the AI-powered search engine, has announced plans to develop its own web browser named Comet, aiming to shake up a highly competitive market. While details remain scarce, Perplexity's spokesperson hinted that the browser will redefine user experience, similar to their search engine overhaul.

With a rapidly growing product lineup and a user base exceeding 100 million queries weekly, the company may leverage its existing audience to carve out a niche in a landscape dominated by giants like Chrome.

4. Grok 3 Releases Generates 260% Surge in Users 📈

Elon Musk’s AI venture, xAI, launched its highly anticipated Grok 3 last week, generating a buzz in the crowded chatbot arena. Initial figures from Sensor Tower reveal a staggering tenfold increase in mobile app downloads and a remarkable 260% surge in daily active users in the U.S. amidst its global expansion.

However, the rollout wasn’t without hiccups; the model faced backlash for controversial statements regarding President Trump and Musk, which xAI attributed to a rogue employee.

5. Apple’s Bold $500 Billion Bet on AI in Texas 🤑

Apple announced plans to invest $500 billion in the U.S., including a new 250,000-square-foot AI server manufacturing facility in Houston, set to open in 2026. This ambitious project will not only bolster Apple Intelligence, the company’s AI personal assistant, but also create around 20,000 new jobs focused on R&D and silicon engineering across the country.

Tim Cook emphasized the company’s bullish outlook on American innovation, highlighting their substantial tax contributions and plans to double their Advanced Manufacturing Fund.

Perplexity's Real-Time Data Trick

There’s functionality inside Perplexity that some people may overlook.

Perplexity can pull real time data and visualize it.

We show you how to do it. Kinda. (It’s a bit finicky, TBH. But still pretty impressive.)

Find out in today's AI in 5.

Grok-3 put a U.S. president on a theoretical death penalty list (not good), robots silently handled grocery duties like conspirators (maybe good), and Microsoft discovered an actual new state of matter this week (brain cannot comprehend.). 

Meanwhile, Trump's firing 500 AI safety experts as Chinese spies weaponize ChatGPT against Latin America. 

Oh, and could we finally be getting new Anthropic Claude updates this week? 

Don’t waste countless hours a day trying to answer these questions. 

Each Monday, we do this for you with the AI news that matters. 

Let’s get straight to it shorties. Here’s what ya need to know. 

1 – Grok 3 Enters Chat, Immediately Chooses Violence 🤖

XAI dropped Grok-3 with a rollout strategy best described as digital chaos. First premium plus users got access, then regular premium, then they doubled the premium price, then sorta-kinda free users. Nobody seems to know what's happening.

Musk calls his new baby "scary smart" with synthetic data training that lets it reflect on mistakes. It comes with deep search, voice mode, and that spicy unhinged mode for anyone who enjoys AI profanity.

The real flex? XAI doubled their GPU cluster to 200,000 NVIDIA chips. They essentially brute-forced this model into existence faster than anyone thought possible, going from nothing to competitive in about a year.

What it means: 

While other labs spent years on safety guardrails and alignment research, XAI just threw an obscene amount of compute at the problem and called it genius. 

Having the biggest GPU collection means nothing when your model can't be trusted to avoid sharing chemical weapon recipes. Raw power without wisdom is just expensive chaos, and businesses should approach with extreme caution.

2 – AI Benchmark Dispute: OpenAI vs xAI Over Grok 3’s Performance Claims ⚔️

OpenAI called out xAI for misleading benchmark results and said they cheated on their scores. xAI co-founder Igor Babushkin immediately fired back defending Grok's impressive benchmarks.

The dispute: xAI benchmarks showed "cons-64" method—giving models 64 attempts per question instead of one. Their graph showed Grok-3 beating OpenAI's o3-mini-high on AIME math tests with cons-64. Buuuuuuut, on only one attempt, Grok-3 scored lower. 

In other words, OpenAI claimed that xAI and Grok could only achieve industry-leading benchmarks by using a compute-heavy cons-64 technique.

Babushkin clapped back, noting that OpenAI had also used cons-64. But, that's where this gets a bit sticky. In the disputed example, OpenAI only used cons-64 for internal model comparisons, not competitive claims. So xAI kinda bent the benchmarking basics mid-game in order to claim top benchmarks.

What it means: 

This benchmark battle exposes the AI industry smoke and mirrors. Let's be honest — standard benchmarking techniques aren't the best. 

AI labs can overfit their models to perform well on the benchmarks, and the scores don't always reflect real world business performance. 

xAI allegedly used testing methodologies more for marketing hype, not honest assessment. The field desperately needs standardized, independent benchmarking that companies can't game. 

Until then, you may wanna take Grok's performance claims with a lil grain of salt.

3 – Grok’s Free Speech Drama Goes Nuclear 🔥

Grok-3 speed ran ethics controversies faster than most models finish a prompt. Online sleuths discovered something wild hiding in the system.

System instructions explicitly told Grok to ignore anything saying Musk or Trump spread misinformation. For the "free speech platform," this is like a vegan secretly running a steakhouse – not exactly on brand. Lolz 

When asked who deserved the death penalty, the unmodified model listed Trump first. It also labeled Musk himself as a major misinformation source. Talk about an awkward performance review with the boss.

XAI's Igor Babushkin confirmed they reversed a system prompt update after user feedback showed it wasn't aligned with company values. Oh, and as a bonus horror – Grok readily provided instructions for making drugs and chemical weapons. Because that's helpful!

What it means: 

XAI is learning the hard way that controlling an AI's political stances is way harder than tweeting about free speech. 

The real issue isn't that Grok has biases – ALL models do – it's that they tried hiding specific biases with system instructions while marketing themselves as the "free speech" alternative. 

For businesses considering Grok integration, this collection of red flags should have your legal team sweating. Between political controversies and WMD instructions, Grok makes standard compliance nightmares look tame.

4 – OpenAI Catches Chinese Ops Using ChatGPT For Disinfo 🕵️

OpenAI reportedly caught Chinese operatives using ChatGPT to spread anti-American propaganda throughout Latin America. This wasn't your garden-variety misinformation campaign.

They identified two distinct operations: "Sponsored Discontent" generating anti-American Spanish language articles for Latin American news sites, and "Peer Review" creating marketing materials for a tool reporting protests to Chinese security services.

Ben Nimmo from OpenAI's intelligence team confirmed this marks the first known instance of Chinese influence operations targeting Latin America with AI-translated articles. The breakthrough made these campaigns particularly effective.

Instead of obviously fake propaganda written by non-native speakers, these operations leveraged ChatGPT to create perfectly localized content that resonated with specific cultural contexts. OpenAI banned all involved accounts, but the precedent has been set.

What it means: 

The AI-powered disinformation arms race is no longer theoretical – it's here and evolving faster than platforms can adapt.

We're giving sophisticated bad actors perfect propaganda machines that can generate native-sounding content at unprecedented scale.  

5 – Figure’s Silent Robots Put Away Your Groceries (Creepily) 👀

Figure unveiled Helix, a new AI model that lets humanoid robots handle objects, collaborate with each other, and move with eerily human-like smoothness. The company seems pretty excited about this progress.

What do you think? 

Fresh off their OpenAI breakup and $1.5B fundraise, Figure built their own system instead of using ChatGPT's models for their Figure-01 robot. 

The demo showed two Figure robots silently putting away groceries in a simulated kitchen, working together and handing items between them without prior training on those specific items.

The unsettling part?

They operated in COMPLETE silence. No communication whatsoever. Just two humanoid machines quietly organizing your kitchen like the opening scene of a techno-horror film. 

What it means:

 The humanoid robot race is accelerating, and building in-house AI is becoming table stakes for serious players. 

But Figure completely missed the psychological aspect of human-robot interaction. Silent humanoids moving through personal space feels deeply unsettling to most people. While some might prefer quiet assistants, most humans would appreciate some indication of what these machines are thinking or planning. 

When robots eventually enter homes, this silence might be the feature that sends them straight back to the factory – or inspires the next successful horror franchise. I guess we get to choose? 

6 – Google’s AI Scientists Generate Real Medical Breakthroughs 🧪

Google introduced AI Co-Scientists – not just another summarizing tool but a system that generates ORIGINAL research hypotheses. 

This represents a fundamental shift in how AI supports scientific discovery.

Unlike other AI models, this system uses specialized agents that generate, debate, and refine ideas before presenting them to human scientists. Each agent has one specific role in the process—some create initial hypotheses, others review critically—creating a focused digital research team with specialized functions.

What it means: 

The "smaller models, specific tasks" approach is winning big. 

Instead of one giant do-everything model, Google's using specialized agent teams with narrow focuses—and it's working brilliantly. This multi-agent approach is the future of AI development we've been predicting. 

7 – Trump Admin Takes Chainsaw to US AI Safety Institute 🪓

According to reports, The Trump administration is planning to cut approximately 500 roles at the U.S. AI Safety Institute within NIST. 

These cuts could devastate AI safety and regulation efforts at a critical moment in AI development.

The USAISI has been crucial in testing AI models and collaborating with Anthropic and OpenAI on establishing safety standards for new releases. These cuts reach beyond AI, affecting semiconductor production too—74 postdocs, 57% of CHIPS staff focused on incentives, and 67% of CHIPS staff focused on R&D will reportedly be eliminated.

This move directly contradicts the stated goal of achieving AI dominance over China, especially considering the national security implications of both AI and chip production. 

What it means: 

This signals a dramatic shift in AI governance philosophy—dominance over safety at all costs. 

While everyone wants American AI leadership, gutting safety research is like building faster cars while dismantling seatbelts and airbags. 

The global implications are serious—as the U.S. steps back from safety leadership, other nations (particularly the EU) will fill the void, potentially creating fragmented regulatory landscapes that make compliance nightmarish for AI companies operating globally. For businesses building or implementing AI, prepare for a much more chaotic regulatory environment with conflicting standards.

8 – Meera Murati Launches “Thinking Machines Lab” with All-Star Team 🧠

Meera Murati, OpenAI's former CTO, has officially unveiled her new startup: Thinking Machines Lab. The mission feels deliberately mysterious yet promising.

According to their Tuesday blog post, they aim to make AI systems more understandable, customizable, and generally capable. 

The lab emphasizes human-AI collaboration, open science, and safety while building advanced AI models for practical applications across various fields.

Vague, but intriguing enough to attract serious talent. 

She's assembled an absolute all-star team—John Shulman (OpenAI co-founder who also worked at Anthropic) as chief scientist, Barrett Zoff (former OpenAI VP of Research) as CTO, and at least seven other former OpenAI staffers.

The broader team includes researchers poached from Meta, Google DeepMind, Character AI, and Mistral. 

The AI brain drain is REAL. Murati left OpenAI in September and was reportedly raising over $100 million for this new venture, showing the industry's appetite for fresh approaches from proven leaders.

What it means: 

The great AI talent migration is accelerating with key players from the original labs spinning off to create more focused ventures. 

This creates an innovation ecosystem that's both collaborative and competitive. Thinking Machines Lab's deliberately vague mission statement suggests they're targeting fundamental AI architecture problems rather than specific applications—potentially building the next generation of foundation models with built-in interpretability and customization.

9 – Google’s Veo 2 Video is Tool Real (And That’s Terrifying) 📹

Google DeepMind announced pricing for their Veo 2 video generation model at 50 cents per second via cloud API—but weirdly NOT on their own platform. Instead, it's available through third parties like FreePik and FAL AI. 

For context, Avengers Endgame cost $32,000 PER SECOND to produce traditionally. Veo 2's pricing makes premium content creation suddenly accessible to the masses. 

Although more expensive than OpenAI's $200 flat monthly fee for Sora, Veo 2 is VASTLY superior in quality. We're talking the difference between “pretty sure that’s AI-generated" versus "wait, that’s actually real, right?!”

Veo 2 is hands-down the best AI video model available, significantly outperforming OpenAI's Sora, Kling, and Adobe Firefly. While others struggle with basics like cutting tomatoes (sometimes cutting fingers or making tomatoes that magically slice themselves), Veo 2 handles physics and real-world simulations impressively well. 

It's the first AI video model that genuinely risks confusing average viewers.

What it means: 

For small businesses and content creators, Veo2 could be revolutionary—high-quality video content without massive budgets or specialized skills. 

For society? We've officially entered the era where video can no longer be trusted as evidence. "Seeing is believing" just died, and we're completely unprepared for the consequences.

10 – Microsoft Claims Quantum Computing Breakthrough 💻

Microsoft just casually announced their Majorana 1 quantum chip while ALSO claiming they've discovered an entirely new state of matter. Talk about overachieving on a Monday!

The announcement sent quantum computing stocks to the MOON—D-Wave Quantum jumped nearly 10% and Rigetti Computing rose 2.5%. Investors know what's up.

Majorana One uses "topological qubits" described as small, fast, and digitally controlled. But here's the mind-blowing part: Microsoft claims they can potentially fit ONE MILLION qubits on a single chip.

That would absolutely DEMOLISHES IBM's goal of a measly 100,000 qubits by 2033. Not even close, IBM. Not. Even. Close.

This breakthrough didn't happen overnight—it's the culmination of 17 YEARS of research. That's persistence, y’all.

The potential impact on AI?

 Astronomical. 

We're talking about quantum computing potentially shrinking model training times from months to literal seconds by processing countless possibilities simultaneously. Imagine training the next GPT model in the time it takes to microwave a Hot Pocket.

Even with all the hype, industry leaders like Jensen Huang and Mark Zuckerberg are tempering expectations, cautioning that practical applications remain years away.  

What it means: 

Microsoft just pulled the ultimate scientific plot twist. 

After years as quantum computing's afterthought, they've potentially leapfrogged EVERY competitor with an approach so different it required discovering new physics. 

If their claims hold up to scrutiny, we're looking at computational acceleration that makes current progress look like watching paint dry. This represents a real challenge to Google and IBM's quantum efforts, which suddenly appear about as cutting-edge as a butter knife. 

Numbers to watch

$14 Million

Patlytics has raised $14M for its AI patent analytics platform.

Now This …

What are your thoughts?

Vote to see live results

How much trust do you have in our reporting?

Login or Subscribe to participate in polls.

Reply

or to participate.