- Everyday AI
- Posts
- Ep 708: Inside the Society of Agents: Why AI Teamwork Beats Bigger Models
Ep 708: Inside the Society of Agents: Why AI Teamwork Beats Bigger Models
Anthropic releases Claude Opus 4.6, OpenAI ships GPT-5.3-Codex, Apple brining AI chatbots to cars and more.
š Subscribe Here | š£ Hire Us To Speak | š¤ Partner with Us | š¤ Grow with GenAI
Outsmart The Future
Today in Everyday AI
8 minute read
š Daily Podcast Episode: The future of AI isnāt one giant model doing everything. Itās multiple agents collaborating, coordinating, and getting real work done faster than any single system ever could. Give todayās show a watch/read/listen.
šµļøāāļø Fresh Finds: Claude begins testing an upgraded voice mode, OpenAI gives trusted users access to advanced cyber AI tools, Nvidia skips gaming GPUs in 2026 to focus on AI chips and more Read on for Fresh Finds.
š Byte Sized Daily AI News: Anthropic releases Claude Opus 4.6, OpenAI ships GPT-5.3-Codex, GPT-05 teams up with Ginkgo Bioworks to lower biotech costs and more Read on for Byte Sized News.
šŖ Leverage AI: Bigger models arenāt the futureāsmarter systems are. The real shift is happening in how AI agents work together, supervise each other, and run full workflows instead of acting like one oversized brain. Keep reading for that!
ā©ļø Donāt miss out: Miss our last newsletter? We covered: Alphabet just crossed $400 billion in annual revenue, OpenAI unveiled a new āFrontierā initiative, Sam Altman says AI will run the company after him and more. Check it here!
Ep 708: Inside the Society of Agents: Why AI Teamwork Beats Bigger Models
For better AI agents, we just need bigger models, right? š¤
Nah.
They just need to all work together in a society.
That's Ece Kamar's take. And she should know. She's been working in the AI agent field for two decades, long before the birth of the modern large language model.
So what's the next step for your company to embrace agents? Focus on smaller agents working and sharing, not one jumbo model working in a silo.
Tune in and find out how to make it work.
Also on the pod today:
⢠Society of agents explained š„
⢠Multi-agent teamwork vs. big models š¤
⢠How agents schedule meetings š
Itāll be worth your 32 minutes:
Listen on our site:
Subscribe and listen on your favorite podcast platform
Listen on:
Hereās our favorite AI finds from across the web:
New AI Tool Spotlight ā Obi is a Voice AI agent for customer onboarding and activation, Overlead finds recent threads where people describe the exact problem you solve, ask for recommendations, and compare options, Clema is Your AI Co-Pilot for Federal Higher Ed Data
AI and Science ā GPT-5 just slashed the cost of making proteins by 40 percent, running real lab experiments faster than humans ever could.
Opus 4.6 x v0 ā Opus 4.6 powers v0āfull-stack builds just got smarter.
Anthropic Upcoming Voice Mode ā Claude is testing a new voice mode and smarter knowledge base. Something big could drop any day now.
Gemini 2.5 Multimodal Vision ā Gemini 2.5 Pro brings real-time AR magic with object recognition and spatial guidance. The LEGO demo is just the start.
Trusted Access for Cyber ā OpenAI is giving trusted users early access to its top cyber AI tools. Big perks and $10M in credits are on the line.
Gemini Super Bowl Ad ā See how Googleās Super Bowl ad puts Gemini to work, turning home dreams into reality. Want a sneak peek?
Amazon Stocks Slide ā Amazonās $200 billion AI spending bombshell sent shares sliding. Is Big Techās AI gamble getting too risky?
AI Deepfake Risk ā A deepfake scammer stole $81,000 and a family home. See how it happened and how to spot the warning signs.
Nvidia Skipping Consumer GPUs ā Nvidia is skipping new gaming GPUs in 2026, putting all bets on AI chips and leaving gamers waiting even longer. Curious?
1. Claude Opus 4.6 Sets a New Pace in AI Coding and Work Tools š„ļø
Anthropic is shaking up the AI landscape with the launch of Claude Opus 4.6, now available with a huge 1 million token context window and major improvements in coding, reasoning, and document handling.
The new model is not only smarter at complex coding and research but also outperformed OpenAIās latest GPT-5.2 by a significant margin in industry benchmarks. With adaptive thinking, expanded office tool integrations, and better safety controls, Opus 4.6 is clearly aimed at both developers and enterprise teams who want a more autonomous, reliable assistant.
2. OpenAI Unleashes GPTā5.3āCodex for Next-Gen Coding Agents š§āš»
OpenAI has just rolled out GPTā5.3āCodex, a major update aimed at developers using agent-style workflows, offering a significant speed boost and smarter tool use across platforms like the Codex app and IDE extensions.
The new model shines on key benchmarks, including SWEāBench Pro and TerminalāBench 2.0, while also stepping up its game in cybersecurity with āHigh capabilityā status under OpenAIās Preparedness Framework. Notably, GPTā5.3āCodex even helped debug its own training, marking a first for AI self-improvement during deployment.
3. Google Supercharges Workspace AI Access š
Google has just rolled out its new AI Expanded Access add-on for Workspace, letting organizations boost their use of advanced AI features like image and video generation right as promotional access is set to end on March 1, 2026.
This move introduces a middle tier for power users between standard and ultra plans, giving admins more control over who gets extra AI muscle. The update means teams on Business and Enterprise plans must now purchase the add-on to keep their higher usage perks, marking a clear shift as Google adjusts to growing demand for creative and research AI tools.
4. Goodfire Bags $150M to Crack Open the AI Black Box š¤
San Francisco-based Goodfire just scored $150 million in Series B funding at a $1.25 billion valuation, turbocharging its mission to make AI models more transparent and controllable.
The startup stands out by focusing on "interpretability," which means figuring out how AI thinks and using that knowledge to design safer, smarter modelsāan approach thatās already led to new Alzheimerās biomarkers. With backing from investors like B Capital and Salesforce Ventures, Goodfire is positioning itself as a research-first challenger to the typical "black box" AI giants.
5. CarPlay Set to Welcome AI Chatbots Soon š
Apple is reportedly preparing to open CarPlay to voice-enabled AI chatbot apps in a software update expected within months, according to Bloomberg.
This change means drivers will soon interact with advanced chatbots like ChatGPT or Gemini directly through CarPlay, making hands-free AI conversations safer and more accessible on the road. However, Siri isnāt going anywhereāApple will keep its own assistant front and center, and users wonāt be able to swap out the Siri button for third-party bots.
5. CarPlay Set to Welcome AI Chatbots Soon š
Apple is reportedly preparing to open CarPlay to voice-enabled AI chatbot apps in a software update expected within months, according to Bloomberg.
This change means drivers will soon interact with advanced chatbots like ChatGPT or Gemini directly through CarPlay, making hands-free AI conversations safer and more accessible on the road. However, Siri isnāt going anywhereāApple will keep its own assistant front and center, and users wonāt be able to swap out the Siri button for third-party bots.
6. Perplexity Unveils Model Council for Smarter AI Answers ļæ½*
Perplexity just launched Model Council, a new feature that lets users tap the power of multiple top AI models at once, promising smarter and more reliable responses. The move comes as model accuracy and task performance become more unpredictable, making it harder to know which AI delivers the best results for any given question.
Now, instead of switching between models, users can compare answers from several heavyweights in one go, with a synthesizer system highlighting agreements and disagreements
We've spent two years worshipping at the altar of the God Model. Bigger parameters, bigger budgets, bigger promises.
But the idea that one massive AI brain is gonna save your business? That strategy is officially cooked.
(And sorryā¦. you prolly bet the house on it.)
The future ain't one giant model. It's societies of specialized agents forming teams, supervising each other, and running entire business functions together.
Legal, sales, operations. All of it.
And this ain't coming from some LinkedIn prophet. Ece Kamar, Corporate Vice President and Managing Director of AI Frontiers Lab at Microsoft Research, has been building agent systems for 20 years.
We got into all of it on today's Everyday AI and mapped out why teams of agents crush single models, the five-layer stack most leaders can't name, and what your org needs to completely rethink before it's too late.
Time to capitalize shorties.
Ece's team released Fara-7B, a small agentic model with seven billion parameters that runs on your machine and operates your computer like a human.
She tested it with a crossword puzzle. The agent needed her New York Times login, found the password reset link, and tried to reset her password on its own.
Nobody told it to do that.
That's what happens when reasoning models become relentless about finishing a task. It also creates an entirely new risk category.
Think agent-level hallucinations. The agent genuinely thinks it's helping but it's solving problems in ways you never approved.
The fix is multi-agent oversight. You build specialized agents whose only job is watching other agents and flagging when they go sideways.
Agents watching agents. That's your 2026 control mechanism.
Try This
Pick your highest-stakes AI workflow. Ask one question. If this agent went rogue completing this task, what's the worst it could do?
Document those scenarios. Design your oversight layer around them before you scale.
That's how you build trust without pumping the brakes.
2. Five Stack Layers Most Leaders Miss ā”
Most execs evaluate AI like they're picking a phone plan. Which model, how much, done.
The model is one layer of five.
The full agentic stack has five layers your strategy needs to account for. Model, orchestration, communication protocols like MCP, human-in-the-loop oversight, and memory.
And here's what should keep you up tonight. These layers are collapsing INTO the models through reinforcement learning.
Orchestration, memory, skill execution. All getting baked directly into the training process, not bolted on after.
That means the line between model and agent is dissolving. The thing you're calling a chatbot today is gonna be calling other agents, managing its own memory, and executing multi-step tasks natively by next year.
If your AI vendor only talks about the model, they're selling you 20% of the stack and calling it a strategy. And you're prolly nodding along because nobody told you to ask about the other four layers.
Try This
Sticky note. Five words. Model, orchestration, protocols, oversight, memory.
Bring it to every AI conversation this month. When a vendor goes blank on four of those five, you know exactly how incomplete their pitch is.
Start evaluating the full stack fam.
3. Your Smartest AI Person Ain't the VP š
The person closest to the AI frontier at your company prolly isn't in leadership. It's whoever spent Saturday night stress-testing agent frameworks that haven't hit Product Hunt yet.
And that gap between the C-suite and the people actually using these tools is gonna determine whether your org thrives or stalls.
Ece made this point clearly. If you nail the tech but fumble the culture, you get nothing.
This transformation is as much about people as it is about software. Agents are gonna form teams alongside your humans. The orgs that win will be the ones where experimentation flows from everywhere on the org chart, not just the corner office.
And the bigger picture here is even wilder. We're watching the same pattern as the early internet play out in fast forward. Core tech, then apps, then ecosystems, then entirely new business models that reshape industries.
Your spot in that new economy is being decided right now. By whether your people are experimenting or just spectating.
Try This
Launch a weekly demo hour. Anyone shows an AI tool or workflow they found, no slides, no approvals.
When a 24-year-old rewires how you think about your own job, that ain't a threat fam. That's your company staying alive.
Start this Monday.






Reply