- Everyday AI
- Posts
- Agentic AI: The risks and how to tackle them responsibly
Agentic AI: The risks and how to tackle them responsibly
Perplexity launches new Labs tool, Hugging Face unveils open-source humanoid robots, Gemini now auto-summarizes your emails and more!
š Subscribe Here | š£ Hire Us To Speak | š¤ Partner with Us | š¤ Grow with GenAI
Outsmart The Future
Today in Everyday AI
6 minute read
š Daily Podcast Episode: A Microsoft leader breaks down multi-agentic systems, governance strategies, and human-AI collaboration to help tackle responsible agentic AI. Give it a listen.
šµļøāāļø Fresh Finds: Arizona Supreme Court turns to AI, YouTube Shorts gets Google Lens feature and Perplexityās new update. Read on for Fresh Finds.
š Byte Sized Daily AI News: Perplexity launches new Labs tool, Hugging Face unveils open-source humanoid robots and Google Gemini now auto summarizes your emails. For that and more, read on for Byte Sized News.
š§ Learn & Leveraging AI: Looking to harness the power of agentic AI? We break down what a Microsoft leader had to say about how you can do so responsibly. Keep reading for that!
ā©ļø Donāt miss out: Did you miss our last newsletter? We talked about Amazon and New York Timesā AI content deal, DeepSeek updating its R1 model and Gemini being able to watch videos in Google Drive. Check it here!
Agentic AI: The risks and how to tackle them responsibly š”
We only talk about the upside of agentic AI.
But why don't we talk about the risks?
As AI agents grow exponentially more capable, so too does the likelihood of something going wrong.
Case in point:
ā³ Microsoft's new multi agent announcements: mind-blowing.
ā³ They can talk together, divvy up work and make decisions autonomously.
ā³ Upside = enormous.
ā³ Risks? Can't be overlooked.
So how can we take advantage of agentic AI while also addressing the risks head on?
Also on the pod today:
⢠Agentic AI's Ethical Implications āļø
⢠Microsoftās AI Governance Strategies š§
⢠Agentic AI: Future Workforce Skills š¼
Itāll be worth your 31 minutes:
Listen on our site:
Subscribe and listen on your favorite podcast platform
Listen on:
Hereās our favorite AI finds from across the web:
New AI Tool Spotlight ā Odyssey creates interactive AI videos, Ainee is an AI-driven note taking companion and OpenMemory MCP is memory for your AI tools.
Trending in AI ā The Arizona Supreme court is turning to AI-generated reports to deliver news.
OpenAI ā OpenAI is arguing to keep its countersuit against Elon Musk in its trial over for-profit shift.
AI in Media ā YouTube now lets you search for things you see in Shorts.
Perplexity ā Perplexityās new update now has your pages pinned in the top left for easy access.
Searching academic papers and journals is now easier than ever.
Finance, Travel, Shopping, and Academic pages are now pinned to your sidebar for quick, seamless access.
Currently available on web.
ā Perplexity (@perplexity_ai)
3:19 PM ⢠May 30, 2025
AI Models ā Black Forest Labsā new Kontext models can edit pics and generate them.
Money in AI ā Grammarly has secured $1 billion from General Catalyst to build an AI productivity platform.
Google - Google has fixed a bug that led AI Overviews to say itās now 2024.
AI in Government ā RFK Jr,ās recent health report seems to have tons of AI errors in it.
1. Perplexity Unveils Labs for Advanced AI-Powered Workflows š§āš¬ļø
Perplexity just launched Perplexity Labs, a new tool for its $20/month Pro subscribers that automates complex tasks like report and dashboard creation using AI, available now on web and mobile. This move signals Perplexityās push beyond search into productivity, leveraging AI to handle research, coding, and data visualization in workflows lasting 10 minutes or more.
The timing is notable as Perplexity also previews its Comet browser and recently acquired a professional social platform, showing clear ambition to expand its footprint.
2. Hugging Face Unveils Open Source Humanoids š¦¾ļø
Hugging Face just dropped two new open source humanoid robots, HopeJR and Reachy Mini, signaling a bold move deeper into robotics this year. HopeJR offers a full-scale 66-degree-of-freedom robot capable of walking and arm movement, while Reachy Mini is a compact desktop bot designed for testing AI applications with head movement and speech.
With prices starting around $3,000 and $250 respectively, these robots aim to democratize robotics beyond big players, thanks in part to Hugging Faceās recent acquisition of Pollen Robotics.
3. Googleās Gemini AI Now Auto-Summarizes Your Emails š¤
Google just took a bold step by making its Gemini AI assistant automatically summarize lengthy emails right at the top of your inbox, no clicks required, according to TechCrunch. This update rolls out alongside existing manual options and aims to keep users on top of complex email threads by updating summaries as conversations evolve.
While this shows how deeply AI is embedding itself into everyday tools, it comes with a cautionary noteāpast AI summary features from Google and others have sometimes stumbled with accuracy.
4. Judge Questions AIās Impact on Google Antitrust Case š§āāļø
In a crucial moment during the DOJās antitrust trial against Google, Judge Amit Mehta pressed lawyers on whether emerging AI-driven search engines can realistically compete with Googleās dominance.
The Department of Justice is pushing for remedies including selling off Chrome and restricting Googleās use of AI to block monopoly power, while Google argues these moves could harm innovation and national security. This debate comes as AI tools reshape how people search online, raising questions about fair competition and future market dynamics.
5. OpenAIās Next Big Move: AI Without Screens ā”ļø
OpenAI is gearing up to reshape how we interact with artificial intelligence by building an āambient computer layerā that eliminates the need for screens, according to COO Brad Lightcap at the WSJ Future of Everything event. Following its recent $6.5 billion acquisition of AI device startup io, founded by former Apple designers including Jony Ive, OpenAI aims to create a truly personal AI experience beyond browsers and apps.
CEO Sam Altman envisions a future where accessing AI wonāt require opening a web browser or typing queries, hinting at revolutionary hardware launching in 2026.
6. Business Insider Cuts 21% of Staff Amid Traffic Collapse and AI Pivot š
Business Insider announced a significant 21% staff reduction as it grapples with a sharp drop in web traffic, largely due to changes in search engine dynamics, notably the impact of Google Zero.
CEO Barbara Peng revealed the company is pulling back from search-dependent commerce categories that once thrived but are now underperforming. In response, Business Insider is doubling down on AI innovations, integrating Enterprise ChatGPT, generative AI site search, and even an AI-powered paywall to reinvent its revenue streams.
7. Delaware AG Hires Bank to Evaluate OpenAIās For-Profit Shift šµ
Delawareās attorney general is bringing in an independent investment bank to evaluate OpenAIās plan to convert from nonprofit to for-profit, potentially slowing down the transition. This move adds another layer of oversight beyond the banks already hired by OpenAI and Microsoft, focusing on the valuation of equity held by OpenAIās nonprofit arm.
Elon Muskās recent $97.4 billion takeover bid for OpenAI, though rejected, may have influenced regulatory concerns about the startupās valuation. This scrutiny could complicate OpenAIās efforts to attract new investors and eventually go public, impacting the broader AI industryās financial landscape.
š¦¾How You Can Leverage:
Microsoft just gave agents their own corporate IDs.
No, seriously.
While you were debating whether ChatGPT could replace your intern, Sarah Bird and team were busy building an entire identity management system for AI agents.
Because apparently 81% of companies plan to deploy these digital workers in the next 18 months.
And they're gonna need badges.
Sarah is Microsoft's Chief Product Officer of Responsible AI and she joined the Everyday AI show today to help us understand the freshly rewritten playbook that is Responsible AI.
Why?
Well, Responsible AI was a lot more straightforward 2.5 years ago when we just had a one-on-one chat with an AI chatbot.
But now?
Microsoftās Copilot, as an example, has options for multiple AI agents to work together, divvy up work, and finish it all on your behalf.
For real.
Thatās why we chatted with Sarah on todayās episode of Everyday AI ā because the rules of Responsible AI are changing REAL QUICK and business leaders gotta keep up.
Hereās what ya need to know. š
1 ā Your agents are becoming digital employees (with actual IDs) šŖŖ
Sarah dropped some knowledge that made us pause the 1990s Super Nintendo.
(Yes, that reference lands perfectly after todayās convo.)
See, everyone's treating agents like fancy chatbots, but Microsoft just started giving them actual Entra IDsāthe same identity system they use for human employees.
Sarah explained how agents are this weird new entity that's not quite a user, not quite an application, but something totally different that needs its own governance structure.
Think about what this actually means for your organization.
Your agent can access customer data. Financial systems. Internal communications. And here's the part that should make you sweat:
Sarah mentioned these agents will work for hours without any human checking in, just doing their thing, making decisions, accessing systems, coordinating with other agents like some kind of digital Ocean's Eleven crew.
That's not a tool.
That's an employee.
And most companies? They're still treating agent security like it's 2022āno identity management, no access controls, no governance framework.
Just vibes and prayers. Lolz.
Microsoft saw this coming and built agent IDs directly into Entra, connected it to Defender for threat monitoring, and made sure when developers build agents in Foundry or Copilot, the identity gets attached automatically.
No manual process. No forgotten steps. Just smart infrastructure that treats agents like the digital workforce members they're becoming.
Try this:
Stop what you're doing and open your org chart right now.
Add a new box called "Digital Workforce" and list every system your entry-level employees can accessāthat's your starting point for agent permissions. Create three access tiers: basic (read-only data), intermediate (can modify non-critical systems), and advanced (financial/customer data access).
The key is assigning every future agent to a tier BEFORE you build it, not after it's already loose in your systems doing who knows what.
2 ā The testing catastrophe heading straight for your deployment š³
Picture this scene that Sarah described happening at companies everywhere.
Teams spend months building their perfect agent system, getting hyped about all the time it'll save, crafting beautiful demos for leadership. They reach the finish line, ready to ship this bad boy into production.
Then someone in the back of the room raises their hand: "So... what happens if it does something weird?"
Cricket sounds.
Panic.
Yikes.
Here's what blew our minds about what Sarah shared: testing agents isn't remotely like testing traditional software where you check if the login button navigates to the right page.
You're testing whether the agent understands user intent, whether it picks the right tool from its toolkit, whether it stays on task when working solo for three hours straight without drifting into some bizarre tangent about medieval farming techniques.
Microsoft built specific evaluators in Foundry just for this madness. They test for copyright violations (because no one wants that). They test for prompt injection vulnerabilities (because hackers are creative). They test whether the agent can be tricked into leaking sensitive data (because social engineering works on robots too, apparently).
But here's the kicker that Sarah emphasized: her team at Microsoft tests from DAY ONE of development, not as some afterthought when the CFO is breathing down their neck about launch dates.
They co-develop the testing alongside the actual system, catching issues when they're tiny problems instead of company-ending disasters.
Try this:
Before writing a single line of code for your next agent, channel your inner pessimist and write down 10 specific ways it could fail in YOUR environment.
Not generic "it might hallucinate" fears, but concrete nightmares like "interprets 'review customer feedback' as 'respond to every negative review with a 50% discount code.'"
Build one test for each failure mode and run these tests every single time you iterateānot at the end, not when you remember, but systematically every time like your job depends on it (because it might).
3 ā Your workforce needs superpowers they donāt have yet š¦ø
Remember when everyone freaked out about calculators replacing math teachers?
The intersection of agentic AI and Responsible AI is wayyyyyy different.
Sarah explained how humans are moving from the "inner loop" to the "outer loop" of oversight, and if that sounds like corporate jargon, letās break it down into human speech.
Inner loop means you're the helicopter parent of a single AI chatbotāchecking every output, approving every decision, basically babysitting a very smart toddler.
Outer loop means agents might work autonomously for hours while you monitor patterns and aggregates, only stepping in when something looks systemically wrong.
It's like going from being a chatbot micromanager to being an agentic CEO overnight.
Sarah mentioned how even within her own team at Microsoft, they run learning sessions where people share what agents they built, what worked brilliantly, and what failed so spectacularly it became an office legend.
They're doing this because nobodyāand yāall we mean NOBODY, not even Microsoftāactually knows the best patterns yet for human-agent collaboration.
Itās fresh.
The wildest part?
Microsoft Research just launched Magentic UI, which is literally an experimental playground for testing different ways humans and agents can work together, because the interface patterns we need don't exist yet. They're crowdsourcing innovation because this problem is so new that even the tech giants are making it up as they go.
Try this:
Sarah's team does something genius that you can steal immediately.
Start "Failure Fridays" where everyone shares one agent experimentānot boring status updates, but real stories like "I built an agent that saved 10 hours weekly but then it started ordering office supplies every time someone mentioned being 'out of ideas.'"
Document every pattern religiously, because within a month you'll have a playbook of what actually works in YOUR specific environment, not some generic best practices document that treats every company like they're identical twins.
Reply