- Everyday AI
- Posts
- Ep 789: Tokenmaxxing is over: The New Era of Token Efficiency and how Your Company Should Adapt
Ep 789: Tokenmaxxing is over: The New Era of Token Efficiency and how Your Company Should Adapt
Microsoft debuts MAI-Thinking-1 and Autopilot 'Scout' Agent, OpenAI releases Enterprise Codex Plugins, White House signs AI order and more.
👉 Subscribe Here | 🗣 Hire Us To Speak | 🤝 Partner with Us | 🤖 Grow with GenAI
Sup y’all 👋
A TON of new updates, from Microsoft Build, OpenAI, Anthropic and more. I’m sitting at the Microsoft Build keynote as I send this, so I’m sure we’ll have more tomorrow.
For our AI at Work Wednesdays that do hands-on live demos, I’m thinking of working in more task-specific demos across different platforms.
Ex: making slides with AI in Powerpoint, ChatGPT, Claude, Gemini, etc.
What type of work do you care most about automating with AI? 🤔🗳️ Vote to see LIVE results 🗳️ |
✌️
Jordan
Outsmart The Future
Today in Everyday AI
8 minute read
🎙 Daily Podcast Episode: Think using more tokens means better business outputs? Think again. Give today’s show a watch/read/listen.
🕵️‍♂️ Fresh Finds: OpenAI is bringing a massive new data center to Michigan, Hermes Agent gets released for the Desktop and Microsoft teases its Super App. Read on for Fresh Finds.
đź—ž Byte Sized Daily AI News: Microsoft debuts MAI-Thinking-1 and Autopilot 'Scout' Agent, OpenAI releases Enterprise Codex Plugins, White House signs AI order and more. Read on for Byte Sized News.
💪 Leverage AI: Most companies are tracking the wrong AI metric. Here’s how to shift from tokenmaxxing to token efficiency. Keep reading for that!
↩️ Don’t miss out: Miss our last newsletter? We covered: Bernie Sanders proposes 50% AI tax, NVIDIA drops new agents and PC partnerships, Anthropic files IPO paperwork and more Check it here!
Ep 789: Tokenmaxxing is over: The New Era of Token Efficiency and how Your Company Should Adapt
More tokens = more ROI, right? 🤔
Maybe.
But probably not.
Maybe one of the weirdest AI trends that has oddly stuck in 2026 is tokenmaxxing -- the practice of individuals and companies racing to use as many AI tokens as possible and equating it with business progress.
Reality check: token efficiency is the real rage.
So, how do you measure token efficiency and how can your company avoid the cost pitfalls of tokenmaxxing?
Also on the pod today:
• Token maxing: internal leaderboards exposed 📊
• $500M AI spend—one company, one month 💸
• What is an AI token? 🔤
Listen on our site:
Subscribe and listen on your favorite podcast platform
Listen on:
Here’s our favorite AI finds from across the web:
New AI Tool Spotlight – Fundraisly Optimizes your Investor Matches, Gigacatalyst Delivers features that your sales team promises, without engineers, Branda turns a name and idea into a complete brand identity in minutes
Microsoft Build Begins — Microsoft is making a big play to win back developers at Build, with new AI models, a Copilot super app, and a revamped Windows 11 developer experience.
OpenAI and Michigan — OpenAI is building a massive new data center in Michigan, promising union jobs, community investment, and free AI tools for students.
Hermes Agent — Hermes Agent Desktop is now live.
ChatGPT Full Screen — ChatGPT just got a full-screen mode for long-form writing.
Microsoft Super App — Microsoft CEO Satya Nadella casually confirms the Super App, but no preview.
TwelveLabs Rodeo — TwelveLabs just dropped Rodeo, an AI tool that turns your raw video into polished stories in minutes using simple prompts.
Florida Sues OpenAI — Florida just sued OpenAI and Sam Altman, claiming ChatGPT is marketed as safe for kids but actually poses serious risks.
OpenAI and Politics — OpenAI says it hasn’t funded any political groups or candidates, and wants AI policy to stay out of partisan games. Curious how they’re handling all the pressure?
Jetbrains Mellum2 — JetBrains just dropped Mellum2, a super-fast open model for text and code that’s perfect for high-throughput tasks.
1. Microsoft debuts MAI-Thinking-1 and expands its in-house AI lineup 🤔
At Build 2026, Microsoft announced MAI-Thinking-1, a new flagship in-house model that signals a bigger push to build its own AI systems after loosening its reliance on OpenAI.
Microsoft says the “medium-sized” model was trained from scratch on clean data and matches top models on key software engineering tests.
Microsoft also rolled out new models for image generation, transcription, voice, and coding, showing it wants more control over the AI tools powering products like GitHub Copilot and Visual Studio Code.
2. Microsoft’s Autopilot Agent Scout turns OpenClaw into a 365 work agent 🦞
Microsoft is launching Scout through its Frontier program, giving GitHub Copilot subscribers access to a cloud-based assistant built on the OpenClaw framework and designed to work across Microsoft 365, desktop apps, and the web browser.
S
Scout runs as a persistent agent with its own user-given name and style, meaning it can remember work preferences, absorb feedback, and build “skills” that help it handle tasks like calendar management, meeting agenda drafting, and other repeatable workflows. The bigger bet is that workers will train Scout around their own habits, while Microsoft tries to keep the agent in bounds with continuous policy checks and audit trails that track whether it is following the rules.
3. Anthropic expands Mythos access to 150 more partners 🪽
Anthropic is widening Project Glasswing, giving 150 additional partners in more than 15 countries access to its Mythos model for finding serious software flaws. The move brings sectors like power, water, healthcare, communications and hardware into the testing program, signaling that AI-powered security tools are moving from tech labs into critical infrastructure.
The expansion lands during a busy week for Anthropic, following its EU access announcement and confidential IPO filing, while concerns remain that the same vulnerability-finding power could also speed up attackers.
4. OpenAI turns Codex into a business workflow hub with new plugins 🔌
OpenAI announced a major Codex update with three key additions: Annotations for precise edits inside documents and spreadsheets, Sites for creating hosted internal web apps, and role-specific plugins that connect to tools like Salesforce, Snowflake, Figma, Tableau, and FactSet.
The update means business users can ask Codex to change a selected part of a file without breaking the rest, turn static work materials into shareable interactive pages, and automate multi-step tasks across analytics, sales, design, creative production, and finance.
5. OpenAI brings frontier models and Codex to AWS customers 🤖
OpenAI announced today that its frontier models and Codex are now generally available on AWS, giving enterprises a more direct way to use OpenAI tools inside the cloud systems they already trust.
The move matters because it turns a major AI adoption hurdle, security reviews, procurement, billing, and governance, into a more familiar AWS workflow.
6. Trump’s new AI order delays frontier model rules while boosting cyber reviews 📜
President Trump signed a scaled-back AI and cybersecurity executive order Tuesday, after scrapping a stricter version that raised competitiveness concerns. The order tells national security agencies to strengthen cyber defenses, build a cybersecurity clearinghouse, and create a classified process within 60 days to judge when advanced AI models pose major cyber risks.
For now, the White House is avoiding mandatory licensing or broad pre-release controls, signaling that Washington wants more oversight of powerful AI without putting a heavy regulatory foot on the gas just yet.
The next AI loser will look wildly productive.
Agents are running all day, and token charts are climbing.
Leaders point at usage like proof the business finally figured out AI.
Cute.
The scoreboard is broken.
Your competitors won’t win by using more tokens. They’ll win by turning fewer tokens into accepted work, faster cycles, and lower rework while everyone else celebrates the meter spinning.
That’s what we tackled in today’s Everyday AI Start Here Series: token maxing is the wrong scoreboard, token efficiency is the new operating discipline, and leaders need to know where the spend leaks before the subsidy era gets less friendly.
We’re breaking down why usage creates bad incentives, how tokens really disappear inside modern agents, and how to match the right model to the job before AI spend becomes another unmanaged SaaS swamp.
1. Stop rewarding AI motion 🔥
For the past six-ish months, companies treated heavy token use like proof of productivity, which is a rotten incentive. That’s how you get leaderboards instead of ROI.
Employees run empty loops because low usage suddenly feels like career risk.
Meta reportedly had an internal leaderboard, with one engineer at 281 billion tokens in a month, before anyone asked whether the output mattered. The scarier part is the behavior it trains inside teams.
People optimize for visible AI activity when leadership celebrates consumption instead of outcomes.
Every bad AI metric eventually becomes a management system once executives start managing to it.
Reward token burn and you’ll get longer loops, bigger bills, and prettier dashboards across every team with model access. Many will attach to almost nothing that customers, finance, or operations would accept.
Try This
On Monday, pull the top 10 token-consuming workflows or users. For each one, require one accepted output: shipped code, approved analysis, a resolved support backlog, a finished RFP, a reduced cycle time, or rework that disappeared.
If the owner can’t name the business artifact, kill the leaderboard energy and move the workflow into review.
2. Audit the hidden token drains ⚡
Tokens aren’t just typed words. They’re inputs, outputs, reasoning, and tool use behind the scenes.
Every long prompt, pasted doc, chat thread, file search, code run, web call, and scheduled agent can quietly hit the meter.
The economics changed fast. Token prices have fallen hard.
Usage still jumped 100 to 200x as reasoning models and agents started thinking by default, calling tools, reading files, and repeating steps.
One billion tokens in a day can resemble $12,000 to $17,000 of API usage depending on the mix. Now put that inside a daily or hourly agent that triages SharePoint, OneDrive, a team inbox, and a live dashboard while the human checks only the final output and nobody watches the meter.
That gets expensive fast.
Try This
Take one valuable agent workflow and map the four drains separately: source material, final response, reasoning, and tools. Then compare the cost and quality against human-only work, human plus non-agentic AI, and an expert-driven agent loop.
The target: fewer wasted tokens, fewer runaway loops, and a clean view of which steps actually move the work forward.
3. Buy output per dollar 🚀
Token efficiency means buying the intelligence your workflow needs, not the model your team likes. Use the smallest capable model for boring work, then save the expensive model for jobs where extra reasoning actually changes the outcome.
The model examples were blunt. Artificial Analysis showed Claude Opus 4.7 costing about $5,100 for its test run versus about $3,300 for OpenAI’s latest model, while Gemini 3.1 Pro was framed as dramatically cheaper for a similar intelligence level.
DeepSWE added the same management lesson from the coding side: don’t fall in love with the logo. If GPT-5.5 solves a harder agentic coding task at lower cost than another model, your governance process needs to notice before procurement locks everyone into the wrong default and makes waste look official.
The harness matters more than leaders usually want to admit. A modular setup lets teams swap models when GPT-5.6, Mythos, or whatever comes next breaks a flow, underperforms, or suddenly changes the cost math.
Try This
Build a model routing table for five recurring jobs: PDF parsing, summarization, RFP drafting, code changes, and executive research. For each job, record the default model, accepted output rate, rework rate, tool calls, cost, and fallback model.
Then make the rule painfully simple: no model keeps the job because it’s famous. It keeps the job because it delivers accepted work at the best cost, speed, and reliability.
AI activity is easy to fake. Accepted output is harder, which is why your AI strategy needs this metric before the token bill finds the corporate AmEx.






Reply