Everyday AI
Posts
Anthropic Claude 3 : Better than ChatGPT and Google Gemini? 🤔

Anthropic Claude 3 : Better than ChatGPT and Google Gemini? 🤔

🧠 Breakdown of Claude 3, Ex-Google Engineer steals AI secrets, new ChatGPT speech feature, and more!

Everyday AI
March 07, 2024

Outsmart The Future

Today in Everyday AI
6 minute read

🎙 Daily Podcast Episode: Anthropic just released Claude 3. How does it stack up against the competition? We put it to the test. Give it a listen.

🕵️‍♂️ Fresh Finds: AI to help you monitor your LLM production, call centers in danger due to AI, and an AI scam using your loved one’s voice. Read on for Fresh Finds.

🗞 Byte Sized Daily AI News: Ex-Google Engineer steals AI secrets, Salesforce’s AI healthcare tools, and AWS’ new AI partnerships. For that and more, read on for Byte Sized News.

🚀 AI In 5: ChatGPT just added a new speech feature! Is it time to get rid of Siri and Alexa? See it here

🧠 Learn & Leveraging AI: We’re giving you the low down on Claude 3 to see if it really is better than the competition. Keep reading for that!

↩️ Don’t miss out: Did you miss our last newsletter? We talked about the dispersion of AI Jobs in the U.S., Microsoft responds to NYT, and OpenAI claps back at Elon Musk. Check it here!

Anthropic Claude 3 - Better Than ChatGPT and Google Gemini? 🤔

Anthropic just released its new AI model Claude 3 and it’s already giving impressive results.

Boasting even better performance than GPT-4, could this really be the new AI model king?

That got us wondering, is it actually better than other models like ChatGPT and Google Gemini?

We're breaking down Claude 3 and showing you how good it actually is.

Join the conversation and ask Jordan Anthropic Claude 3 here.

Also on the pod today:

• AI Model Challenges and Tests 🧪
• Discussion on Claude Opus API 🔃
• Breakdown of Anthropic Claude 3 👀

It’ll be worth your 46 minutes:

Listen on our site:

Click to listen

Subscribe and listen on your favorite podcast platform

Listen on:

Spotify | Apple Podcasts |
Google Podcasts | Amazon Music |

Here’s our favorite AI finds from across the web:

New AI Tool Spotlight – Athina helps monitor your LLM production, Customerly is AI customer service, and Career Connect gives you personalized AI-crafted emails.

Big Tech – Salesforce’s AI Executive talks about how to lead with AI.

Future of Work - Will AI be the death of call centers? Teleperformance has something to say about that.

Trending in AI – Is Prompt Engineering dead already? Maybe so…

Read This – This AI scam is using your loved one’s voice.

1. Former Google Engineer Accused of Stealing AI Secrets 🤫

GA former Google software engineer, Linwei Ding, indicted on four charges in California for allegedly stealing 500 confidential files related to Google's supercomputing data centers for two Chinese companies. Ding reportedly secretly worked for these companies while employed at Google, facing up to 10 years in prison and hefty fines if convicted.

2. Salesforce Revolutionizes Healthcare with New AI Tools 🧑‍⚕️

Salesforce unveils Einstein Copilot: Health Actions, empowering doctors to breeze through appointments and patient info using AI conversationally. Assessment Generation digitizes health surveys, eliminating manual input hassles. Physicians facing burnout due to admin tasks can rejoice as Salesforce streamlines medical data from various sources.

3. AWS Unveils Generative AI Competency Partners 🤝

AWS has launched the Generative AI Competency to spotlight top partners excelling in generative AI tech using AWS services like Amazon SageMaker Jumpstart and AWS Inferentia. These partners are pioneers in driving business growth through innovative generative AI solutions globally.

4. Tech Billionaires Jumping on Nuclear Energy for AI ☢️

Silicon Valley billionaires are pouring money into nuclear energy projects while grappling with the energy-intensive AI boom. The rapid growth of generative AI models like GPT could result in environmental costs up to five times higher than standard searches.

5. Microsoft Announces "New Era of Work" AI Device Event 💻

Microsoft is gearing up to revolutionize the workplace with sleek new devices and cutting-edge AI enhancements. Get ready for the Surface Pro 10 with an OLED display and the Surface Laptop 6 featuring a fresh design and powerful specs. Rumors suggest Intel and Snapdragon models are on the horizon, promising a tech showdown in the coming months.

New ChatGPT Speech Feature!

Now this is a ChatGPT feature you have to hear to believe.

Literally. ChatGPT just added a speech feature!

We’re showing you how it works and where to access it.

Check out today's AI in 5.

Or see this related video:

ChatGPT's Data Analysis Showdown

🦾How You Can Leverage:

Gemini, Mistral, Einstein, Claude…. It’s like a revolving door or freshly updated GenAI offerings.

Blink twice and you might miss the latest and greatest large language model.

(Don’t worry. Our host Jordan used to win blinking contests in 5th grade. So we don’t miss a dang thing.)

So when juggernaut AI startup Anthropic released Claude 3 this week, we rolled up our collective sleeves at Everyday AI and went to work.

Here’s the headliner — Anthropic said that Claude Opus beats reigning champ ChatGPT’s GPT-4 in every benchmark. (More on that later. And on the Claude varieties.)

Should you hang up your Gemini subscription?

Get your team off of ChatGPT?

Is Claude 3 even worth it?

We obviously got all those answers and more. Jordan broke down the newest features from Anthropic’s new feature model.

Ready for the essentials?

Lez get it. 😎

1 – An asterisk on the benchmarks? 🤔

Shortly after we recorded this episode, some news dropped that’s pretty relevant before we dive too deep here.

Anthropic leaned heavily into its benchmark reports, which showed Claude 3 Opus as the most powerful model out right now.

Like, literally beating GPT4 and Google’s Gemini in every benchmark. 😳

(If you’re new to LLMs, benchmarks are essentially standardized tests across different facts to see how smart and powerful models are. Or aren’t.)

But one eagle-eyed LLM researcher spotted that the comparison benchmarks may have been from OpenAI’s original GPT4 model, not the squeaky clean GPT4-Turbo that was released a few months ago.

As it looks like that’s the case, you’ve gotta take Anthropic’s marketing with a grain of salt.

That’s not a slight at Anthropic, as they’re working with publicly available benchmarks. (It’s not like Google’s sneaky marketing attempt.)

Try this:
The UC Berkeley researcher updated Claude 3’s benchmarks with available benchmarks for GPT-4 Turbo.

The results there?
Claude 3’s Opus was no longer leader of the pack.

2 – What’s your Claude 3 flavor? 🫵

Anyone remember that Craig David song or just us? 🤷

Anyways, Claude 3 comes in three varieties, each with their own pros and cons.

For the most part, we’ll be talking about and referencing Opus, Anthropic’s most powerful flavor for Claude 3.

If you’re using Claude via an API, it’s crazy expensive. (But presumably, extremely powerful in its adaptability and robust feature set.

Sonnet is the middle-of-the-pack model, balancing power and API cost. (According to Anthropic, Sonnet will be good for most day-to-day use cases.)

And the cheapest and least robust flavor is Haiku, which is apparently blazing fast.

Try this:
Here’s a breakdown of Claude’s 3 different varieties, including API costs and each flavor’s intended use cases.

3 – Results: we weren’t blown away 🤷

Yet.

Maybe because we use GPT-4 Turbo for hours a day, but we weren’t taken aback by the Opus model’s performance.

We did very unofficial and informal use-case testing, including basic writing, logic tests, business planning, and even vision capabilities.

Our (very unofficial) results showed that Claude 3 Opus lacked in areas where GPT-4 Turbo seemed to shine.

As we said on the show — don’t take our word for it.

Our real-time showcase was more infotainment than actual testing, and you should investigate deeply when deciding the right LLM solution for you or your biz.

But.

Buuuuuuuut.

We’re pretty hyped for some 3rd-party use cases for Opus.

Anthropic demoed Opus’ functionality, which looks like our first look at brand name agents. (Or at least a step there.)

Try this:
Check out Anthropic’s demo of Opus below:

Anthropic on LinkedIn: Claude 3 Opus as an economic analyst | 73 comments

Claude 3 Opus is our most intelligent model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen… | 73 comments on LinkedIn

www.linkedin.com/feed/update/urn:li:activity:7171168530111946753

Also, we gave Claude 3 a more in-depth side-by-side against ChatGPT and Google Gemini.

A quick 30-minute video might end up saving your hoooooours down the road.

⌚

Numbers to watch

March 21st, 2024

Microsoft has announced a new AI event for its new devices. The “New Era of Work” event will take place March 21st.

Now This …

Let us know your thoughts!

Vote to see live results

Will you be using Claude 3?

Reply

or to participate.