• Everyday AI
  • Posts
  • Anthropic Claude 3 : Better than ChatGPT and Google Gemini? šŸ¤”

Anthropic Claude 3 : Better than ChatGPT and Google Gemini? šŸ¤”

šŸ§  Breakdown of Claude 3, Ex-Google Engineer steals AI secrets, new ChatGPT speech feature, and more!

Outsmart The Future

Today in Everyday AI
6 minute read

šŸŽ™ Daily Podcast Episode: Anthropic just released Claude 3. How does it stack up against the competition? We put it to the test. Give it a listen.

šŸ•µļøā€ā™‚ļø Fresh Finds: AI to help you monitor your LLM production, call centers in danger due to AI, and an AI scam using your loved oneā€™s voice. Read on for Fresh Finds.

šŸ—ž Byte Sized Daily AI News: Ex-Google Engineer steals AI secrets, Salesforceā€™s AI healthcare tools, and AWSā€™ new AI partnerships. For that and more, read on for Byte Sized News.

šŸš€ AI In 5: ChatGPT just added a new speech feature! Is it time to get rid of Siri and Alexa? See it here

šŸ§  Learn & Leveraging AI: Weā€™re giving you the low down on Claude 3 to see if it really is better than the competition. Keep reading for that!

ā†©ļø Donā€™t miss out: Did you miss our last newsletter? We talked about the dispersion of AI Jobs in the U.S., Microsoft responds to NYT, and OpenAI claps back at Elon Musk. Check it here!

Anthropic Claude 3 - Better Than ChatGPT and Google Gemini? šŸ¤”

Anthropic just released its new AI model Claude 3 and itā€™s already giving impressive results.

Boasting even better performance than GPT-4, could this really be the new AI model king?

That got us wondering, is it actually better than other models like ChatGPT and Google Gemini?

We're breaking down Claude 3 and showing you how good it actually is.

Join the conversation and ask Jordan Anthropic Claude 3 here.

Also on the pod today:

ā€¢ AI Model Challenges and Tests šŸ§Ŗ
ā€¢ Discussion on Claude Opus API šŸ”ƒ
ā€¢ Breakdown of Anthropic Claude 3 šŸ‘€

Itā€™ll be worth your 46 minutes:

Listen on our site:

Click to listen

Subscribe and listen on your favorite podcast platform

Listen on:

Hereā€™s our favorite AI finds from across the web:

New AI Tool Spotlight ā€“ Athina helps monitor your LLM production, Customerly is AI customer service, and Career Connect gives you personalized AI-crafted emails.

Big Tech ā€“ Salesforceā€™s AI Executive talks about how to lead with AI.

Future of Work - Will AI be the death of call centers? Teleperformance has something to say about that.

Trending in AI ā€“ Is Prompt Engineering dead already? Maybe soā€¦

Read This ā€“ This AI scam is using your loved oneā€™s voice.

1. Former Google Engineer Accused of Stealing AI Secrets šŸ¤«

GA former Google software engineer, Linwei Ding, indicted on four charges in California for allegedly stealing 500 confidential files related to Google's supercomputing data centers for two Chinese companies. Ding reportedly secretly worked for these companies while employed at Google, facing up to 10 years in prison and hefty fines if convicted.

2. Salesforce Revolutionizes Healthcare with New AI Tools šŸ§‘ā€āš•ļø

Salesforce unveils Einstein Copilot: Health Actions, empowering doctors to breeze through appointments and patient info using AI conversationally. Assessment Generation digitizes health surveys, eliminating manual input hassles. Physicians facing burnout due to admin tasks can rejoice as Salesforce streamlines medical data from various sources.

3. AWS Unveils Generative AI Competency Partners šŸ¤

AWS has launched the Generative AI Competency to spotlight top partners excelling in generative AI tech using AWS services like Amazon SageMaker Jumpstart and AWS Inferentia. These partners are pioneers in driving business growth through innovative generative AI solutions globally.

4. Tech Billionaires Jumping on Nuclear Energy for AI ā˜¢ļø

Silicon Valley billionaires are pouring money into nuclear energy projects while grappling with the energy-intensive AI boom. The rapid growth of generative AI models like GPT could result in environmental costs up to five times higher than standard searches.

5. Microsoft Announces "New Era of Work" AI Device Event šŸ’»

Microsoft is gearing up to revolutionize the workplace with sleek new devices and cutting-edge AI enhancements. Get ready for the Surface Pro 10 with an OLED display and the Surface Laptop 6 featuring a fresh design and powerful specs. Rumors suggest Intel and Snapdragon models are on the horizon, promising a tech showdown in the coming months.

New ChatGPT Speech Feature!

Now this is a ChatGPT feature you have to hear to believe.

Literally. ChatGPT just added a speech feature!

Weā€™re showing you how it works and where to access it.

Or see this related video:

šŸ¦¾How You Can Leverage:

Gemini, Mistral, Einstein, Claudeā€¦. Itā€™s like a revolving door or freshly updated GenAI offerings. 

Blink twice and you might miss the latest and greatest large language model. 

(Donā€™t worry. Our host Jordan used to win blinking contests in 5th grade. So we donā€™t miss a dang thing.) 

So when juggernaut AI startup Anthropic released Claude 3 this week, we rolled up our collective sleeves at Everyday AI and went to work

Hereā€™s the headliner ā€” Anthropic said that Claude Opus beats reigning champ ChatGPTā€™s GPT-4 in every benchmark. (More on that later. And on the Claude varieties.) 

Should you hang up your Gemini subscription? 

Get your team off of ChatGPT? 

Is Claude 3 even worth it? 

We obviously got all those answers and more. Jordan broke down the newest features from Anthropicā€™s new feature model. 

Ready for the essentials? 

Lez get it. šŸ˜Ž

1 ā€“ An asterisk on the benchmarks? šŸ¤”

Shortly after we recorded this episode, some news dropped thatā€™s pretty relevant before we dive too deep here. 

Anthropic leaned heavily into its benchmark reports, which showed Claude 3 Opus as the most powerful model out right now. 

Like, literally beating GPT4 and Googleā€™s Gemini in every benchmark. šŸ˜³

(If youā€™re new to LLMs, benchmarks are essentially standardized tests across different facts to see how smart and powerful models are. Or arenā€™t.) 

But one eagle-eyed LLM researcher spotted that the comparison benchmarks may have been from OpenAIā€™s original GPT4 model, not the squeaky clean GPT4-Turbo that was released a few months ago.

As it looks like thatā€™s the case, youā€™ve gotta take Anthropicā€™s marketing with a grain of salt.

Thatā€™s not a slight at Anthropic, as theyā€™re working with publicly available benchmarks. (Itā€™s not like Googleā€™s sneaky marketing attempt.) 

Try this:
The UC Berkeley researcher updated Claude 3ā€™s benchmarks with available benchmarks for GPT-4 Turbo.

The results there? 
Claude 3ā€™s Opus was no longer leader of the pack.

2 ā€“ Whatā€™s your Claude 3 flavor? šŸ«µ

Anyone remember that Craig David song or just us? šŸ¤·

Anyways, Claude 3 comes in three varieties, each with their own pros and cons.

For the most part, weā€™ll be talking about and referencing Opus, Anthropicā€™s most powerful flavor for Claude 3.

If youā€™re using Claude via an API, itā€™s crazy expensive. (But presumably, extremely powerful in its adaptability and robust feature set.

Sonnet is the middle-of-the-pack model, balancing power and API cost. (According to Anthropic, Sonnet will be good for most day-to-day use cases.)

And the cheapest and least robust flavor is Haiku, which is apparently blazing fast. 

Try this:
Hereā€™s a breakdown of Claudeā€™s 3 different varieties, including API costs and each flavorā€™s intended use cases. 

3 ā€“ Results: we werenā€™t blown away šŸ¤·

Yet. 

Maybe because we use GPT-4 Turbo for hours a day, but we werenā€™t taken aback by the Opus modelā€™s performance.

We did very unofficial and informal use-case testing, including basic writing, logic tests, business planning, and even vision capabilities.

Our (very unofficial) results showed that Claude 3 Opus lacked in areas where GPT-4 Turbo seemed to shine. 

As we said on the show ā€” donā€™t take our word for it.

Our real-time showcase was more infotainment than actual testing, and you should investigate deeply when deciding the right LLM solution for you or your biz. 

But. 

Buuuuuuuut. 

Weā€™re pretty hyped for some 3rd-party use cases for Opus.

Anthropic demoed Opusā€™ functionality, which looks like our first look at brand name agents. (Or at least a step there.) 

Try this:
Check out Anthropicā€™s demo of Opus below: 

Also, we gave Claude 3 a more in-depth side-by-side against ChatGPT and Google Gemini.

A quick 30-minute video might end up saving your hoooooours down the road. 

āŒš

Numbers to watch

March 21st, 2024

Microsoft has announced a new AI event for its new devices. The ā€œNew Era of Workā€ event will take place March 21st.

Now This ā€¦

Let us know your thoughts!

Vote to see live results

Will you be using Claude 3?

Login or Subscribe to participate in polls.

Reply

or to participate.