13 Comments

I compared how Gemini Ultra and GPT-4 answered the same questions across a range of categories: https://theaidigest.org/gemini-vs-chatgpt

Readers voted for which answer they prefer, and it's pretty mixed - I think the performance is fairly similar for the two models.

Expand full comment
Feb 22·edited Feb 22

Why is Google still lagging behind OpenAI? Here's one theory:

1. The tasks you present here are technical challenges. OpenAI is clearly trying to solve some of these challenges by grafting on particular functions that go beyond "pure" LLM-based next-token prediction -- this is why ChatGPT will now make use of Python code to solve math problems, for example. Although OpenAI is technically a nonprofit that claims it is pursuing artificial general intelligence, in reality it is acting like a commercial tech company that's iterating on its product to make it more useful to consumers.

2. In contrast, Google is a for-profit Big Tech company, yet some of the sharpest thinkers on the Google team (many coming from DeepMind) don't seem all that interested in solving these sorts of technical challenges. Francois Chollet, for example, is quite open in his research and public writings about seeing the current functionality of LLMs as quite narrow, and limited to whatever data they've been trained upon, and with very little capacity to generalize to novel situations. He -- and perhaps others at Google? -- appear to be searching for bigger conceptual breakthroughs that could lead to true general intelligence.

Expand full comment

Tons of comments from conservatives in my TL about Gemini's guide rails on generating images of people of different races. Amy thoughts on that?

Expand full comment

Is there a price difference between the systems under test? The "latest version of GPT-4" is subscription-only - while Bard is free to use.

Expand full comment

Fascinating!

Expand full comment

Excellent analysis and testing setup!

Expand full comment
Feb 23·edited Feb 23

I cannot even begin to express just how Superior I find Gemini for almost all activities involving writing. I'll continue to subscribe to chat gpt4 but I pretty much haven't touched it in the past 9 days. Chat gpt4 continues to write everything in it's default setting, that being an overly word yessay, that sounds like a pretentious college student trying to show up how intelligent it is. Gemini on the other hand sounds almost scary human on most of the writing tasks I have it do. Have no idea about logic tests and coding since I just never do any of that stuff. But for anything involving writing, especially in the area of sales or marketing or things like writing copy and emails, absolutely no contest, Gemini basically crushes gpt4 in those areas.

Expand full comment

Ouch!

The fruit slice obsession is hilarious. Gives Gemini a sort of human quality. ("The guy just really loves his sliced fruit segments, what are you gonna do?")

Your observations above appear to align with the tentative consensus out there. My most recent (completely unscientific) poll had 85% saying that ChatGPT (GPT-4) was better than Gemini Ultra in their tests (8% said they're about the same, while 8% preferred Gemini).

But there are also anecdotal indications that Gemini, while trailing behind on reasoning tasks, is actually a better *creative* writer than ChatGPT (less bland, more varied, more imaginative).

I'm yet to test Gemini Ultra out myself as I'm holding off on pulling the trigger on the free trial until some of the early quirks are taken care of.

I think Gemini's ultimate claim to fame may come not from potentially nudging out GPT-4 on benchmarks but from the upcoming 1 million token window combined with Gemini's native multimodality.

The fact that Gemini can reliably understand lengthy video input, down to specific frames and individual objects, definitely appears like a leap from what we're currently used to. Take a look at this impressive example Ethan Mollick posted just a few hours ago (he has insider access that we don't):

https://www.linkedin.com/feed/update/urn:li:activity:7166242775103971328/

Expand full comment