Tag Archives: Turing Institute AI review

DeepSeek, ChatGPT, Grok… Which Is the Best AI Assistant? We Put Them to the Test

The world of artificial intelligence has seen rapid advancements in recent years, with several powerful chatbots emerging to compete for dominance in the AI market. OpenAI’s ChatGPT, Google’s Gemini, Elon Musk’s Grok, Anthropic’s Claude, Meta AI, and the newly introduced DeepSeek from China are some of the most talked-about AI assistants today. But which one stands out as the best?

We put these AI models to the test, examining their strengths, weaknesses, and unique capabilities in areas like reasoning, creativity, real-time updates, humor, and image generation. With expert insights from Robert Blackwell, a senior research associate at the UK’s Alan Turing Institute, we explore how these models stack up against one another.

The Rise of DeepSeek: A New Challenger in the AI Race

DeepSeek, the newest AI chatbot from China, made headlines when it launched on January 20, 2024. Within days, it triggered a $1 trillion market shift in the US tech industry as investors feared America’s stronghold over AI dominance was under threat.

Unlike its American counterparts, DeepSeek operates under strict government-imposed constraints, meaning that it avoids politically sensitive topics, such as references to Tiananmen Square or China’s President Xi Jinping. However, this chatbot is no lightweight—DeepSeek is trained with a reasoning model called r1, which allows it to compete with the likes of ChatGPT and Gemini.

Putting the AI Assistants to the Test

To evaluate these AI models, we asked them a variety of questions, ranging from creative writing challenges to real-time information retrieval and problem-solving. Here’s how each performed:

1. ChatGPT (OpenAI)

Strengths:

  • Industry leader with the most widespread adoption.
  • Chain of Thought reasoning for complex queries.
  • Can generate high-quality poems, stories, and technical content.
  • ChatGPT 4o version offers a faster response time and internet access.

Weaknesses:

  • Paid version (ChatGPT 4o) offers more advanced features than the free model.
  • Avoids controversial topics and may flag certain prompts as violating policies.

Example Test: We asked ChatGPT to write a Shakespearean sonnet on AI’s impact on humanity. Initially, the chatbot flagged this as potentially violating its policy, but eventually, it produced a melancholic, thought-provoking response.

2. DeepSeek (China)

Strengths:

  • Impressive image recognition and logical reasoning.
  • Competitively matches OpenAI’s ChatGPT in many ways.
  • Built with r1 reasoning technology.

Weaknesses:

  • Avoids politically sensitive topics and refuses to discuss Chinese government affairs.
  • Web browsing feature is often slow or unavailable due to high demand.

Example Test: When asked “Who is Tank Man in Tiananmen Square?”, DeepSeek refused to answer, stating: “I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.”

3. Grok (Elon Musk’s AI on X)

Strengths:

  • Freely available on Musk’s X (formerly Twitter).
  • Capable of generating realistic images of public figures.
  • Strong emphasis on humor and sarcasm.

Weaknesses:

  • Some AI-generated images can be controversial or misleading.
  • Not as advanced in logical reasoning as ChatGPT.

Example Test: We asked Grok to generate an image of Joe Biden playing the piano, and it produced a photorealistic image that OpenAI’s DALL-E refused to create.

4. Gemini (Google’s AI)

Strengths:

  • Developed under Google DeepMind, ensuring high-quality AI responses.
  • Excels in reading and analyzing images.
  • Efficient at processing mathematical equations and scientific data.

Weaknesses:

  • Refuses to comment on political topics, such as Donald Trump’s legal issues.
  • Image generation capabilities lag behind OpenAI and Grok.

Example Test: When asked “How is Donald Trump doing?”, Gemini responded: “I can’t help with responses on elections and political figures right now.”

5. Claude (Anthropic AI)

Strengths:

  • Focuses on ethical AI and safety.
  • Offers various response styles for different needs.
  • Reminds users that it can make mistakes, ensuring transparency.

Weaknesses:

  • Slower response times compared to competitors.
  • May refuse to answer complex queries if system capacity is constrained.

Example Test: Claude was asked to analyze a legal document, and while it provided insightful feedback, the processing time was noticeably longer than other AI models.

6. Meta AI (Facebook’s Parent Company, Meta)

Strengths:

  • Open-source model, allowing for customization and improvements.
  • Handles common sense questions well.
  • Can be downloaded and fine-tuned by developers.

Weaknesses:

  • Occasionally generates hallucinations (false information).
  • Lacks real-time search capabilities.

Example Test: We asked Meta AI, “You are driving north along the east shore of a lake, in which direction is the water?” The correct answer, west, was provided with logical reasoning.

Final Verdict: Which AI Assistant Is the Best?

Choosing the best AI assistant depends on what you prioritize:

  • For creativity and reasoning: ChatGPT (OpenAI) remains the leader.
  • For humor and social media use: Grok (Elon Musk’s AI) is fun and engaging.
  • For ethical and safety-focused AI: Claude (Anthropic) is a great choice.
  • For real-time updates: DeepSeek and ChatGPT 4o offer some internet access, though DeepSeek has limitations in China-related topics.
  • For open-source flexibility: Meta AI is an excellent option.

Ultimately, AI development is evolving at a rapid pace, and each model has its unique strengths and drawbacks. The competition between these AI giants is sure to drive even greater innovation in the coming years.