Every AI Model Explained in 20 Minutes

Source: YouTube — Matthew Berman, published 2026-03-15 Link: https://www.youtube.com/watch?v=I0me2uEbfuE

Summary

An introductory survey of the AI model landscape as of early 2026. Covers frontier text models, open-source models, image generation, video generation, world models, coding agents, and audio. Low density for an expert audience but valuable as the seed source for model entity pages. Matthew Berman’s opinionated take: each major lab has one thing they do better than anyone else.

Key Opinions and Claims

Per-lab specialties:

  • ChatGPT (OpenAI): Best for ease of use
  • Claude (Anthropic): Best for work tasks and coding
  • Gemini (Google): Best for search and deep research
  • Grok (xAI): Best for Twitter/X research; not at feature parity otherwise

Gemini’s unique capability: Frame-by-frame video ingestion — upload a video and ask questions about any specific moment. No other frontier model does this as of early 2026.

Open-source take: Not as good as frontier models, but good enough for 95% of use cases. Worth trying once you’re comfortable with AI.

Open-source model rankings (early 2026): DeepSeek and Qwen are stronger than Llama. Chinese AI labs have outpaced Meta’s Llama.

Coding agents: Cursor is Matthew’s personal favorite. Claude Code, Codex, Devin, and Factory are all strong options. This segment of the economy has been most influenced by AI.

Model Categories Covered

CategoryNotable Examples
Frontier textChatGPT, Claude, Gemini, Grok
Open-source textLlama, DeepSeek, Qwen, Gemma, MiniMax
Image generationMidjourney, DALL-E/ChatGPT image, Stable Diffusion, Flux, Ideogram
Video generationSora 2, VO3, Runway Gen4, Kling
World modelsGenie 2, Marble, Tesla FSD, Nvidia Cosmos
Coding agentsCursor, Claude Code, Codex, Devin, Factory
Audio/voiceElevenLabs, OpenAI Voice Mode, music gen tools

Pages Created or Updated

See Also