Models

ChatGPT, Claude, Gemini, and language model developments

14 articles

Karpathy says LLM memory features "trying too hard"

In a pair of posts on X on March 25, Andrej Karpathy noted that the memory feature has an issue: models can treat stale, one-off queries as deep ongoing interests.

1d ago3 min

Research

Moonshot AI proposes new method for how LLM layers share information, claims 1.25x compute advantage

Moonshot AI's Kimi Team has published a technical report proposing a change to a fundamental component of modern AI models called residual connections.

March 16, 20265 min

Models

How ChatGPT and AlphaFold helped an Australian engineer design a personalized cancer vaccine for his dog

A Sydney tech entrepreneur with no background in biology used ChatGPT, Google DeepMind's AlphaFold, and his own machine learning expertise to design a custom mRNA cancer vaccine for his rescue dog.

March 16, 20266 min

Models

Claude's 1M context window is now generally available with no long-context pricing premium

Anthropic has moved the 1 million token context window for Claude Opus 4.6 and Sonnet 4.6 from beta to general availability, and eliminated the long-context pricing premium entirely.

March 14, 20265 min

Models

GPT-5.4 arrives with native computer use, 1M context, and a new tool search capability

OpenAI released GPT-5.4 today, billing it as its most capable model to date for professional work, coding, and automated tasks.

March 5, 20265 min

Models

Alibaba launches Qwen 3.5 small model series with sub-1B edge options

Alibaba's Qwen team has released four new small language models — Qwen3.5-0.8B, 2B, 4B, and 9B — alongside their base model counterparts, extending the Qwen3.5 architecture to compact, resource-efficient deployments.

March 2, 20264 min

Models

Gemini 3.1 Pro claims top-tier reasoning gains

Google has released Gemini 3.1 Pro, a new version of its Pro model line that it says upgrades the core reasoning capabilities behind recent Gemini 3 advances.

February 19, 20264 min

Products

Anthropic releases Claude Sonnet 4.6, bringing near-flagship performance to its mid-tier model

Anthropic released Claude Sonnet 4.6, upgrading its mid-tier AI model with improved coding, computer use, and long-context reasoning capabilities.

February 18, 20269 min

Research

DeepMind reports research-level results from Gemini Deep Think

Gemini Deep Think is being used to assist with research-level problems in mathematics, computer science, physics, and economics.

February 11, 20264 min

Products

OpenAI upgrades ChatGPT deep research with GPT-5.2 and new controls

OpenAI has rolled out a set of upgrades to Deep Research in ChatGPT, including a new backend model a...

February 10, 20263 min

Research

Study suggests LLM leaderboards may be more fragile than they appear

A new study from MIT researchers argues that some popular platforms used to rank large language models can be far mor...

February 10, 20264 min

Models

"Fast mode" for Claude Opus 4.6, 2.5x speed at 6x the price

A new "fast mode" for Claude Opus delivers up to 2.5 times higher output token generation speed at a significant premium: six times the standard.

February 8, 20266 min

Products

OpenAI drops GPT-5.3-Codex right after Anthropic's Opus 4.6

OpenAI on Feb. 5, 2026, released GPT-5.3-Codex<sup referenceindex="1" refer...

February 6, 202613 min

Products

Anthropic releases Claude Opus 4.6 with 1M context window

Anthropic announced Claude Opus 4.6, the latest...

February 5, 202613 min