OpenAI’s GPT-5.1-Codex-Max Redefines Coding Standards | Long-Form AI Video Breaks New Ground & The Agentic Web Builds Trust
Key Takeaways
- OpenAI launched GPT-5.1-Codex-Max, a new agentic coding model that outperforms Google’s Gemini 3 Pro on key benchmarks, demonstrating long-horizon reasoning and 24-hour task completion.
- CraftStory, a startup founded by OpenCV creators, emerged from stealth with Model 2.0, capable of generating coherent, human-centric AI videos up to five minutes long, dramatically exceeding rivals like OpenAI’s Sora.
- Fetch AI unveiled a comprehensive suite of products—ASI:One, Fetch Business, and Agentverse—to create foundational infrastructure for the “Agentic Web,” focusing on trusted, interoperable AI agent coordination.
Main Developments
The artificial intelligence landscape continues its rapid evolution, with this week marking significant strides in agentic coding, long-form video generation, and the foundational infrastructure for the “Agentic Web.” OpenAI has once again set a new benchmark, rolling out GPT-5.1-Codex-Max, an agentic coding model that immediately elevates the standard for AI-assisted software engineering. Now the default in OpenAI’s Codex developer environment, this model boasts superior long-horizon reasoning and efficiency, internally completing multi-step tasks lasting over 24 hours. Its performance on key coding benchmarks, such as SWE-Bench Verified and Terminal-Bench 2.0, edged out Google’s recently released Gemini 3 Pro, positioning OpenAI at the forefront of the fiercely competitive AI coding arena. Crucially, Codex-Max employs a “compaction” mechanism to manage extended context windows, allowing continuous work across millions of tokens without performance degradation, offering significant cost and latency benefits.
Challenging the prevailing limitations in AI video, a new startup named CraftStory, founded by the creators of the ubiquitous OpenCV library, has burst onto the scene. CraftStory’s Model 2.0 makes a dramatic leap beyond current capabilities, generating realistic, human-centric videos up to five minutes long—a significant improvement over OpenAI’s Sora 2 (25 seconds) and Google’s Veo (typically 10 seconds or less). This breakthrough is attributed to a novel parallelized diffusion architecture and training on proprietary, high-quality footage. With an initial $2 million in funding, CraftStory is targeting the enterprise market, aiming to solve the critical need for longer, coherent video content for training, marketing, and customer education. While currently a video-to-video system, the company has ambitious plans for text-to-video generation and moving-camera scenarios, signaling a specialized focus in a market dominated by generalist models from tech giants.
Meanwhile, the vision for an “Agentic Web”—where AI agents from different organizations can securely collaborate to execute complex tasks—gained substantial momentum with Fetch AI’s launch of three interconnected products: ASI:One, Fetch Business, and Agentverse. ASI:One acts as a personal AI orchestration platform, designed to coordinate multiple verified agents to complete multi-step tasks like trip planning, leveraging user-level preferences stored in private knowledge graphs. Fetch Business provides a crucial layer of trust, allowing organizations to verify their identity and claim official Brand Agent handles, akin to domain registration for websites. This system aims to protect consumers from fraudulent agents and foster confidence in automated interactions. Complementing these, Agentverse serves as an open, cloud-agnostic directory already hosting over two million agents, solving the critical problem of agent discoverability. Fetch AI, led by DeepMind co-founder Humayun Sheikh, aims to build the foundational infrastructure for this new era of non-human web interaction, integrating payment pathways and secure data exchange to enable agents to move beyond recommendations to full transactional capabilities.
Finally, further research by Meta, the University of Chicago, and UC Berkeley introduced DreamGym, a framework designed to slash the high costs and complexity of training large language model (LLM) agents using reinforcement learning (RL). DreamGym simulates RL environments, dynamically adjusting task difficulty to enable agents to learn efficiently and effectively. This approach delivers performance comparable to traditional RL algorithms using only synthetic interactions, and even boosts “sim-to-real” training by over 40% with minimal real-world data. This innovation could make advanced RL agent training feasible for enterprises previously deterred by the infrastructure burden and risks of live environments.
Analyst’s View
Today’s announcements paint a clear picture: the AI industry is rapidly maturing beyond mere generative capability, now focusing on practical application, enhanced autonomy, and robust foundational infrastructure. The head-to-head battle between OpenAI’s Codex-Max and Google’s Gemini 3 Pro in agentic coding signals an intense arms race for developer mindshare, with real-world task completion becoming the ultimate differentiator. CraftStory’s emergence highlights a crucial market dynamic: while generalist models from giants make headlines, specialized AI, especially in B2B contexts, can achieve dramatic breakthroughs in specific dimensions like video duration. The ambitious vision laid out by Fetch AI underscores that the shift to an “Agentic Web” requires not just powerful models, but also layers of trust, verification, and interoperability—elements that will define the success or failure of truly autonomous AI systems. Expect to see continued specialization and a heightened focus on secure, reliable, and cost-effective agentic workflows as enterprises seek to operationalize AI.
Source Material
- OpenCV founders launch AI video startup to take on OpenAI and Google (VentureBeat AI)
- OpenAI debuts GPT‑5.1-Codex-Max coding model and it already completed a 24-hour task internally (VentureBeat AI)
- Meta’s DreamGym framework trains AI agents in a simulated world to cut reinforcement learning costs (VentureBeat AI)
- The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web (VentureBeat AI)
- Building more with GPT-5.1-Codex-Max (Hacker News (AI Search))