GPT-5’s Performance Puzzle: New Benchmarks Flag Regressions and Enterprise Fails | Open Source Agents Rise; OpenAI Accelerates Life Sciences
Key Takeaways Independent evaluations indicate GPT-5 shows a concerning regression in healthcare-specific tasks compared to its predecessor, GPT-4. A new Salesforce benchmark reveals GPT-5 fails over half of real-world enterprise orchestration tasks, questioning its practical utility in complex scenarios. The open-source community gains significant ground with OpenCUA, whose computer-use agents are now reported to rival top proprietary models. OpenAI is leveraging specialized AI, GPT-4b micro, to accelerate protein engineering for stem cell therapy and longevity research. Japanese digital entertainment leader…