MIT Unveils Self-Evolving AI Models | Salesforce Bets Big on Agents, Digital Twins Threaten Surveys

Key Takeaways
- Researchers at MIT have open-sourced an updated SEAL technique, enabling large language models (LLMs) to autonomously generate and apply their own fine-tuning strategies, ushering in an era of self-improving AI.
- Salesforce launched Agentforce 360, a major strategic pivot betting that AI agents will handle up to 40% of enterprise work across its core services, leveraging Slack as the primary conversational interface.
- A new research paper details a “semantic similarity rating” (SSR) method for LLMs to simulate human consumer behavior with 90% accuracy, creating “digital twin” consumers and potentially disrupting the traditional survey industry.
Main Developments
The artificial intelligence landscape is witnessing fundamental shifts this week, with breakthroughs spanning from how AI models learn and evolve to their impact on enterprise workflows and market research. At the forefront, MIT’s Improbable AI Lab has unveiled a significantly expanded version of its SEAL (Self-Adapting LLMs) technique, now open-sourced and gaining traction among AI power users. SEAL empowers LLMs to improve themselves by autonomously generating synthetic data and fine-tuning strategies, moving beyond the “frozen-weights era” towards continuous self-learning. This dual-loop system, combining inner supervised fine-tuning with outer reinforcement learning, has shown impressive gains, boosting question-answering accuracy by over 13% and few-shot learning success rates by more than 50% compared to models without this self-adaptation. This technical leap suggests a future where AI agents can learn, adapt, and retain knowledge more effectively, addressing critical limitations like catastrophic forgetting that plague static models.
This evolution in AI capability comes as enterprises grapple with the practical deployment of AI. Salesforce, responding to what it calls an industry-wide “pilot purgatory” where 95% of enterprise AI projects fail to reach production, is making its most aggressive bet yet on AI agents. At its Dreamforce conference, the company launched Agentforce 360, a sweeping reimagination of its entire product portfolio designed to create “agentic enterprises.” Salesforce co-founder Parker Harris envisions AI agents handling up to 40% of work across sales, service, marketing, and operations, working collaboratively with humans. Crucially, Salesforce is elevating Slack as the primary interface for its platform, embedding agents directly into conversational channels and offering specialized agents for sales, IT, HR, and analytics. This strategy aims to overcome the “prompt doom loop” by deeply integrating AI with enterprise workflows, data, and governance systems, moving past disconnected tools to a unified, agent-driven experience. Early results from customers like Reddit and OpenTable, showing dramatic reductions in resolution times and case deflections, underscore the potential of this agentic approach.
The growing sophistication of AI is also poised to disrupt established industries, none more so than market research. A new research paper outlines a breakthrough method allowing LLMs to simulate human consumer behavior with startling accuracy. This technique, called semantic similarity rating (SSR), sidesteps the traditional flaw of LLMs producing unrealistic numerical ratings. Instead, it prompts models for rich, textual opinions which are then converted into numerical vectors and compared against predefined reference statements. Tested against a real-world dataset of 57 product surveys and 9,300 human responses, the SSR method achieved 90% human test-retest reliability, with AI-generated rating distributions statistically indistinguishable from human panels. This development, which arrives as human survey panels face integrity threats from chatbot-generated responses, promises to create armies of “digital twin” consumers, offering scalable, high-fidelity qualitative and quantitative feedback at a fraction of the cost and time of traditional methods.
Beneath these advancements, researchers are also making strides in optimizing the very mechanics of AI training. A study from the University of Illinois Urbana-Champaign suggests that retraining only narrow parts of an LLM, specifically the multi-layer perceptron’s up/gating projections, can prevent “catastrophic forgetting” and significantly reduce compute costs. This nuanced approach challenges the notion that forgetting is true memory loss, instead attributing it to bias drift in output distribution. These efficiency gains, alongside strategic moves like OpenAI’s partnership with Broadcom to produce its own AI chips, highlight a holistic effort to make AI more capable, cost-effective, and independent from existing hardware bottlenecks.
Analyst’s View
Today’s news signals a profound acceleration towards truly autonomous and integrated AI. The emergence of self-improving models like MIT’s SEAL is a foundational shift, pushing AI from static tools to evolving entities. This capability, combined with Salesforce’s aggressive pivot to enterprise-wide AI agents, illustrates a future where AI isn’t just assisting, but actively participating and orchestrating work. The “digital twin” consumer concept is equally transformative, highlighting how synthetic data generation is moving beyond niche applications to directly challenge and reshape multi-billion dollar industries. The core challenge for enterprises will now shift from whether to adopt AI to how quickly and effectively they can integrate these rapidly evolving, self-improving agentic systems while maintaining trust and ethical oversight. Expect intense competition in enterprise AI platforms, with success hinging on robust data integration, governance, and the ability to manage increasingly autonomous digital workforces.
Source Material
- This new AI technique creates ‘digital twin’ consumers, and it could kill the traditional survey industry (VentureBeat AI)
- Self-improving language models are becoming reality with MIT’s updated SEAL technique (VentureBeat AI)
- Salesforce bets on AI ‘agents’ to fix what it calls a $7 billion problem in enterprise software (VentureBeat AI)
- Researchers find that retraining only small parts of AI models can cut costs and prevent forgetting (VentureBeat AI)
- OpenAI partners with Broadcom to produce its own AI chips (The Verge AI)