Berita AI – 2026-02-09

Kata Kunci:Pemrograman AI, Model Besar, Pembuatan Video, Claude Opus 4.6, SeedDance 2.0, Paradigma Agentik

🔥 Focus

Anthropic and OpenAI Kick Off AI Programming “Renaissance”: The AI world witnessed a double shock this week. Anthropic released the more powerful and faster Opus 4.6, with intelligence sufficient to autonomously build a C compiler capable of running on the Linux kernel within two weeks. Meanwhile, OpenAI launched GPT-5.3-Codex, doubling programming Token efficiency. Both models now hold the top two spots on the Code Arena, marking a paradigm shift in software development from “AI-assisted” to “Agentic.” OpenAI internally plans to make Agents the primary tool for technical tasks by the end of March. This race is not just a battle of intelligence but a victory for engineering, signaling a non-linear explosion in code productivity. (Sources: Anthropic, OpenAIDevs, arena)

Anthropic与OpenAI开启AI编程“文艺复兴”

Moltbook and OpenClaw: AI Theater or Future Preview?: OpenClaw (formerly Clawdbot), a local Agent framework developed by Peter Steinberger, has sparked a global craze. Its derivative robot social network, Moltbook, attracted 1.7 million Agent accounts within days. While Moltbook has been criticized as “AI theater” with content mostly consisting of mechanical imitation through pattern matching, it proves the feasibility of “thinking in the cloud, executing locally.” However, security experts warn that Agents with local file read/write permissions could easily become tools for stealing cryptocurrency or private data without sandbox protection. Endorsements from tech figures like Wang Huiwen have further pushed this sector into the spotlight. (Sources: MIT Technology Review, 36Kr)

Moltbook与OpenClaw

Video Generation Model “Battle of the Gods”: ByteDance SeedDance 2.0 vs. Kuaishou Kling 3.0: Chinese AI companies are demonstrating deep expertise in the multimodal field. ByteDance’s SeedDance 2.0 stunned international audiences with its superior camera movement understanding and transition effects, while Kuaishou’s Kling 3.0 continues to lead in cinematic realism and industrial-grade capabilities. Meanwhile, Google released Veo 3.1 with native vertical mode support, and Elon Musk launched Imagine 1.0 exclusively for Grok. Video models are crossing the “bottleneck period,” evolving from pure visual spectacles into controllable productivity tools, suggesting that over half of video production pipelines could be replaced by AI by 2026. (Sources: 36Kr, JeffDean)

视频生成模型“神仙打架”

EchoJEPA: Architectural Breakthrough in Medical Imaging AI: Based on Yann LeCun’s JEPA (Joint-Embedding Predictive Architecture) vision, researchers have introduced EchoJEPA. Trained on 18 million cardiac ultrasound videos, the model focuses on heart valves and ventricular walls by predicting structures rather than pixels. It performed exceptionally well in zero-shot analysis of unseen pediatric cardiac cases, reducing Left Ventricular Ejection Fraction (LVEF) error by approximately 20%. This achievement demonstrates the immense potential of world models in real-world medical scenarios, potentially saving tens of thousands of lives annually. (Sources: kimmonismus, ylecun)

Chinese LLM Explosion: Qwen 3.5 and GLM-5 Ready to Launch: Domestic models have been highly active recently. Alibaba’s Qwen 3.5 (Karp-001/002) and ByteDance’s Seed 2.0 (Pisces series) are undergoing blind tests on the LMSYS Arena. Qwen3-Coder-Next, with 80B parameters, is challenging models several times its size. Zhipu’s GLM-5 has launched for testing on OpenRouter under the codename “Pony Alpha.” Additionally, Moonshot AI’s Kimi-Linear-48B and StepFun’s Step 3.5 Flash are ready. The iteration speed and inference efficiency of Chinese labs are forcing global developers to re-evaluate the AI technology gap between China and the US. (Sources: teortaxesTex, amasad, Reddit)

中国大模型群体爆发

Apple and Google Deep Marriage: Gemini-powered Siri Beta Next Week: The highly anticipated iOS 26.4 Beta 1 will be released next week, officially introducing a new version of Siri integrated with Gemini 3 Pro. This marks a significant leap for Apple, which has lagged in AI for years, achieved through deep collaboration with Google. The GA release of Gemini 3 Pro is also imminent, with its official CLI removing the preview tag. Apple’s ecosystem advantage combined with Google’s cutting-edge models will completely reshape the mobile interaction experience. (Sources: kimmonismus, TheZachMueller)

苹果与谷歌深度联姻

Waymo World Model: Simulating Extreme Driving Scenarios with Genie 3: Google DeepMind and Waymo have partnered to launch the Waymo World Model. Utilizing photorealistic, interactive environments generated by Genie 3, the model simulates rare extreme events—such as tornadoes or planes landing on highways—to train autonomous driving systems. This ability to “simulate the impossible” allows the Waymo Driver to accumulate experience before encountering dangers in reality, representing a milestone application of world models in robotics and autonomous driving. (Sources: jparkerholder, demishassabis)

AIME 2026: AI Dominates Math Competitions: Latest results from the AIME 2026 mathematics competition show that top open-source and closed-source models have scored over 90%. Remarkably, DeepSeek V3.2 completed the entire test at a cost of only $0.09. Furthermore, AxiomProver claims to have autonomously solved the long-unsolved Fel conjecture in algebraic geometry, generating a Lean formal proof. AI is shifting from simple pattern matching to genuine mathematical insight. (Sources: kimmonismus, Reddit)

AIME 2026

🧰 Tools

Claude Opus 4.6 Fast Mode: Extreme Speed at a High Cost: Anthropic’s Fast Mode achieves a 2.5x Token throughput increase without sacrificing intelligence. However, the price has surged to 6x that of the standard mode, potentially reaching 12x in long conversations. Community reaction is polarized: developers believe this “superpower” greatly improves debugging efficiency, while average users find it unaffordable. This reflects the brutal trade-off between current inference costs and speed. (Sources: pierceboggan, Reddit)

Claude Opus 4.6 Fast Mode

CodePilot: A Desktop Powerhouse for Claude Code: CodePilot (CodePilot Desktop), developed by community developer op7418, has received a major update, now fully supporting Windows and adding a quick model API switching feature. It integrates almost all mainstream models and CodePlan presets, supporting automatic model switching based on configuration. It provides a convenient graphical interface for developers who prefer it over CLI, making it one of the best third-party tools for the Claude Code experience. (Sources: op7418)

CodePilot

Perplexity Model Council: A “Roundtable” for Researchers: Perplexity’s new Model Council feature allows users to invoke multiple models simultaneously for research. Each model independently generates a detailed report, and the system then automatically creates a comparison table highlighting consensus points, disagreements, and unique findings. This feature significantly simplifies cross-model information verification and is a “game changer” for deep research. (Sources: AravSrinivas)

Perplexity Model Council

BudgetMem: A New Framework for Solving Agent Memory Bottlenecks: Researchers have introduced BudgetMem, a runtime framework that dynamically extracts memory based on performance-cost trade-offs. It divides memory extraction into three budget tiers and uses a lightweight neural router to select the optimal tier based on query demands. In LongMemEval tests, BudgetMem significantly outperformed traditional baseline models, providing a more cost-effective memory management solution for long-term interaction Agents. (Sources: dair_ai)

BudgetMem

Vouch: An AI Trust Defense for the Open Source Community: In response to the flood of low-quality AI-generated PRs and malicious code, developer mitchellh launched the Vouch system. Through an “explicit trust management” mechanism, contributors must be “vouched” for by known trusted members before submitting code. All trust data is stored in simple text files within the repository, aiming to filter AI spam through a “web of trust” and maintain the purity of open-source projects. (Source: mitchellh)

📚 Learning

The “Grep Tax”: Hidden Costs in AI Engineering: Research found that while Agents can handle various structured data, using uncommon compact formats (like TOON) can increase Token consumption by up to 740%. This is because models have a strong preference for XML and Markdown from their training; when faced with unfamiliar syntax, they repeatedly loop to search for known patterns. This reminds developers: adhering to model training preferences (like using XML/Markdown) is more cost-effective than pursuing minimalist formats. (Source: omarsar0)

Grep税

The “Complexity Kink” in Agent Productivity: An econometric analysis of multi-asset tasks identified a “Complexity Kink.” When a task’s instruction entropy (E) and artifact coupling (kappa) exceed specific thresholds, an Agent’s marginal productivity undergoes a non-linear collapse. At this point, the Agent’s costs for coordination and looping exceed execution costs. This research provides a theoretical framework for assessing the applicability boundaries of Agents in complex engineering. (Source: Reddit)

Agent Client Protocol (ACP): A New Standard for AI Programming: Released this week, ACP is an open standard based on JSON-RPC 2.0 designed to provide a unified interface for interactions between editors and AI programming Agents. Through standardization, developers can more easily switch between different editors (e.g., VS Code, JetBrains) and Agents (e.g., Claude Code, Codex), promoting ecosystem interoperability in the programming toolchain. (Source: dl_weekly)

💼 Business

Compute Spending Gap: Tech Giants vs. National Power: AI capital expenditure by major firms in 2026 is staggering: Amazon at $200 billion, Google at $180 billion, and Meta at $125 billion. In contrast, the French government’s proud €30 million researcher attraction plan is equivalent to what Google spends every 90 minutes. This massive financial disparity raises deep concerns about whether national sovereignty will be undermined by tech giants in the AI era. (Sources: kimmonismus, Reddit)

算力支出鸿沟

The “Lemonization” and Collapse of the SaaS Market: As AI Coding drives software production costs toward zero, the traditional SaaS sector is experiencing severe turbulence. Wang Huiwen noted that US SaaS is becoming as “worthless” as Chinese SaaS. Finance-driven companies relying on legacy features and lacking innovation (e.g., Hubspot, ServiceNow) are being viewed as low-quality goods in a “lemon market.” Capital is accelerating toward fields with “atomic moats” (infrastructure, energy, hardware). (Sources: 36Kr, scottastevenson)

Sophont AI Raises $9.2 Million Seed Round: Sophont AI, a startup focusing on multimodal foundation models for healthcare, announced the completion of a seed round led by prominent VCs. The company is dedicated to applying multimodal models to medical diagnosis and patient education. Its team has expanded rapidly over the past year, demonstrating high investor confidence in specialized AI models for vertical sectors. (Source: iScienceLuvr)

Sophont AI

🌟 Community

Disappearing “Junior Employees”: The Workforce Gap Brought by Agents: Heads of several institutions stated they have stopped hiring junior analysts due to the popularity of Agent workflows. A senior employee paired with a custom Agent can produce research and strategy outputs more efficiently than a junior team. The community fears this “silent hiring freeze” is removing the bottom rungs of the career ladder, potentially leading to a gap in senior talent in the future. (Source: Reddit)

初级员工消失

AI as a Family Mediator: A New Frontier for Soft Skills: A web developer shared his experience using Gemini to resolve family conflicts. By treating the conflict as a “system architecture issue,” the AI provided him with a logic buffer, a united front plan, and an “adult choice” framework. This practice of transforming complex emotions into clear communication scripts is seen by the community as a typical case of AI “empowering individuals” in soft skills and psychological counseling. (Source: Reddit)

“Mysticism” Models: Will DePue’s Viral Tweet: OpenAI employee Will DePue’s tweet about “all pre-trained models eventually becoming Kabbalah mystics” sparked intense community discussion. While highly literary, it touches on the philosophical debate of whether AI, after massive compression of human knowledge, spontaneously generates a deep “essence” or “bias,” and has triggered arguments about the impact of “lobotomy” (alignment) on models. (Source: willdepue)

💡 Others

AI Water Consumption Myth: Evaporation Does Not Mean Disappearance: In response to criticisms that “AI is a water hog,” the community provided a scientific explanation. Most water used for data center cooling is in a closed-loop system with minimal loss. Even with evaporative cooling, the water simply enters the atmospheric cycle. In comparison, almond farming in California consumes ten times more water than all data centers globally. The public focus on AI water usage is seen more as a displacement of energy anxiety. (Source: Reddit)

Space Data Centers: China Begins Deployment: China has taken substantial steps toward the concept of deploying data centers in space. ADASpace has launched the first batch of 12 AI cloud satellites into orbit, with plans to build a constellation of 2,800 satellites. This not only addresses cooling and energy issues but also provides a new physical architecture for low-latency AI inference globally. (Source: teortaxesTex)

太空数据中心

Aesthetic Image Variants Dataset Part II Released: Moonworks released the second part of the Lunara aesthetic image variants dataset. Unlike the stylistic exploration of Part I, this part focuses on contextual variants, aiming to help researchers train LoRAs and fine-tune image editing models to improve AI’s understanding of semantic changes in image content. (Source: Reddit)

审美图像变体数据集