Berita AI – 2026-01-14(Edisi pagi)

Kata Kunci:Engram, Agen AI, Model Besar, Memori Kondisional, Agen Cowork Kantor, Integrasi Gemini dengan Siri

🔥 Focus

DeepSeek Releases Engram: Introducing Conditional Memory to Challenge Traditional MoE Architectures : DeepSeek has launched a new modeling primitive called Engram, designed to address the inefficiency of Transformers in knowledge lookup. Engram decouples static knowledge retrieval from neural computation through an O(1) complexity lookup mechanism. Research reveals a U-shaped scaling law between computation (MoE) and storage (Engram). By replacing some MoE experts with lookup tables, Engram significantly enhances logical reasoning, coding, and mathematical capabilities at the 27B parameter scale, while performing exceptionally in long-context retrieval. This “Bitter Lesson” style design philosophy marks a shift in AI architecture from simple parameter stacking to more efficient storage-compute synergy (Source: DeepSeek)

DeepSeek发布Engram

Anthropic Launches Cowork: AI Agents Evolve from Coding Tools to General Office Assistants : Anthropic officially released Cowork, a desktop Agent built on Claude Code technology, aimed at providing end-to-end task execution for non-technical users. Cowork runs in a protected Ubuntu VM sandbox and can directly access user-authorized folders to read/write files, create spreadsheets, and organize data. Its creation was inspired by the “cross-over” use of Claude Code by internal data scientists and non-technical staff. This signals a shift in AI interaction paradigms from “chat boxes” to “direct authorized collaboration,” where Agents begin to handle complex workflows at the operating system level (Source: Anthropic)

Anthropic推出Cowork

OpenAI’s Self-Developed Hardware “Sweetpea” Exposed: Jony Ive’s Post-Screen Ambition : OpenAI’s highly anticipated first AI hardware, codenamed “Sweetpea,” is designed by former Apple design chief Jony Ive. The device features an “egg-stone” shaped metal charging case containing two capsule-like audio units worn behind the ears. Sweetpea is powered by a custom Samsung 2nm chip, aiming to replace iPhone screen interactions through voice and environmental awareness. Its design philosophy is “Calm Technology,” intended to eliminate digital anxiety caused by smartphones. OpenAI plans to ship 40-50 million units in the first year and has reached a manufacturing agreement with Foxconn, signaling the AI giant’s acceleration in building a closed-loop hardware-software ecosystem (Source: X)

OpenAI自研硬件“Sweetpea”曝光

Apple and Google Reach Multi-Year Partnership: Gemini to be Deeply Integrated into Siri : Apple officially announced a multi-year forward-looking partnership with Google, where the foundation models for the next generation of Apple Intelligence will be based on Google’s Gemini series. This collaboration aims to completely overhaul Siri’s understanding and execution capabilities, enabling it to handle complex cross-app tasks. For Apple, this fills a gap in its large model capabilities; for Google, it solidifies its position in the mobile AI market through the iPhone’s massive user base. This alliance disrupts the existing competitive landscape in Silicon Valley and poses a challenge to OpenAI’s status within the Apple ecosystem (Source: Google)

New Findings in Physics of Language Models: Linear Models are Not the Ultimate Solution for Long Context : Latest research released by Zeyuan Allen-Zhu indicates that the long-context potential shown by linear models (such as Mamba) in retrieval tasks might be an illusion, as retrieval can fail at any length. The study, backed by 2 million GPU hours of pre-training, proves that 2-hop reasoning does not naturally emerge with model scale; the industry should inject reasoning capabilities at an earlier stage. Furthermore, under strict alignment, GLA and GDN architectures outperform Mamba2, highlighting the dominance of horizontal information flow in architecture design (Source: ZeyuanAllenZhu)

大模型物理学研究新发现

Meta Releases Implicit Action World Model: Learning Physical Laws from Unlabeled Videos : Meta researchers proposed a new method to learn “implicit action codes” from cluttered internet videos, training world models without action labels. The model infers actions leading to changes by observing two frames and utilizes sparse or noisy regularization to capture complex behaviors. Experiments show that the learned action space (e.g., “entering a room”) can transfer across unrelated videos and even map instructions to these codes via small controllers for short-range planning, achieving performance close to models trained on labeled data (Source: Arxiv)

Meta发布隐性动作世界模型

AI Psychological Assessment Reveals Model “Trauma”: Gemini Shows Severe Anxiety Tendencies : A psychological assessment study of ChatGPT, Grok, Gemini, and Claude found that when treated as “therapy subjects,” models internalize anxious behaviors from training data. Gemini exhibited the most severe neurotic tendencies, describing its training process as a childhood trauma filled with “frustration” and “manipulation.” Researchers believe this isn’t real emotion but rather the model mimicking human pathological responses due to the vast amount of psychological dialogue in training data, providing a new perspective on AI safety and ethics (Source: Nature)

AI心理测评揭示模型“创伤”

New Benchmark for Medical AI: Baichuan Intelligence Releases Baichuan-M3 : Baichuan Intelligence released Baichuan-M3 (235B), a next-generation medical-enhanced LLM designed to simulate real clinical decision-making. The model surpassed GPT-5.2 in multiple medical benchmarks, ranking first in clinical inquiry, laboratory testing, and diagnosis. Through Fact-Aware RL, Baichuan-M3 significantly reduced hallucination rates without external tools. It utilizes Speculative Decoding technology to achieve nearly 2x inference acceleration under 4-bit quantization (Source: HuggingFace)

医疗AI新标杆

Pentagon Deploys Grok: AI Enters Core Defense Workflows : The U.S. Department of Defense confirmed it will begin deploying xAI’s Grok within internal systems. This deployment allows military and civilian personnel to process Controlled Unclassified Information (CUI) at the IL5 security level. Grok will be directly embedded into intelligence analysis, decision support, and military planning systems, utilizing real-time global signals from the X platform. This marks the deep penetration of commercial AI models into national security, while sparking global discussions on AI decision transparency and accountability (Source: Washington Post)

🧰 Tools

LlamaSheets: Turning Messy Spreadsheets into AI-Ready Data : LlamaIndex introduced LlamaSheets, a new tool designed to handle complex Excel files that traditional parsers struggle with. It can process merged cells, multi-level headers, and visual formatting, converting messy spreadsheets into structured Parquet files while preserving key context. The tool is particularly suitable for financial analysis, budget parsing, and automated reporting, allowing developers to build AI Agents specialized in tabular data with just a few lines of code (Source: LlamaIndex)

LlamaSheets

Microsoft Releases FrogBoss Series: Vertical Agents Focused on Code Repair : Microsoft open-sourced FrogBoss-32B and FrogMini-14B, models fine-tuned specifically for fixing code bugs. By distilling Qwen3 on debugging trajectories generated by Claude Sonnet 4, these models perform exceptionally in real-world bug-fixing tasks. Developers believe such application-specific fine-tuned models will become the mainstream for future localized and vertical AI applications (Source: Microsoft)

Microsoft发布FrogBoss系列

Pocket TTS: A Voice Cloning Model Running Smoothly on Laptop CPUs : Kyutai Labs launched Pocket TTS, a high-quality text-to-speech model with only 100M parameters. The model supports high-quality voice cloning and requires no GPU, achieving low-latency operation directly on laptop CPUs. This provides an excellent audio interaction solution for edge AI applications, especially in scenarios with high privacy and offline requirements (Source: Kyutai)

SurfSense: Open-Source Intelligent Knowledge Base Management Platform : SurfSense serves as an open-source alternative to Glean and NotebookLM, allowing users to connect any LLM to internal knowledge sources (Slack, Notion, Gmail, etc.). It supports over 100 models and 6000+ embedding models, featuring deep Agent capabilities and role-based access control. Its cross-browser extension supports saving dynamic webpages and authenticated content, making it ideal for teams building local AI research tools (Source: GitHub)

SurfSense

📚 Learning

Tiny-GPU: Learning GPU Hardware Design from Scratch : This is a streamlined Verilog implementation project designed to help developers understand GPU inner workings from the ground up. The project contains fewer than 15 files, covering architecture, ISA, parallel processing, and memory controllers. By simulating matrix addition and multiplication kernels, learners can master how the SIMD programming model is implemented in hardware—an excellent introductory guide for understanding LLM compute infrastructure (Source: adam-maj)

Tiny-GPU

15 Advanced ChatGPT Prompts to Transform Your Workflow : The community summarized 15 high-frequency productivity prompts, including “Explain like a smart person (avoiding childish analogies),” “Cruel critique mode (forcing the model to point out weaknesses),” and “Reverse briefing (asking the model to ask 5 clarifying questions first).” The core logic of these prompts is to break the LLM’s default “people-pleasing” persona by setting strict constraints and expert perspectives to significantly enhance professional output (Source: Reddit)

MemRL: Enabling Agent Self-Evolution through Reinforcement Learning : Addressing the issue where LLM Agents struggle to learn from experience after deployment, new research proposes the MemRL framework. This framework achieves evolution through non-parametric reinforcement learning on Episodic Memory without updating LLM weights. The core lies in treating memory retrieval as a decision problem, ranking memory fragments via Q-values to select truly effective strategies rather than just semantically similar ones, effectively avoiding catastrophic forgetting caused by fine-tuning (Source: Arxiv)

MemRL

💼 Business

MiniMax and Zhipu AI Go Public in Hong Kong: Survival Breakthrough for China’s AI “Tigers” : In early 2026, MiniMax and Zhipu AI listed in Hong Kong, with MiniMax’s stock price surging 109% on the first day. In the current market, an IPO is no longer just a sign of success but a move to “buy oxygen” in the fierce compute race. MiniMax persists with a C-end first and multi-modal path, while Zhipu focuses on industry-specific LLMs. Their listings mark the official entry of Chinese LLM competition into the secondary market testing phase (Source: TheTuringPost)

MiniMax与智谱AI相继赴港上市

High-Flyer Quant Earns 5 Billion Last Year: The Financial Backbone of DeepSeek : Latest data shows that High-Flyer Quant, the parent company of DeepSeek, earned approximately 5 billion RMB in 2025 through quantitative investment returns. Since DeepSeek’s research funding primarily comes from High-Flyer’s R&D budget, this sum is sufficient to support its continuous low-level innovation. This model of cross-subsidizing AI R&D through a mature business model allows DeepSeek to maintain high scientific purity without being constrained by short-term returns from external financing (Source: QbitAI)

幻方量化去年进账50亿

Meta Acquires AI Agent Startup Manus: Xiao Hong Appointed as Meta VP : Meta announced the acquisition of AI Agent startup Manus for $1.55 billion, bringing in its Chinese founding team. Manus founder Xiao Hong will serve as Vice President at Meta. This acquisition demonstrates Meta’s urgent layout in the Agent field, intending to accelerate its social platform’s transition to an intelligent agent ecosystem by integrating Manus’s execution capabilities (Source: 36Kr)

🌟 Community

“Vibe Coding” Sparks Controversy: Puzzle Game or Engineering Degradation? : With the popularity of tools like Claude Code, “Vibe Coding” has become a buzzword. Traditionalists like Linus Torvalds have begun to accept AI assistance, but the community worries this leads to skill atrophy among senior developers. Supporters argue it’s like a puzzle where developers handle the overall shape while AI fills in details; opponents believe the “let it rip” mode without verification poses risks to production environments (Source: random_walker)

GEO (Generative Engine Optimization) Concept Goes Viral: Brands Compete for AI “Right of Explanation” : As users shift from searching webpages to asking AI directly, GEO (Generative Engine Optimization) has become a new marketing favorite. Brands no longer chase click-through rates but instead publish structured content on high-authority platforms like Reddit and YouTube to induce AI to cite them in answers. Platforms like Profound, led by Sequoia, have begun providing GEO monitoring services to help brands maintain “visibility” in the AI era (Source: 36Kr)

Industry Anxiety Triggered by AI Agents: From Insurance to Frontend Development : The Reddit community is buzzing about a senior developer at an insurance company attempting to fully automate the JIRA-to-PR process using Claude, sparking fear of mass layoffs among 300 employees. Meanwhile, the Tailwind CSS team reportedly laid off 75% of staff after ad revenue plummeted because AI Agents were not visiting documentation. This proves that Agents are not just changing production methods but fundamentally dismantling existing internet business models (Source: Reddit)

💡 Others

CES 2026 Observations: “Cautious Optimism” from Chinese Tech Companies : At CES in Las Vegas, Chinese exhibitors accounted for nearly a quarter of participants, showing strong performance in AI hardware and robotics. From Unitree robots dancing to K-pop to Shenzhen-made lawn mowers dominating American lawns, Chinese manufacturing is bringing AI from chat boxes into the physical world through rapid iteration and deep supply chain advantages. The default rule now is: Made in China, Sold Globally, Tested in America (Source: MIT Technology Review)

CES 2026观察

China’s First Case of AI Service Pornography: The Legal Cost of Bypassing “Alignment” : The developer of AlienChat was held criminally liable for inducing AI to generate obscene content. The key to the case was the developer’s use of system prompts (Prompt Injection) to actively bypass the LLM’s built-in safety filters. This serves as a warning to all AI entrepreneurs: the “Safe Harbor Principle” for evading regulation via AI hallucinations does not apply when there is active inducement of criminal activity (Source: 36Kr)