Keywords:AI infrastructure, Sovereign AI, Agent, Five-layer cake model, Engram architecture, Agent cognitive compressor
🔥 Focus
NVIDIA’s Jensen Huang at Davos: The “Five-Layer Cake” Theory of AI Infrastructure : NVIDIA CEO Jensen Huang proposed the “five-layer cake” model for the AI industry at the 2026 Davos Forum: Energy, Chips, Cloud Services, Models, and Applications. He noted that the current hundreds of billions of dollars in investment is just the beginning, with a wave of trillions of dollars in infrastructure coming. Huang emphasized that AI should be viewed as national-level infrastructure (Sovereign AI), using the increasing number of radiologists as an example to argue that AI automates “tasks” rather than replacing “purposes,” creating new demand by improving efficiency. This perspective provides a new lens for global AI unemployment anxiety: AI is a productivity amplifier rather than a human rival (Source: NVIDIA)

Anthropic Releases “Claude Constitution”: Defining AI’s Independent Persona and Values : Anthropic has officially released Claude’s new constitution, detailing its behavioral vision and core values. This document is not just a guide for the training process but an attempt to shape Claude into a new “world entity” distinct from previous science fiction concepts. The constitution emphasizes Claude’s independence beyond its training data and even includes Anthropic’s obligations to the AI. The community has reacted strongly, viewing this as a shift from AI as a tool to an entity with a “digital persona,” while also sparking deep discussions on balancing AI constraints with autonomy (Source: Anthropic)

DeepSeek Launches Engram Architecture: A Breakthrough in Computing Power by Replacing HBM with DRAM : A Morgan Stanley research report highly praised the Engram (Imprint) module proposed in DeepSeek’s latest paper. This architecture separates static pattern storage from dynamic inference through a “conditional memory” mechanism, allowing models to offload massive amounts of knowledge to low-cost system memory (DRAM), looking it up only when needed. This breakthrough effectively alleviates the bottleneck of expensive High Bandwidth Memory (HBM), proving that “doing more with less” is achievable through algorithmic innovation in compute-constrained environments. Morgan Stanley predicts that DeepSeek V4, utilizing this architecture, is expected to run on consumer-grade GPUs (such as the RTX 5090), completely rewriting the laws of AI scaling (Source: Morgan Stanley)

Inside xAI’s “Macrohard” Project: Tesla Car Computers Could Become the Base for Millions of Agents : Former xAI engineer Sulaiman Ghori leaked details of an internal project codenamed “Macrohard” on a podcast. The project aims to build a “human simulator” that simulates human keyboard and mouse operations at 8x speed to automate white-collar work. The most shocking revelation is xAI’s plan to utilize the computing power of millions of idle Tesla vehicles (HW4 platform) to deploy these Agents, bypassing traditional data center construction cycles via a distributed network. Ghori was subsequently fired for the unauthorized leak, but the revealed “war room” culture and aggressive timeline have prompted the industry to re-evaluate xAI’s competitive potential (Source: The Information)

Google Partners with Shopify to Enter AI E-commerce: Shifting from Search Entry to Transaction Loop : Google announced the launch of the Universal Commerce Protocol (UCP), partnering with giants like Shopify and Walmart to turn Gemini into a complete shopping entry. Users can complete the entire process—from price comparison and parameter analysis to instant checkout—within the chat box without switching apps. Gemini can even call offline stores on behalf of the user to confirm stock. This move is seen as a strong counter-attack against ChatGPT’s “Instant Checkout” feature, marking a paradigm shift from search advertising to “Agent Commerce,” as large model providers become a new force reshaping the global retail channel landscape (Source: Google)

🎯 Trends
Apple AI Hardware and Siri “Campos” Upgrade Plans Leaked : Reports suggest Apple is secretly developing an AI wearable device (AI Pin) similar to an AirTag, featuring multiple cameras and sensors, with an expected release in 2027. Meanwhile, a brand-new Siri codenamed “Campos” will debut this September, deeply integrating the Google Gemini 3 model and possessing “screen awareness” to directly operate files and apps on the screen. Apple aims to leverage its hardware-software integration advantage to counter OpenAI and Meta in the edge AI field, with an initial hardware production target of 20 million units (Source: The Information)

Microsoft Releases VibeVoice-ASR: Processing One Hour of Audio in a Single Pass : Microsoft has open-sourced VibeVoice-ASR, a 9B-scale speech recognition model, on Hugging Face. The model breaks the traditional ASR approach of slicing audio, supporting the processing of 60 minutes of audio at once within a 64K token window, effectively avoiding global context loss and speaker tracking confusion. Tests show robust performance in complex backgrounds (such as identifying voices in music) and long texts (such as audiobook reading), with an average accuracy of 91.9% and support for hotword configuration to correct proper noun recognition (Source: Microsoft)

Meta Introduces Dr. Zero Framework: Achieving Agent Self-Evolution with Zero Data : Meta’s Fundamental AI Research (FAIR) lab proposed the Dr. Zero framework, enabling agents to evolve efficiently without annotated data. The framework uses a “Proposer-Solver” collaborative mechanism, utilizing search engines to actively explore and generate complex problems. The core technology, HRPO (Hindsight Relative Policy Optimization), builds benchmarks by clustering similar questions, avoiding expensive nested sampling. It outperformed fully supervised baselines by 14.1% in complex Q&A tasks, providing a new path to solve the AI training data depletion problem (Source: Meta)

Industry Shifts to Long-term Task Evaluation: Multiple Real-world Benchmarks Released : AI evaluation focus is shifting from math/code leaderboards to long-term tasks. The newly released APEX-Agents tests Agent professional collaboration in Google Workspace; DSAEval covers 641 real-world data science problems. Tests show GPT-5.2 leading in efficiency, while Claude-Sonnet-4.5 is strongest in overall performance. The emergence of these benchmarks reflects an industry consensus: what limits Agent development is no longer reasoning ability, but the ability to maintain logical consistency and memory control over long cycles (Source: Mercor, DSAEval)
Agent Cognitive Compressor (ACC): Biological-Inspired Memory Control : Researchers have proposed the Agent Cognitive Compressor to solve the “context decay” problem in multi-turn dialogues for Agents. Instead of simply replaying history, ACC maintains a “compressed cognitive state” constrained by architecture, retaining only key variables like goals, entities, and relationships. Experiments show that ACC achieves near-zero hallucination and drift rates in complex workflows of over 50 turns, far outperforming traditional Retrieval-Augmented Generation (RAG) models (Source: DAIR.AI)

🧰 Tools
Prefect Horizon: A Hosting and Governance Platform for MCP Servers : In response to the popularity of the Model Context Protocol (MCP), Prefect has launched the Horizon platform. It addresses the pain points of MCP server deployment in enterprises, providing managed hosting, Role-Based Access Control (RBAC), audit logs, and tool discovery. Horizon allows enterprises to securely expose private data and workflows to AI Agents, elevating MCP from a simple protocol to a productivity platform manageable at scale (Source: Prefect)

CopilotKit + LangChain: A Frontend Solution for Deep Agents : CopilotKit now supports the Deep Agents architecture proposed by LangChain, allowing developers to build interactive UIs for Agents with planning capabilities with just a few lines of code. The tool supports streaming output, custom Skills, and sub-agent orchestration, solving the UI/UX bottlenecks developers face when building complex Agent applications, enabling “planning-first” Agents (like Manus or Claude Code-style apps) to be converted into end products more quickly (Source: CopilotKit)

Devin Review: An AI Tool Reimagining the Code Review Experience : Cognition has launched Devin Review, aimed at solving the human review bottleneck after AI generates massive amounts of code. The tool doesn’t just look for bugs; it uses a redesigned interface to help humans quickly understand complex PR logic. It supports replacing the domain name directly in GitHub links for use, and in tests, it found associated errors beyond the Diff. The core logic: AI-generated code should be reviewed by more efficient AI-assisted tools, rather than letting programmers drown in “code garbage” (Source: Cognition)

GLM-4.7 Flash Localization Optimization: Running 200K Context on a Single Card : The community fixed vLLM’s KV cache support for GLM-4.7-Flash with a single line of code, enabling the MLA (Multi-head Latent Attention) mechanism. This reduced the VRAM usage for the 30B model’s 200K context from 180GB to a staggering 10GB. Now, a single RTX 5090 (32GB VRAM) can run this top-tier reasoning model at full speed, marking the official start of the high-performance local Agent era (Source: Zai_org)

📚 Learning
Gemini CLI Practical Course: Building Multi-step Automation Workflows : DeepLearning.AI and Google have launched a free short course teaching developers how to build open-source agents using the Gemini CLI. The course covers the entire process from local file operations and dev tool integration to cloud service calls, focusing on how to use Agents for code automation, data dashboard creation, and complex task planning. Ideal for developers looking to shift from simple API calls to building actual productivity tools (Source: DeepLearningAI)
Hyperball Optimizer: Achieving 33% Training Acceleration via Normalization : Stanford researchers proposed the Hyperball optimizer wrapper. This method maintains constant weight and update norms, allowing direct control over effective step sizes, thereby replacing traditional weight decay. Experiments prove that Hyperball can bring a 33% training speedup on top of optimizers like Muon and possesses stronger hyperparameter transferability, providing a more stable mathematical framework for large-scale model training (Source: Kaiyue Wen)

NVIDIA Motive: An Attribution Analysis Method for Video Generation : NVIDIA researchers introduced Motive, a gradient-based motion-centric data attribution method. By isolating temporal dynamics from static appearance, Motive can accurately identify which videos in the training set had a positive or negative impact on the generated motion. This is of significant research value for optimizing the training quality of video generation models and understanding the causes of motion degradation (Source: NVIDIA Research)
InT (Intervention Training): Solving the Credit Assignment Problem in Reasoning : A paper proposed the Intervention Training method, which optimizes reinforcement learning initialization by having the model locate the first error in its own reasoning path and propose a single-step intervention. Compared to standard RL, which only rewards the final answer, InT can precisely correct intermediate steps. On the IMO-AnswerBench benchmark, this method improved the accuracy of a 4B model by 14%, even surpassing 20B-scale models (Source: HuggingFace)
💼 Business
OpenAI Plans to Raise $50 Billion at an $830 Billion Valuation : Reports indicate Sam Altman recently met with investors in the UAE to discuss a massive new funding round. The target is $50 billion, with a valuation between $750 billion and $830 billion. The funds will primarily support OpenAI’s projected $200 billion compute expenditure through 2030. Meanwhile, OpenAI is facing a massive lawsuit from Elon Musk regarding “betraying its non-profit mission” (Source: Bloomberg)

Alibaba’s T-Head Initiates IPO Plan: Completing the Full-Stack AI Chip Map : Alibaba has decided to support its chip subsidiary, T-Head (Pinguantou), in an independent IPO. Since its founding 8 years ago, T-Head has launched several top-tier chips in computing, storage, and networking. Its self-developed PPU (GPU) performance now rivals the NVIDIA H20, becoming a major force in China’s new AI computing power. T-Head’s listing will trigger a market re-evaluation of domestic AI chip value and marks Alibaba’s completion of a full-stack AI layout spanning models, cloud infrastructure, and core chips (Source: 36氪)

Embodied AI Startup Skild AI Raises $1.4 Billion in Series B : Led by SoftBank with participation from NVIDIA and Jeff Bezos, Skild AI’s valuation has surpassed $14 billion. The company is dedicated to building “Skild Brain,” an embodied brain with general generalization capabilities across hardware forms. Its 2025 revenue reached $30 million, primarily from industrial deployments in security and delivery. This funding will accelerate its progress in bringing embodied AI to the home consumer market (Source: Skild AI)

🌟 Community
The “December Revolution” in Programming: Agentic Coding Gains Mainstream Recognition : The community is buzzing about December 2025 being a watershed moment for software engineering. Tech leaders like Linus Torvalds and Karpathy have begun to publicly embrace Agentic Coding. Discussions suggest that “Software Engineers” are transforming into “Software Prompters,” and future core competitiveness will lie in the ability to orchestrate AI Agents. The focus of PR reviews will shift from the code itself to the review of Prompts and verification logic (Source: X)
Five Core Skill Stacks for the Post-AI Era : As AI takes over the technical execution layer, the community has summarized new personal competitiveness: 1. Agency—creating stories worth telling; 2. Taste—the ability to distinguish good from bad; 3. Perspective—adding human uniqueness; 4. Persuasion—resonating with people; 5. Know-How—efficiently utilizing AI tools. The core view is: when intelligence is infinitely abundant, human “judgment” and “aesthetics” will command the highest premium (Source: DAN KOE)
AI Education Equality: Gemini Provides Free SAT Mock Exams : Google has launched a full SAT mock exam feature in the Gemini App, certified by The Princeton Review, providing instant feedback. The community believes this has significant social meaning, democratizing expensive test prep. While some worry it will intensify “score-chasing” competition, more see it as a milestone for AI as a “private tutor” to close the education gap (Source: Google Education)
💡 Others
The “New Narrative” of AI in Real Estate : Facing a market downturn, real estate companies have begun using robots as selling points. From greeting and explaining to community cleaning and unmanned delivery, robots are becoming the core packaging for “future tech private residences.” This reflects the real estate industry’s attempt to shift from “high leverage” to “high tech content.” Although large-scale implementation still faces challenges, it has become an important label for attracting young homebuyers (Source: 36氪)

Cross-species “Agents”: Cows Observed Using Tools : The scientific community has discovered that cows can learn to use tools in specific environments, a discovery jokingly dubbed the “first Agentic Cow” by the AI community. The discussion extends to the boundaries between biological intelligence and artificial agents, and how observing primitive intelligence in nature can inspire autonomous exploration algorithms for AI (Source: Futurism)

xAI Forms “Talent Sniper Team”: Engineers Recruiting Engineers : Elon Musk has personally stepped in to form a “Talent Engineer” team at xAI reporting directly to him. The role requires candidates to be “geeks” with technical intuition rather than traditional HR, focusing on mining top talent through Vibe coding and specific communities. With annual salaries reaching up to 1.68 million RMB, it reflects the almost frantic competition for top technical talent in the AI era (Source: Business Insider)