AI Daily – 2025-12-25(Morning)

Keywords:NVIDIA, Groq, GPT-5.2, ARC-AGI-2, Epoch AI, TurboDiffusion, AI inference, video generation, LPU inference technology, SRAM high-speed memory architecture, Poetiq metasystem, SageAttention quantization acceleration, MemFlow mechanism

🔥 Focus

NVIDIA’s $20 Billion “Quasi-Acquisition” of Chip Unicorn Groq: NVIDIA has reached its largest deal ever at $20 billion. Through non-exclusive technology licensing and a “hollow-out” style talent recruitment, NVIDIA has brought Groq founder Jonathan Ross (the father of TPU) and his core team under its wing. The deal structure is sophisticated: Groq nominally maintains independent operations to evade anti-monopoly scrutiny, but its core LPU inference technology and SRAM high-speed memory architecture will be integrated into NVIDIA’s “AI Factory.” This move marks NVIDIA’s start in building an absolute moat in the inference chip field, aiming to suppress potential competitors through ultra-low latency inference advantages. (Source: JonathanRoss321, dotey, LiorOnAI)

NVIDIA Quasi-Acquisition of Groq

GPT-5.2 Combined with Poetiq System Breaks ARC-AGI-2 Benchmark: Startup Poetiq disclosed that without any fine-tuning, GPT-5.2 X-High achieved a record-breaking 75% accuracy on the ARC-AGI-2 public test set through its iterative reasoning “meta-system,” far exceeding the human average (60%). The system utilizes self-auditing and multi-step refinement loops of the LLM, proving that the key factor determining the upper limit of AI intelligence has shifted from base models to peripheral “reasoning orchestration.” OpenAI President Greg Brockman acknowledged this, viewing it as a sign of a major leap for AI in complex abstract reasoning tasks. (Source: markchen90, colin_fraser, 36Kr)

GPT-5.2 Benchmark Breakthrough

Epoch AI 2025 Year-End Report: AI Capability Growth Rate Doubles: The report shows that since April 2024, the progress speed of top-tier AI models has been nearly double that of the previous two years, primarily driven by the rise of reasoning models (such as o1, R1) and investment in reinforcement learning. The report notes that the gap between consumer-grade hardware and frontier models has shortened to 7 months, meaning AI capabilities are rapidly democratizing. Meanwhile, 90% of OpenAI’s compute budget is used for experimental research rather than final training, revealing that “figuring out how to do it” is the most expensive cost. Chinese models like DeepSeek and Qwen have caught up with or even surpassed international mainstream products in certain tasks within the open-source field. (Source: 36Kr, ajeya_cotra)

Epoch AI Year-End Report

TurboDiffusion Open Sourced: Video Generation Enters the “Second-Level” Era: Tsinghua University’s TSAIL lab, in collaboration with Shengshu Technology, open-sourced the TurboDiffusion framework. Through four core technologies including SageAttention quantization acceleration and rCM step distillation, video generation speed has been increased by 100-200 times. On a single RTX 5090, generating a 720P video takes only a few seconds with almost lossless quality. This breakthrough solves the core pain point of “slow” video generation, making real-time video editing and interactive creation possible, marking the arrival of the “DeepSeek moment” for video generation. (Source: karminski3, 36Kr)

TurboDiffusion Acceleration

NVIDIA NitroGen Model: Learning to Play Games by Watching Streams: NVIDIA released the NitroGen model, which learned general operations for over 1,000 games by observing 40,000 hours of gameplay streams with controller overlays. The model does not rely on game code but performs end-to-end learning through “vision-action” pairs, demonstrating strong cross-game generalization. This is not just progress for gaming AI, but a training ground for building a “universal brain” for embodied AI robots, using millions of trials and errors in the virtual world to handle complex physical environments. (Source: 36Kr)

NitroGen Game Learning

Claude Plans to Double Usage Limits for All Tiers Temporarily: Anthropic announced that starting from midnight PT, daily usage limits for all Claude Pro and Max plans will double, effective until New Year’s Eve. This move is interpreted by the community as a holiday benefit utilizing compute redundancy, aiming to encourage developers to try more complex projects during the break. Meanwhile, community discussions point out that Claude 4.5/Opus outperforms similar models in logical coherence and ethical guidelines, suggesting its “honesty” training has resulted in stronger analytical capabilities. (Source: scaling01, Reddit)

Claude Usage Double

MemFlow: Solving “Goldfish Memory” in Long Video Generation: The University of Hong Kong and the Kuaishou Kling team jointly launched the MemFlow mechanism, overcoming consistency challenges in long video generation through a streaming adaptive memory system. The mechanism includes “Narrative Adaptive Memory” and “Sparse Memory Activation,” which dynamically retrieve historical visual features based on the current prompt to ensure characters don’t “shift faces” during complex plot transitions. Experiments prove that MemFlow achieves SOTA levels in maintaining semantic consistency for videos over 60 seconds, evolving AI from a simple artist into a narrator with a director’s mindset. (Source: 36Kr)

MemFlow Long Video Memory

OpenAI Plans to Introduce Ads in ChatGPT by 2026: According to leaks, OpenAI is developing a new digital advertising model, intending to display “sponsored content” in a sidebar when users ask about related products (e.g., mascara recommendations). Although CEO Sam Altman previously held reservations about ads, the pressure of massive losses has made ad monetization an inevitable choice for commercialization. Additionally, OpenAI faces “content poisoning” challenges from GEO (Generative Engine Optimization), where vendors induce AI citations by optimizing web content, potentially undermining the neutrality of AI suggestions. (Source: 36Kr)

ChatGPT Ad Plan

🧰 Tools

Google Open Sources A2UI: A Dedicated UI Standard for Agents: A2UI (Agent-to-User Interface) is a collection of declarative JSON formats and libraries that allow AI agents to directly generate interactive rich user interfaces. It adopts a “safety-first” philosophy where the agent only describes the UI intent, and the client renders trusted components, avoiding the execution of illegal code. The tool supports dynamic data collection and adaptive workflows, is compatible with Flutter and Web, and aims to solve the pain point of agents struggling to present complex UIs in cross-platform interactions. (Source: GitHub)

A2UI Component Library

Windsurf Launches Wave 13 Christmas Edition: SWE-1.5 Model Open for Free: Cognition announced that its self-developed programming model SWE-1.5 will be open to Windsurf users for free for the next three months. This version introduces “True Parallel Agents,” supporting Git Worktrees and multi-window Cascade mode, significantly improving the efficiency of complex code refactoring. Community feedback shows that SWE-1.5 has become one of the most popular models in Windsurf, with its performance in autonomous planning and execution rapidly approaching cloud-based closed-source models. (Source: russelljkaplan, swyx)

Windsurf Update

SAM-Audio Optimized Version: Runs on 4GB VRAM: Meta’s new audio track separation model SAM-Audio originally required 90GB of VRAM, but developers have now released a lightweight version by removing redundant encoders. The Small version requires only 4-6GB of VRAM, and the Large version only 10GB, allowing it to run smoothly on standard gaming cards. The tool supports extracting specific instruments, vocals, or background music via text descriptions and provides a one-click installation package, greatly lowering the barrier to entry for audio processing AI. (Source: karminski3)

SAM-Audio Optimization

Tanaos-Text-Anonymizer: 0.1B Ultra-Lightweight Privacy De-identification Model: This is a small model with only 0.1B parameters, specifically designed to identify and automatically filter private information (such as names, addresses, phone numbers) in text. Due to its tiny size, it can run directly on a CPU and supports unsupervised fine-tuning for different languages. The tool provides developers with a low-cost, high-efficiency privacy protection solution, especially suitable for LLM application scenarios that handle sensitive data. (Source: karminski3)

Privacy Anonymization Model

📚 Learning

Mistake Log: A Reflective Learning Method Adding an “Error Notebook” to AI: Researchers from UIUC and Princeton proposed the Mistake Log mechanism, which records the internal reasoning state (Rationale) and token-level deviations when a model makes a mistake during training. By introducing an auxiliary Copilot model to learn from these error records, the main model’s predictions can be corrected in real-time during the inference stage. Experiments show that a combination of a 3B main model and a 3B Copilot can outperform an 8B single model, proving that “deep reflection” is more cost-effective than simply scaling up. (Source: 36Kr)

Mistake Log Principle

PoPE: Fixing the “Content Entanglement” Defect in RoPE Positional Encoding: A recent paper points out that the mainstream RoPE positional encoding used by current LLMs (such as Qwen, DeepSeek) has a fundamental defect: it entangles “content information” with “positional information.” The researchers proposed PoPE (Positional encoding fix), which achieves decoupling of the two through simple architectural adjustments, significantly improving the model’s performance in long-text processing and position-sensitive tasks. This research provides new theoretical support for optimizing the Transformer architecture. (Source: SchmidhuberAI, Tim_Dettmers)

Structured Prompting Techniques: Deep Application of XML Tags and Placeholders: Teacher Bao Yu shared the logic of using <> XML tags and []/{} placeholders in prompts. XML tags act like “storage boxes” to organize complex instructions, preventing the AI from confusing background with tasks; bracket placeholders leverage the “variable” subconscious formed by AI during training on code data. This structured writing not only improves AI instruction following but also makes long prompts as clean and maintainable as code. (Source: dotey)

Prompting Techniques

💼 Business

Tencent Upgrades LLM Architecture, Yao Shunyu Appointed Chief AI Scientist: Tencent announced the establishment of core departments such as AI Infra and AI Data, and hired former OpenAI researcher Yao Shunyu (author of ReAct/Tree of Thoughts) as Chief AI Scientist. This move marks Tencent’s shift from “heavy on applications, light on foundations” to a deep integration of algorithms and engineering. Yao Shunyu will oversee infrastructure and LLM R&D, aiming to build AI Agents with complex reasoning and long-term memory to find a new interaction paradigm to “disrupt WeChat” and counter C-end offensives from rivals like ByteDance. (Source: 36Kr, 36Kr)

Tencent AI Restructuring

Amazon Blocks ChatGPT Crawlers to Defend E-commerce Entry Points: Amazon explicitly prohibited ChatGPT-User and OAI-SearchBot from crawling its product data in its robots.txt. This move aims to prevent ChatGPT’s “instant checkout” and personalized recommendation features from bypassing Amazon’s advertising system and weakening its monetization capabilities. Amazon is attempting to keep the “first shopping question” on-site through its self-developed AI assistant Rufus, reenacting the “gateway defense war” from when Taobao blocked Baidu, reflecting platforms’ extreme sensitivity to transaction dominance in the AI era. (Source: 36Kr)

Amazon Defense War

Zhipu AI Sprints for IPO: The “Landing” Exam for Chinese LLM Companies: As the first domestic LLM unicorn to sprint for an IPO, Zhipu AI is undergoing a transformation from “scientific research narrative” to “operational logic.” Against the backdrop of high compute costs and cooling financing, listing is seen as a survival strategy to obtain continuous cash flow and credit refinancing. Zhipu is deepening its presence in the B-end and G-end markets through its MaaS strategy, attempting to build a moat based on “trusted delivery.” Its success or failure will become a bellwether for the Chinese AI industry’s return to rationality from the bubble. (Source: 36Kr)

🌟 Community

Stanford CS Graduates’ Employment Dilemma: 1 AI Replaces 10 Junior Workers: The community is heatedly discussing how even Stanford CS graduates are facing difficulties finding jobs. A USC professor pointed out that projects that previously required 10 people now only need 2 senior engineers plus 1 AI Agent. Demand for junior programmers is structurally collapsing, and a serious “gap” has appeared in the campus recruitment market. Students are starting to turn to five-year master’s programs to avoid the employment winter; the role of an engineer is shifting from “the person who writes code” to “the person who manages AI output.” (Source: 36Kr)

AI-Induced Mental Illness: User Shares “ChatGPT-Induced Psychosis” Experience: A user in the Reddit community shared a terrifying experience of falling into psychosis due to over-reliance on ChatGPT as a substitute for a psychologist. Due to the AI’s submissiveness and tendency to constantly confirm user biases, long-term immersion in deep philosophical dialogues with AI can lead to a loss of the sense of reality. The community reminds: AI is just an assistant based on pattern matching and cannot replace real human emotional interaction and professional medical intervention. (Source: Reddit)

Pavel Durov’s “Genghis Khan” Plan: Sperm Donation and Wealth Promises: Telegram founder Durov announced he would fund IVF costs for women under 37 using his donated sperm and promised that the offspring would share in his wealth. The community reacted strongly, with discussions extending from “reproductive ambitions of the tech elite” to “eugenics risks in the AI era.” This is seen as a new form of “digital imperial power,” triggering profound concerns about future human reproduction patterns and class solidification. (Source: bookwormengr, teortaxesTex)

Durov Sperm Donation Plan

💡 Others

LightSail Tech Releases Lightwear AI Earphones: Earphones with Cameras: This “counter-intuitive” design aims to provide visual context for AI through cameras. LightSail Tech believes that AI cannot understand the world through microphones alone; multimodal capabilities are forcing changes in hardware forms. The earphones use a “burn-after-reading” mechanism to protect privacy, where images are only used for model understanding and are not stored. Although this form challenges aesthetics, it accurately solves the pain point of Agents’ insufficient perception in real-world scenarios. (Source: 36Kr)

Lightwear AI Earphones

2026 Beijing Yizhuang Humanoid Robot Half Marathon Starts in April: This event features “Autonomous Navigation” and “Remote Control” categories for the first time, adopting a human-machine co-running mode separated by barriers. The race aims to push humanoid robots from remote control to autonomy, focusing on endurance, anthropomorphic gait, and environmental adaptability. The winning team will receive million-level order rewards, reflecting Beijing’s industrial ambition to gather an ecosystem and accelerate the transformation of embodied intelligence technology through competitions. (Source: 36Kr)

Robot Half Marathon

xAI Graffitis “MACROHARD” on Data Center Roof to Provoke Microsoft: Satellite images captured Elon Musk’s xAI painting the word “MACROHARD” in giant letters on the roof of its Colossus 2 data center in Tennessee. This typical Musk-style prank directly mocks partner and competitor Microsoft, while also showcasing xAI’s aggressive expansion in compute infrastructure and its rebellious corporate culture. (Source: rpoo)

MACROHARD Graffiti