AI Daily - 2025-12-26(Morning)

Keywords：Large Language Models, AI Unicorn, OpenAI, NVIDIA, Meta, AI Inference, AI Computing Power, AI Music, GLM-4.7 Model, GPT-5.2-Codex-XMas, Groq LPU Architecture, Self-play SWE-RL, Nemotron 3 Series

🔥 Focus

Zhipu AI and MiniMax Compete for “Global First AI Model IPO”: Zhipu AI and MiniMax (Xiyu Technology) have successively passed hearings at the Hong Kong Stock Exchange and disclosed their prospectuses, marking the start of the capitalization process for Chinese AI unicorns. Zhipu’s 2024 revenue reached 312 million RMB with a compound growth rate exceeding 130%. However, due to surging computing costs, its loss in the first half of 2025 reached 2.358 billion RMB. Meanwhile, Zhipu released and open-sourced the GLM-4.7 model, which ranked first among open-source models in the Code Arena, surpassing GPT-5.2 and demonstrating strong technical iteration capabilities. This IPO is not just about financing, but a “benchmark” event for the market to anchor the value of pure-play AI model companies. (Source: 36Kr, Market Value Crystal)

OpenAI Releases Christmas Edition Codex, Deepening “Agentic Programming”: During the Christmas season, OpenAI launched GPT-5.2-Codex-XMas. This model features personalized upgrades while maintaining GPT-5.2 performance and offers double usage limits for subscribers. This update is more than holiday marketing; it reflects OpenAI’s strategic shift to position Codex as an “Engineering Agent,” emphasizing long-context understanding, cross-file task processing, and Windows-native toolchain optimization. Developers have found its completion level in complex projects superior to most competing models, signaling an evolution from “AI writing code” to “AI managing engineering” by 2026. (Source: Xinzhiyuan, op7418)

Axiom Math: Defining New “Acceptance” Standards for AI Reasoning: Axiom Math, founded by 24-year-old Stanford dropout Carina Hong, raised $64 million at a $300 million valuation. The company is dedicated to developing “AI Mathematicians” capable of autonomously verifying logical correctness. The core breakthrough lies in introducing the Lean programming language, enabling formal proof for every step of AI reasoning and solving the trust issue of “accepting” AI model results. In the Putnam Mathematical Competition, its system autonomously solved 9 difficult problems, all of which passed verification. This progress means AI is shifting from vague “answer generation” to rigorous “logical self-proof,” becoming a reliable collaborator in scientific research and industry. (Source: AI Deep Researcher)

NVIDIA Reaches Technology Licensing Agreement with Groq to Address Compute and Memory Bottlenecks: Facing skyrocketing HBM prices and capacity shortages, NVIDIA has entered into a non-exclusive technology licensing agreement with Groq. Groq’s founder and core team will join NVIDIA to assist in integrating its inference technology. Groq’s LPU architecture utilizes SRAM as primary memory, offering bandwidth several times higher than traditional HBM, significantly alleviating memory access bottlenecks during inference. This move is seen as NVIDIA opening a second front amid the “memory famine,” aiming to explore new memory technology paths to hedge against DRAM supply chain risks and consolidate its dominance in the AI inference market. (Source: Machine Heart, op7418)

🎯 Trends

Meta Introduces Self-play SWE-RL for Agent Self-Evolution: Meta’s research team released the SSR framework, allowing software engineering agents to undergo self-training through a “Bug Injection” and “Bug Fixing” self-play mechanism without human labeling. This method only requires access to a source code sandbox, where the agent continuously generates high-quality problems and solves them autonomously in a self-driven evolution loop. Experiments show that SSR performance improves continuously during training and outperforms baseline Reinforcement Learning (RL) methods. This marks a step toward “Superintelligence” for AI Agents, with the potential to surpass human capabilities in system understanding and autonomous software creation. (Source: Academic Headlines)

Liquid AI Releases Strongest 3B Model with Significant RL Results: Liquid AI launched the experimental model LFM2-2.6B-Exp, built entirely through Reinforcement Learning, showing excellent performance in instruction following, knowledge, and math benchmarks. Its IFBench score even surpassed DeepSeek R1, which is 263 times larger. Community feedback indicates the model possesses “PhD-level knowledge” and runs smoothly on edge devices like iPhones. This progress proves once again that through efficient algorithm design and RL optimization, small-parameter models can exhibit performance comparable to top-tier models in specific domains. (Source: maximelabonne, huggingface)

ChatGPT Android Code Leak Reveals Ad Placement Plans: Developers discovered ad-related strings such as “ads feature” and “search ad” in the ChatGPT Android beta code. Although Sam Altman previously stated that ads were a “last resort,” OpenAI is clearly preparing to monetize free users under the financial pressure of $2.5 billion in spending in the first half of 2025. Planned ad formats may include sidebar sponsored messages or “conversational recommendations,” aiming to achieve intent-oriented monetization without interrupting natural dialogue. This suggests that AI search will move away from the “absolutely pure” era. (Source: Facing AI)

NVIDIA Releases Nemotron 3 Series, Focusing on Long Context and Agent Capabilities: NVIDIA introduced the Nemotron 3 family (Nano, Super, Ultra), utilizing a hybrid Mamba-Transformer architecture and Mixture-of-Experts (MoE) technology. This series supports context lengths up to 1M and features post-training optimization for Agent reasoning and multi-step tool calling. The Nano version leads in accuracy among similar small models with extremely low inference costs, while the Ultra version pursues SOTA-level reasoning performance. NVIDIA committed to open-sourcing model weights, training software, and recipes to further enrich the open-source ecosystem. (Source: Reddit)

SAM 3: Evolving from Clicking Pixels to “Naming Concepts”: Meta released SAM 3 (Segment Anything with Concepts), upgrading video segmentation from “point-and-click” to “concept recognition.” Users can simply input “person wearing glasses,” and the model will automatically locate all matching objects in the image or video. Through automated training on 4 million unique concepts, SAM 3’s accuracy on the complex video benchmark MOSEv2 increased from 47.9% to 60.3%. This breakthrough significantly enhances AI’s semantic understanding of the visual world, solving occlusion and consistency challenges in video segmentation. (Source: ylecun)

🧰 Tools

GAIT and GaitHub: A “Git” Version Control System for AI Reasoning: Addressing the pain points of untraceable and irreproducible AI decisions, developers launched GAIT. This system treats AI interactions as content-addressed objects, covering user intent, model responses, reasoning branches, and memory states. With GAIT, developers can version control, branch experiments, and merge decisions for the AI reasoning process just like managing code. The accompanying GaitHub cloud platform supports collaboration and auditing, providing necessary engineering infrastructure for enterprise-level AI workflows and solving the black-box problem of “why the AI decided this.” (Source: Reddit)

DeepFabric: A Tool-Calling Fine-Tuning Framework for Specific MCP Services: DeepFabric is an open-source tool that allows developers to automatically generate domain-specific reasoning datasets for any MCP server or toolset. By executing real tool trajectories in an isolated WebAssembly environment, the framework can fine-tune small models like Qwen3-4B to outperform Claude 4.5 and Gemini 2.5 in specific tasks (such as Blender control). This provides a clear path for building high-performance, low-cost vertical domain expert Agents. (Source: Reddit)

Quint: Saying Goodbye to CLI, Introducing Interactive UI for Chatbots: Quint is a React library designed to shift LLM-driven interactions from pure text to structured, deterministic UI. It allows developers to define explicit options that users can click to trigger specific information displays or structured inputs. The core concept is to separate model reception, user visuals, and output rendering, making interactions in scenarios like MCQs and role-playing branches more controllable. Quint is independent of specific AI providers, signaling a future where LLMs directly render dynamic UI components to enhance user experience. (Source: Reddit)

📚 Learning

Hugging Face Releases Series of Free AI Courses: Hugging Face launched a matrix of free courses covering the latest AI technologies during the holidays. Content includes: a Robotics course for building robots using LeRobot, an MCP course for learning Model Context Protocol, an Agents course for building and deploying agents, as well as in-depth technical tutorials on LLMs, Deep Reinforcement Learning, and Diffusion Models. These courses leverage the HF ecosystem to help developers quickly master practical skills from foundation models to cutting-edge Agent architectures. (Source: huggingface)

WildVideo: First Benchmark Systematically Categorizing Video QA Hallucinations: A team from the National University of Defense Technology and Sun Yat-sen University released the WildVideo benchmark, defining 9 types of tasks including perception, cognition, and contextual understanding regarding “hallucination” issues in multimodal video interaction. Experiments show that even GPT-4o’s accuracy in multi-turn tasks is only 52.7%, with poor performance in first-person perspective videos. This benchmark provides precise tools for diagnosing model defects in dynamic perception, deep reasoning, and long-dialogue consistency, pushing video understanding evaluation toward real interaction. (Source: Xinzhiyuan)

PhononBench: A New Yardstick for Evaluating AI-Generated Crystal Stability: PhononBench is the first large-scale benchmark for the dynamical stability of AI-generated crystals. Using the MatterSim potential function, it efficiently calculated over 100,000 structures produced by six leading generative models. The results reveal common limitations in current models: average stability is only 25.83%. This work not only highlights the shortcomings of generative models in physical feasibility but also filters out 28,000 phonon-stable crystal structures, providing a reliable candidate pool for future new material exploration. (Source: HuggingFace)

💼 Business

AI Giants’ $120 Billion “Ghost Debt” Raises Concerns: Tech giants like Meta, xAI, and Oracle are moving over $120 billion in data center spending off their balance sheets through Special Purpose Vehicles (SPVs). While this off-balance-sheet financing model protects corporate credit ratings, it also hides significant financial risks. If AI demand falls short of expectations, the massive debt could trigger a chain reaction on Wall Street. UBS data shows that about $125 billion flowed into such “project financing” this year, reflecting that the AI arms race has entered a high-risk stage of capital gambling. (Source: Cailianshe)

Indian “AI Meme Stock” Surges 550x Revealed to Have No Chip Business: India’s RRP Semiconductor Ltd saw its stock price skyrocket 55,000% over the past 20 months, with its market value soaring to $1.7 billion, even surpassing NVIDIA’s growth rate. However, investigations revealed the company has only 2 full-time employees and has conducted no semiconductor manufacturing activities, with even negative revenue. This absurd phenomenon reflects the blind pursuit of AI concepts by Indian retail investors and regulatory loopholes, serving as a typical speculative cautionary tale in the 2025 AI bubble. (Source: Xinzhiyuan)

AI Compute Demand Causes 256GB RAM Price to Surpass RTX 5090: As giants like OpenAI lock up 40% of the global DRAM supply, the memory market is experiencing structural shortages. The market price for a single 256GB DDR5 memory stick has surged to $3,500-$5,000, far exceeding top-tier graphics cards. This phenomenon reflects how high bids from AI servers for HBM and high-performance memory are “hijacking” consumer-grade capacity. Beyond PC components, the rigid demand for large memory in the AI PC concept has further raised the threshold, with average consumers facing skyrocketing hardware costs due to the AI premium. (Source: Machine Heart)

🌟 Community

2025 AI Buzzwords: From “Vibe Coding” to “Slop”: MIT Technology Review selected the AI words of the year, with “Vibe Coding” topping the list, emphasizing that humans only need to express goals while AI handles implementation. Meanwhile, “Reasoning Models” and “World Models” reflect the evolution of technical depth, while “Slop” (AI-generated junk content) and “Bubble” reflect community reflection on content flooding and capital overheating. Additionally, “GEO” (Generative Engine Optimization) is replacing SEO as the new battlefield for brands to acquire traffic in the AI era. (Source: Tencent Tech, Silicon Star GenAI)

Yann LeCun Retweets: “Seven Rifts” in Judgment Between Humans and LLMs: A paper compared the judgment differences between humans and LLMs across seven cognitive stages, pointing out fundamental flaws in LLMs regarding perceptual anchoring, motivational guidance, causal reasoning, and metacognition. Although LLM-generated language is fluent and deceptive, its essence is probabilistic prediction rather than “mind.” Community discussions suggest that this “sense of AI intelligence” is highly misleading without verification, as humans often over-trust AI outputs due to “credibility bias,” constituting a structural challenge in the AI era. (Source: ylecun)

Reddit Discussion: Using ChatGPT as a Cognitive Rehabilitation Tool: A user with a history of PTSD shared their experience using ChatGPT for structured cognitive support. Through long-term interactive dialogue, the user made significant progress in emotional regulation, logical organization, and self-advocacy, recognized by clinicians. The community responded strongly, focusing on how AI serves as a “consistency mirror” to assist psychological recovery, while also warning against over-reliance and the potential “echo” effect of misleading AI outputs. (Source: Reddit)

💡 Others

Alzheimer’s Disease Animal Experiment Achieves Complete Reversal: A research team at Case Western Reserve University published a breakthrough in Cell Reports Medicine, achieving complete recovery of neurological function in late-stage Alzheimer’s mice by repairing NAD+ balance in the brain with the compound P7C3-A20. Unlike blind NAD+ supplementation, this therapy focuses on precise regulation, not only repairing pathological damage but also restoring memory. While human application is still some time away, it opens the door to a “complete cure” for Alzheimer’s. (Source: dotey)

Stardust Intelligence Cable-Driven Robot Starts Selling Blind Boxes: On Christmas Day, the S1 cable-driven humanoid robot developed by Stardust Intelligence officially “started work” in commercial districts across Beijing, Shanghai, and Guangzhou, responsible for voice reception, grabbing blind boxes, and delivering goods. Cable-driven technology gives the robot flexibility and fine force control similar to human muscles, making it safer and more responsive in human-robot interaction. The company’s “Avatar Intelligence” concept aims to let robots enter toxic labs or remote service scenarios through teleoperation first. (Source: Intelligent Emergence)

AI Music Hit “Seven-Day Lover” Triggers Copyright and Attention Battle: “Seven-Day Lover,” generated by a programmer using DeepSeek and AI music tools, surpassed 2 million plays on NetEase Cloud Music, with copyrights sold for tens of thousands of RMB. This event proves that AI music already possesses real monetization capabilities and is impacting the traditional copyright system. ByteDance’s Qishui Music defines hit paths through the Douyin ecosystem, while Tencent and NetEase maintain strict control over auditing and revenue distribution. The “infinite supply” brought by AI is forcing platforms to shift from copyright races to battles over attention distribution efficiency. (Source: Shixiang)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Others

Related Tags

Related Posts

AI Daily – 2026-07-19

AI Daily – 2026-07-18

AI Daily – 2026-07-17