AI Daily - 2026-01-09(Morning)

Keywords：DeepSeek R1, AI training, Reinforcement Learning RL, Process Reward Model PRM

🔥 Focus

DeepSeek R1 technical report expanded to 86 pages, revealing training details: DeepSeek has quietly updated its R1 technical report, expanding it from 22 pages to 86 pages, nearly rewriting it into a reproducible “textbook.” The report discloses for the first time the evolution of the three training stages (Dev1/2/3), a breakdown of the extremely low training cost of $294,000, and a retrospective on failed attempts such as MCTS and Process Reward Models (PRM). This move not only showcases its deep accumulation in the field of Reinforcement Learning (RL) but also proves to the open-source community through detailed appendix parameters that pure RL-driven reasoning models are not only feasible but also possess an extremely high efficiency ratio. This “transparency” competition strategy is forcing closed-source giants to re-examine their technical barriers. (Source: _akhaliq, karminski3, QbitAI)

MiniMax and Zhipu AI Hong Kong IPOs kick off the “Shanghai/Beijing Moment” for Large Models: China’s leading AI companies, MiniMax and Zhipu AI, have successively listed on the Hong Kong Stock Exchange, marking the official entry of China’s AGI industry into the secondary market testing phase. MiniMax’s stock price surged over 100% on its first day of trading, with its market value exceeding HK$100 billion; its globalization DNA, with over 70% of revenue coming from overseas, is highly favored by capital. Zhipu AI demonstrated an exponential 25x growth in its MaaS business over 10 months. The successful listings of these two companies not only bring generous returns to early investors but also provide a reproducible financing model for subsequent AI unicorns through the Chapter 18C system, proving the unique value of Chinese enterprises with independent base model capabilities in global competition. (Source: Zai_org, 36Kr)

CES 2026 Physical AI Explosion: From Screens to the Real World: This year’s CES has completely pivoted to the theme of “Physical AI,” which NVIDIA’s Jensen Huang called the “ChatGPT moment for Physical AI.” Boston Dynamics’ Atlas made its first public stage appearance and announced its entry into Hyundai factories for work; LG released the CLOiD household robot capable of folding clothes; and Lenovo introduced Qira, a personal AI super-agent. The Chinese supply chain performed remarkably, with over 20 robotics companies exhibiting, showcasing mass production capabilities ranging from dexterous hands to full-sized humanoid robots. AI is no longer just a chat box; it is deeply intervening in the physical world through sensors and actuators, restructuring traditional industry chains from home appliances and PCs to automobiles. (Source: TheRundownAI, LeiTech)

OpenAI launches Healthcare segment to enter medical vertical: OpenAI has officially launched the ChatGPT Health experience, supporting HIPAA compliance and partnering with top medical institutions like the Mayo Clinic and Boston Children’s Hospital. This feature allows users to connect Electronic Health Records (EHR) and Apple Health data, using AI to assist in analyzing lab reports and formulating health plans. Although jokingly referred to as the “American version of Ant Health,” it represents the trend of large models deepening from general-purpose to professional vertical fields. Medical AI is evolving from simple Q&A to professional assistants capable of integrating multi-source data and providing clinical decision support, though safety and misdiagnosis risks remain focal points for the community. (Source: _samirism, openai)

🎯 Trends

Google DeepMind proposes “Nested Learning” (NL) framework: Addressing the issue where Transformers lack continuous learning capabilities and are prone to “catastrophic forgetting,” the DeepMind team drew inspiration from human associative memory mechanisms to propose the Nested Learning framework (NL). This framework treats the optimizer as the “context” of the model architecture, allowing AI to build abstract structures during operation and solidify short-term experiences into long-term knowledge through nested modules with different update frequencies. This is seen as a key step toward AGI, potentially allowing models to self-evolve in dynamic environments like humans, rather than relying on expensive retraining. (Source: hardmaru, QbitAI)

Alibaba releases Qwen3-VL-Embedding and Reranker models: Alibaba’s Qwen team has launched a multimodal retrieval duo aimed at unifying the vector space for text, images, videos, and mixed modalities. Qwen3-VL-Embedding supports over 30 languages and achieves SOTA performance on multimodal retrieval benchmarks; the Reranker further improves retrieval accuracy through fine-grained relevance scoring. This release marks the official entry of RAG (Retrieval-Augmented Generation) technology into the omni-modal era, providing core infrastructure for building more complex visual Q&A, video search, and multimodal Agents. (Source: huggingface, _akhaliq)

a16z founder looks ahead to 2026: Intelligence cost deflation will drive demand explosion: Marc Andreessen pointed out that the decline in the unit cost of AI has exceeded Moore’s Law, and intelligence is transforming from a luxury into a utility like water and electricity. He predicts the future market will present a “pyramid structure”: a few super models at the top and ubiquitous edge-side small models at the bottom. Meanwhile, he believes startups are shedding “wrapper” skepticism by “backward integrating” self-developed models, and AI business models will shift from pay-per-token to pricing based on created value. (Source: nvidia, WallStreetCN)

Smart cockpit voice large models accelerate “onboarding”: At CES, StepFun demonstrated an end-to-end voice large model cockpit in collaboration with Geely Galaxy, featuring emotional recognition and long-term memory capabilities. Industry views suggest that 2026 will be the inaugural year for the mass production of entry-level Agents in automotive cockpits. Cockpits are shifting from simple voice control to a “third space” with proactive execution and personalized services. Cloud-edge collaborative AI architectures will become the core of competition for car companies, aiming to deeply integrate AI capabilities into the OS layer to achieve multi-domain experience fusion. (Source: dotey, CLS.cn)

🧰 Tools

Claude Code and code-simplifier plugin released: Anthropic’s command-line tool, Claude Code, has gone viral in the developer community due to its excellent engineering feel. The official team recently released the code-simplifier agent plugin, which supports one-click simplification of complex codebases. Its core philosophy is “file system as context,” significantly improving the efficiency of handling large repositories by dynamically loading required files instead of stacking tokens. Community feedback indicates it has surpassed GPT-4o in logical understanding and reducing “code verbosity.” (Source: dotey, natolambert)

Ralph Mode: Continuous loops and memory enhancement for Agents: LangChain OSS has introduced native Skills and Memory support for the DeepAgents library via Ralph Mode. This mode allows Agents to perform infinite loop tasks supported by the file system and Git, continuously updating their knowledge base through a “skill-based” learning process. This design enables Agents to self-correct and accumulate experience, providing a new paradigm for autonomous software development and complex long-range task processing. (Source: Vtrivedy10, hwchase17)

Pico AI Server: Local private ChatGPT on Mac: For privacy-sensitive users, Pico AI Server enables GPT-oss to run entirely locally on Apple Silicon. Optimized using the MLX framework, this tool allows Mac users with 24GB+ RAM to enjoy a smooth local inference experience. This reflects the trend of AI computing power migrating to the edge; users no longer need to upload sensitive data to the cloud to obtain high-performance dialogue and programming assistance. (Source: awnihannun)

LFM2.5 1.2B: An exceptional small model for Agents: LiquidAI released the LFM2.5 1.2B Instruct model, which performs remarkably in its size class, specifically optimized for Agent tasks, data extraction, and RAG. While not recommended for knowledge-heavy tasks, its inference speed in local environments like LM Studio is extremely fast (up to 41 tps), making it an ideal choice for building lightweight AI assistants and tool-calling workflows. (Source: Reddit r/LocalLLaMA)

📚 Learning

Tsinghua team’s DrugCLIP featured in Science: AI speeds up drug screening by millions of times: A joint research team from Tsinghua University proposed the DrugCLIP framework, redefining virtual screening as a dense retrieval task. By mapping protein binding pockets and small molecules into a vector space, the framework can complete 10 trillion calculations in just 24 hours on 8 A100 GPUs, with a screening speed 10 million times faster than traditional methods. This breakthrough creates a new paradigm for drug R&D in the post-AlphaFold era, significantly lowering the threshold for ultra-large-scale drug discovery. (Source: 36Kr)

Sakana AI releases Digital Red Queen (DRQ) research: This study simulates LLM-driven adversarial evolution within a Core War programming game sandbox. By having LLM-written Redcode programs compete continuously, researchers observed “convergent evolution” similar to that in the biological world: programs under different initial conditions eventually evolved similar efficient survival strategies (such as self-replication and data bombs). This work provides a safe and controlled experimental environment for studying adversarial dynamics and cybersecurity evolution in artificial systems. (Source: hardmaru, SakanaAILabs)

MAMF Explorer: Insights into real GPU matrix multiplication performance: Developer Aflah launched the MAMF Explorer tool, providing researchers with data on the actual achievable peak matmul FLOPS on various hardware, rather than the theoretical peaks advertised by manufacturers. This is highly practical for optimizing computing power allocation for large-scale model training and inference, helping developers identify real performance bottlenecks on different chips like Blackwell and H100. (Source: StasBekman, charles_irl)

💼 Business

Anthropic valuation may reach $350 billion, ARR growing rapidly: Anthropic is reportedly planning to raise $10 billion, with its valuation doubling within six months. Its 2025 revenue has reached $900 million, with a goal to exceed $2 billion in 2026. Compared to OpenAI’s internal friction, Anthropic is becoming the preferred choice for the enterprise market due to its high team stability and “performance kills” in the developer market (such as Claude Code), and is even thought to potentially overtake its predecessor in IPO progress. (Source: 36Kr, srimuppidi)

Tailwind layoffs spark reflection on AI’s impact on traditional SaaS models: The well-known CSS framework Tailwind announced a 75% layoff, citing the collapse of its business model due to the popularity of AI programming tools. Although Tailwind’s usage is increasing, the demand for users to generate code via AI has reduced reliance on its paid components. This event serves as a warning to all software companies relying on “labor/template” value: when AI can generate implementation solutions with one click, traditional knowledge payment barriers are crumbling. (Source: jon_stokes, imjaredz)

JD.com establishes “Chameleon Business Department” to accelerate Embodied AI implementation: JD.com has upgraded its original Chameleon project into a business department to fully take over the JoyAI App and the embodied intelligence brand JoyInside. The department focuses on the fusion of AI hardware and software and has already connected with over 40 robotics and AI toy brands. This shows that the e-commerce giant is leveraging its deep supply chain advantages to build a commercial closed loop from R&D to sales in the fields of AI toys and industrial robots. (Source: 36Kr)

🌟 Community

Linus Torvalds slams debate over “AI junk code” norms: Regarding discussions in the Linux kernel community about whether to establish norms for AI-generated code, Linus bluntly called it “stupid.” He believes documentation can only constrain those who follow rules, while those submitting “AI junk code” won’t bother to label it. He insists on viewing AI as a tool and pointed out that the kernel’s immunity should come from code review mechanisms and community culture, rather than meaningless documented posturing. (Source: 36Kr)

“Karpathy Effect” triggers collective anxiety among programmers: Andrej Karpathy lamented that the programming profession is being drastically restructured, with the bits contributed by developers becoming increasingly sparse. The community has summarized this as the “Karpathy Effect”: even senior engineers feel an unprecedented sense of falling behind. Discussions suggest that future core competitiveness will shift from “writing code” to “understanding system complexity”; vibe coding is turning 10x engineers into 100x, but also making the barrier for beginners higher. (Source: dejavucoder, arohan)

MTurk data quality faces “existential crisis” due to AI participation: Latest research shows that data quality on crowdsourcing platforms like Amazon Mechanical Turk has plummeted, with 96% of contradictory items showing positive correlation in labeling, proving that a large number of workers are using LLMs to perfunctorily complete tasks. This is fatal for behavioral science and model fine-tuning that rely on high-quality human labeling; the community is calling for the establishment of authentic data collection networks based on identity verification. (Source: random_walker)

💡 Others

NO FAKES Act legal terms spark concern in open-source community: The bill’s definition of liability regarding “digital replica rights” is said to contain traps. If a developer releases a TTS or voice cloning model that is used by others to create fake celebrity videos, the developer could face massive joint liability. The community fears this will lead audio model developers on platforms like Hugging Face into “legal suicide,” thereby stifling innovation in open-source audio technology. (Source: Reddit r/LocalLLaMA)

ICML 2026 introduces “collective responsibility” rules to combat academic cheating: To crack down on “salami slicing” submissions and AI-generated spam, ICML announced: if a paper is found to involve fraud, all submissions under the names of all co-authors may be directly rejected. This “guilt by association” mechanism requires heads of research groups to personally oversee quality. Meanwhile, the conference allows conditional use of AI for peer review, but only with the authors’ consent. (Source: 36Kr)

Stanford paper confirms LLMs exhibit serious copyright data memorization: Research shows that Claude 3.7 Sonnet can reproduce 95.8% of Harry Potter content verbatim, with Gemini and Grok following closely. This strongly refutes the claim that “models do not store training data,” proving that existing safety filters remain fragile when faced with specific prompts. This finding will provide key evidence for future AI copyright litigation. (Source: stanfordnlp, andykonwinski)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Others

Related Tags

Related Posts

AI Daily – 2026-07-20

AI Daily – 2026-07-19

AI Daily – 2026-07-18