AI Daily - 2026-01-03(Morning)

Keywords：Meta Llama 4, DeepSeek mHC, OpenAI Gumdrop, Llama 4 benchmark manipulation, Manifold Constrained Hyperconnect Architecture, AI stylus hardware design

🔥 Focus

Meta Llama 4 Benchmark Manipulation Confirmed: LeCun’s Departure Reveals Inner Workings: Turing Award winner Yann LeCun, upon leaving Meta, publicly admitted that the base model test results for Llama 4 involved “fudging” behaviors, specifically using optimized models for different test tasks to achieve high scores. This revelation has caused a major stir in the open-source community, exposing the “benchmark anxiety” among tech giants in the base model race. LeCun further pointed out that Mark Zuckerberg, disappointed with Llama 4’s performance, marginalized the original generative AI team and instead invested heavily in Scale AI. This marks a significant shift in Meta’s AI research roadmap, moving from academic-driven to a more aggressive commercial and engineering-driven approach. (Sources: Financial Times, Slashdot)

DeepSeek Releases mHC Architecture: Challenging the Decade-Old Residual Connection Tradition: DeepSeek has proposed the “manifold-constrained Hyper-Connections” (mHC) architecture, attempting to break the residual connection paradigm that has dominated deep learning since ResNet in 2015. Traditional residual connections suffer from signal amplification and training instability in deep networks. mHC introduces “doubly stochastic matrix” constraints, reducing signal gain from 3000x to 1.6x, significantly improving training stability and model performance. This breakthrough demonstrates the ambition of Chinese AI labs in underlying architecture innovation, moving beyond mere scaling to deeply tap into the optimization potential of macro-architectures. (Sources: arXiv, Reddit)

OpenAI Hardware Project “Gumdrop” Exposed: Jony Ive Crafting an AI Pen: Supply chain sources reveal that OpenAI’s hardware project in collaboration with former Apple design chief Jony Ive is codenamed “Gumdrop,” confirmed to be an AI pen with ambient sensing capabilities. The device abandons traditional screen interaction, focusing instead on voice and haptics. The design philosophy pursues minimalism and “focus,” aiming to fill deep work scenarios beyond phones and computers. This move reflects OpenAI’s attempt to establish a native AI interaction entry point through hardware, utilizing next-generation audio models to achieve a more natural “intelligent companion” experience. (Sources: APPSO, The Information)

Andrew Ng Proposes “Turing-AGI Test”: Replacing Conversational Deception with Economic Value: Addressing the over-hype of the AGI concept, Andrew Ng proposed the “Turing-AGI Test” in his 2026 New Year special issue. This test no longer focuses on whether AI can deceive humans, but rather evaluates whether it can complete multi-day work tasks with economic value using computers and the internet, much like a skilled remote employee. The core of this view is to pull AGI back from illusory intelligence metrics to a pragmatic productivity dimension, aiming to calibrate social expectations of AI through more rigorous and practical standards to avoid investment bubbles. (Source: DeepLearning.AI)

🎯 Trends

Rise of Recursive Language Models (RLM): A New Trend for 2026: Researchers like Alex Zhang from Stanford University have proposed the concept of Recursive Language Models, suggesting that 2026 will see a leap from reasoning models to recursive models. The core of RLM is to let the model treat its “own prompts” as objects in the external environment, manipulating and recursively calling itself by writing code. This method can increase the context processing capability of LLMs by several orders of magnitude, granting models stronger long-range task planning and self-correction abilities. The community generally believes that this “Bitter Lesson” style of inference-side scaling will be one of the key paths to achieving AGI. (Sources: arXiv, Stanford NLP)

Claude Code Explosive Growth: $1 Billion Revenue in 6 Months: Anthropic disclosed that its programming assistant Claude Code reached an Annual Recurring Revenue (ARR) of nearly $1 billion within six months of launch, setting a record for AI programming tools. Founder Boris Cherny revealed that 100% of his personal code is now written by AI. The key to Claude Code’s success lies in its evolution from “code completion” to a “digital coder,” achieving autonomous development loops through plugins like Ralph Wiggum. This marks the entry of AI programming into the mid-to-back-end infrastructure era, with a significant increase in corporate willingness to pay. (Sources: Xinzhiyuan, Boris Cherny)

Embodied AI Talent War Escalates: Starting Salaries for Fresh Graduates Hit 3 Million RMB: As giants like ByteDance and Huawei delve deep into Embodied AI, top algorithm talent has become a scarce resource. A motion control lead who graduated with a Master’s in 2024 has been offered a 3 million RMB annual salary plus options, while senior experts’ monthly salaries have exceeded 120,000 RMB. Companies are entering “early lock-in” mode, even offering full-time benefits to third-year PhD interns. This irrational prosperity reflects collective industry anxiety on the eve of a technical explosion; talent competition is expected to remain white-hot until the mass production milestone in 2027. (Source: Touzhong.com)

🧰 Tools

Ralph Wiggum Plugin: Enabling Claude to “Work Overtime All Night”: Anthropic officially released the Ralph Wiggum plugin for Claude Code, which intercepts exit commands via a Stop hook mechanism and feeds the prompt back to the model. This “self-dialogue” loop allows Claude to continuously improve code, run tests, and fix bugs without human intervention until it outputs a “DONE” signal. This autonomous loop mode greatly enhances the efficiency of TDD development and Greenfield projects, shifting the human role from “writer” to “spec definer.” (Sources: GitHub, Jintao Zhang)

LlamaIndex Releases LlamaSheets: The Nemesis of Messy Tables: LlamaIndex has launched LlamaSheets into Beta testing, specifically designed to handle real-world spreadsheets with chaotic layouts, merged cells, and complex headers. The tool automatically identifies regions and extracts them into clean Parquet files, directly interfacing with pandas or DuckDB. It also provides over 40 cell-level metadata features, offering strong support for automated financial statement analysis and complex data cleaning, serving as an important supplement for RAG systems processing unstructured tables. (Source: LlamaIndex)

OpenCode Open-Source Programming Agent: A Strong Competitor to Claude Code: The trending GitHub project OpenCode provides a 100% open-source, vendor-agnostic AI programming agent. It supports Claude, OpenAI, and local models, utilizing a client/server architecture that allows users to drive remote computers for development from mobile devices. With a TUI interface optimized for Neovim users and built-in LSP support, it has become the top choice for developers seeking freedom and a premium terminal experience. The project has already garnered over 45,000 stars. (Source: GitHub)

UltraShape-1.0: A New Benchmark for Open-Source 3D Model Generation: Professor Yuan Li’s team at Peking University released UltraShape-1.0, claimed to be the strongest open-source 3D model generator currently available, surpassing Trellis 2 in performance. The project not only open-sources the inference code but also makes data preprocessing and training code public, significantly lowering the barrier to high-quality 3D asset generation. This is of great significance for game development, virtual reality, and the construction of simulation environments for Embodied AI. (Source: GitHub)

📚 Learning

Physics of Language Models Tutorial: Extracting Architecture Principles from Synthetic Data: Dr. Zeyuan Allen-Zhu from FAIR released the “Physics of Language Models” tutorial series. By conducting experiments in a controlled synthetic data “playground,” he derived over 20 architectural principles, explaining why Canon layers are effective and why linear models are weaker than Transformers in reasoning depth. These accessible videos reveal the underlying logic masked by noise during model scaling and are must-watch resources for AI researchers to understand internal model mechanisms. (Source: Zeyuan Allen-Zhu)

OpenAI Grove Program: A Technical “Whampoa Academy” for Early Founders: OpenAI has opened applications for the new phase of the Grove program, a 5-week technical program for early-stage founders. Participants will receive direct guidance from OpenAI’s research and applied teams, hands-on workshops, and early product access. The program aims to help developers explore the frontiers of AI applications in the most talent-dense hardware and software environment, serving as a core channel for developers to enter the OpenAI ecosystem. (Source: OpenAI)

Survey on Self-Evolving Agents: The Path Toward Artificial Super Intelligence: The paper “A Survey on Self-Evolving Agents” is trending in the community, providing a comprehensive overview of how AI agents achieve capability improvements through self-evolution mechanisms. It covers the timing, methods, and challenges of evolution. In the current context of the Agent explosion, understanding how models achieve performance beyond human presets through environmental feedback and self-iteration is crucial for building next-generation autonomous systems. (Source: TheTuringPost)

💼 Business

Zhipu AI and MiniMax Kick Off Hong Kong IPO Wave: China’s “Six Little Dragons” of large models are showing clear divergence, with Zhipu AI and MiniMax being the first to pass the Hong Kong listing hearing. Zhipu focuses on B-end MaaS business, with revenue accounting for over 80%, emphasizing technical foundations and industrial empowerment; MiniMax expands globally through C-end applications like Talkie/Xingye, with overseas revenue exceeding 70%. The listing of these two companies will provide an important template for domestic large models to transition from “technical narrative” to “commercial monetization.” (Source: XiaGuangShe)

Meta Invests $14 Billion in Scale AI: 28-Year-Old CEO Takes the Reins: Meta announced a massive $14 billion injection into data labeling giant Scale AI and hired its 28-year-old CEO Alexandr Wang to lead Meta’s new AI initiatives. This move directly led to the marginalization and departure of veteran scientists like LeCun. Zuckerberg is attempting to quickly acquire high-quality data resources to reverse the slump in Llama 4’s development, showing that Meta is accelerating to catch up with OpenAI at any cost. (Source: Financial Times)

🌟 Community

OpenAI President Greg Brockman Becomes Trump’s Largest Donor: The community is buzzing over Greg Brockman’s massive donation to a Trump Super PAC. Reddit users reacted strongly, arguing this contradicts OpenAI’s stated values of “benefiting humanity” and “democratic governance,” fearing it will lead to AI regulatory policies favoring specific interest groups. Some users have even launched a boycott by canceling ChatGPT subscriptions, reflecting the significant impact of tech leaders’ political stances on brand credibility. (Source: Reddit r/ChatGPT)

Growing US Public Hostility Toward AI: Anxiety Over Energy, Jobs, and Privacy: The New York Times analyzed why Americans generally harbor hostility toward AI. Reddit discussions pointed out that the core issues are: AI infrastructure (like data centers) driving up local electricity bills and noise; AI resume screening leading to repeated rejections for job seekers; and the lack of universal healthcare, where unemployment means a survival crisis. The public believes the benefits of AI are monopolized by Silicon Valley elites, while the consequences are borne by ordinary people. This cultural resistance has become a major obstacle to technology implementation. (Source: Reddit r/artificial)

Hardware Shortages and Price Hikes: “Austerity” Signals for 2026: Supermicro announced it would stop selling standalone motherboards, selling only full-system servers; ASUS also announced across-the-board price increases ahead of CES 2026. The community is generally concerned, viewing this as hardware manufacturers monopolizing resources to stifle the development of Local Inference, forcing developers toward expensive cloud services. Coupled with skyrocketing RAM prices, 2026 may become the most expensive year for individual developers and SMEs in terms of hardware costs. (Source: Reddit r/LocalLLaMA)

AI Response “Stupidity” Mystery: Users Question Throttling by Vendors: Numerous complaints have surfaced in the Reddit community regarding the decline in response quality of ChatGPT and Gemini. Users suspect that after acquiring a large number of subscriptions, vendors are “throttling” models to save computing costs, resulting in perfunctory, conservative, and uncreative answers. While this might be due to stricter guardrails or system prompt changes, this “bait and switch” experience has triggered collective dissatisfaction among paying users. (Source: Reddit r/ArtificialInteligence)

💡 Others

Macy’s Use of AI-Generated Clothing Ads Sparks Controversy: Social media exposed Macy’s using AI to generate models and clothing displays, drawing collective mockery from netizens. Critics argue the AI-generated clothing textures look fake and even feature anatomical deformities, a practice that not only cheapens the brand but also deprives photographers and models of job opportunities. This reflects the aesthetic deficiencies and socio-ethical challenges traditional retail faces when embracing AI for cost reduction and efficiency. (Source: Reddit r/artificial)

Google SynthID Watermark Successfully Bypassed: Researchers released a report stating that invisible image watermarks developed by Google DeepMind’s SynthID can be completely erased using post-processing techniques with Diffusion models. The study aims to drive the industry toward developing more resilient AI content identification technologies through responsible disclosure. This once again proves that current pixel-perturbation-based watermarking schemes remain vulnerable to adversarial attacks, and AI safety regulation still has a long way to go. (Source: GitHub)

Future Job Prospect: Head Transplant Surgeon: MIT Technology Review surveyed future professions, mentioning the “head transplant surgery” being prepared by Italian neurosurgeon Sergio Canavero. Although the idea is highly controversial and was once dismissed as a hoax, it is gaining new attention with support from Silicon Valley longevity enthusiasts and AI-driven precision surgical robots. This is not just a medical challenge, but the ultimate intersection of AI, robotics, and bioethics. (Source: MIT Technology Review)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Others

Related Tags

Related Posts

AI Daily – 2026-07-20

AI Daily – 2026-07-19

AI Daily – 2026-07-18