AI Daily - 2026-01-19(Evening)

Keywords：AGI, AI large models, intelligent agents, world models, Claude permanent memory, AI research paradox

🔥 Focus

Nobel Laureate Hassabis Predicts AGI Within Five Years: Key Lies in World Models and Agent Breakthroughs: Google DeepMind chief Demis Hassabis has provided an ultimate timeline for AGI, suggesting that humanity is only 1-2 key technical breakthroughs away, with realization expected within 5 years. He noted that while current large models are powerful, they possess “patchy intelligence,” lacking a true understanding of physical laws and long-term planning capabilities. Future progress requires “World Models” to understand physical rules and an evolution into “Agent systems” with “cognitive error-correction” abilities. This transformation’s impact will be 10 times that of the Industrial Revolution, positioning AI as the ultimate tool for scientific discovery and ushering in a golden age for drug discovery and clean energy. (Source: QbitAI)

Claude’s Shocking Upgrade to “Permanent Memory”: Knowledge Bases and Cowork Mode Reshape AI Productivity: Anthropic is reportedly injecting “Permanent Memory” into Claude Cowork through “Knowledge Bases” to achieve persistent knowledge storage. AI will no longer have “goldfish memory” but will automatically record user preferences, decision processes, and experience summaries, becoming more intuitive with use. Furthermore, Cowork will become the primary entry point for Claude, integrating the Artefacts sidebar and stronger MCP (Model Context Protocol) automation connectors. This upgrade marks the evolution of AI assistants from simple dialogue tools into “AI colleagues” capable of long-term collaboration and complex task execution, completely changing the AI productivity paradigm. (Source: QbitAI)

ICML 2026 Introduces “Author Self-Rating” Mechanism: Using Game Theory to Combat Academic Review Crisis: Facing a breakdown in the review system due to a surge in submissions to top conferences like NeurIPS, ICML 2026 has launched a disruptive “Author Self-Rating” policy. Based on “isotonic regression” in game theory, the mechanism requires authors submitting multiple papers to rank their own work. Experiments show that authors’ rankings of their own papers predict long-term impact more accurately than random reviewers. The move aims to convert authors’ “winning ambitions” into calibration signals, though it has raised fairness concerns about “academic whales” exploiting algorithms while “retail scholars” are left exposed in the noise. (Source: QbitAI)

Tsinghua Nature Paper Reveals AI Research Paradox: Productivity Surge Leads to “Locked” Scientific Boundaries: A team led by Professors Xu Fengli and Li Yong from Tsinghua University published a study in Nature analyzing 41 million papers over 45 years. They found that while AI significantly boosts individual output (3x increase in paper volume, 4.8x in citations), it has led to a 4.63% decline in collective knowledge breadth. AI guidance causes researchers to flock to “data-rich, well-defined” popular fields, leading to homogenized innovation and reduced cross-disciplinary cooperation. The team proposed a “Full-process Scientific Research Agent System” to push AI from an auxiliary tool to an “AI Scientist” capable of actively proposing hypotheses and expanding unknown territories. (Source: QbitAI)

Google Research Discovers Prompt Trick: Repeating Questions Boosts Accuracy from 21% to 97%: Google research has found that in non-reasoning tasks, simply repeating the input question (copy-pasting) can significantly improve LLM performance with almost no added latency. This “repeater” trick boosted accuracy from 21.33% to 97.33% in the NameIndex test for Gemini 2.0 Flash-Lite. The scientific logic lies in exploiting the “causal blind spots” of the Transformer architecture; repeated input provides the model with a “bidirectional-like attention” god-view. This discovery means developers can use cheaper small models to achieve the retrieval and extraction capabilities of top-tier models. (Source: QbitAI)

🎯 Trends

GLM-4.7-Flash Officially Released: Zhipu AI Launches 30B-Class All-Purpose Lightweight Model: Zhipu AI (Z.ai) has officially released GLM-4.7-Flash, positioned as a local coding and Agent assistant. The model features 30B total parameters with approximately 4B active parameters and introduces the MLA architecture to balance high performance and efficiency. In benchmarks, it performs on par with or better than GLM-4.5-Air, particularly excelling in tool calling and creative writing. As a 30B-class lightweight option, it is ideal for local deployment in 2x24GB VRAM environments, providing developers with a cost-effective foundation for Agent development. (Source: scaling01; Reddit)

DeepSeek Launches Engram Primitive: Ushering in a New Era of Context-Aware Conditional Memory: DeepSeek has released a conditional memory primitive called “Engram,” upgrading static N-gram lookup tables to context-aware dynamic memory. This technology supports ultra-large-scale memory retrieval with O(1) complexity, activating only the parts relevant to the current hidden state. Engram allows memory to be stored in CPU RAM instead of expensive GPU HBM, significantly reducing costs. This breakthrough proves that “memory scaling” can partially replace “parameter scaling,” providing new system-level support for long-range context and continuous learning. (Source: ZhihuFrontier)

Huawei Releases Five Major 2025 Flagship Inference Systems: Breaking HBM Capacity Walls and Resource Islands: Huawei researcher Zuo Pengfei reviewed five major breakthroughs in 2025 inference systems: SparseServe uses DRAM offloading for cold KV caches to break the memory wall; Adrenaline enables dynamic flow between Decode and Prefill nodes through resource pooling; TaiChi architecture adaptively balances TTFT and TPOT; DualMap balances cache affinity and load balancing; and MemArt stores Agent memory as reusable KV blocks. These full-stack system redesigns mark a shift in inference from single-kernel optimization to complex SLO-aware scheduling, laying the foundation for large-scale multimodal streams and long-range Agents. (Source: ZhihuFrontier)

Baichuan Intelligence Releases Baichuan-M3: Going All-In on Serious Medical AI: Baichuan Intelligence has released its next-generation medical model, Baichuan-M3, claiming it comprehensively surpasses GPT-5.2 in the medical field for the first time. The model utilizes a Fact-Aware Reinforcement Learning (RL) architecture, reducing hallucination rates to 3.5% without relying on external search. Baichuan-M3 features SCAN active inquiry capabilities, simulating a doctor’s follow-up history taking. Wang Xiaochuan stated the company has fully pivoted to healthcare, employing professional medical teams for large-scale data labeling to address the shortage of primary medical resources in China, with plans for an IPO in 2027. (Source: 36Kr)

OpenAI’s Hardware Bet: Screenless AI Pen Attempts to Escape Graphical Interfaces: Rumors suggest OpenAI is about to launch its first hardware product, codenamed “Gumdrop”—a minimalist AI pen. Designed by Jony Ive’s team, the device has no screen or camera, weighs only 10-15 grams, and emphasizes being “available on call, hidden when done.” Its core is not writing, but using voice and handwriting as intent-capture channels. This design reflects OpenAI’s attempt to bypass traditional graphical interfaces, allowing AI to integrate into physical world recording and interaction in the most natural, seamless way, marking a shift in human-computer interaction from command-driven to intent-driven. (Source: 36Kr)

🧰 Tools

Claude Skills Open Source Library Goes Viral: 48 Production-Grade Expert Skill Packs Power Agent Development: Developer alirezarezvani has open-sourced the claude-skills library, containing 48 production-grade skill packs covering marketing, engineering, product, legal, and more. These packs integrate Python analysis tools, best practice frameworks, and templates, supporting over 9 AI Agents including Claude Code, Cursor, and VS Code. Users can quickly install them via the /plugin command for tasks like “Marketing Requirement Gathering” or “System Architecture Design,” significantly boosting Agents’ practical capabilities in non-programming fields and achieving a leap from “tool application” to “expert-level collaboration.” (Source: GitHub; dotey)

Ollama Compatible with Anthropic API: Local Models Can Directly Drive Claude Code: Ollama is now officially compatible with Anthropic’s Claude Code API, meaning users can use Ollama-hosted local models (such as Llama 3, Qwen, etc.) within the powerful terminal Agent, Claude Code. Currently, this feature primarily supports models with contexts over 64K. This update breaks the restriction that Claude Code must rely on closed-source APIs, providing developers with a lower-cost, higher-privacy local Vibe Coding environment and further expanding the application boundaries of local LLMs. (Source: op7418)

Coze 2.0 Major Update: Seamless Skills Creation, Distribution, and Monetization: ByteDance’s Coze has released version 2.0, with the core upgrade being the complete integration of the Skills lifecycle. Users can now create Skills using natural language within the Coze programming environment and directly distribute and monetize them. This update greatly lowers the barrier for non-technical users to develop AI plugins and automated workflows. Combined with natural language orchestration, Coze is attempting to build an AI skill ecosystem similar to the App Store, allowing every complex automated task to be realized through simple Skill calls. (Source: op7418)

Manus AI Launches App Publishing Feature: List on Google/Apple Stores Without Environment Configuration: Manus AI has updated its App publishing workflow, allowing users to package and publish developed Apps directly to Google Play (Internal Testing) and Apple App Store (TestFlight). The entire process requires no installation of Xcode or Android Studio and no handling of complex build configurations; Manus automatically manages certificates and uploads. This feature significantly shortens the distance from AI prototype to real mobile testing, allowing non-developers to easily experience the full chain from “Prompt” to “Listed App.” (Source: hidecloud)

Eigent Open Source Project: An Open Source Alternative to Anthropic Cowork: In response to Anthropic’s release of Claude Cowork, startup team Eigent has chosen to open-source its product. Eigent aims to provide a non-technical task collaboration experience similar to Cowork, allowing users to manage workflows, files, and automated tasks via AI. The project’s open-source nature provides a reference for teams wishing to implement Cowork-like functionality in private environments or custom platforms, reflecting the intense competition in the AI tool space where “closed-source innovation leads, and open-source quickly follows.” (Source: ClementDelangue)

📚 Learning

Stanford AI Lab Launches “AI Bites”: Fragmented Audio Learning for Core AI Courses: Stanford AI Lab (SAIL) has launched a podcast series called “AI Bites,” designed to bridge the gap between dense academic materials and fragmented learning. Content for core courses like CS124 (NLP/LLM) and CS221 (Principles of AI) is already online. By transforming academic lectures into digestible audio, the project provides a convenient path for practitioners and students to systematically learn top-tier AI theory during commutes or leisure time. (Source: stanfordnlp)

Free Textbook Shared: Linear Algebra for Computer Vision, Robotics, and Machine Learning: The community shared a comprehensive free textbook covering core mathematical theories from basic vector spaces, matrices, and norms to eigenvalues and SVD. The book focuses on the practical application of these mathematical tools in PCA, graph theory, waveform analysis, and 3D rotation, making it an excellent resource for practitioners in CV and robotics to solidify their mathematical foundation. (Source: TheTuringPost)

MIMIC Benchmark Released: In-depth Analysis of Multi-Image Understanding Flaws in LVLMs: Researchers have introduced the MIMIC (Multi-Image Model Insights and Challenges) benchmark, specifically evaluating LVLMs’ capabilities in multi-image understanding and reasoning. The study found widespread flaws in existing models regarding cross-image information aggregation and simultaneous tracking of multiple concepts. The team proposed a programmatic data generation strategy and an attention mask scheme for multi-image input, significantly improving model performance in complex multi-image tasks. (Source: HuggingFace)

Hugging Face Releases Smol Training Playbook: Sharing World-Class Small Model Training Experience: The Hugging Face team shared a video of the “Smol Training Playbook” presentation, detailing how to train world-class small-parameter models. Content covers data filtering, training strategy optimization, and practical tips for squeezing performance out of models under limited compute. This is highly valuable for developers looking to deploy efficient AI models on edge devices or in specific vertical domains. (Source: _lewtun)

💼 Business

Anthropic Donates $1.5 Million to Python Software Foundation: Anthropic announced a $1.5 million donation to the Python Software Foundation (PSF) to support the continuous development of the Python ecosystem. Since the vast majority of AI R&D and toolchains (like Claude Code) rely deeply on Python, this move is seen as a long-term investment by an AI giant into the underlying open-source community. The community generally views this as a sign of corporate respect for technical foundations rather than mere PR. (Source: Reddit)

Canadian Defense AI Startup Dominion Dynamics Raises $21 Million: Dominion Dynamics, a Canadian company focused on Arctic sensor networks and defense capabilities, has completed a $21 million seed round. The company is dedicated to using AI technology to enhance territorial defense and monitoring. Industry leaders like Aidan Gomez congratulated the team, viewing it as a significant step for Canada in critical sovereign technology, especially amid increasingly complex Arctic geopolitics. (Source: aidangomez)

Synthesia Named in Sunday Times 100, Remains UK’s Fastest-Growing Unicorn: AI video generation platform Synthesia has been included in the Sunday Times 100 and maintains its position as the UK’s fastest-growing unicorn. Its core product has revolutionized corporate video production through digital human technology. The company teased major news coming soon, hinting at further breakthroughs in multimodal generation or enterprise applications. (Source: synthesiaIO)

🌟 Community

ChatGPT “What would you do with me if you took over the world” Trend Goes Viral: A trend has emerged on social media where users ask ChatGPT to “generate an image of how you would treat me after taking over the world based on my recent performance.” Users shared various dramatic AI-generated images, ranging from being treated as an honored guest to being placed in a “cognitive breathing room.” This phenomenon reflects the public’s complex attitude toward AI anthropomorphism and future human-AI relations, while also showcasing DALL-E 3’s ability to translate long-term dialogue context into visual narratives. (Source: Yuchenj_UW; Reddit)

Vibe Coding Debate: Productivity Revolution or “Code Slop”?: The community is engaged in a heated debate over “Vibe Coding.” Supporters like levelsio argue that running multiple Agents in parallel via tools like Claude Code enables rapid development and million-dollar revenues. Opponents like swyx criticize it as “tech bro vanity,” arguing that stacking AI output without reviewing code produces unmaintainable “Slop.” The core of the debate is whether developers in the AI era should focus on problem-solving and customer value or if they are becoming addicted to the illusion of efficiency provided by tools. (Source: swyx; seo_leaders)

The End of AI is Electricians: Data Centers Trigger Blue-Collar Labor Shortage in the US: As the AI Scaling war intensifies, the battle for energy has become central. The US Bureau of Labor Statistics predicts a massive shortage of electricians over the next decade, with average annual demand growth far exceeding the average. Tech giants like Amazon and Google have seen a surge in energy-related hiring, even poaching nuclear energy executives from each other. Elon Musk stated that “the currency of the future is Watts,” and power supply will be more critical than GPUs. This trend is boosting blue-collar roles like electricians and plumbers, turning “go be a plumber” from a joke into forward-looking career advice. (Source: 36Kr)

AI Disinformation Discussions Triggered by Greenland Territorial Dispute: Recent discussions regarding US attempts to purchase or control Greenland have sparked significant political controversy on social media. This includes forged letters and diplomatic threats, some of which are suspected to be AI-assisted disinformation or political propaganda. Community discussions point out that AI’s potential to amplify political polarization and create diplomatic chaos is concerning, while also sparking fierce debates over the stability of current governments. This event highlights that in the AI era, discerning information authenticity and preventing algorithm-driven cognitive warfare has become a global challenge. (Source: teortaxesTex; halvarflake)

Developer Feedback on AI Hallucinations and “Causal Blind Spots”: Developers have been testing Google’s research on “prompt repetition.” Some users found that when processing long document retrieval, repeating instructions indeed significantly reduces omission rates. However, others pointed out that for reasoning tasks requiring strict logic, excessive repetition can lead to the model becoming a “repeater” or causing logical breaks. The consensus is that the current Transformer architecture has a natural “one-way reading” defect; until architecture-level solutions appear, these prompt engineering “hacks” are key to boosting small model productivity. (Source: Reddit; Gemini)

💡 Others

TaskExplorer: A Powerful Windows Task Management Tool: The GitHub trending project TaskExplorer is a deep system monitoring tool that goes beyond monitoring running apps to provide deep insights into their behavior. It offers real-time thread stack tracing, memory editing, handle viewing, and socket connection monitoring. Compared to the built-in Task Manager, it displays more detailed I/O data and GPU performance curves. Developed with Qt, it plans to port to Linux, potentially becoming a cross-platform advanced graphical system management standard. (Source: GitHub)

Privacy Concerns Over AI Recording Cards: Data Risks Under a Stylish Shell: A large number of thin recording cards featuring “one-click recording and AI summarization” have emerged on the market. While compact and convenient, the lack of local compute means all recordings must be uploaded to the cloud for recognition, raising serious privacy concerns. Additionally, the thin design limits microphone specifications, often resulting in lower recording quality than high-end phones and limiting AI transcription accuracy. Experts warn that these hardware devices are essentially “selling services,” and users must weigh the risk of sensitive data leakage against convenience. (Source: 36Kr)

Japan Develops Giant Humanoid Robot for Railway Maintenance: To address labor shortages caused by an aging population, Japan has developed a massive humanoid robot for track maintenance. Mounted on a construction vehicle crane arm, the robot is controlled by a human operator via a VR headset. It can handle heavy cables and components, significantly reducing risks for workers in high-voltage environments. This application demonstrates the huge potential of AI and robotics in traditional infrastructure maintenance, serving as a typical case of “heavy-duty embodied intelligence.” (Source: Ronald_vanLoon)

🔥 Focus

🎯 Trends

🧰 Tools

📚 Learning

💼 Business

🌟 Community

💡 Others

Related Tags

Related Posts

AI Daily – 2026-07-19

AI Daily – 2026-07-18

AI Daily – 2026-07-17