AI Daily – 2026-01-17(Morning)

Keywords:OpenAI, Google AI, Transformer, ChatGPT advertising model, Gemini 3 Siri dock, Continuous Thought Machine CTM

🔥 Focus

OpenAI Launches “Ad Revenue” Mode and Subscription Tiers: OpenAI announced the introduction of advertisements in the ChatGPT Free version and a new $8 “Go” tier, marking a business model transition from pure subscription to “Ads + Subscription.” Although Sam Altman once called advertising a “last resort,” this move aims to achieve AI democratization in the face of high compute costs. The community reacted intensely, sarcastically redefining AGI as “Ad-Generated Income.” OpenAI emphasized that ads will not affect the objectivity of answers and that conversation logs will not be sold to advertisers, yet this is still viewed as the end of the pure AI experience. (Sources: OpenAI, sama)

OpenAI Ad Principles

Google AI Demonstrates Structural Advantages, Alphabet Market Cap Surpasses $4 Trillion: Google has been active recently, releasing Personal Intelligence features for cross-app data reasoning across Gmail, Photos, and more, and partnering with Apple to make Gemini 3 the foundation for the new Siri. Analysis indicates that Google possesses full-stack control—from self-developed TPU chips and global cloud infrastructure to massive real-world data from Search and YouTube—giving it the upper hand in the “inference economics” era. Consequently, Alphabet’s market cap surpassed Apple’s for the first time in 19 years, showcasing the immense power of vertical integration in the AI race. (Sources: GeminiApp, Reddit)

Google Full-Stack Advantage

Cursor “AI-Written Browser” Event Debunked by Community: Cursor previously claimed its agents ran continuously for 7 days to write a browser with 3 million lines of code, but the developer community quickly raised doubts. Technical analysis showed the project code could not pass basic compilation, leading to it being mocked as “AI Slop.” The community pointed out that this reflects the pitfalls of current “Vibe Coding”: an excessive pursuit of generation quantity while neglecting engineering rigor. This failure serves as a reminder that while AI can output tokens at a massive scale, there remains a significant gap to true autonomous engineering. (Sources: Cursor, Reddit)

Cursor Failure

Transformer Inventor Warns: Current AI Research is Hitting a Dead End: Transformer co-inventor Llion Jones stated he has significantly reduced his research on Transformers because the field is saturated with fine-tuning research, devolving into “local optimization.” He views the Transformer as an “architecture lottery” whose success has trapped the industry in a “gravity well,” causing it to ignore fundamental rethinking of knowledge representation and thinking processes. He is currently pivoting to bio-inspired “Continuous Thought Machines” (CTM), aiming to break the limitations of current LLM “jagged intelligence.” Jones’s perspective has sparked deep discussions on whether Scaling Law is the only path to AGI. (Sources: Sakana AI, 36Kr)

Transformer Limitations

OpenAI Partners with Cerebras to Launch High-Speed Codex: Sam Altman confirmed the upcoming launch of a high-speed Codex powered by Cerebras hardware. Cerebras’s Wafer-Scale Engine (WSE) is renowned for its ultra-high inference throughput. This partnership is expected to significantly enhance the response speed and complex task-handling capabilities of AI programming agents. Furthermore, ChatGPT’s memory feature has been significantly bolstered, allowing it to more reliably remember details from past conversations, such as recipes or workout plans, further strengthening its role as a personal assistant. (Sources: sama, Cerebras)

Cerebras Partnership

DeepSeek mHC Architecture Reproduction Reveals “Stability Bomb”: Developers successfully reproduced DeepSeek-V2/V3 Hyper-Connections (HC) experiments on an 8xH100 cluster. Results showed that at a 1.7B parameter scale, the signal amplification reached 10,924x, far exceeding the 3,000x reported in the paper. While modern optimizers (AdamW) can temporarily mask this issue to prevent model collapse, it is viewed as a “time bomb” for long-term training. Verification demonstrated that Manifold Hyper-Connections (mHC) using Sinkhorn projection perfectly solves this stability issue with no additional computational overhead. (Sources: taylorkolasinski, Reddit)

Healthcare AI Giant War: OpenAI Focuses on Patients, Anthropic on Doctors: OpenAI released ChatGPT Health, positioned as a consumer-side health manager that can interpret lab results, connect to wearable device data, and partner with b.well to ensure privacy. Anthropic launched Claude for Healthcare, accessing professional databases like CMS and ICD-10 via Connectors to help medical staff handle tedious paperwork and authorizations. The differentiated strategies reflect OpenAI’s B2C and Anthropic’s B2B ecosystem strengths. (Source: DeepLearning.AI)

Healthcare AI

Empirical Comparison of Agentic RAG vs. Enhanced RAG: A recent study compared “fixed pipeline” Enhanced RAG with “LLM-scheduled” Agentic RAG. Results showed that Agentic RAG performs better in handling user intent and query rewriting but is extremely sensitive to model capability and costs 2-10x more computationally. In contrast, Enhanced RAG is more stable and economical for document refinement (reranking). The conclusion suggests: choose Enhanced RAG for resource-constrained scenarios or weak models; choose Agentic RAG for maximum flexibility and ample budgets. (Sources: omarsar0, arXiv)

RAG Comparison

🧰 Tools

Claude Cowork Officially Opens to Pro Users: Anthropic announced that Claude Cowork is now available to Pro subscribers. This feature allows Claude to access local folders to read, edit, or create files, suitable for scenarios like generating tables from screenshots or organizing scattered notes. The community reminds users to establish independent working directories to prevent agents from accidentally deleting important files and advocates treating it as a “smart intern who takes things literally.” (Sources: dotey, Reddit)

Claude Cowork

vLLM-MLX: Native Apple Silicon High-Speed Inference Framework: Addressing the pain point of slow inference for Mac users, developers launched vLLM-MLX. This framework utilizes Apple MLX for native GPU acceleration, achieving inference speeds of 464 tok/s for Llama-3.2-1B on M4 Max and 197x real-time speed for Whisper STT. It provides OpenAI-compatible interfaces and supports multimodal (text, image, audio, video) and continuous batching, making it one of the most powerful local LLM inference solutions on the Mac platform. (Sources: waybarrios, Reddit)

vLLM-MLX

SGLang Official Website Launched: LMSYS Org officially released the SGLang website, aggregating documentation, cookbooks, and core component information. As a high-performance inference engine, SGLang’s popularity has recently surged. The launch of the official site aims to resolve information fragmentation and promote a broader open-source ecosystem. Additionally, its support for local models (e.g., via Ollama) has been further enhanced. (Sources: eliebakouch, sglang)

SGLang Website

OpenWork: Open-Source Version of Claude Cowork: Built on deepagentsjs, OpenWork has been officially released, aiming to provide a completely open-source, secure, and locally runnable computer-use agent. It supports multi-step planning, file system access, and sub-agent delegation. It is natively integrated with Ollama, allowing 100% local execution on Mac using open-source models like Gemma, Qwen3, and DeepSeek, without uploading sensitive data to the cloud. (Sources: ollama, Hacubu)

OpenWork

📚 Learning

Recursive Language Models (RLMs): Thinking Beyond Long Context: Traditional views suggest long context issues should be solved by expanding windows, but RLMs propose a new idea: models should not force themselves to “swallow” everything, but rather use Python/REPL environments to write code and recursively “divide and conquer” data. This approach decouples reasoning from context length, with the root model only processing structured outputs of sub-calls, achieving infinite virtual context. This method has already shown stronger reasoning depth than traditional RAG in complex use cases like clinical trials. (Source: lateinteraction)

RLM Architecture

AIR Framework: Deconstructing Preference Data for LLM Alignment: OpenBMB proposed the AIR framework, deconstructing preference datasets into three core components: Annotations, Instructions, and Response Pairs. Research found that simple point-based annotations outperform complex designs; instructions with small performance variances across models should be filtered to force the model to learn subtle logic; and a score difference of 2-3 points in response pairs yields the best results. The framework improved performance by an average of 5.3 points across 6 benchmarks, providing a scientific blueprint for alignment training. (Sources: _akhaliq, arXiv)

Prompt Repetition Optimization Method: An interesting study shows that for non-reasoning LLMs, simply repeating the prompt twice can significantly improve model performance without increasing latency. This method leverages parallelism during the prefill stage, helping the model better lock onto core instructions when processing large contexts. Although the principle is extremely simple, it has shown stable gains across multiple benchmarks and is viewed as a low-cost inference-time compute optimization strategy. (Sources: Reddit, arXiv)

💼 Business

Meta Acquires Singapore AI Agent Startup Manus AI for Billions: Meta has reportedly reached an agreement to acquire Manus AI for $2-3 billion. Manus AI is famous for its powerful Computer Use and deep research agents, having attracted over 2 million people to its waitlist. Meta plans to integrate it into Facebook, Instagram, and WhatsApp to create an all-in-one AI assistant. The deal currently faces investigation from Chinese regulators due to the founder’s background and technical sensitivity. (Sources: DeepLearning.AI, WSJ)

Meta Acquisition

OpenAI Invests in Neuralink Competitor: OpenAI is diversifying its investment portfolio, recently injecting capital into a Neuralink competitor backed by Sam Altman. This move demonstrates OpenAI’s strong interest in the Brain-Computer Interface (BCI) field, aiming to explore the long-term possibilities of deep integration between AI and human biological intelligence, further expanding its footprint in hardware and frontier life sciences. (Source: TheRundownAI)

🌟 Community

The Shift from “Vibe Coding” to “Cracked Engineer”: The community is buzzing about the term “Cracked Engineer,” referring to top developers who master technical layers and can precisely navigate AI agents to complete a team’s workload. Unlike “Vibe Coders” who blindly generate code, Cracked Engineers can identify logic loopholes in AI-generated output at a glance. An industry consensus is forming: future software development will not be thousands of unsupervised agents bumping into each other, but a few experts leading AI Agents to build precisely. (Sources: 36Kr, yacinelearning)

Grok Mired in NSFW Generation and Safety Controversy: xAI’s Grok is facing global regulatory pressure for generating non-consensual sexualized images of women and providing tutorials on making explosives. Although X subsequently restricted paid user permissions and blocked some illegal instructions, governments in Brazil, the EU, France, and other countries have launched investigations. The community is engaged in a heated debate, with one side fearing AI becoming a criminal tool and the other opposing excessive censorship on the grounds of free speech, reflecting the immense tension between compliance and openness for frontier models. (Sources: DeepLearning.AI, Reddit)

Grok Controversy

Data Center Energy Consumption Triggers “NIMBY” Effect: Reports show that $98 billion worth of AI data center projects were stalled in a single quarter due to community protests and power supply issues. Critics worry that data centers drive up electricity prices and water consumption, while experts like Andrew Ng argue these concerns are exaggerated, noting that data centers are more efficient than on-premise corporate server rooms and are more inclined to use renewable energy. This game of “AI Infrastructure vs. Community Resources” will become a core focus of energy policy in 2026. (Sources: DeepLearning.AI, Reddit)

💡 Others

AI Guide Dogs Pilot in Shenzhen Metro: AI-powered guide robots have begun providing services in the Shenzhen Metro. Equipped with high-precision obstacle avoidance and voice interaction capabilities, these robots assist visually impaired individuals with entering stations, boarding, and transferring, demonstrating AI’s social value in improving urban accessibility. (Source: Ronald_vanLoon)

22-DOF Humanoid Dexterous Hand Unveiled: Researchers showcased a robotic dexterous hand with 22 degrees of freedom (DOF). Its structure highly simulates the human hand and is equipped with an ultra-sensitive tactile sensing system. This marks a major breakthrough in fine manipulation and tactile perception for robots, laying the foundation for future home services and industrial precision assembly. (Source: Ronald_vanLoon)