Berita AI – 2025-12-28(Edisi malam)

Kata Kunci:Pemrograman AI, Claude Opus 4.5, NVIDIA Groq, Model Dunia, Penalaran AI, Agen Cerdas, Model Sumber Terbuka, Mode Agentic, Chip LPU untuk Penalaran, Model GLM-4.7 Sumber Terbuka, Evolusi Mandiri AI, Sistem Penalaran Mini-SGLang

🔥 Focus

Claude Opus 4.5 Release Triggers a “Programming Paradigm” Earthquake: With the release of Claude Opus 4.5, the AI industry is once again caught in a collective state of anxiety and excitement. Andrej Karpathy stated that as a programmer, he has never felt so “behind,” as the profession is being restructured and the code contributed by programmers is becoming increasingly sparse. If AI is properly orchestrated, efficiency can be increased by more than 10 times. Community discussions point out that AI is shifting from simple code generation to an “Agentic” mode, even capable of autonomously penetrating home automation systems (such as Lutron). This marks a complete migration of software engineering’s focus from “execution” to “thinking and decision-making”; code is no longer the bottleneck, and how to define the problem has become the core. (Source: Andrej Karpathy, Vtrivedy10)

Claude Opus 4.5 Release Triggers a "Programming Paradigm" Earthquake

NVIDIA Acquires Groq for $20 Billion to Bridge Inference Gap: By acquiring Groq, a “pick-and-shovel” factory, NVIDIA aims to counter the threat of ASIC chips like Google’s TPU. Analysis indicates that while GPUs are invincible in the pre-training phase, they are limited by HBM memory bandwidth during low-latency inference (the Decode phase). Groq’s LPU utilizes on-chip SRAM, which is a hundred times faster than GPUs, solving the memory bottleneck during inference. Jensen Huang’s move signals that the focus of AI competition is shifting from the training layer to the application layer, with NVIDIA acquiring a “vaccine” to prevent being disrupted by emerging inference architectures. (Source: Gavin Baker, Suhail)

NVIDIA Acquires Groq for $20 Billion to Bridge Inference Gap

Geoffrey Hinton Warns of 2026: AI Moving Toward Autonomous Reasoning and Self-Evolution: AI godfather Hinton points out a fundamental shift in AI: from “giving answers” to “executing tasks.” He emphasizes that AI will possess human-like self-correction mechanisms (self-verification), achieving reasoning through high-dimensional vector connections rather than logical symbols. More importantly, AI will enter a “self-learning phase,” generating high-quality training data through self-play to escape dependence on public human data. This means AI Agents will begin to deliver results directly, and the reins of control are shifting away from human hands. (Source: )

GLM-4.7 Tops Open-Source Model Rankings, Domestic Models Continue to Surge: Zhipu AI’s GLM-4.7 ranked first among open-source models in the Artificial Analysis Intelligence Index, surpassing competitors like Kimi K2. Community feedback highlights its stunning performance in mathematical vision and complex reasoning. Meanwhile, Xiaomi’s released Mimo-v2-flash also demonstrated high utility in the long-context arena. This shows that open-source models are rapidly closing the gap with closed-source flagships, especially in specific vertical domains and inference efficiency. (Source: Z.ai, LocalLLaMA)

GLM-4.7 Tops Open-Source Model Rankings

2025 Top Seven World Models Overview: From Physics to Agentic Nesting: TheTuringPost summarized the most representative World Models of 2025, including LeJEPA, Code World Model (CWM), and Cosmos WFM 2.5. These models attempt to integrate physical laws, agent behaviors, and nested logic into a unified architecture. Trends show that future AI will no longer just be about text generation but will possess the ability to perform high-fidelity simulation and prediction of the physical world and complex systems. (Source: TheTuringPost)

2025 Top Seven World Models Overview

GPT-5.2 Codex Leaked: More Efficient File Editing and Logical Consistency: OpenAI is internally advancing the iteration of GPT-5.2 Codex, with early testers reporting significant improvements in file editing consistency and logical transparency. The model behaves more like a mature “collaborator” than a simple completion tool when handling complex codebases. With the wave of local models arriving, such efficient reasoning models will become the core of individual developers’ workflows. (Source: gdb)

DeepSeek V3.2 Shows Generational Competitiveness, Reshaping Global Model Landscape: Social media is buzzing about DeepSeek V3.2 performing better than GPT-5.2 on certain specific tasks (such as building a chess engine). This trend of “the small defeating the large” reflects the huge potential of Post-training techniques in raising the ceiling of model reasoning. 2026 is considered the “Year of Verification,” where users will no longer pay for “magic moments” but instead pursue production reliability of over 95%. (Source: teortaxesTex)

🧰 Tools

just-bash: A TypeScript Bash Implementation Built for AI Agents: Malte Ubl developed just-bash, a complete Bash implementation designed specifically for AI Agents (such as Claude Code). It supports common tools like grep, sed, and awk, and provides a secure sandbox execution environment. An interesting aspect of the project is that its code was almost entirely written by Opus 4.5, demonstrating how AI can achieve self-enhancement by building its own underlying toolchain. (Source: andersonbcdefg)

Dad Co-Pilot: An iOS App Developed Independently in 3 Weeks Using Claude Code: A new father used Claude Code to complete a baby tracking app based on SwiftUI and CloudKit in just 3 weeks, without any backend servers. The tool achieved functional iterations through natural language interaction, proving that AI is significantly lowering the entry barrier for software development, allowing non-professional developers to quickly deliver complex, productive applications. (Source: Reddit r/ClaudeAI)

Dad Co-Pilot

exe.dev: Persistent VM Sandboxes for Code Agents: Addressing the need for AI Agents to have a stable environment when executing tasks, exe.dev launched a “bring your own sandbox” service. It provides persistent virtual machines accessible via SSH, allowing developers to leave AI Agents inside to run tasks continuously. This solves security and environment consistency issues for Agents in complex development tasks. (Source: mathemagic1an)

exe.dev

agi-memory: Giving AI Agents an Autonomous “Heartbeat” and Long-Term Memory: QuixiAI open-sourced the agi-memory system, which uses a “heartbeat daemon” to periodically wake up the AI (such as Claude), giving it the ability to autonomously reflect, keep a diary, and maintain long-term memory. This mechanism ensures that AI is no longer just a passive program waiting for instructions, but can perform continuous consciousness queries and self-optimization in the background like a living organism. (Source: QuixiAI)

📚 Learning

Mini-SGLang: Master LLM Inference with 5,000 Lines of Python Code: The Mini-SGLang project released by LMSYS compresses a production-grade inference stack into readable Python code. It covers core technologies such as FlashAttention-3, Tensor Parallelism, Chunked Prefill, and Radix Cache. This is the best practical resource for learning modern LLM inference system architecture, helping developers understand the underlying logic of latency hiding and throughput optimization. (Source: arnaud_autef)

Mini-SGLang

Egocentric2Embodiment: Training Embodied Intelligence from First-Person Videos: A new study proposes the E2E pipeline, which converts human first-person videos into structured Q&A supervision data for training the embodied perception model PhysBrain. This method significantly enhances AI’s planning and interaction reasoning capabilities in the physical world with lower dependence on robotic samples, providing a new path for the implementation of physical intelligence. (Source: TheTuringPost)

Egocentric2Embodiment

NanoGPT Training Race Breaks Records Again: The Magic of Asymmetric Logit Scaling: Developers updated NanoGPT with a single line of code, utilizing Asymmetric Logit Scaling and offset techniques to further boost training speed. This trick takes advantage of the fact that prediction tasks only focus on the Right Tail, achieving faster convergence by optimizing Logit Softcapping. This demonstrates that at the infrastructure level, subtle mathematical optimizations can still bring massive efficiency dividends. (Source: kellerjordan0)

NanoGPT Training Race Breaks Records Again

💼 Business

OpenAI Recruits “Head of Preparedness” to Tackle Model Abuse Risks: Sam Altman announced that OpenAI is hiring a Head of Preparedness, a critical position aimed at addressing potential risks of models in cybersecurity (such as automatically finding vulnerabilities) and biosecurity. As model self-evolution capabilities increase, how to enjoy technical dividends while limiting negative impacts has become a business focus for top labs. (Source: Sam Altman)

NVIDIA Acquisition of Groq Details Disclosed: Employees Reap Huge Rewards: As the acquisition deal settles, Axios reported that Groq employees received massive financial returns in this transaction. Although some options have not yet fully vested, the terms offered by NVIDIA are extremely attractive. This deal is not just a merger of technology, but another large-scale restructuring of the AI chip talent market. (Source: Suhail)

🌟 Community

AI Slop Phenomenon Sparks Heated Debate: Beware of the “It’s not X, it’s Y” Language Trap: The community has widely noticed the homogenization of ChatGPT-generated content, particularly the specific sentence structure “It’s not just about X, it’s about Y.” Analysis suggests this style exploits human psychological dependence on “superficial depth” and “group bias.” YouTube research shows that over 20% of recommended videos for new users have devolved into AI Slop, a “low-quality prosperity” that is causing long-term impact on the content ecosystem. (Source: scottastevenson, Reddit r/artificial)

AI Slop Phenomenon Sparks Heated Debate

Tennessee Proposes Legislation to Ban AI Emotional Support, Sparking Controversy: Tennessee lawmakers are attempting to make the act of training AI to provide emotional support or act as a companion a Class A felony (equivalent to murder). The community reacted strongly, viewing this not only as a stifling of innovation but also as ignorance of AI’s potential to assist in mental health. This move reflects the extreme unease and defensive psychology of traditional legal systems when facing AI’s social attributes. (Source: nptacek)

Tennessee Proposes Legislation to Ban AI Emotional Support

Code Review Crisis in the Agent Era: Humans Are Becoming the Productivity Bottleneck: As AI Agents (such as Claude Code) achieve an output of hundreds of PRs per month, the traditional manual code review model is becoming unsustainable. Brivael pointed out that when an engineer manages 10 Agents, requiring manual review for every line of code will lead to systemic paralysis. Software engineering is facing a forced transformation from “line-by-line review” to “systemic verification” and “automated auditing.” (Source: brivael, dotey)

Systems Thinking Over Syntax: The New Identity of Programmers in the AI Era: A community consensus has been reached: the importance of systems thinking and domain expertise has far surpassed code syntax. Developers should quickly shift their identity from “people who write code” to “people who solve problems through software.” For those with semi-technical backgrounds, this is the best time to catch up, as AI levels the difficulty of implementation and amplifies the value of decision-making. (Source: bookwormengr, nptacek)

💡 Others

Call for New Aesthetics: Tyler Cowen Funds Artists Defining the Era: Economist Tyler Cowen launched a grant program called “New Aesthetics,” aimed at finding artists and designers who can consciously define the aesthetics of the new era. In an age flooded with AI-generated content, how humans create a new visual language that is unique, deep, and resonant has become an urgent cultural proposition. (Source: Plinz)

Call for New Aesthetics

X Platform Recommendation Algorithm Revealed: Fully Vectorized Matching Based on Grok: Elon Musk confirmed that the X platform’s new recommendation algorithm is entirely driven by Grok. The algorithm analyzes over 100 million posts daily, predicting user engagement through Embeddings and machine learning, no longer relying on keyword filtering or manual rules. This fully vectorized approach aims to achieve more precise “interest matching,” but it has also sparked further discussion about information cocoons. (Source: brivael)