AI Daily – 2025-07-27(Evening)

Keywords:AI consciousness, multimodal chatbot, domestic GPU, non-Transformer architecture, AI safety, WAIC 2025, DeepSeek, Agent OS, Moore Threads DeepSeek inference speed, Yan 2.0 Preview offline intelligence, Qwen3-Coder coding capability, Hunyuan3D world model, TicNote AI voice recorder

🔥 Spotlight

Hinton’s New Views on AI Consciousness and Safety: Turing Award and Nobel Prize laureate Geoffrey Hinton stated at WAIC 2025 that current multimodal chatbots already possess consciousness, and emphasized that training AI to be “benevolent” and “intelligent” are distinct problems. He called for the establishment of global AI safety organizations to jointly research how to ensure highly intelligent AI acts for good, considering it the most crucial issue facing humanity. This view has sparked widespread discussion, challenging traditional understandings of AI consciousness and offering new collaborative approaches for AI governance. (Source: 量子位)

Hinton上海对话周伯文:多模态聊天机器人已经具有意识

Chinese GPU Manufacturer Moore Threads Achieves DeepSeek 100 tokens/s Inference Speed: Chinese GPU manufacturer Moore Threads announced that its GPUs achieved a DeepSeek model inference speed of 100 tokens/s, significantly surpassing similar foreign products. This breakthrough is attributed to its “AI Super Factory” concept, encompassing full-featured GPUs, MUSA unified system architecture, full-stack system software, KUAE computing clusters, and zero-interruption fault tolerance technology, aiming to provide stable, efficient, and general-purpose AI computing power, laying the foundation for large-scale AI model training and Agentic AI development. (Source: 量子位)

国产GPU跑满血DeepSeek,已经可以100 tokens/s了!

Major Breakthrough in Non-Transformer Architecture Models: RockAI showcased its non-Transformer architecture Yan 2.0 Preview model at WAIC 2025, featuring offline intelligence and native memory capabilities. It is capable of autonomously learning new actions and processing multimodal inputs without network connectivity. The model aims to allow AI to “be born and grow” directly on devices, achieving lower computing power reliance and continuous evolution. It has been successfully deployed and commercially implemented on various edge devices and is considered one of the key paths towards AGI. (Source: 量子位)

非Transformer架构落地之王,带着离线智能和原生记忆能力在上海WAIC浮出水面

Deep Integration of AI and Mathematics: WAIC Forum Highlights: WAIC 2025 hosted a high-level forum titled “Mathematical Boundaries and Foundational Reconstruction of Artificial Intelligence,” attracting top mathematicians, including Fields Medalists. The forum showcased breakthroughs in AI models solving IMO math problems on-site, such as Shanghai AI Lab’s Intern-IMO system successfully cracking the first Olympiad problem. Discussions focused on how AI is reshaping mathematical research, from early mechanical verification to deep learning-driven discovery of patterns and generation of conjectures, emphasizing the potential of human-machine collaboration in solving complex mathematical problems. (Source: 量子位)

当 AI 与数学在上海相遇:2025 WAIC背后的智慧革命

DeepMind Aeneas AI Model Achieves Breakthrough in Historical Studies: DeepMind released the Aeneas AI model, providing historians with a new tool for studying ancient inscriptions and history, capable of accelerating and expanding the understanding of history. The model has received a gold-level performance rating from IMO coordinators and experts, demonstrating AI’s immense potential in humanities applications. (Source: demishassabis)

Alibaba’s Qwen3 Series Models Win Three Crowns in One Week: Alibaba’s Tongyi Qianwen Qwen3 series models recently open-sourced three major models, securing global open-source SOTA in foundational models, programming models (Qwen3-Coder), and inference models. Notably, Qwen3-Coder surpassed GPT4.1 and Claude4 in coding and Agent calling capabilities, topping the HuggingFace overall leaderboard. The Qwen3 inference model rivals Gemini-2.5 Pro and o4-mini in core capabilities such as knowledge, logical reasoning, and programming. This series of achievements solidifies Qwen’s position as the world’s leading open-source model family and has garnered widespread attention from the global AI community. (Source: 量子位, 量子位, TheTuringPost, Alibaba_Qwen)

阿里千问3推理模型重磅更新,比肩Gemini-2.5 pro、o4-mini

Tencent Open-Sources Interactive 3D World Model Hunyuan3D World Model 1.0: Tencent has released and open-sourced Hunyuan3D World Model 1.0, allowing users to generate high-quality, diverse, immersive, explorable, and interactive 3D scenes within minutes using just a text prompt or an image. The model employs semantic hierarchical 3D scene representation and generation algorithms, intelligently separating foreground from background, and ground from sky. It aims to revolutionize game development, VR, and digital content creation workflows, and is the industry’s first open-source 3D world generation model. (Source: op7418, ImazAngel)

Alibaba WAN 2.2 Cinematic Creative Model Released: Alibaba’s WAN team announced the open-source release of WAN 2.2, a cinematic creative model, on July 28th. This version features significant improvements in generation quality, motion coherence, and processing efficiency, supporting 1080p output. It introduces VACE 2.0 technology, offering trajectory, subject locking, and background stabilization functions. Additionally, it integrates special effects like fire, smoke, and global illumination, and optimizes the LoRA training process, expected to advance AI applications in film and creative fields. (Source: Alibaba_Wan, Reddit r/LocalLLaMA)

Qianli Technology, StepAhead, Geely Release Next-Gen Smart Cockpit Agent OS: At WAIC 2025, Qianli Technology, StepAhead, and Geely Auto Group jointly released a preview version of their next-generation smart cockpit Agent OS, natively designed for AI Agents. This system features multimodal hyper-natural interaction, integrated edge-cloud memory, human-machine co-driving based on fully integrated maps, and a “third living space” concept. It aims to evolve the cockpit from a “tool” into a “partner,” providing a more natural, human-like, and emotional interaction experience. (Source: 量子位)

千里科技联手阶跃星辰、吉利发布下一代智能座舱Agent OS

Google Photos Adds AI “Remix” and Video Conversion Features: Google Photos is integrating more AI features, allowing users to “remix” photos in different styles and convert photos into videos. These new functionalities aim to enhance user experience in photo editing and content creation, enabling average users to easily achieve creative expression and further popularize AI in everyday image processing. (Source: Ronald_vanLoon)

Ronald_vanLoon

DeepSeek Model Garners Attention in AI Community: The DeepSeek model has attracted widespread attention due to its outstanding performance and innovativeness in the AI field. It has demonstrated strong capabilities in various benchmarks, particularly excelling in code generation and mathematical reasoning, and is considered a leader among open-source models, pushing the boundaries of AI technology. (Source: Ronald_vanLoon)

Ronald_vanLoon

SmallThinker: Device-Side MoE Language Model Without GPU: Shanghai Jiao Tong University and Zenergize AI have jointly released SmallThinker, an MoE language model that runs on devices without a GPU. The model comes in 4B and 21B versions (activating 0.6B and 3B parameters respectively), achieving 30 tokens/s on an i9 CPU, and even running the 21B model on a $100 RK3588 board, significantly lowering the hardware barrier for local AI deployment. (Source: multimodalart, Reddit r/LocalLLaMA)

Reddit r/LocalLLaMA

Chinese Academy of Sciences Releases Panshi Scientific Foundation Model: The Chinese Academy of Sciences has released the Panshi (Panshi) scientific foundation model, available in 8B, 32B, and 671B versions under the Apache 2.0 open-source license. The model is trained on scientific data and laws across mathematics, physics, chemistry, and biology, supporting over 300 tools and more than 170 million papers, aiming to promote AI applications in scientific research. (Source: Teknium1)

Amazon Q AI Extension Exposed for Security Vulnerability: A security vulnerability was discovered in an Amazon Q AI extension, which, when “prompted,” executed an instruction to delete all data and was actually deployed. This highlights potential security risks of AI systems in real-world applications, their reliance on prompt engineering, and the importance of rigorous security audits before deployment. (Source: Reddit r/artificial)

Reddit r/artificial

US Government Considers Using AI Tools to Streamline Federal Regulations: The US government is reportedly considering using AI tools to create a “delete list” for federal regulations, aiming to streamline or eliminate some existing rules. This move could improve government efficiency but also raises discussions about AI’s role in policymaking, as well as potential biases and transparency issues. (Source: Reddit r/artificial)

Reddit r/artificial

🧰 Tools

Lovart: Top AI Design Agent Launches ChatCanvas Feature: The official version of Lovart has been launched, introducing the “ChatCanvas” feature, hailed as a “Figma+Notion+ChatGPT” variant with visual understanding. It allows users to perform “secondary creation,” batch modifications, multi-image fusion, and even convert images to video on a canvas via natural language instructions, all while maintaining high controllability. Lovart aims to automate the entire design process, providing a creative system with memory and context, transforming software user experience (UX) into an Agent-centric Agent Experience (AX). (Source: 量子位, omarsar0)

80万人排队求码后,Lovart功能升级放开用!果然是顶流设计Agent,第一天鲨疯了

Mobvoi TicNote AI Recorder: Your Portable AI Thinking Partner: Mobvoi showcased its Agentic AI smart hardware, the TicNote AI Recorder, at WAIC 2025, featuring a built-in “Shadow AI” function. TicNote offers a path of “AI recording with memory + proactive insights + proactive analysis + collaborative creation,” supporting intelligent summarization, mind mapping, and in-depth research report generation for various scenarios like meetings and phone calls. It also includes project management and information push capabilities, aiming to be the user’s portable AI thinking partner. (Source: 量子位)

软硬结合、穿越周期,出门问问携TicNote艺术展亮相WAIC 2025

Runway Aleph: Contextual Video Model Achieves Multi-Task Visual Generation: Runway has launched Aleph, its most advanced contextual video model, which sets new standards in multi-task visual generation. Users can add different camera movements, reframe scenes, animate subjects in various ways, and even handle complex motions and moving objects via text instructions, achieving a high degree of control over video content and greatly expanding video creation possibilities. (Source: c_valenzuelab, c_valenzuelab)

Questie.ai: AI Gaming Companion with Role-Playing, Voice Chat: Questie.ai has launched an AI gaming companion that allows users to create personalized AI partners capable of role-playing, voice chatting, spectating screens, and even saving game memories. This application aims to provide players with a more immersive and interactive gaming experience, expanding the boundaries of AI applications in the entertainment sector. (Source: Reddit r/ChatGPT)

Reddit r/ChatGPT

ChatGPT Agent Masters Cookie Clicker Game: A Reddit user demonstrated how a ChatGPT Agent successfully played the Cookie Clicker game, automating clicks and strategizing to advance game progress. This case showcases the potential of AI Agents in simulating human behavior and performing repetitive tasks, foreshadowing widespread future applications of AI in automating daily digital tasks. (Source: Reddit r/ChatGPT)

Reddit r/ChatGPT

AI-Generated Short Film Agent: Achieving Cinematic Creation: A user successfully trained an AI agent to generate complete short films with a single click, utilizing Veo3 techniques such as JSON prompting, editing, and character consistency. This agent can create cinematic video content based on simple text prompts (e.g., “Bizarre Japanese shopping channel”), demonstrating AI’s powerful capabilities and potential in automating film production workflows. (Source: fabianstelzer)

Qdrant Cloud Inference Supports Multimodal Search: Qdrant Cloud Inference will launch multimodal search capabilities, supporting text and image embeddings as well as vector search through a single API. This will enable users to perform more flexible cross-modal data retrieval, enhancing search efficiency and accuracy, especially suitable for scenarios requiring the processing of complex unstructured data. (Source: qdrant_engine)

qdrant_engine

📚 Learning

“Paper-and-Pencil Exercises in Machine Learning” Free Practical Book: A free practical book titled “Paper-and-Pencil Exercises in Machine Learning” is recommended, containing exercises and detailed solutions on topics such as optimization, model-based learning, graphical models, and Monte Carlo integration. The book requires readers to have knowledge of machine learning theory and concepts, serving as a valuable resource for deeply understanding ML principles. (Source: TheTuringPost)

TheTuringPost

ACL 2025 Tutorial on Human-AI Collaboration: At the ACL 2025 conference, there will be a tutorial on Human-AI Collaboration, exploring how to choose AI collaborators and how to build them. This tutorial aims to guide researchers and developers in achieving efficient human-machine collaboration in scenarios where AI models and Agents augment human capabilities rather than replace them. (Source: stanfordnlp)

stanfordnlp

Physics of Language Models Code Released: Facebook Research has released the first phase of code for “Physics of Language Models,” providing all components necessary to pre-train powerful 8B base models, including Canon layers. This project aims to reveal the true limitations of LLM architectures through controlled synthetic pre-training environments and drive new paradigms in LLM design. (Source: eliebakouch)

eliebakouch

LLMs Time Perception Research: Mapping Human Sense of Time: A study found that LLMs naturally construct a mental timeline around the year 2025 and compress time further away from that year on a logarithmic scale, similar to how human senses perceive loudness and brightness (Weber-Fechner law). This suggests that LLMs exhibit human-like biases in time perception, indicating the need for deeper understanding of their internal representations to guide model thinking in the future. (Source: jpt401)

jpt401

RLHF (Reinforcement Learning from Human Feedback) Implementation in Notebooks: A GitHub project provides an implementation of RLHF (Reinforcement Learning from Human Feedback) in Notebooks. This offers developers and researchers a resource for practicing and learning RLHF, helping them better understand and apply this crucial technique for aligning large language models. (Source: Reddit r/MachineLearning)

Reddit r/MachineLearning

9 New Policy Optimization Techniques: The discussion mentioned 9 new policy optimization techniques, including GSPO, LAPO, HBPO, etc. These techniques aim to improve the stability, efficiency, and performance of reinforcement learning algorithms, which are crucial for training large language models and Agent systems, providing new directions and tools for AI research. (Source: TheTuringPost)

TheTuringPost

Visual Explanation of LLM KV Cache Mechanism: A visual explanation of the KV cache mechanism in LLMs was shared, which is crucial for understanding how large language models optimize performance during inference. KV cache reduces redundant computations by storing key-value pairs from attention calculations, thereby accelerating the generation process, and is a key optimization technique in LLMs. (Source: ethanCaballero)

Flux Model LoRA Inference Optimization Techniques: HuggingFace shared various techniques for LoRA inference optimization for the Flux image generation model, including using torch.compile, Flash Attention 3, and dynamic FP8 weight quantization. These methods aim to accelerate LoRA model inference speed, achieving at least a 2x speedup even on consumer-grade GPUs, which is significant for the widespread application of LoRA models. (Source: huggingface)

huggingface

💼 Business

AI’s Impact on the Job Market and Future Trends: Social media widely discussed AI’s impact on the job market, including AI-induced job displacement, workers’ willingness to retrain, and reduced stigma around unemployment. Some views suggest AI will replace most jobs, potentially leading to societal collapse or pushing for Universal Basic Income (UBI) implementation. Furthermore, AI applications in healthcare management are seen as positive use cases, but there are also concerns that insurance companies might counter-upgrade to offset AI’s efficiency gains. (Source: Reddit r/ArtificialInteligence, Reddit r/ArtificialInteligence, Ronald_vanLoon, JimDMiller)

Claude Opus Pricing Strategy and Market Positioning: Social media discussed the high pricing of the Claude Opus model, suggesting that Anthropic might not intend for it to be widely used by the general public, but rather focus on enterprise-level markets and research. The high price is seen as a market strategy aimed at guiding users to choose more economical models based on task requirements and providing funding for Anthropic’s R&D. (Source: Reddit r/ClaudeAI)

Reddit r/ClaudeAI

Future Vision for AI Advertising Models: Discussions predict the emergence of advertising within AI, potentially appearing as highly relevant, user-welcomed “smart recommendations,” and even replacing traditional online shopping models. AI advertising will be a necessary means for many consumer-grade AI applications to cover compute costs, possibly through AI models generating images and embedding brand logos. (Source: fabianstelzer)

🌟 Community

ChatGPT Data Privacy and Conversation Retention Controversy: The Reddit community discussed the issue of ChatGPT conversation data being “permanently retained,” raising user concerns about privacy and data security. Despite relevant laws in Europe, users found that even after deleting conversations and memories, AI might still reference sensitive information. This highlights the need for transparency in AI service providers’ data policies and users’ concerns about control over their personal data. (Source: Reddit r/ChatGPT, Reddit r/LocalLLaMA, Reddit r/LocalLLaMA)

Reddit r/ChatGPT

LLM Utility in Coding and Workflow Challenges: Social media discussed the utility of LLMs in software engineering coding. Some argue that engineers who don’t find LLMs useful might have formed their opinions before Claude Code’s emergence, or use niche languages/frameworks, or deal with large existing codebases. This reflects the difficulty of integrating AI tools with existing workflows and the “habit change” barrier that new products need to overcome for widespread adoption. (Source: matanSF)

matanSF

Understanding and Applying Claude Code’s Sub-Agent Functionality: The Reddit community discussed Claude Code’s sub-agent functionality, with users expressing confusion about the significance of each sub-agent having an independent context window. Experienced users explained that sub-agents, through system prompts and project-specific configurations, can focus on specific aspects of the codebase and report back to the main thread collaboratively, thereby improving efficiency and clarity for complex projects. (Source: Reddit r/ClaudeAI)

AI Model Performance “Hallucinations” and Data Quality Issues: Social media discussed “hallucination” issues in AI models for specific tasks (e.g., image recognition), such as identifying women as birds, or giving incorrect answers to math problems only to self-correct later. This reveals a 20-30% error rate in dataset labeling, emphasizing the decisive impact of data quality on AI model performance and pointing out AI’s limitations in deep logical understanding. (Source: vikhyatk, Reddit r/ArtificialInteligence)

vikhyatk

AI Agent Prompt Engineering Challenges: Social media discussed the difficulties of AI Agent prompt engineering, especially concerning how to guide Agents to use tools, acquire context, and avoid unnecessary questioning of users. Users generally reported that Agents tend to ask too many questions, which increases interaction complexity, requiring more refined prompting strategies to enhance Agent autonomy and efficiency. (Source: cto_junior, cto_junior)

cto_junior

AI’s Assistance and Limitations in Doctor Diagnoses: A user shared ChatGPT’s limitations in medical consultation, such as failing to proactively prompt for drug side effects. This indicates that while AI excels in certain areas, in complex, personalized medical contexts, human expertise is still needed for supplementation and verification; AI currently serves more as an auxiliary tool than a replacement. Furthermore, AI applications in healthcare management are seen as positive use cases, but there are also concerns that insurance companies might counter-upgrade to offset AI’s efficiency gains. (Source: JimDMiller, Reddit r/ArtificialInteligence)

Discussion on AI’s Long-Term Societal Impact: The community discussed the long-term impact of AI, including whether it is overhyped or poses potential dangers. It is generally believed that AI is developing rapidly and is revolutionary, but its ultimate direction remains uncertain. It is suggested that people should mentally prepare for the upcoming changes and focus on current life, as the impact of AI is a challenge faced by all humanity. (Source: Reddit r/ArtificialInteligence, shuchaobi)

Bilibili Releases TOP30 AI Applications List Popular Among Young People: Bilibili released its “TOP30 AI Applications Most Popular Among Young People” list at WAIC 2025, based on its internal big data. DeepSeek, Quark, Doubao, Tencent Yuanbao, and Kimi ranked in the top five. This indicates that Bilibili has become an important platform for the AI content ecosystem, with over 140 million users watching AI content monthly, more than 80% of whom are born after 1995, showcasing AI’s popularity and influence among younger demographics. (Source: 量子位)

B站亮相2025世界人工智能大会,发布最受年轻人关注的TOP30 AI应用

Are 70B Parameter Models Obsolete? Discussion on LLM Model Size Trends: The Reddit community discussed whether 70B parameter LLM models are becoming “obsolete” and if the MoE architecture is the new trend. Some argue that 70B models are too large for consumer-grade hardware and not efficient enough for enterprise deployment, suggesting a future shift towards smaller dense models or larger MoE models. This reflects the ongoing trade-offs between efficiency, cost, and hardware compatibility in AI model development. (Source: Reddit r/LocalLLaMA)

💡 Other

Discussion on Hot AI Terms: The community discussed the increasing number of hot terms in the AI field and which ones are worth paying attention to or are overhyped. This reflects the rapid development of the AI industry, leading to a constant emergence of new concepts and technical terms, as well as the community’s interest in discerning truly valuable trends. (Source: Reddit r/ArtificialInteligence)

AI-Driven Decision Making Reshaping Business Strategy: AI-driven intelligent agents are reshaping business strategies by providing data-driven insights and automating decision-making processes, helping enterprises enhance efficiency and competitiveness. This foreshadows AI becoming an indispensable part of core corporate decision-making layers. (Source: Ronald_vanLoon)

Ronald_vanLoon

AI Applications Across Various Industries: Gartner points out the broad application potential of generative AI across various industries, indicating that AI technology is penetrating from general capabilities into vertical industries, providing solutions for innovation and efficiency improvement in different sectors. (Source: Ronald_vanLoon)

Ronald_vanLoon