AI Daily AI Daily – 2026-01-09(Morning) AI trainingDeepSeek R1Process Reward Model PRMReinforcement Learning RL