<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Training on Peng Tan's AI Blog</title><link>https://c44db530.hobbytp-github-io.pages.dev/zh/tags/training/</link><description>一个关注 AI 各领域的专题博客</description><atom:link href="https://c44db530.hobbytp-github-io.pages.dev/zh/tags/training/index.xml" rel="self" type="application/rss+xml"/><item><title>Reflect, Retry, Reward: 大型语言模型的自我进化新范式</title><link>https://c44db530.hobbytp-github-io.pages.dev/zh/papers/reflect_retry_reward_rl_finetunning/</link><pubDate>Fri, 04 Jul 2025 22:30:00 +0800</pubDate><guid>https://c44db530.hobbytp-github-io.pages.dev/zh/papers/reflect_retry_reward_rl_finetunning/</guid><description>Reflect, Retry, Reward: 大型语言模型的自我进化新范式</description></item><item><title>微调</title><link>https://c44db530.hobbytp-github-io.pages.dev/zh/training/finetuning/</link><pubDate>Wed, 26 Feb 2025 22:14:00 +0800</pubDate><guid>https://c44db530.hobbytp-github-io.pages.dev/zh/training/finetuning/</guid><description>本文介绍了微调的常见挑战及其克服方法，并详细介绍了如何使用Unsloth在消费级GPU上对DeepSeek-R1进行微调。</description></item></channel></rss>