微调 DeepSeek R1 (推理模型)
DeepSeek 颠覆了 AI 领域,通过推出一系列新的高级推理模型挑战了 OpenAI 的主导地位。最好的部分是什么?这些模型完全免费使用,没有任何限制,每个人都可以使用它们。您可以在下方观看我们的视频教程,了解如何微调 DeepSeek。
在本教程中,我们将在 Hugging Face 的 Medical Chain-of-Thought Dataset 上微调模型。这个提炼的 DeepSeek-R1 模型是通过对 DeepSeek-R1 生成的数据微调 Llama 3.1 8B 模型而创建的。它展示了与原始模型类似的推理能力。DeepSeek-R1-Distill-Llama-8B
如果您不熟悉 LLM 和微调,我强烈建议您参加 Python 中的 LLM 简介课程。
DeepSeek R1 简介
中国 AI 公司 DeepSeek AI 开源了其第一代推理模型 DeepSeek-R1 和 DeepSeek-R1-Zero,它们在数学、编码和逻辑等推理任务上的性能可与 OpenAI 的 o1 相媲美。您可以阅读我们的 DeepSeek R1 完整指南以了解更多信息。
深度搜索-R1-Zero
DeepSeek-R1-Zero 是第一个仅使用大规模强化学习 (RL) 而不是监督微调 (SFT) 作为初始步骤进行训练的开源模型。这种方法使模型能够独立探索思维链 (CoT) 推理,解决复杂问题,并迭代优化其输出。但是,它带来了重复推理步骤、可读性差和语言混合等挑战,这些挑战可能会影响其清晰度和可用性。
深度搜索-R1
DeepSeek-R1 的推出是为了克服 DeepSeek-R1-Zero 的局限性,通过在强化学习之前整合冷启动数据,为推理和非推理任务提供坚实的基础。
这种多阶段训练使模型能够在数学、代码和推理基准方面实现与 OpenAI-o1 相当的最新性能,同时提高其输出的可读性和连贯性。
DeepSeek 蒸馏
除了需要大量计算能力和内存才能运行的大型语言模型外,DeepSeek 还引入了蒸馏模型。这些更小、更高效的模型已经证明,它们仍然可以实现卓越的推理性能。
这些模型从 1.5B 到 70B 参数不等,保留了强大的推理能力,DeepSeek-R1-Distill-Qwen-32B 在多个基准测试中都优于 OpenAI-o1-mini。
较小的模型继承了较大模型的推理模式,展示了蒸馏过程的有效性。
资料来源: deepseek-ai/DeepSeek-R1
阅读 DeepSeek-R1:功能、o1比较、蒸馏模型等博客,了解其关键功能、开发过程、蒸馏模型、访问、定价以及与OpenAI o1的比较。
微调 DeepSeek R1:分步指南
要微调 DeepSeek R1 模型,您可以按照以下步骤作:
1. 设置
在这个项目中,我们使用 Kaggle 作为我们的 Cloud IDE,因为它提供了对 GPU 的免费访问,这些 GPU 通常比 Google Colab 中提供的 GPU 更强大。首先,启动一个新的Kaggle笔记本,并将你的Hugging Face代币和Weights & Biases代币添加为秘密。
您可以通过导航到 Kaggle 笔记本界面中的选项卡并选择选项来添加密钥。Add-onsSecrets
设置 secret 后,安装 Python 包。Unsloth 是一个开源框架,旨在使微调大型语言模型 (LLM) 的速度提高 2 倍,并且内存效率更高。unsloth
阅读我们的 Unsloth 指南:优化和加速 LLM 微调,了解 Unsloth 的主要特性、各种功能以及如何优化您的微调工作流程。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> %%capture !pip install unsloth !pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
使用我们从 Kaggle Secrets 中安全提取的 Hugging Face API 登录到 Hugging Face CLI。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> from huggingface_hub import login from kaggle_secrets import UserSecretsClient user_secrets = UserSecretsClient() hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN") login(hf_token)<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
使用您的API密钥登录Weights & Biases(),并创建一个新项目来跟踪实验和微调进度。wandb
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> import wandb wb_token = user_secrets.get_secret("wandb") wandb.login(key=wb_token) run = wandb.init( project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset', job_type="training", anonymous="allow" )<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
2. 加载模型和分词器
对于此项目,我们将加载 DeepSeek-R1-Distill-Llama-8B 的 Unsloth 版本。此外,我们将以 4 位量化加载模型,以优化内存使用和性能。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> from unsloth import FastLanguageModel max_seq_length = 2048 dtype = None load_in_4bit = True model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit, token = hf_token, )<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
3. 微调前的模型推理
要为模型创建提示样式,我们将定义系统提示并包含用于问题和响应生成的占位符。该提示将指导模型逐步思考并提供合乎逻辑、准确的响应。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response. ### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. Please answer the following medical question. ### Question: {} ### Response: <think>{}"""<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
在此示例中,我们将向 提供一个医学问题,将其转换为标记,然后将标记传递给模型以生成响应。prompt_style
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?" FastLanguageModel.for_inference(model) inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda") outputs = model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=1200, use_cache=True, ) response = tokenizer.batch_decode(outputs) print(response[0].split("### Response:")[1])<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
即使没有微调,我们的模型也成功地生成了一条思路链,并在提供最终答案之前提供了推理。推理过程封装在 <think></think> 标签中。
那么,为什么我们仍然需要微调呢?推理过程虽然详细,但冗长且不简洁。此外,最终答案以项目符号格式呈现,这与我们要微调的数据集的结构和样式不同。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> <think> Okay, so I have this medical question to answer. Let me try to break it down. The patient is a 61-year-old woman with a history of involuntary urine loss during activities like coughing or sneezing, but she doesn't leak at night. She's had a gynecological exam and a Q-tip test. I need to figure out what cystometry would show regarding her residual volume and detrusor contractions. First, I should recall what I know about urinary incontinence. Involuntary urine loss during activities like coughing or sneezing makes me think of stress urinary incontinence. Stress incontinence typically happens when the urethral sphincter isn't strong enough to resist increased abdominal pressure from activities like coughing, laughing, or sneezing. This usually affects women, especially after childbirth when the pelvic muscles and ligaments are weakened. The Q-tip test is a common diagnostic tool for stress urinary incontinence. The test involves inserting a Q-tip catheter, which is a small balloon catheter, into the urethra. The catheter is connected to a pressure gauge. The patient is asked to cough, and the pressure reading is taken. If the pressure is above normal (like above 100 mmHg), it suggests that the urethral sphincter isn't closing properly, which is a sign of stress incontinence. So, based on the history and the Q-tip test, the diagnosis is likely stress urinary incontinence. Now, moving on to what cystometry would show. Cystometry, also known as a filling cystometry, is a diagnostic procedure where a catheter is inserted into the bladder, and the bladder is filled with a liquid to measure how much it can hold (residual volume) and how it responds to being filled (like during a cough or sneeze). This helps in assessing the capacity and compliance of the bladder. In a patient with stress incontinence, the bladder's capacity might be normal, but the sphincter's function is impaired. So, during the cystometry, the residual volume might be within normal limits because the bladder isn't overfilled. However, when the patient is asked to cough or perform a Valsalva maneuver, the detrusor muscle (the smooth muscle layer of the bladder) might not contract effectively, leading to an increase in intra-abdominal pressure, which might cause leakage. Wait, but detrusor contractions are usually associated with voiding. In stress incontinence, the issue isn't with the detrusor contractions but with the sphincter's inability to prevent leakage. So, during cystometry, the detrusor contractions would be normal because they are part of the normal voiding process. However, the problem is that the sphincter doesn't close properly, leading to leakage. So, putting it all together, the residual volume might be normal, but the detrusor contractions would be normal as well. The key finding would be the impaired sphincter function leading to incontinence, which is typically demonstrated during the Q-tip test and clinical history. Therefore, the cystometry would likely show normal residual volume and normal detrusor contractions, but the underlying issue is the sphincter's inability to prevent leakage. </think> Based on the provided information, the cystometry findings in this 61-year-old woman with stress urinary incontinence would likely demonstrate the following: 1. **Residual Volume**: The residual volume would be within normal limits. This is because the bladder's capacity is typically normal in cases of stress incontinence, where the primary issue lies with the sphincter function rather than the bladder's capacity. 2. **Detrusor Contractions**: The detrusor contractions would also be normal. These contractions are part of the normal voiding process and are not impaired in stress urinary incontinence. The issue is not with the detrusor muscle but with the sphincter's inability to prevent leakage. In summary, the key findings of the cystometry would be normal residual volume and normal detrusor contractions, highlighting the sphincteric defect as the underlying cause of the incontinence.<|end▁of▁sentence|><button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
4. 加载和处理数据集
我们将通过为 complex chain of thought 列添加第三个占位符来略微更改处理数据集的提示样式。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response. ### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. Please answer the following medical question. ### Question: {} ### Response: <think> {} </think> {}"""<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
编写 Python 函数,该函数将在数据集中创建一个 “text” 列,该列由训练提示样式组成。在占位符中填写问题、文本链和答案。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): text = train_prompt_style.format(input, cot, output) + EOS_TOKEN texts.append(text) return { "text": texts, }<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
我们将从 FreedomIntelligence/medical-o1-reasoning-SFT 加载前 500 个样本数据集,可在 Hugging Face 中心获取。之后,我们将使用该函数映射列。textformatting_prompts_func
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> from datasets import load_dataset dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True) dataset = dataset.map(formatting_prompts_func, batched = True,) dataset["text"][0]<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
正如我们所看到的,文本列有系统提示符、说明、思路链和答案。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> "Below is an instruction that describes a task, paired with an input that provides further context. \nWrite a response that appropriately completes the request. \nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \nPlease answer the following medical question. \n\n### Question:\nA 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n### Response:\n<think>\nOkay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdominal pressure like coughing or sneezing. This sounds a lot like stress urinary incontinence to me. Now, it's interesting that she doesn't have any issues at night; she isn't experiencing leakage while sleeping. This likely means her bladder's ability to hold urine is fine when she isn't under physical stress. Hmm, that's a clue that we're dealing with something related to pressure rather than a bladder muscle problem. \n\nThe fact that she underwent a Q-tip test is intriguing too. This test is usually done to assess urethral mobility. In stress incontinence, a Q-tip might move significantly, showing urethral hypermobility. This kind of movement often means there's a weakness in the support structures that should help keep the urethra closed during increases in abdominal pressure. So, that's aligning well with stress incontinence.\n\nNow, let's think about what would happen during cystometry. Since stress incontinence isn't usually about sudden bladder contractions, I wouldn't expect to see involuntary detrusor contractions during this test. Her bladder isn't spasming or anything; it's more about the support structure failing under stress. Plus, she likely empties her bladder completely because stress incontinence doesn't typically involve incomplete emptying. So, her residual volume should be pretty normal. \n\nAll in all, it seems like if they do a cystometry on her, it will likely show a normal residual volume and no involuntary contractions. Yup, I think that makes sense given her symptoms and the typical presentations of stress urinary incontinence.\n</think>\nCystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.<|end▁of▁sentence|>"<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
5. 设置模型
使用目标模块,我们将通过将低排名的采用者添加到模型来设置模型。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> model = FastLanguageModel.get_peft_model( model, r=16, target_modules=[ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", ], lora_alpha=16, lora_dropout=0, bias="none", use_gradient_checkpointing="unsloth", # True or "unsloth" for very long context random_state=3407, use_rslora=False, loftq_config=None, )<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
接下来,我们将通过提供模型、分词器、数据集和其他重要的训练参数来设置训练参数和训练器,这些参数将优化我们的微调过程。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> from trl import SFTTrainer from transformers import TrainingArguments from unsloth import is_bfloat16_supported trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=dataset, dataset_text_field="text", max_seq_length=max_seq_length, dataset_num_proc=2, args=TrainingArguments( per_device_train_batch_size=2, gradient_accumulation_steps=4, # Use num_train_epochs = 1, warmup_ratio for full training runs! warmup_steps=5, max_steps=60, learning_rate=2e-4, fp16=not is_bfloat16_supported(), bf16=is_bfloat16_supported(), logging_steps=10, optim="adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=3407, output_dir="outputs", ), )<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
6. 模型训练
执行以下命令开始训练。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> trainer_stats = trainer.train()<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
培训过程耗时 44 分钟。训练损失逐渐减少,这是模型性能更好的好兆头。
您可以通过登录网站并查看项目,在 Weights and bais 仪表板上查看填充模型评估报告。
如果您在运行上述代码时遇到问题,请参阅 Fine-tuning DeepSeek R1 (Reascccccccccccconing Model) Kaggle 笔记本。
7. 微调后的模型推理
为了比较结果,我们将向微调模型提出与之前相同的问题,以查看发生了什么变化。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?" FastLanguageModel.for_inference(model) # Unsloth has 2x faster inference! inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda") outputs = model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=1200, use_cache=True, ) response = tokenizer.batch_decode(outputs) print(response[0].split("### Response:")[1])<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
这要好得多,也更准确。思路链很直接,答案很简单,而且在一个段落中。微调成功了。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> <think> Okay, so let's think about this. We have a 61-year-old woman who's been dealing with involuntary urine loss during things like coughing or sneezing, but she's not leaking at night. That suggests she might have some kind of problem with her pelvic floor muscles or maybe her bladder. Now, she's got a gynecological exam and a Q-tip test. Let's break that down. The Q-tip test is usually used to check for urethral obstruction. If it's positive, that means there's something blocking the urethra, like a urethral stricture or something else. Given that she's had a positive Q-tip test, it's likely there's a urethral obstruction. That would mean her urethra is narrow, maybe due to a stricture or some kind of narrowing. So, her bladder can't empty properly during activities like coughing because the urethral obstruction is making it hard. Now, let's think about what happens when her bladder can't empty. If there's a urethral obstruction, the bladder is forced to hold more urine, increasing the residual volume. That's because her bladder doesn't empty completely. So, her residual volume is probably increased. Also, if her bladder can't empty properly, she might have increased detrusor contractions. These contractions are usually stronger to push the urine out. So, we expect her detrusor contractions to be increased. Putting it all together, if she has a urethral obstruction and a positive Q-tip test, we'd expect her cystometry results to show increased residual volume and increased detrusor contractions. That makes sense because of the obstruction and how her bladder is trying to compensate by contracting more. </think> Based on the findings of the gynecological exam and the positive Q-tip test, it is most likely that the cystometry would reveal increased residual volume and increased detrusor contractions. The positive Q-tip test indicates urethral obstruction, which would force the bladder to retain more urine, thereby increasing the residual volume. Additionally, the obstruction can lead to increased detrusor contractions as the bladder tries to compensate by contracting more to expel the urine.<|end▁of▁sentence|><button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
8. 在本地保存模型
现在,让我们在本地保存 adopter、full model 和 tokenizer,以便我们可以在其他项目中使用它们。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> new_model_local = "DeepSeek-R1-Medical-COT" model.save_pretrained(new_model_local) tokenizer.save_pretrained(new_model_local) model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
9. 将模型推送到 Hugging Face Hub
我们还会将 adopter、tokenizer 和 model 推送到 Hugging Face Hub,以便 AI 社区可以通过将其集成到他们的系统中来利用此模型。
<button class="code-copy-button" data-trackid="copy-code" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-width: 0px; border-style: initial; border-color: initial; border-radius: 50%; box-shadow: none; display: flex; height: 32px; -webkit-box-pack: center; justify-content: center; line-height: 1.5; padding: 0px; position: absolute; right: 16px; top: 16px; width: 32px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button> new_model_online = "kingabzpro/DeepSeek-R1-Medical-COT" model.push_to_hub(new_model_online) tokenizer.push_to_hub(new_model_online) model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")<button data-value="Explain code" class="ai-explain-code-button" data-trackid="ai-explain-code-button" data-testid="ai-explain-code-button" style="background-repeat: no-repeat; -webkit-box-align: center; align-items: center; border-style: solid; border-color: rgb(5, 25, 45); border-radius: 4px; display: flex; height: 32px; padding: 0px 8px;"><svg aria-hidden="true" height="16" width="16" viewBox="0 0 18 18" class="css-6su6fj"></svg></button><svg viewBox="0 0 152 36" xml:space="preserve" xmlns="http://www.w3.org/2000/svg" class="css-0"> </svg>
资料来源: kingabzpro/DeepSeek-R1-Medical-COT ·拥抱脸
学习之旅的下一步是将模型提供并部署到云中。您可以遵循如何使用 BentoML 部署 LLM 指南,该指南提供了使用 BentoML 和 vLLM 等工具高效且经济高效地部署大型语言模型的分步过程。
或者,如果您更喜欢在本地使用模型,则可以将其转换为 GGUF 格式并在您的计算机上运行它。为此,请查看 微调 Llama 3.2 并在本地使用它 指南,其中提供了本地使用的详细说明。
结论
AI 领域正在迅速变化。开源社区现在正在接管,挑战过去三年来统治 AI 领域的专有模型的主导地位。
开源大型语言模型 (LLM) 变得越来越好、更快、更高效,这使得在较低的计算和内存资源上对其进行微调变得比以往任何时候都更容易。
在本教程中,我们探索了 DeepSeek R1 推理模型,并学习了如何针对医疗问答任务微调其提炼版本。微调的推理模型不仅可以提高性能,还可以将其应用于医学、紧急服务和医疗保健等关键领域。
为了应对 DeepSeek R1 的推出,OpenAI 推出了两个强大的工具:OpenAI 的 o3,一种更高级的推理模型,以及 OpenAI 的 Operator AI 代理,由新的计算机使用代理 (CUA) 模型提供支持,它可以自主导航网站并执行任务。