白嫖資源訓練 DeepSeek R1 推理模型精華

發布于 2025-2-26 14:40

瀏覽

0收藏

DeepSeek 顛覆了 AI 領域，通過推出一系列全新高級推理模型挑戰 OpenAI 的主導地位。最棒的是？這些模型完全免費使用，沒有任何限制，每個人都可以使用。您可以在下面觀看有關如何微調 DeepSeek 的視頻教程。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

在本教程中，我們將在 Hugging Face 的醫療思維鏈數據集上對模型進行微調，微調的基礎模型為 DeepSeek-R1-Distill-Llama-8B。這個精簡的 DeepSeek-R1 模型是通過在使用 DeepSeek-R1 生成的數據上對 Llama 3.1 8B 模型進行微調而創建的。它展示了與原始模型類似的推理能力。

如果您是 LLM 和微調的新手，我強烈建議您參加 Python 中的大型語言模型導論課程。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

DeepSeek R1 簡介

中國人工智能公司 DeepSeek AI （深度求索）已開源其第一代推理模型 DeepSeek-R1 和 DeepSeek-R1-Zero，它們在數學、編碼和邏輯等推理任務上的表現可與 OpenAI 的 o1 相媲美。您可以訪問 DeepSeek 的官方網站了解更詳細的內容。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

DeepSeek-R1-Zero

DeepSeek-R1-Zero 是第一個完全用大規模強化學習（而不是監督式微調）來訓練的開源模型。這種方式讓模型能夠自己探索思路鏈推理，解決復雜問題，并不斷改進輸出。不過，它也有一些問題，比如會重復推理步驟、生成的內容不容易讀懂，還有可能會混雜不同的語言，這些都會影響它的清晰度和實用性。

DeepSeek-R1

DeepSeek-R1 的推出是為了改進 DeepSeek-R1-Zero 的不足，通過在強化學習前加入一些初始數據，為處理推理和非推理任務打下更好的基礎。這種分階段的訓練方法讓模型在數學、代碼和推理測試中的表現達到了與 OpenAI-o1 相當的高水平，同時還提高了輸出內容的可讀性和連貫性。

DeepSeek 蒸餾

除了那些需要大量計算資源和內存支持的大型語言模型外，DeepSeek 還開發了一系列精簡版模型。這些更緊湊且高效的模型已經證明能夠在推理性能上保持高水平。它們的參數規模從 1.5B 到 70B 不等，同時保留了卓越的推理能力。特別值得一提的是，DeepSeek-R1-Distill-Qwen-32B 模型在多個基準測試中均超過了 OpenAI-o1-mini 的表現。較小規模的模型成功地繼承了大規模模型的推理特性，充分展示了知識蒸餾技術的有效性。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

來源：deepseek-ai/DeepSeek-R1

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

閱讀DeepSeek -R1：功能、o1 比較、提煉模型等博客，了解其主要功能、開發過程、提煉模型、訪問、定價以及與 OpenAI o1 的比較。

微調所需資源

模型	GPU	CPU	內存	磁盤	耗時
DeepSeek-R1-Distill-Llama-8B	T4 x 2 15G	4核	32G	200G	23分鐘

什么？你說上面的配置太高？?? 好吧，跟我往下走，教你如何白嫖！??????

微調 DeepSeek R1：分步指南

要微調DeepSeek R1模型，您可以按照以下步驟操作：

1. 設置

對于這個項目，我們使用 Kaggle 作為我們的 Cloud IDE，因為它可以免費訪問 GPU，而這些 GPU 通常比 Google Colab 中提供的 GPU 更強大。首先，啟動一個新的 Kaggle 筆記本，并將您的 Hugging Face 令牌和 Weights & Biases 令牌添加為機密。關于如何獲取令牌參考文末 QA 環節。

您可以通過導航到 Add-ons?Kaggle 筆記本界面中的選項卡并選擇Secrets選項來添加機密。

設置機密后，安裝 unslothPython 包。Unsloth 是一個開源框架，旨在使微調大型語言模型 (LLM) 的速度提高 2 倍，并且更節省內存。

閱讀我們的 Unsloth 指南：優化和加速 LLM 微調，以了解 Unsloth 的主要特性、各種功能以及如何優化您的微調工作流程。

!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

使用我們從 Kaggle Secrets 中安全提取的 Hugging Face API 登錄到 Hugging Face CLI。

from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()


hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")
login(hf_token)

使用您的 API 密鑰登錄 Weights & Biases（wandb）并創建一個新項目來跟蹤實驗和微調進度。

import wandb


wb_token = user_secrets.get_secret("wandb")


wandb.login(key=wb_token)
run = wandb.init(
    project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset', 
    job_type="training", 
    annotallow="allow"
)

2. 加載模型和標記器

對于這個項目，我們正在加載DeepSeek-R1-Distill-Llama-8B 的 Unsloth 版本。此外，我們將以 4 位量化加載模型，以優化內存使用和性能。

from unsloth import FastLanguageModel


max_seq_length = 2048 
dtype = None 
load_in_4bit = True




model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token, 
)

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

3. 微調前的模型推理

為了為模型創建提示樣式，我們將定義一個系統提示，并包含用于生成問題和響應的占位符。提示將引導模型逐步思考并提供合乎邏輯且準確的響應。

prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.


### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 


### Question:
{}


### Response:
<think>{}"""




## =========================以下為中文翻譯======================================


prompt_style = """以下是一條描述任務的指令，以及為其提供更多背景信息的輸入內容。請給出一個能恰當完成該請求的回復。在回答之前，仔細思考問題，并創建一個逐步的思路鏈，以確保回復符合邏輯且準確。


### 指令：
你是一位在臨床推理、診斷和治療計劃方面擁有高級知識的醫學專家。請回答以下醫學問題。


### 問題：
{}


### 回復：
<think>{}"""

在這個例子中，我們將向提供一個醫療問題 prompt_style，將其轉換為標記，然后將標記傳遞給模型進行響應生成。

question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"




FastLanguageModel.for_inference(model) 
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")


outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])




## =========================以下為中文翻譯======================================
一位 61 歲的女性，有長期在咳嗽或打噴嚏等活動時不自主漏尿但夜間無漏尿的病史，進行了婦科檢查和棉簽試驗。基于這些發現，膀胱測壓最有可能揭示她的殘余尿量和逼尿肌收縮情況如何？

英文效果

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

中文效果

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

即使沒有微調，我們的模型也成功地生成了思路鏈，并在給出最終答案之前進行了推理。推理過程封裝在 <think></think> 標簽中。

那么，為什么我們還需要微調呢？推理過程雖然詳細，但卻冗長而不簡潔。此外，最終答案是以項目符號格式呈現的，這偏離了我們想要微調的數據集的結構和風格。

<think>
Okay, so I have this medical question to answer. Let me try to break it down. The patient is a 61-year-old woman with a history of involuntary urine loss during activities like coughing or sneezing, but she doesn't leak at night. She's had a gynecological exam and a Q-tip test. I need to figure out what cystometry would show regarding her residual volume and detrusor contractions.


First, I should recall what I know about urinary incontinence. Involuntary urine loss during activities like coughing or sneezing makes me think of stress urinary incontinence. Stress incontinence typically happens when the urethral sphincter isn't strong enough to resist increased abdominal pressure from activities like coughing, laughing, or sneezing. This usually affects women, especially after childbirth when the pelvic muscles and ligaments are weakened.


The Q-tip test is a common diagnostic tool for stress urinary incontinence. The test involves inserting a Q-tip catheter, which is a small balloon catheter, into the urethra. The catheter is connected to a pressure gauge. The patient is asked to cough, and the pressure reading is taken. If the pressure is above normal (like above 100 mmHg), it suggests that the urethral sphincter isn't closing properly, which is a sign of stress incontinence.


So, based on the history and the Q-tip test, the diagnosis is likely stress urinary incontinence. Now, moving on to what cystometry would show. Cystometry, also known as a filling cystometry, is a diagnostic procedure where a catheter is inserted into the bladder, and the bladder is filled with a liquid to measure how much it can hold (residual volume) and how it responds to being filled (like during a cough or sneeze). This helps in assessing the capacity and compliance of the bladder.


In a patient with stress incontinence, the bladder's capacity might be normal, but the sphincter's function is impaired. So, during the cystometry, the residual volume might be within normal limits because the bladder isn't overfilled. However, when the patient is asked to cough or perform a Valsalva maneuver, the detrusor muscle (the smooth muscle layer of the bladder) might not contract effectively, leading to an increase in intra-abdominal pressure, which might cause leakage.


Wait, but detrusor contractions are usually associated with voiding. In stress incontinence, the issue isn't with the detrusor contractions but with the sphincter's inability to prevent leakage. So, during cystometry, the detrusor contractions would be normal because they are part of the normal voiding process. However, the problem is that the sphincter doesn't close properly, leading to leakage.


So, putting it all together, the residual volume might be normal, but the detrusor contractions would be normal as well. The key finding would be the impaired sphincter function leading to incontinence, which is typically demonstrated during the Q-tip test and clinical history. Therefore, the cystometry would likely show normal residual volume and normal detrusor contractions, but the underlying issue is the sphincter's inability to prevent leakage.
</think>


Based on the provided information, the cystometry findings in this 61-year-old woman with stress urinary incontinence would likely demonstrate the following:


1. **Residual Volume**: The residual volume would be within normal limits. This is because the bladder's capacity is typically normal in cases of stress incontinence, where the primary issue lies with the sphincter function rather than the bladder's capacity.


2. **Detrusor Contractions**: The detrusor contractions would also be normal. These contractions are part of the normal voiding process and are not impaired in stress urinary incontinence. The issue is not with the detrusor muscle but with the sphincter's inability to prevent leakage.


In summary, the key findings of the cystometry would be normal residual volume and normal detrusor contractions, highlighting the sphincteric defect as the underlying cause of the incontinence.<｜end▁of▁sentence｜>

4.加載和處理數據集

我們將通過添加復雜思路鏈列的第三個占位符來稍微改變處理數據集的提示樣式。

train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.


### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 


### Question:
{}


### Response:
<think>
{}
</think>
{}"""




## =========================以下為中文翻譯======================================
train_prompt_style = """以下是一個描述任務的指令，與提供進一步上下文的輸入相配對。寫出一個適當完成請求的回應。在回答之前，仔細思考問題并創建一個逐步的思維鏈，以確保邏輯準確的回應。


### 指令：
您是醫學專家，在臨床推理、診斷和治療計劃方面擁有先進的知識。
請回答以下醫學問題。


### 問題：


{}
### 響應：
<think>
{}
</think>
{}"""

編寫 Python 函數，在數據集中創建一個“文本”列，該列由訓練提示樣式組成。用問題、文本鏈和答案填充占位符。

我們從 Hugging Face 獲取醫療行業的思維鏈數據集中加載前 500 個樣本。之后，我們將 text ?使用formatting_prompts_func 函數映射列。

from datasets import load_dataset
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

數據集樣例

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

正如我們所看到的，文本列有一個系統提示、說明、思路鏈以及答案。

"Below is an instruction that describes a task, paired with an input that provides further context. \n
Write a response that appropriately completes the request. \n
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n
### Instruction:\n
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \n
Please answer the following medical question. \n\n
### Question:\n
A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n
### Response:\n
<think>\n
Okay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdominal pressure like coughing or sneezing. This sounds a lot like stress urinary incontinence to me. Now, it's interesting that she doesn't have any issues at night; she isn't experiencing leakage while sleeping. This likely means her bladder's ability to hold urine is fine when she isn't under physical stress. Hmm, that's a clue that we're dealing with something related to pressure rather than a bladder muscle problem. \n\nThe fact that she underwent a Q-tip test is intriguing too. This test is usually done to assess urethral mobility. In stress incontinence, a Q-tip might move significantly, showing urethral hypermobility. This kind of movement often means there's a weakness in the support structures that should help keep the urethra closed during increases in abdominal pressure. So, that's aligning well with stress incontinence.\n\nNow, let's think about what would happen during cystometry. Since stress incontinence isn't usually about sudden bladder contractions, I wouldn't expect to see involuntary detrusor contractions during this test. Her bladder isn't spasming or anything; it's more about the support structure failing under stress. Plus, she likely empties her bladder completely because stress incontinence doesn't typically involve incomplete emptying. So, her residual volume should be pretty normal. \n\n
All in all, it seems like if they do a cystometry on her, it will likely show a normal residual volume and no involuntary contractions. Yup, I think that makes sense given her symptoms and the typical presentations of stress urinary incontinence.\n
</think>\n
Cystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.
<｜end▁of▁sentence｜>"

5. 建立模型

使用目標模型，我們將通過向模型添加低秩適配器來建立模型。

model = FastLanguageModel.get_peft_model(
    model,
    r=16,  
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,
    lora_dropout=0,  
    bias="none",  
    use_gradient_checkpointing="unsloth",  # True or "unsloth" for very long context
    random_state=3407,
    use_rslora=False,  
    loftq_config=None,
)

接下來，我們將設置訓練參數并創建訓練器，通過提供模型、分詞器、數據集以及其他重要的訓練參數，這些參數將優化我們的微調過程。

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported


trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        # Use num_train_epochs = 1, warmup_ratio for full training runs!
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

如果報錯提示：AttributeError: _unwrapped_old_generate 則更新下庫

# 更新庫到最新版本
pip install --upgrade unsloth transformers


# 或者回退到特定版本
pip install unsloth==x.y.z transformers==a.b.c

6.模型訓練

運行以下命令開始訓練。

trainer_stats = trainer.train()

等待，訓練中，不知道為啥只用一個 GPU 可能是沒有開并行訓練的事，后續可以改下腳本試下

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

訓練過程耗時 23 分鐘。訓練損失逐漸減少，這是模型性能提高的一個好兆頭。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

登錄 wandb.ai 并查看項目，查看模型評估報告。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

如果您在運行上述代碼時遇到問題，請參閱微調 DeepSeek R1（推理模型） Kaggle 筆記本。

7. 微調后的模型推理

為了比較結果，我們將向微調模型提出與之前相同的問題，看看有什么變化。

question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"




FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")


outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])

這要好得多且更加準確。思路連貫，答案直截了當且在一個段落中。微調是成功的。

<think>
Okay, so let's think about this. We have a 61-year-old woman who's been dealing with involuntary urine loss during things like coughing or sneezing, but she's not leaking at night. That suggests she might have some kind of problem with her pelvic floor muscles or maybe her bladder.


Now, she's got a gynecological exam and a Q-tip test. Let's break that down. The Q-tip test is usually used to check for urethral obstruction. If it's positive, that means there's something blocking the urethra, like a urethral stricture or something else.


Given that she's had a positive Q-tip test, it's likely there's a urethral obstruction. That would mean her urethra is narrow, maybe due to a stricture or some kind of narrowing. So, her bladder can't empty properly during activities like coughing because the urethral obstruction is making it hard.


Now, let's think about what happens when her bladder can't empty. If there's a urethral obstruction, the bladder is forced to hold more urine, increasing the residual volume. That's because her bladder doesn't empty completely. So, her residual volume is probably increased.


Also, if her bladder can't empty properly, she might have increased detrusor contractions. These contractions are usually stronger to push the urine out. So, we expect her detrusor contractions to be increased.


Putting it all together, if she has a urethral obstruction and a positive Q-tip test, we'd expect her cystometry results to show increased residual volume and increased detrusor contractions. That makes sense because of the obstruction and how her bladder is trying to compensate by contracting more.
</think>
Based on the findings of the gynecological exam and the positive Q-tip test, it is most likely that the cystometry would reveal increased residual volume and increased detrusor contractions. The positive Q-tip test indicates urethral obstruction, which would force the bladder to retain more urine, thereby increasing the residual volume. Additionally, the obstruction can lead to increased detrusor contractions as the bladder tries to compensate by contracting more to expel the urine.<｜end▁of▁sentence｜>

8. 本地保存模型

現在，讓我們在本地保存 adopter、full model 和 tokenizer ，以便我們可以在其他項目中使用它們。

new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local) 
tokenizer.save_pretrained(new_model_local)


model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

9. 將模型推送至 Hugging Face Hub

我們還可以把 adopter, tokenizer, and model 推送到 Hugging Face Hub，以便 AI 社區可以將此模型集成到他們的系統中來利用它。

new_model_online = "skyxiaowang/DeepSeek-R1-Medical-COT"
model.push_to_hub(new_model_online)
tokenizer.push_to_hub(new_model_online)


model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit"))

注意：要提交到自己的命名空間下，提供的 HF 的 token 必須要有 write 權限

等待上傳....

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

ok，上傳完成，登錄 HF 查看，模型已經存在

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

學習之旅的下一步是將模型部署到云端。您可以按照如何使用 BentoML 部署 LLM 指南進行操作，該指南提供了使用 BentoML 和 vLLM 等工具高效且經濟高效地部署大型語言模型的分步流程。

或者，如果您更喜歡在本地使用該模型，您可以將其轉換為 GGUF 格式并在您的機器上運行。為此，請查看微調 Llama 3.2 并在本地使用：分步指南指南，其中提供了有關本地使用的詳細說明。

微調結束，記著手動關閉 kaggle 環境，節省 GPU 資源

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

結論

在人工智能領域，情況正在迅速變化。開源社區正在崛起，挑戰過去三年中一直統治人工智能領域的專有模型的主導地位。開源大型語言模型（LLMs）正變得更好、更快、更高效，使得在較低的計算和內存資源上對其進行微調比以往任何時候都更容易。在本教程中，我們探索了 DeepSeek R1 推理模型，并學習了如何對其精簡版本進行微調以用于醫療問答任務。經過微調的推理模型不僅能提高性能，還能使其在醫學、緊急服務和醫療保健等關鍵領域得到應用。為了應對 DeepSeek R1 的推出，OpenAI 推出了兩個強大的工具：OpenAI 的 o3，一個更先進的推理模型，以及由新的計算機使用代理（CUA）模型驅動的 OpenAI 的 Operator AI 代理，它可以自主瀏覽網站并執行任務。xAI 推出了帶深度思考的 Grok 3，一個用 20 萬塊顯卡訓練的大模型，性能超過所有同類開源和閉源模型，但是實測也差強人意，每日智能免費問兩次，收費也貴的嚇人，得到了 30 美元/月，我摸了摸錢包還是很自覺的去用 DeepSeek R1 了，免費又好用，誰能不愛？

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

如果你覺著一步一步的寫代碼比較費時，不要緊我已經給你準備好了懶人腳本，如下：

https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

你說我對你好不好？??

關于小白問題的 QA 解答

1. 如何獲取 HF 令牌

訪問 Hugging Face 官網并登錄你的賬戶。

點擊右上角你的頭像，選擇 “Settings”（設置）。

在左側菜單中選擇 “Access Tokens”（訪問令牌）。

點擊 “New token”（新令牌），為令牌設置一個名稱，選擇合適的權限（通常選擇 “read” 即可），然后點擊 “Generate a token”（生成令牌），復制生成的令牌。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

2. 如何獲取 Weights & Biases 令牌

訪問 Weights & Biases 官網并登錄你的賬戶。

點擊右上角你的頭像，選擇 “Settings”（設置）。

在 “API Keys”（API 密鑰）部分，點擊 “Generate”（生成），復制生成的 API 密鑰。

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

3. Kaggle 使用

添加密鑰

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

開啟免費 GPU

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區

點星標，不迷路，獲取最新最前沿的人工智能技術

白嫖資源訓練 DeepSeek R1 推理模型-AI.x社區圖片

[1] Python 中的大型語言模型導論：https://www.datacamp.com/courses/introduction-to-llms-in-python

[2] 強化學習：基于 Python 示例的介紹：https://www.datacamp.com/tutorial/reinforcement-learning-python-introduction

[3] 思維鏈推理習：https://www.datacamp.com/tutorial/chain-of-thought-prompting

[4] DeepSeek-R1：https://github.com/deepseek-ai/DeepSeek-R1

[5] DeepSeek-R1 功能和 o1 的比較、蒸餾模型等：https://www.datacamp.com/blog/deepseek-r1

[6] Weights & Biases 官網（wandb）： https://wandb.ai/home

[7] kaggle：https://www.kaggle.com/

[8] 原文鏈接：https://www.datacamp.com/tutorial/fine-tuning-deepseek-r1-reasoning-model?utm_source=chatgpt.com

[9] Unsloth 指南：https://www.datacamp.com/tutorial/unsloth-guide-optimize-and-speed-up-llm-fine-tuning

[10] 基模 HF 地址：https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B

[11] Kaggle 使用指南：https://blog.csdn.net/weixin_42426841/article/details/143591586

[12] 醫學思維鏈數據集：https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT?row=46

[13] 微調 DeepSeek R1（推理模型）Kaggle 筆記本：https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

[14] 如何使用 BentoML 部署 LLM：https://www.datacamp.com/tutorial/deploy-llms-with-bentoml

[15] 微調 Llama 3.2 并在本地使用：分步指南：https://www.datacamp.com/tutorial/fine-tuning-llama-3-2

[16] Hugging Face 官網：https://huggingface.co/

[17] OpenAI 的 O3：特性、與 O1 的比較、發布日期及更多內容：https://www.datacamp.com/blog/o3-openai

[18] OpenAI 的 Operator：示例、用例、競爭及更多：https://www.datacamp.com/blog/operator

[19] 懶人腳本：https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model

[20] DeepSeek 的官方網站：?https://www.deepseek.com/

本文轉載自 ??AIGC前沿技術追蹤??，作者：喜歡學習的小仙女

標簽

DeepSeek R1

推理

模型

贊

回復

舉報

回復

51CTO

51CTO博客

51CTO學堂

白嫖資源訓練 DeepSeek R1 推理模型精華

DeepSeek R1 簡介

DeepSeek-R1-Zero

DeepSeek-R1

DeepSeek 蒸餾

微調所需資源

微調 DeepSeek R1：分步指南

1. 設置

2. 加載模型和標記器

3. 微調前的模型推理

英文效果

中文效果

4.加載和處理數據集

5. 建立模型

6.模型訓練

7. 微調后的模型推理

8. 本地保存模型

9. 將模型推送至 Hugging Face Hub

結論

關于小白問題的 QA 解答

1. 如何獲取 HF 令牌

2. 如何獲取 Weights & Biases 令牌

3. Kaggle 使用

目錄

51CTO

51CTO博客

51CTO學堂

白嫖資源訓練 DeepSeek R1 推理模型 精華

DeepSeek R1 簡介

DeepSeek-R1-Zero

DeepSeek-R1

DeepSeek 蒸餾

微調所需資源

微調 DeepSeek R1：分步指南

1. 設置

2. 加載模型和標記器

3. 微調前的模型推理

英文效果

中文效果

4.加載和處理數據集

5. 建立模型

6.模型訓練

7. 微調后的模型推理

8. 本地保存模型

9. 將模型推送至 Hugging Face Hub

結論

關于小白問題的 QA 解答

1. 如何獲取 HF 令牌

2. 如何獲取 Weights & Biases 令牌

3. Kaggle 使用

目錄

白嫖資源訓練 DeepSeek R1 推理模型精華