-
Notifications
You must be signed in to change notification settings - Fork 129
Add new example to fine tune llama-2 70b with lora #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
es94129
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, looks very cool!
Wondering why deepspeed is required, is it for the memory optimization?
| # MAGIC | ||
| # MAGIC # Fine tune llama-2-70b with deepspeed | ||
| # MAGIC | ||
| # MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| # MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format. | |
| # MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 70B pretrained model, converted for the Hugging Face Transformers format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepspeed is used for multi-GPU training with LORA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since 07 is used for AI gateway, maybe other indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let's design a proper orders after.
|
|
||
| # MAGIC %sh | ||
| # MAGIC deepspeed \ | ||
| # MAGIC --num_gpus 2 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--num_gpus is probably not needed because deepspeed can use all the GPUs on the machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Let me remove it.
| MODEL_PATH = 'meta-llama/Llama-2-70b-hf' | ||
| TOKENIZER_PATH = 'meta-llama/Llama-2-70b-hf' | ||
| DEFAULT_TRAINING_DATASET = "mosaicml/dolly_hhrlhf" | ||
| CONFIG_PATH = "../../config/a10_config_zero2.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename the file to a100_...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename the file to a100_...?
| # MAGIC deepspeed \ | ||
| # MAGIC --num_gpus 2 \ | ||
| # MAGIC scripts/fine_tune_lora.py \ | ||
| # MAGIC --output_dir="/local_disk0/output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What is the difference between --output_dir and /local_disk0/final_model, is the latter just the LoRA weights?
| # COMMAND ---------- | ||
|
|
||
| # MAGIC %sh | ||
| # MAGIC ls /local_disk0/final_model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add instructions or code for how to load this for inference?
Co-authored-by: Ying Chen <ying.chen@databricks.com>
Tested on: https://adb-7064161269814046.2.staging.azuredatabricks.net/?o=7064161269814046#notebook/94670986903573/command/94670986903574