ViewTube

Skip

Stanford's new ALPACA 7B LLM explained - Fine-tune code and data set for DIY

31,439 views

645

code_your_own_AI

32.4K subscribers

Fri, 24 Mar 2023 00:00:00 GMT

New ALPACA 7B. Stanford Institute for Human-Centered AI created a new LLM: ALPACA 7B based on Meta's LLama 7B. It has some inherent beauty, since it uses OpenAI's API to generate a synthetic data set for supervised fine-tuning a small LLM (7 to 11B). Instruct-tune LLaMA w/ ChatGPT = ALPACA LLM. LLama is available on HuggingFace Transformers' main version, only the weights of the models have to be requested from Meta (w/ a specific form, smile). Transformation script is available from Huggingface, fine-tune code from Stanford University. Create your own ALPACA with you corporate specific data set! For minimal costs, since a small (around 10 Billion parameters, instead of 175B or 540B models) model is cheaper to fine-tune, and yes, we use the "classical fine-tuning method" to alter all weights of our LLM model for superior performance. An alternative may be to use Huggingface PEFT or AdapterHub to apply adapter-tuning and frozen weights to further significantly reduce GPU memory usage (see my new videos on PEFT - LoRA). Instruct-tune LLaMA on consumer hardware w/ LoRA: My recommendation: great GitHub repo to explore: https://github.com/tloen/alpaca-lora Your fine-tune code for LLama-LoRA: https://github.com/tloen/alpaca-lora/blob/main/finetune.py Details on fine-tuning official Stanford Alpaca 7B: https://github.com/huggingface/transformers/pull/21955 Literature: SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions https://arxiv.org/pdf/2212.10560.pdf Alpaca: A Strong, Replicable Instruction-Following Model https://crfm.stanford.edu/2023/03/13/alpaca.html Official Stanford Alpaca GitHub for fine-tuning code: https://github.com/tatsu-lab/stanford_alpaca#fine-tuning @misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, } @tatsu-lab/alpaca Alpaca is a dataset of 52,000 instructions: The data fields are as follows: A. instruction: describes the task the model should perform. Each of the 52K instructions is unique. B. input: optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input. C. output: the answer to the instruction as generated by text-davinci-003. D. text: the instruction, input and output formatted with the prompt template used by the authors for fine-tuning their models. #ai #naturallanguageprocessing #finetune #finetuning

ViewTube

Recommended videos

Stanford's new ALPACA 7B LLM explained - Fine-tune code and data set for DIY

33 Comments