New ALPACA 7B. Stanford Institute for Human-Centered AI created a new LLM: ALPACA 7B based on Meta's LLama 7B. It has some inherent beauty, since it uses OpenAI's API to generate a synthetic data set for supervised fine-tuning a small LLM (7 to 11B). Instruct-tune LLaMA w/ ChatGPT = ALPACA LLM.
LLama is available on HuggingFace Transformers' main version, only the weights of the models have to be requested from Meta (w/ a specific form, smile).
Transformation script is available from Huggingface, fine-tune code from Stanford University. Create your own ALPACA with you corporate specific data set!
For minimal costs, since a small (around 10 Billion parameters, instead of 175B or 540B models) model is cheaper to fine-tune, and yes, we use the "classical fine-tuning method" to alter all weights of our LLM model for superior performance.
An alternative may be to use Huggingface PEFT or AdapterHub to apply adapter-tuning and frozen weights to further significantly reduce GPU memory usage (see my new videos on PEFT - LoRA).
Instruct-tune LLaMA on consumer hardware w/ LoRA:
My recommendation: great GitHub repo to explore:
https://github.com/tloen/alpaca-lora
Your fine-tune code for LLama-LoRA:
https://github.com/tloen/alpaca-lora/blob/main/finetune.py
Details on fine-tuning official Stanford Alpaca 7B:
https://github.com/huggingface/transformers/pull/21955
Literature:
SELF-INSTRUCT: Aligning Language Model
with Self Generated Instructions
https://arxiv.org/pdf/2212.10560.pdf
Alpaca: A Strong, Replicable Instruction-Following Model
https://crfm.stanford.edu/2023/03/13/alpaca.html
Official Stanford Alpaca GitHub for fine-tuning code:
https://github.com/tatsu-lab/stanford_alpaca#fine-tuning
@misc{alpaca,
author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
title = {Stanford Alpaca: An Instruction-following LLaMA model},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
@tatsu-lab/alpaca
Alpaca is a dataset of 52,000 instructions:
The data fields are as follows:
A. instruction: describes the task the model should perform. Each of the 52K instructions is unique.
B. input: optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input.
C. output: the answer to the instruction as generated by text-davinci-003.
D. text: the instruction, input and output formatted with the prompt template used by the authors for fine-tuning their models.
#ai
#naturallanguageprocessing
#finetune
#finetuning
33 Comments