Hi!
The official LLM fine-tune example) is based on LLaMA2 and uses the alpaca template.
Do I need to redefine the get_tokenizer_and_data_collator_and_propt_formatting function in dataset.py if I want to use other base models?
I think you need to update the following
response_template_with_context = “\n### Response:” # alpaca response tag
and pyproject.toml and specify the dataset name and huggingface will do the other work for you!
Thanks for your reply, I noticed that the chat_template of different models is different, what should I refer to to modify the template in the code? Meanwhile, the example is single-round , if I want to implement multiple rounds of dialogue, is the code logic contradictory?