Search code examples
large-language-modelhuggingfacehuggingface-tokenizersllama-indexllama3

How to set eos_token_id in llama3 in HuggingFaceLLM?


I wanna set my eos_token_id, and pad_token_id. I googled alot, and most are suggesting to use e.g. tokenizer.pad_token_id (like from here https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/36). But the problem is my code doesn't have tokenizer initiation.

I checked the official Llama3 page https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/, it does not show the code.

While my code is like this:

import os
from llama_index.core import StorageContext, load_index_from_storage
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
import torch

# Define the LLM
llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=256,  # Reduce max new tokens for faster inference
    generate_kwargs={
        "temperature": 0.1,
        "do_sample": True,
        "pad_token_id": 128001 , 
        "eos_token_id": 128001   
    },
    tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
    model_name="meta-llama/Meta-Llama-3-8B-Instruct",
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16}
)

So my question is, what should be the proper setting for the pad and eos token id? I am sure it is not 128001. Would anyone please help?


Solution

  • For the eos_token that was working for me:

    "eos_token_id": [128001, 128009]
    

    Found here at the bottom: https://github.com/vllm-project/vllm/issues/4180

    For the pad_token, I guess you can ignore it like suggested here: https://github.com/meta-llama/llama3/issues/42