I've defined a pipeline using Huggingface transformer library.
pipe = pipeline(
"text-generation",
model=myllm,
tokenizer=tokenizer,
max_new_tokens=512,
)
I'd like to test it:
result = pipe("Some input prompt for the LLM")
How can I debug the prompt actually sent to the LLM?
I expect the pipeline to apply the prompt template (tokenizer.default_chat_template) but how can I verify how the prompt is after the template has been applied?
you may use preprocess
method and check generated token_ids. Generally would suggest to more closely look on the code of the method, it will explain what is happening with the prompt before model forward pass.
params = pipe._preprocess_params
pipe.preprocess("I can't believe you did such a ", **params)
# Returns:
# {'input_ids': tensor([[ 40, 460, 470, 1975, 345, 750, 884, 257, 220]]),
# 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1]]),
# 'prompt_text': "I can't believe you did such a "}
Internally preprocess
is calling either
tokenizer.chat_template
tokenizer(prompt_text)
.For example for "gpt-2" model default tokenizer outputs token_ids
and masks
:
pipe.tokenizer("I can't believe you did such a ")
# Returns:
# {'input_ids': [40, 460, 470, 1975, 345, 750, 884, 257, 220],
# 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1]}
Another thing to consider during prompts debug is to look what will happen if you'll invert token ids:
pipe = pipeline(
"text-generation",
model="openai-community/gpt2"
)
inputs = pipe.tokenizer("I can't believe you did such a ")
pipe.tokenizer.convert_ids_to_tokens(inputs['input_ids'])
# ['I', 'Ġcan', "'t", 'Ġbelieve', 'Ġyou', 'Ġdid', 'Ġsuch', 'Ġa', 'Ġ']