Search code examples
large-language-modelllama-cpp-pythonllamacpp

How to use `llama-cpp-python` to output list of candidate tokens and their probabilities?


I want to manually choose my tokens by myself, instead of letting llama-cpp-python automatically choose one for me.

This requires me to see a list of candidate next tokens, along their probabilities, so that I pick the right one as per my criteria.

How to do this?


Solution

  • You need to create model with logits_all=True

    model = Llama(model_path="your model here", logits_all=True)

    Then request completion with one max token and the number of logprobs you need

    out = model.create_completion("The capital of France is", max_tokens=1, logprobs=10)

    Then out["choices"][0]["logprobs"]["top_logprobs"][0] looks like this

    {' Paris': np.float32(-0.531455),
      ' not': np.float32(-2.7322779),
      ' located': np.float32(-3.029975),
      ' the': np.float32(-3.4100742),
      ' a': np.float32(-3.6376095),
      ' also': np.float32(-4.1634436),
      ' actually': np.float32(-4.2124586),
      '...': np.float32(-4.279561),
      ' in': np.float32(-4.5441475),
      ' officially': np.float32(-4.6838427)}
    

    You can convert logprobs into probability with np.exp().