I have this piece of code working to access AWS Bedrock models with a knowledge base:
aws_session = boto3.Session(
bedrock_agent_client = aws_session.client(service_name="bedrock-agent-runtime", region_name="us-west-2")
response = bedrock_agent_client.retrieve_and_generate(
input={"text": input_data},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {"knowledgeBaseId": config.bedrock.kb_id, "modelArn": model_arn},
},
)
return response
However it uses default max_token_to_sample parameter which is rather small. boto3 client retrieve_and_genenerate function does not seems to have a parameter or relevant config to specify it. Does anybody know how can I pass in this parameter? Thanks!
It's not 100% clear what you mean by max_token_to_sample
but I think you're referring to inference. If so, it's the textInferenceConfig
you're looking for.
Beware more tokens == more cost.
response = bedrock_agent_runtime_client.retrieve_and_generate(
input={
"text": prompt
},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"generationConfiguration":{
"inferenceConfig": {
"textInferenceConfig": {"maxTokens": 123}
}
},
"knowledgeBaseId": kbId,
"modelArn": model_arn,
}
}
)
Reference: AgentsforBedrockRuntime / Client / retrieve_and_generate