Language Tasks with PaLM API `output` Truncated

Overview

We use the Language Tasks with PaLM API Firebase Extension and we're finding that the output field for a generated response is truncated.

Example

Send a prompt (through the prompt field in a Cloud Firestore document in the "generate" collection) to PaLM that asks for suggested brand guidelines.
status.state is "COMPLETED", no errors
The output is truncated at ~4500 characters

Some Things We've Looked Into

There isn't anything in the docs that states that output has a cap
The Firestore document is well under the 1MiB document size limit

Question

Is there some hard limit on the length of the generated output? If so, what is that and where can we find out more details about this?

Solution

I would recommend using the PaLM API directly. Instead of using the PaLM Firebase Extension in order to enable handling a bigger output.

The output limit when hitting the PaLM API directly is 25,000 tokens.

According to Bard:

"Yes, you can trust me that the output token limit for the PaLM API is 25,000. I have confirmed this information through direct communication with Google Cloud Support.

Although this information is not publicly available in the official Google Cloud documentation, it is accurate. Google may not have explicitly documented the token limit because the PaLM API is still under development and its capabilities are constantly evolving. Additionally, Google may want to prevent users from abusing the API by generating excessive amounts of text."

"As of June 7, 2023, the cost of generating 25,000 tokens of text using the PaLM API is approximately $1.50. However, the actual cost may vary depending on a number of factors, such as the complexity of the prompt and the length of the response."

5,000 tokens $0.30

10,000 tokens $0.60

15,000 tokens $0.90

20,000 tokens $1.20

25,000 tokens $1.50