I'm developing a Streamlit app to create an AI chatbot using the "complete" function in snowflake. When a user starts the app, I need to pass a sequence of initial messages into the function. Here's an example:
messages = [
{'system': 'act like a database professional'},
{'user': 'context: I have a database containing sales data. I need you to answer user questions based on the input data between <data> </data> tags'},
{'user': '<data>
actual data ...
</data>'},
{'user': 'provide the answer in markdown format'}
]
These messages are hard-coded in the app.
User input and assistant messages will be appended to the messages list like this:
messages.append({'role': 'user', 'content': question})
The option parameter is set as follows:
option = {
'temperature': 0,
'max_tokens': 128,
'guardrails': True
}
Here's how I call the complete function:
cmd = """
select snowflake.cortex.complete(?, ?, ?) as response
"""
df_response = session.sql(cmd, params=[st.session_state.model_name, messages, option]).collect()
Currently, I'm consistently receiving this error message:
SnowparkSQLException: (1304): 252011: Python data type [dict] cannot be automatically mapped to Snowflake data type. Specify the snowflake data type explicitly.
I've completed this quickstart, but its solution summarizes all past conversations into one prompt. I'd prefer to pass it as an array of objects.
I've attempted several solutions, including using Snowpark types such as StructType, StructField, StringType, ArrayType, MapType, and the UDF library, but without success.
Can anyone offer assistance?
As snowflake.cortex is not available in my region (can not import) and I prefer the 'passing direct SQL to Snowflake' solution so I can run the app apart from the snowflake infrastructure, I have made the following changes to my code:
messages = [
{'role': 'system', 'content': 'act like a database professional'},
{'role': 'user', 'content': """context: I have a database containing sales data. I need you to answer user questions based on the input data between <data> </data> tags.
<data>
actual data ...
</data>"""},
{'role': 'assistant', 'content': ''},
{'role': 'user', 'content': 'question 2 ....'},
]
prompts_str = (
json.dumps(messages, indent=3)
.replace('\'', '\\\'')
.replace('"', '\'')
)
OPTIONS = (
json.dumps({
'temperature': 0,
'max_tokens': 128000,
'guardrails' : True
}, indent=3)
.replace('\'', '\\\'')
.replace('"', '\'')
)
sql = f"""
select snowflake.cortex.complete('llama3.1-70b', {prompts_str}, {OPTIONS}):"choices"[0]:"messages"::string as response
"""
# read the answer
response = session.sql(sql).collect()[0].as_dict()["RESPONSE"].replace("'", "")