This is my code so far :
vector_search=VectorSearch(
algorithms=[
HnswAlgorithmConfiguration(
name="myHnsw",
kind="hnsw",
parameters={
"m": 4,
"efConstruction":400,
"efSearch":500,
"metric":"cosine"
}
)
],
profiles=[
VectorSearchProfile(
name="myHnswProfile",
algorithm_configuration_name="myHnsw",
vectorizer="myVectorizer"
)
],
vectorizers=[
AzureOpenAIVectorizer(
name="myVectorizer",
azure_open_ai_parameters=AzureOpenAIParameters(
resource_uri=azure_openai_endpoint,
deployment_id=azure_openai_embedding_deployment,
#model_name=embedding_model_name,
api_key=azure_openai_key
)
)
]
)
Please notice is that model is commented out in the vectorizer, or I was getting error that model attribute does not exist
then creating search index as :
# Define the index fields
client = SearchIndexClient(endpoint, credential)
fields = [
SimpleField(name="id", type=SearchFieldDataType.String, key=True, sortable=True,
filterable=True, facetable=True),
SimpleField(name="originalbalancedue", type=SearchFieldDataType.Double,
sortable=True, filterable=True, facetable=True),
SimpleField(name="adjustedbalancedue", type=SearchFieldDataType.Double,
sortable=True, filterable=True, facetable=True),
SimpleField(name="feeamount", type=SearchFieldDataType.Double, sortable=True,
filterable=True, facetable=True),
SearchableField(name="result", type=SearchFieldDataType.String, sortable=True,
filterable=True, vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile")
]
index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)
result = client.create_or_update_index(index)
print(f'{result.name} created')
And the search index was created successfully.
Now trying to insert text and embeddings into vector store as:
with open('worthiness_with_result_small.json', 'r') as f:
content = f.read().strip()
if content:
documents = json.loads(content)
print(f"loaded {len(documents)} documents")
else:
print("The file is empty.")
search_client = SearchClient(endpoint=endpoint, index_name=index_name,
credential=credential)
result = search_client.upload_documents(documents)
print(f"Uploaded {len(documents)} documents")
where my data in worthiness_with_result_small.json file looks like :
[{"id":"425001","originalbalancedue":1684269.59,"adjustedbalancedue":1683369.59,"feeamount":6659.1199999999998900,"result":"5759.1199999999998900"}]
And document was uploaded.
Now, I am trying to do the vector search in this way:
# Define the context and query
context = """
You are a bot to assist in finding information from bills that cause result to be the highest.
Result is the total money that we make off of the bill.
The main fields to look at when determining if a claim is worth working are originalbalancedue,adjustedbalancedue, result.
Users will ask if certain bills are worth it to work.
When they ask if it is worth it to work, analyze the existing bill data to see if
bills have higher results.
"""
query = context + " Using the bills provided, which bills worth working"
embedding = client.embeddings.create(input=query, model=embedding_model_name,
dimensions=azure_openai_embedding_dimensions).data[0].embedding
# Perform the vector search
results = search_client.search(
search_text=None,
vectors=[vector_query],
select=["id", "result"]
)
# Process and print the results
for result in results:
print(f"ID: {result['id']}, Revcodes: {result['result']}")")
Now in the last section ( where trying the vector search): getting error:
10 query = context + " Using the claims provided, which codes denote claims worth working"
---> 11 embedding = client.embeddings.create(input=query, model=embedding_model_name, dimensions=azure_openai_embedding_dimensions).data[0].embedding
15 # Create the vector query
16 vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="revcodes,revcodeamounts,revcodecount,drgcode,primarydxcode,admitdxcode")
AttributeError: 'SearchIndexClient' object has no attribute 'embeddings'
what is the right way of creating this vector search in this case
Please notice is that model is commented out in the vectorizer ( when creating config). And now, I am not sure how to use the model name when it is not defined anywhere.
The reason you are getting this error is because you are using SearchIndexClient
instead of AzureOpenAI
for creating the embedding. You would need to create an Azure OpenAI client and then generate the embedding using that client.
Your code would be something like:
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
import json
openai_credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(openai_credential, "https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
azure_deployment=azure_openai_embedding_deployment,
api_version=azure_openai_api_version,
azure_endpoint=azure_openai_endpoint,
api_key=azure_openai_key,
azure_ad_token_provider=token_provider if not azure_openai_key else None
)
Code reference: https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/basic-vector-workflow/azure-search-vector-python-sample.ipynb