Search code examples
jsonelixirtensorelixir-nx

elixir nx - nx.tensor to json


I am using elixir livebook to do embeddings, then use it to search my Qdrant database:

{:ok, model_info} = Bumblebee.load_model({:hf, "BAAI/bge-base-en-v1.5"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "BAAI/bge-base-en-v1.5"})
# {:ok, model_info} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
# {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})

serving = Bumblebee.Text.text_embedding(model_info, tokenizer)

text = "where do i change my password?"
emb = Nx.Serving.run(serving, text)

collection_name = "demo_v1"
Qdrant.search_points(collection_name, %{vector: emb.embedding, limit: 3})

I am getting an error:

{:error,
 {Tesla.Middleware.JSON, :encode,
  %Protocol.UndefinedError{
    protocol: Jason.Encoder,
    value: #Nx.Tensor<
      f32[768]
      [-0.8432725071907043, -0.5420605540275574, ...]
    >,
    description: "Jason.Encoder protocol must always be explicitly implemented.\n\nIf you own the struct, you can derive the implementation specifying which fields should be encoded to JSON:\n\n    @derive {Jason.Encoder, only: [....]}\n    defstruct ...\n\nIt is also possible to encode all fields, although this should be used carefully to avoid accidentally leaking private information when new fields are added:\n\n    @derive Jason.Encoder\n    defstruct ...\n\nFinally, if you don't own the struct you want to encode to JSON, you may use Protocol.derive/3 placed outside of any module:\n\n    Protocol.derive(Jason.Encoder, NameOfTheStruct, only: [...])\n    Protocol.derive(Jason.Encoder, NameOfTheStruct)\n"
  }}}

from the error message, it seems like i still need to convert nx.tensor embedding to some sort of JSON format, before I can use it for the search_point/2 function in Qdrant.

How can I resolve this?


Solution

  • It seems (although the docs are not explicit) that you are supposed to call Nx.to_list/1 to get JSON-serializable data:

    Qdrant.search_points(collection_name, %{vector: Nx.to_list(emb.embedding), limit: 3})