Search code examples
artificial-intelligencechatbot

Train LLM on internal docs


I have my internal company documentations regarding leaves and such. I was wondering if there is a way or a service where I can upload these docs, and I have a ChatGPT like AI which answers questions related to these docs? I don't mind if this is a paid service. Any ideas?


Solution

  • Sounds like you're looking for something like OSSChat

    There are two ways to go about creating a ChatGPT like thing for your own internal docs: 1) fine-tuning an LLM, or 2) using a vector database + some LLM. I actually just recently made a multi document Q/A app using LlamaIndex, LangChain, and Milvus. Here's the Colab Notebook.

    Basically what you can do is:

    1. vectorize your documents and store them in a vector database like Milvus

    2. generate some summaries or titles for each of your docs

    3. store the keywords in a dict and make the values correspond to your vector store entries

    4. use LlamaIndex to hook up the keyword and vector store indices

    5. use LlamaIndex to make decomposable queries

    that should pretty much be it from a high level POV