Let’s build a Bot – RBot V1
Let’s build a Bot to help people solve their legal problems, cried the JDD Team, and so they built RBot Version 1.
The bot uses Retrieval Augmented Generation (RAG). RAG is a groundbreaking technique designed to counteract the limitations of foundational Large Language Models (LLM) such at ChatGPT. For example, you may have heard of ChatGPT “hallucinating” and providing completely made-up responses to people’s queries.
The bot is envisioned as a something that will go where people are asking for help with legal problems (Spoiler – they go to Reddit!) and answer people’s questions with accurate information from BC’s legal support websites.
The top two places people look for help are from family and friends, and on the internet. Our research showed that most British Columbians look online for help on the social media site, Reddit.
The JDD Lab built a reddit bot – “RBot” – to meet people on where they are looking for help and offer the help they are asking for. We constructed RBot V1 so it is trained on the high-quality content provided by People’s Law School and Courthouse Libraries’ Clicklaw Site. Then, by pairing the LLM with a RAG pipeline, we know that the Bot has access to the highly reliable, well-curated and updated content, and we have access to that content.
We break the RAG workflow into the following six steps:
- Data ingestion: This is where we acquire the high-quality content (data) from our collaborators.
- Chunking: Now we break the content down into digestable pieces for the LLM to consume. Products with chunking functionality are: LangChain, Llama Index and Haystack.
- Embedding: After chunking, we convert the text into a numerical representation that an LLM can understand. This is also known as “vector embedding”. OpenAI, Cohere, and Hugging Face all offer embedding models.
- Locating the Vector Database: We had to choose the location for storing our chunked embeddings. There are lots of options to choose from for a vector database, including: Pinecone, Milvus, and Chroma.
- Running the Queries for Retrieval: Now we start working with our database by running example queries and questions that describe people’s everyday legal problems, and having the RAG grab and return the most relevant chunks of information in the vector database via similarity search.
- Using the LLM to finish: With the relevant chunks in hand, we pass both the original query and the context to the LLM to generate the accurate response.
At the JDD Lab, we are testing different techniques for each one of these six steps, looking for the combination that provides the best responses. A technical blog on this work is posted in the Research Engine.