Retrieval Augmented Technology in SQLite


This is the second in a two-part collection on utilizing SQLite for Machine Studying. In my final article, I dove into how SQLite is quickly changing into a production-ready database for internet functions. On this article, I’ll focus on how you can carry out retrieval-augmented-generation utilizing SQLite.

In case you’d like a customized internet software with generative AI integration, go to losangelesaiapps.com

The code referenced on this article could be discovered right here.


After I first discovered how you can carry out retrieval-augmented-generation (RAG) as a budding information scientist, I adopted the conventional path. This often seems to be one thing like:

  • Google retrieval-augmented-generation and search for tutorials
  • Discover the most well-liked framework, often LangChain or LlamaIndex
  • Discover the most well-liked cloud vector database, often Pinecone or Weaviate
  • Learn a bunch of docs, put all of the items collectively, and success!

Actually I really wrote an article about my expertise constructing a RAG system in LangChain with Pinecone.

There’s nothing terribly improper with utilizing a RAG framework with a cloud vector database. Nonetheless, I’d argue that for first time learners it overcomplicates the state of affairs. Do we actually want a complete framework to discover ways to do RAG? Is it essential to carry out API calls to cloud vector databases? These databases act as black containers, which is rarely good for learners (or frankly for anybody). 

On this article, I’ll stroll you thru how you can carry out RAG on the best stack attainable. Actually, this ‘stack’ is simply Sqlite with the sqlite-vec extension and the OpenAI API to be used of their embedding and chat fashions. I like to recommend you readvert half 1 of this collection to get a deep dive on SQLite and the way it’s quickly changing into manufacturing prepared for internet functions. For our functions right here, it is sufficient to perceive that SQLite is the best form of database attainable: a single file in your repository. 

So ditch your cloud vector databases and your bloated frameworks, and let’s do some RAG.


SQLite-Vec

One of many powers of the SQLite database is using extensions. For these of us aware of Python, extensions are quite a bit like libraries. They’re modular items of code written in C to increase the performance of SQLite, making issues that had been as soon as inconceivable attainable. One fashionable instance of a SQLite extension is the Full-Textual content Search (FTS) extension. This extension permits SQLite to carry out environment friendly searches throughout massive volumes of textual information in SQLite. As a result of the extension is written purely in C, we are able to run it wherever a SQLite database could be run, together with Raspberry Pis and browsers.

On this article I will likely be going over the extension referred to as sqlite-vec. This offers SQLite the facility of performing vector search. Vector search is just like full-text search in that it permits for environment friendly search throughout textual information. Nonetheless, moderately than seek for an actual phrase or phrase within the textual content, vector search has a semantic understanding. In different phrases, trying to find “horses” will discover matches of “equestrian”, “pony”, “Clydesdale”, and so on. Full-text search is incapable of this. 

sqlite-vec makes use of digital tables, as do most extensions in SQLite. A digital desk is just like a daily desk, however with further powers:

  • Customized Information Sources: The information for the standard desk in SQLite is housed in a single db file. For a digital desk, the information could be housed in exterior sources, for instance a CSV file or an API name.
  • Versatile Performance: Digital tables can add specialised indexing or querying capabilities and help advanced information sorts like JSON or XML.
  • Integration with SQLite Question Engine: Digital tables combine seamlessly with SQLite’s customary question syntax e.g. SELECT , INSERT, UPDATE, and DELETE choices. Finally it’s as much as the writers of the extensions to help these operations.
  • Use of Modules: The backend logic for a way the digital desk will work is applied by a module (written in C or one other language).

The everyday syntax for making a digital desk seems to be like the next:

CREATE VIRTUAL TABLE my_table USING my_extension_module();

The essential a part of this assertion is my_extension_module(). This specifies the module that will likely be powering the backend of the my_table digital desk. In sqlite-vec we’ll use the vec0 module.

Code Walkthrough

The code for this text could be discovered right here. It’s a easy listing with the vast majority of information being .txt information that we are going to be utilizing as our dummy information. As a result of I’m a physics nerd, the vast majority of the information pertain to physics, with just some information regarding different random fields. I cannot current the complete code on this walkthrough, however as a substitute will spotlight the essential items. Clone my repo and mess around with it to research the complete code. Beneath is a tree view of the repo. Be aware that my_docs.db is the single-file database utilized by SQLite to handle all of our information.

.

├── information

│   ├── cooking.txt

│   ├── gardening.txt

│   ├── general_relativity.txt

│   ├── newton.txt

│   ├── personal_finance.txt

│   ├── quantum.txt

│   ├── thermodynamics.txt

│   └── journey.txt

├── my_docs.db

├── necessities.txt

└── sqlite_rag_tutorial.py

Step 1 is to put in the mandatory libraries. Beneath is our necessities.txt file. As you possibly can see it has solely three libraries. I like to recommend making a digital setting with the most recent Python model (3.13.1 was used for this text) after which operating pip set up -r necessities.txt to put in the libraries.

# necessities.txt

sqlite-vec==0.1.6

openai==1.63.0

python-dotenv==1.0.1

Step 2 is to create an OpenAI API key for those who don’t have already got one. We will likely be utilizing OpenAI to generate embeddings for the textual content information in order that we are able to carry out our vector search. 

# sqlite_rag_tutorial.py

import sqlite3

from sqlite_vec import serialize_float32

import sqlite_vec

import os

from openai import OpenAI

from dotenv import load_dotenv

# Arrange OpenAI shopper

shopper = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

Step 3 is to load the sqlite-vec extension into SQLite. We will likely be utilizing Python and SQL for our examples on this article. Disabling the power to load extensions instantly after loading your extension is an efficient safety observe.

# Path to the database file

db_path="my_docs.db"

# Delete the database file if it exists

db = sqlite3.join(db_path)

db.enable_load_extension(True)

sqlite_vec.load(db)

db.enable_load_extension(False)

Subsequent we'll go forward and create our digital desk:

db.execute('''

   CREATE VIRTUAL TABLE paperwork USING vec0(

       embedding float[1536],

       +file_name TEXT,

       +content material TEXT

   )

''')

paperwork is a digital desk with three columns:

  • sample_embedding : 1536-dimension float that can retailer the embeddings of our pattern paperwork.
  • file_name : Textual content that can home the title of every file we retailer within the database. Be aware that this column and the next have a + image in entrance of them. This means that they’re auxiliary fields. Beforehand in sqlite-vec solely embedding information might be saved within the digital desk. Nonetheless, lately an replace was pushed that enables us so as to add fields to our desk that we don’t actually need embedded. On this case we’re including the content material and title of the file in the identical desk as our embeddings. It will enable us to simply see what embeddings correspond to what content material simply whereas sparing us the necessity for additional tables and JOIN statements.
  • content material : Textual content that can retailer the content material of every file. 

Now that now we have our digital desk arrange in our SQLite database, we are able to start changing our textual content information into embeddings and storing them in our desk:

# Operate to get embeddings utilizing the OpenAI API

def get_openai_embedding(textual content):

   response = shopper.embeddings.create(

       mannequin="text-embedding-3-small",

       enter=textual content

   )

   return response.information[0].embedding

# Iterate over .txt information within the /information listing

for file_name in os.listdir("information"):

   file_path = os.path.be part of("information", file_name)

   with open(file_path, 'r', encoding='utf-8') as file:

       content material = file.learn()

       # Generate embedding for the content material

       embedding = get_openai_embedding(content material)

       if embedding:

           # Insert file content material and embedding into the vec0 desk

           db.execute(

               'INSERT INTO paperwork (embedding, file_name, content material) VALUES (?, ?, ?)',

               (serialize_float32(embedding), file_name, content material)

# Commit modifications

db.commit()

We basically loop by every of our .txt information, embedding the content material from every file, after which utilizing an INSERT INTO assertion to insert the embedding, file_name, and content material into paperwork digital desk. A commit assertion on the finish ensures the modifications are continued. Be aware that we’re utilizing serialize_float32 right here from the sqlite-vec library. SQLite itself doesn’t have a built-in vector kind, so it shops vectors as binary massive objects (BLOBs) to save lots of area and permit quick operations. Internally, it makes use of Python’s struct.pack() perform, which converts Python information into C-style binary representations.

Lastly, to carry out RAG, you then use the next code to do a Okay-Nearest-Neighbors (KNN-style) operation. That is the center of vector search. 

# Carry out a pattern KNN question

query_text = "What's basic relativity?"

query_embedding = get_openai_embedding(query_text)

if query_embedding:

   rows = db.execute(

       """

       SELECT

           file_name,

           content material,

           distance

       FROM paperwork

       WHERE embedding MATCH ?

       ORDER BY distance

       LIMIT 3

       """,

       [serialize_float32(query_embedding)]

   ).fetchall()

   print("High 3 most comparable paperwork:")

   top_contexts = []

   for row in rows:

       print(row)

       top_contexts.append(row[1])  # Append the 'content material' column

We start by taking in a question from the person, on this case “What’s basic relativity?” and embedding that question utilizing the identical embedding mannequin as earlier than. We then carry out a SQL operation. Let’s break this down:

  • The SELECT assertion means the retrieved information can have three columns: file_name, content material, and distance. The primary two now we have already talked about. Distance will likely be calculated through the SQL operation, extra on this in a second.
  • The FROM assertion ensures you might be pulling information from the paperwork desk.
  • The WHERE embedding MATCH ? assertion performs a similarity search between the entire vectors in your database and the question vector. The returned information will embody a distance column. This distance is only a floating level quantity measuring the similarity between the question and database vectors. The upper the quantity, the nearer the vectors are. sqlite-vec gives just a few choices for how you can calculate this similarity. 
  • The ORDER BY distance makes certain to order the retrieved vectors in descending order of similarity (excessive -> low).
  • LIMIT 3 ensures we solely get the highest three paperwork which might be nearest to our question embedding vector. You’ll be able to tweak this quantity to see how retrieving roughly vectors impacts your outcomes.

Given our question of “What’s basic relativity?”, the following paperwork had been pulled. It did a fairly good job!

High 3 most comparable paperwork:

(‘general_relativity.txt’, ‘Einstein’s concept of basic relativity redefined our understanding of gravity. As an alternative of viewing gravity as a drive appearing at a distance, it interprets it because the curvature of spacetime round large objects. Gentle passing close to an enormous star bends barely, galaxies deflect beams touring thousands and thousands of light-years, and clocks tick at totally different charges relying on their gravitational potential. This groundbreaking concept led to predictions like gravitational lensing and black holes, phenomena later confirmed by observational proof, and it continues to information our understanding of the cosmos.’, 0.8316285610198975)

(‘newton.txt’, ‘In classical mechanics, Newton’s legal guidelines of movement type the inspiration of how we perceive the motion of objects. Newton’s first legislation, typically known as the legislation of inertia, states that an object at relaxation stays at relaxation and an object in movement continues in movement until acted upon by an exterior drive. This idea extends into extra advanced physics issues, the place analyzing web forces on objects permits us to foretell their future trajectories and behaviors. Over time, making use of Newton’s legal guidelines has enabled engineers and scientists to design safer automobiles, extra environment friendly machines, and even information spacecraft by intricate gravitational fields.’, 1.2036118507385254)

(‘quantum.txt’, ‘Quantum mechanics revolutionized our understanding of the microscopic world. Not like classical particles, quantum entities similar to electrons can exhibit each wave-like and particle-like behaviors. Phenomena like quantum superposition recommend that particles can exist in a number of states directly, and the act of measurement typically “collapses” these states into one noticed end result. This unusual and counterintuitive concept underpins fashionable applied sciences like semiconductors and lasers, and it gives a conceptual framework for rising fields like quantum computing and cryptography.’, 1.251380205154419)

We will then stuff the context of the mannequin with these three paperwork and have it try and reply our query. 

# Put together the context for the question

context = "nn".be part of(top_contexts)

system_message = "You're a useful assistant. Use the next context to reply the question."

# Ship question and context to OpenAI

attempt:

   completion = shopper.chat.completions.create(

       mannequin="gpt-4o-mini",

       messages=[

           {"role": "system", "content": system_message},

           {"role": "user", "content": f"Context: {context}nnQuery: {query_text}"}

       ]

   )

   print("Response:")

   print(completion.decisions[0].message.content material)

besides Exception as e:

   print(f"Error producing response: {e}")

Here’s what the mannequin stated:

Common relativity is a concept developed by Albert Einstein that redefines our understanding of gravity. As an alternative of viewing gravity as a drive appearing at a distance, basic relativity interprets it because the curvature of spacetime brought on by the presence of mass. In keeping with this concept, large objects like stars and planets create a distortion within the material of spacetime, and this curvature impacts the movement of different objects, together with gentle, which might bend when it passes close to an enormous physique.

That is faithfully sticking to the paperwork we gave the mannequin. Nice job 4o-mini!

Conclusion

sqlite-vec is a undertaking sponsored by the Mozilla Builders Accelerator program, so it has some important backing behind it. Have to present a giant because of Alex Garcia, the creator of sqlite-vec , for serving to to push the SQLite ecosystem and making ML attainable with this easy database. It is a properly maintained library, with updates coming down the pipeline frequently. As of November twentieth, they even added filtering by metadata! Maybe I ought to re-do my aforementioned RAG article utilizing SQLite 🤔.

The extension additionally gives bindings for a number of fashionable programming languages, together with Ruby, Go, Rust, and extra.

The truth that we’re in a position to radically simplify our RAG pipeline to the naked necessities is exceptional. To recap, there isn’t a want for a database service to be spun up and spun down, like Postgres, MySQL, and so on. There is no such thing as a want for API calls to cloud distributors. In case you deploy to a server straight through Digital Ocean or Hetzner, you possibly can even keep away from expensive and pointless complexity related to managed cloud providers like AWS, Azure, or Vercel. 

I imagine this easy structure can work for a wide range of functions. It’s cheaper to make use of, simpler to take care of, and sooner to iterate on. When you attain a sure scale it’s going to possible make sense emigrate to a extra sturdy database similar to Postgres with the pgvector extension for RAG capabilities. For extra superior capabilities similar to chunking and doc cleansing, a framework could be the proper alternative. However for startups and smaller gamers, it’s SQLite to the moon. 

Have enjoyable attempting out sqlite-vec for your self!

Easy RAG structure. Picture by writer.