Using Vector Search
Vector search is a foundational concept to systems that use machine learning and artificial intelligence. After using an embedding model to transform unstructured data, such as text or images, into embedding vectors, users can perform operations on these vectors to process input and return the vectors that are the most semantically similar.
InterSystems IRIS SQL supports the storage of vectors in the compact and performant VECTOR and EMBEDDING types, enabling you to efficiently store vectorized data as a part of your traditional relational schemas. By leveraging the EMBEDDING type, InterSystems IRIS SQL converts text into embedding vectors through familiar SQL syntax without interacting with an embedding model directly.
Vectors and Embeddings
Vectors can be used to represent the semantic meaning in embeddings. These embeddings are determined by an embedding model, which is a machine learning model that maps text, images, or audio to a geometric space with high dimensionality. This page talks specifically using an embedding model with text, but the procedures described here can be used on many different types of data.
Modern embedding vectors typically range between hundreds and thousands of dimensions. Words that share similar semantic meaning occupy nearby positions in that space, while words with disparate semantic meaning occupy positions distant from each other. These spatial positions allow an application to algorithmically determine the similarity between two words or sentences by computing the distance between them on their embedding vectors.
In vector search, a user compares an input vector with the vectors stored in the database using a distance function, such as the dot product. A vector search enables you to algorithmically determine which pieces of text are semantically most similar to an input. As a result, vector search is a great fit for tasks that involve information retrieval.
InterSystems IRIS SQL supports a dedicated VECTOR type that leverages SIMD CPU instructions to efficiently perform distance computations and an EMBEDDING type that uses all the optimizations of the VECTOR type, but also simplifies the conversion of a string value from a source field in the same table (typically of type VARCHAR) into an embedding vector. There are four numeric vector types: float (the default), double, decimal, and integer.
As VECTOR and EMBEDDING are standard SQL data types, you can store them alongside other data in a relational table, converting a SQL database transparently into a hybrid vector database. To insert VECTOR-typed data with an INSERT statement, use the TO_VECTOR function; note that this requires you to manually create embeddings before you can include them in the INSERT statement. To insert EMBEDDING-typed data, define an embedding configuration, then use an INSERT statement to insert the string into the EMBEDDING column’s source field. InterSystems recommends using the EMBEDDING type for ease of use.
Inserting VECTOR-typed Data
Translate Text to Embeddings
Before you can store embeddings as vectors in InterSystems IRIS, you first need to create them from a source. In general, you can transform a piece of text into an embedding in four steps.
-
Import a package to use to turn your text into a series of embeddings.
-
Pre-process your text to best fit your chosen embedding model’s input specifications.
-
Instantiate the model and convert your text to the embeddings, using your chosen package’s workflow.
With Embedded Python, you can execute Python code that converts text to embeddings (using whichever package you like) alongside ObjectScript code that can insert those embeddings directly into your database. For information on importing a Python package into InterSystems IRIS, see Install and Import Python Packages.
Example: Create Embeddings with Embedded Python
The following example takes an input list of sentences, passed as a built-in Python list, converts them to embeddings using the sentence_transformers package, and returns a list of the embeddings for later insertion into a table. See Performing an INSERT for more information on inserting the embeddings. This example assumes that the input sentences have already been pre-processed to match the embedding model’s specifications.
ClassMethod GetEmbeddingPy(sentences) [ Language = python ]
{
import json
# import the package
import sentence_transformers
# perform any preprocessing
# create the model and form the embeddings
model = sentence_transformers.SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
return embeddings
}
Perform an INSERT
Once you have a list of strings that represent embeddings, you can insert them into your table as VECTORs either with an INSERT statement or by creating an object and storing the embedding as a property of the object. The following example demonstrates how to insert data with an INSERT.
Example
For each embedding, execute an INSERT statement that adds the embedding to the desired table. Use TO_VECTOR to convert the string representation of the embedding to a VECTOR.
In the following command, one embedding is inserted into a table called Sample.Description that has two columns: one for the embedding that represents a textual description and the other for a unique identifier that can be used to link the embedding with the text it is derived from (implicitly stored in a separate table with the same unique identifier). (Note that the example uses ?s as placeholders for the embedding and the unique identifier, as these are typically supplied programmatically as parameters to the statement.)
INSERT INTO Sample.Descriptions (DescriptionEmbedding, UID)
VALUES (TO_VECTOR(?,FLOAT), ?)
The following code samples use this query to insert a single embedding into the table.
ClassMethod InsertEmbeddings(embedding As %String, uid As %Integer)
{
set sc=$$$OK
try {
set myquery = "INSERT INTO Sample.Descriptions (DescriptionEmbedding, UID)"
_"VALUES (TO_VECTOR(?,FLOAT), ?)"
set tStatement = ##class(%SQL.Statement).%New()
$$$ThrowOnError(tStatement.%Prepare(query))
set rset = tStatement.%Execute(embedding, uid)
if (rset.%SQLCODE < 0) {
throw ##class(%Exception.SQL).CreateFromSQLCODE(rset.%SQLCODE,rset.%Message)
}
}
catch e {
set sc = e.AsStatus()
return sc
}
return 0
}
public void InsertEmbeddings(String embeddings, Integer uid) {
try {
// set connection parameters
IRISDataSource ds = new IRISDataSource();
ds.setServerName("127.0.0.1");
ds.setPortNumber(51776);
ds.setDatabaseName("USER");
ds.setUser("_SYSTEM");
ds.setPassword("SYS");
IRISConnection conn = ds.GetConnection();
String sql = "INSERT INTO Sample.Embeddings (Embedding, UID) " +
"VALUES (TO_VECTOR(?,FLOAT), ?)";
PreparedStatement pstmt = conn.prepareStatement(sql);
pstmt.setString(embedding);
pstmt.setInt(uid);
pstmt.executeUpdate();
pstmt.close();
} catch (Exception ex) {
System.out.println("caught exception: "
+ ex.GetClass().getName() + ": " + ex.GetMessage());
}
}
def insertEmbeddings(embeddings, uid):
// set the connection parameters
conn_string = "localhost:1972/USER"
username = "_system"
password = "SYS"
connection = iris.connect(conn_string, username, password)
cursor = connection.cursor()
try:
sql = "INSERT INTO Sample.Embeddings (Embedding, UID) " +
"VALUES (TO_VECTOR(?,FLOAT), ?)"
params = [embeddings,uid]
cursor.execute(sql,params))
cursor.close()
except Exception as ex:
print(ex)
finally:
if cursor:
cursor.close()
if connection:
connection.close()
static void InsertEmbeddings(string emb, Integer uid)
{
// set the connection parameters
String host = "127.0.0.1";
String port = "51776";
String username = "_SYSTEM";
String password = "SYS";
String namespace = "USER";
IRISConnection conn = new IRISConnection();
IRISConnect.ConnectionString = "Server = " + host
+ "; Port = " + port + "; Namespace = " + namespace
+ "; Password = " + password + "; User ID = " + username;
conn.Open();
String sql = "INSERT INTO Sample.Embeddings (Embedding, UID) " +
"VALUES (TO_VECTOR(?,FLOAT), ?)";
IRISCommand cmd = new IRISCommand(sql, conn);
cmd.ExecuteNonQuery();
cmd.Dispose();
conn.Close();
}
Inserting EMBEDDING-typed Data
Create an Embedding Configuration
Before you can insert embeddings into a table, you must decide on an embedding model you wish to use to convert text into an embedding. Once you have selected a model, you can insert metadata for that model into the %Embedding.Config table to create an embedding configuration. The embedding configuration stores the information needed to make API calls to your chosen embedding model.
The %Embedding.Config table has five columns:
-
Name: The unique and valid identifier used to refer to the model.
-
Configuration: A JSON-formatted string that includes particular data for a particular source. See OpenAI Configuration Settings and SentenceTransformers Configuration Settings for more information on using this parameter with the default embedding classes.
-
EmbeddingClass: The name of the ObjectScript class that extends %Embedding.Interface and defines logic for retrieving embeddings from the endpoint. InterSystems IRIS provides two such classes out of the box: %Embedding.OpenAI and %Embedding.SentenceTransformers. To use an embedding model from a different source, you must manually define a new class that extends the %Embedding.Interface class.
-
VectorLength: The length of the vector (number of dimensions) the embedding model returns.
-
Description: An optional description of this endpoint.
The following example inserts an embedding configuration for an OpenAI embedding modelOpens in a new tab into the %Embedding.Config table.
INSERT INTO %Embedding.Config (Name, Configuration, EmbeddingClass, VectorLength, Description)
VALUES ('my-openai-config',
'{"apiKey":"<api key>",
"sslConfig": "llm_ssl",
"modelName": "text-embedding-3-small"}',
'%Embedding.OpenAI',
1536,
'a small embedding model provided by OpenAI')
The following example inserts an embedding configuration for a SentenceTransformers modelOpens in a new tab in the %Embedding.Config table. Note that the VectorLength column is implied and is populated automatically, based on the model
INSERT INTO %Embedding.Config (Name, Configuration, EmbeddingClass, Description)
VALUES ('sentence-transformers-config',
'{"modelName":"sentence-transformers/all-MiniLM-L6-v2",
"hfCachePath":"/Users/InterSystems/VEC147/hfCache",
"maxTokens": 256,
"checkTokenCount": true}',
'%Embedding.SentenceTransformers',
'',
'a small SentenceTransformers embedding model')
OpenAI Configuration Settings
Setting Name | Description | Required? |
---|---|---|
modelName | The name of the OpenAI model. | Yes |
sslConfig | The SSL Configuration used to make the API request to OpenAI. | Yes |
apiKey | The OpenAI API key. | Yes |
maxTokens | The maximum number of input tokens | No (default value: 8191) |
checkTokenCount | A boolean. If true, then inputs are checked against the maximum number of tokens. | No (default value: false) |
SentenceTransformers Configuration Settings
Setting Name | Description | Required? |
---|---|---|
modelName | The name of the OpenAI model. | Yes |
hfCachePath | The path on your machine to the embedding model. | Yes |
maxTokens | The maximum number of input tokens. | No (automatically populated by your chosen model) |
checkTokenCount | A boolean. If true, then inputs are checked against the maximum number of tokens. | No (default value: false) |
Define a Table with EMBEDDING-type Columns
Once you have stored an embedding configuration in the %Embedding.Config table, you can create a table that uses this configuration to populate EMBEDDING-typed columns.
When defining a column with the embedding type, you must specify two arguments for the model and source parameters. The model parameter is the name of the embedding configuration that converts text into an embedding. The source parameter is a comma separated list of properties in the class that are used to calculate the embedding.
The following example creates a table with four columns, two of which store strings and two of which store embeddings of those strings. Both of the EMBEDDING-typed fields use an embedding configuration called my-openai-config.
CREATE TABLE Embedding.Example (
Description VARCHAR(200),
Name VARCHAR(30),
DescriptionEmbedding EMBEDDING('my-openai-config','Description'),
NameEmbedding EMBEDDING('my-openai-config','Name')
)
Define a Vector Index
After storing data in InterSystems IRIS in the VECTOR type, you may define a vector index to improve the efficiency of searches issued against your stored vectors. Vectorized data is most often indexed in an approximate nearest neighbor (ANN) vector index.
In a standard vector search, comparisons against an input vector are made against every individual vector in the database. While this approach guarantees that your searches are completely accurate, it is computationally inefficient. An ANN vector index approximates the exact result by leveraging nearest neighbor algorithms that search a specialized graph data structure to limit the number of comparison operations performed between an input vector and the stored vectors. As a result, when a search is performed, the system does not make comparisons with each stored vector, but instead uses the graph to eliminate vectors that are not close to input vector. This approach dramatically improves the performance of searches on a vector database, particularly when dealing with large amounts of high-dimensional data. This performance benefit outweighs the small loss in precision due to approximation.
As with standard indexes, the query optimizer may decide that the most efficient query plan does not use the vector index you have defined. To see if a query uses the vector index, examine the query play with the EXPLAIN command. Note that an ANN vector index can only be used in queries that include both a TOP clause and an ORDER BY ... DESC clause involving a vector distance function (such as VECTOR_DOT_PRODUCT) to find the best matching records, whereas other index types are routinely used to handle other clauses, including WHERE, GROUP BY, and JOIN conditions.
Hierarchical Navigable Small World Index
InterSystems SQL allows you to define a Hierarchical Navigable Small World (HNSW) index, which uses the HNSW algorithmOpens in a new tab to create a vector index. HNSW is an efficient ANN algorithm implemented natively in InterSystems IRIS for use in vector search.
InterSystems intends on making improvements to the efficiency of the HNSW index that may impact its execution and structure, potentially requiring you to rebuild the index in a future version.
If, after upgrading to a later version, a query that attempts to use your HNSW index fails with the error message “HNSW index <IndexName> was built using an unsupported HNSW storage version (versions <old version number>); current HNSW code supports storage version <current version number>.,” simply rebuild the index.
You can define an HNSW index using a CREATE INDEX statement. To define an HNSW index, the following requirements must be met:
-
The HNSW index is defined on a VECTOR- or EMBEDDING-typed field with a fixed length that is of type double or decimal.
-
The table the index is defined on must have IDs that are bitmap supported.
-
The table the index is defined on must use default storage.
When defining an HNSW index, you must specify the distance function used by the index in the Distance index parameter. There are two possible values for this parameter: Cosine and DotProduct. This parameter is case insensitive and must be surrounded by single quotes.
There are two additional parameters, specific to the HNSW algorithm, that you can optionally specify when defining an HNSW index:
-
M (optional): The number of bi-directional links created for every new element during construction. This value should be a positive integer larger than 1; the value will fall between 2–100. Higher M values work better on datasets with high dimensionality or recall, while lower M values work better with low dimensionality or recall. The default value is 16.
-
efConstruction (optional): The size of the dynamic list for the nearest neighbors. This value should be a positive integer that is larger than M. Larger efConstruction values generally lead to better index quality, but longer construction time. There is a maximum value past which efConstruction does not improve the quality of the index. The default value is 64.
The following examples define HNSW indexes with various values in the parameters.
CREATE INDEX HNSWIndex ON TABLE Company.People (Biography)
AS HNSW(Distance='Cosine')
CREATE INDEX HNSWIndex ON TABLE Company.People (Biography)
AS HNSW(M=24, Distance='DotProduct')
CREATE INDEX HNSWIndex ON TABLE Company.People (Biography)
AS HNSW(M=32, efConstruction=100, Distance='Cosine')
Special Cases
If you build an HNSW index using Cosine as the Distance parameter, then all-zero vectors stored in a table are silently omitted from the index, so searches that use the HNSW indexes do not include all-zero vectors. Similarly, if you query a table with an HNSW index that uses Cosine as the Distance parameter to find vectors that are similar to an all-zero vector, the result set will contain zero rows.
Perform Vector Search
Vector search enables you to use one vector to search for other similar vectors stored within the database. In InterSystems SQL, you can perform such a search with a single SELECT query.
InterSystems SQL currently supports two functions that determine the similarity between two vectors: VECTOR_DOT_PRODUCT and VECTOR_COSINE. The larger the value of these functions, the more similar the vectors are. Use an ORDER BY clause with the DESC option to return a sorted set of vectors, with the most similar vectors returned at the top of the result set.
If you have defined a vector index, be sure to specify that results should be returned in descending order by using the DESC option in your ORDER BY clause.
Example
The following example demonstrates how to use SQL to issue a query that uses VECTOR_DOT_PRODUCT to find the most semantically similar descriptions to an input sentence. Convert an input search term to an embedding with the EMBEDDING function and use either VECTOR_DOT_PRODUCT or VECTOR_COSINE within an ORDER BY clause to return the most similar pieces of text. In this example, note that you do not need to specify the name of the embedding model in the EMBEDDING function because the function is used on an EMBEDDING-typed field in a table and the system automatically uses the embedding model associated with the field. To select only the most similar results, use a TOP clause. This example shows a SQL statement that selects the five descriptions that are the most similar to the input user query. (Note that the example uses ? as a placeholder for the embedding of the search term, as this value is typically provided as a parameter, not as a literal.)
SELECT TOP 5 Description FROM Embedding.Example
ORDER BY VECTOR_DOT_PRODUCT(DescriptionEmbedding,
EMBEDDING(?)) DESC
The following demonstrates an ObjectScript method that executes this query in Dynamic SQL, iterating through the result set to append the descriptions in a long string:
ClassMethod GetSimilarDesc(searchTerm As %String)
{
set sc = $$$OK
try {
set query = "SELECT TOP 5 Description FROM Embedding.Example
_"ORDER BY VECTOR_DOT_PRODUCT(DescriptionEmbedding,
_"EMBEDDING(?,'my-openai-config')) DESC"
set tStatement = ##class(%SQL.Statement).%New()
$$$ThrowOnError(tStatement.%Prepare(query))
set rset = tStatement.%Execute(searchTerm)
if (rset.%SQLCODE < 0) {
throw ##class(%Exeception.SQL).CreateFromSQLCODE(rset.%SQLCODE,rset.%Message)
}
// process retrieved descriptions here and return the result
set retrievedInfo = ""
while rset.%Next() {
set retrievedInfo = retrievedInfo_" Description: "_rset.%Get("Description")
}
return retrievedInfo
}
catch e {
set sc = e.AsStatus()
return sc
}
}