AI & Vectors

Amazon Bedrock


Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. Each model is accessible through a common API which implements a broad set of features to help build generative AI applications with security, privacy, and responsible AI in mind.

This guide will walk you through an example using Amazon Bedrock SDK with vecs. We will create embeddings using the Amazon Titan Embeddings G1 – Text v1.2 (amazon.titan-embed-text-v1) model, insert these embeddings into a PostgreSQL database using vecs, and then query the collection to find the most similar sentences to a given query sentence.

Create an Environment

First, you need to set up your environment. You will need Python 3.7+ with the vecs and boto3 libraries installed.

You can install the necessary Python libraries using pip:


_10
pip install vecs boto3

You'll also need:

Create Embeddings

Next, we will use Amazon’s Titan Embedding G1 - Text v1.2 model to create embeddings for a set of sentences.


_34
import boto3
_34
import vecs
_34
import json
_34
_34
client = boto3.client(
_34
'bedrock-runtime',
_34
region_name='us-east-1',
_34
# Credentials from your AWS account
_34
aws_access_key_id='<replace_your_own_credentials>',
_34
aws_secret_access_key='<replace_your_own_credentials>',
_34
aws_session_token='<replace_your_own_credentials>',
_34
)
_34
_34
dataset = [
_34
"The cat sat on the mat.",
_34
"The quick brown fox jumps over the lazy dog.",
_34
"Friends, Romans, countrymen, lend me your ears",
_34
"To be or not to be, that is the question.",
_34
]
_34
_34
embeddings = []
_34
_34
for sentence in dataset:
_34
# invoke the embeddings model for each sentence
_34
response = client.invoke_model(
_34
body= json.dumps({"inputText": sentence}),
_34
modelId= "amazon.titan-embed-text-v1",
_34
accept = "application/json",
_34
contentType = "application/json"
_34
)
_34
# collect the embedding from the response
_34
response_body = json.loads(response["body"].read())
_34
# add the embedding to the embedding list
_34
embeddings.append((sentence, response_body.get("embedding"), {}))

Store the Embeddings with vecs

Now that we have our embeddings, we can insert them into a PostgreSQL database using vecs.


_16
import vecs
_16
_16
DB_CONNECTION = "postgresql://<user>:<password>@<host>:<port>/<db_name>"
_16
_16
# create vector store client
_16
vx = vecs.Client(DB_CONNECTION)
_16
_16
# create a collection named 'sentences' with 1536 dimensional vectors
_16
# to match the default dimension of the Titan Embeddings G1 - Text model
_16
sentences = vx.get_or_create_collection(name="sentences", dimension=1536)
_16
_16
# upsert the embeddings into the 'sentences' collection
_16
sentences.upsert(records=embeddings)
_16
_16
# create an index for the 'sentences' collection
_16
sentences.create_index()

Querying for Most Similar Sentences

Now, we query the sentences collection to find the most similar sentences to a sample query sentence. First need to create an embedding for the query sentence. Next, we query the collection we created earlier to find the most similar sentences.


_27
query_sentence = "A quick animal jumps over a lazy one."
_27
_27
# create vector store client
_27
vx = vecs.Client(DB_CONNECTION)
_27
_27
# create an embedding for the query sentence
_27
response = client.invoke_model(
_27
body= json.dumps({"inputText": query_sentence}),
_27
modelId= "amazon.titan-embed-text-v1",
_27
accept = "application/json",
_27
contentType = "application/json"
_27
)
_27
_27
response_body = json.loads(response["body"].read())
_27
_27
query_embedding = response_body.get("embedding")
_27
_27
# query the 'sentences' collection for the most similar sentences
_27
results = sentences.query(
_27
data=query_embedding,
_27
limit=3,
_27
include_value = True
_27
)
_27
_27
# print the results
_27
for result in results:
_27
print(result)

This returns the most similar 3 records and their distance to the query vector.


_10
('The quick brown fox jumps over the lazy dog.', 0.27600620558852)
_10
('The cat sat on the mat.', 0.609986272479202)
_10
('To be or not to be, that is the question.', 0.744849503688346)

Resources