• United States+1
  • United Kingdom+44
  • Afghanistan (‫افغانستان‬‎)+93
  • Albania (Shqipëri)+355
  • Algeria (‫الجزائر‬‎)+213
  • American Samoa+1684
  • Andorra+376
  • Angola+244
  • Anguilla+1264
  • Antigua and Barbuda+1268
  • Argentina+54
  • Armenia (Հայաստան)+374
  • Aruba+297
  • Australia+61
  • Austria (Österreich)+43
  • Azerbaijan (Azərbaycan)+994
  • Bahamas+1242
  • Bahrain (‫البحرين‬‎)+973
  • Bangladesh (বাংলাদেশ)+880
  • Barbados+1246
  • Belarus (Беларусь)+375
  • Belgium (België)+32
  • Belize+501
  • Benin (Bénin)+229
  • Bermuda+1441
  • Bhutan (འབྲུག)+975
  • Bolivia+591
  • Bosnia and Herzegovina (Босна и Херцеговина)+387
  • Botswana+267
  • Brazil (Brasil)+55
  • British Indian Ocean Territory+246
  • British Virgin Islands+1284
  • Brunei+673
  • Bulgaria (България)+359
  • Burkina Faso+226
  • Burundi (Uburundi)+257
  • Cambodia (កម្ពុជា)+855
  • Cameroon (Cameroun)+237
  • Canada+1
  • Cape Verde (Kabu Verdi)+238
  • Caribbean Netherlands+599
  • Cayman Islands+1345
  • Central African Republic (République centrafricaine)+236
  • Chad (Tchad)+235
  • Chile+56
  • China (中国)+86
  • Christmas Island+61
  • Cocos (Keeling) Islands+61
  • Colombia+57
  • Comoros (‫جزر القمر‬‎)+269
  • Congo (DRC) (Jamhuri ya Kidemokrasia ya Kongo)+243
  • Congo (Republic) (Congo-Brazzaville)+242
  • Cook Islands+682
  • Costa Rica+506
  • Côte d’Ivoire+225
  • Croatia (Hrvatska)+385
  • Cuba+53
  • Curaçao+599
  • Cyprus (Κύπρος)+357
  • Czech Republic (Česká republika)+420
  • Denmark (Danmark)+45
  • Djibouti+253
  • Dominica+1767
  • Dominican Republic (República Dominicana)+1
  • Ecuador+593
  • Egypt (‫مصر‬‎)+20
  • El Salvador+503
  • Equatorial Guinea (Guinea Ecuatorial)+240
  • Eritrea+291
  • Estonia (Eesti)+372
  • Ethiopia+251
  • Falkland Islands (Islas Malvinas)+500
  • Faroe Islands (Føroyar)+298
  • Fiji+679
  • Finland (Suomi)+358
  • France+33
  • French Guiana (Guyane française)+594
  • French Polynesia (Polynésie française)+689
  • Gabon+241
  • Gambia+220
  • Georgia (საქართველო)+995
  • Germany (Deutschland)+49
  • Ghana (Gaana)+233
  • Gibraltar+350
  • Greece (Ελλάδα)+30
  • Greenland (Kalaallit Nunaat)+299
  • Grenada+1473
  • Guadeloupe+590
  • Guam+1671
  • Guatemala+502
  • Guernsey+44
  • Guinea (Guinée)+224
  • Guinea-Bissau (Guiné Bissau)+245
  • Guyana+592
  • Haiti+509
  • Honduras+504
  • Hong Kong (香港)+852
  • Hungary (Magyarország)+36
  • Iceland (Ísland)+354
  • India (भारत)+91
  • Indonesia+62
  • Iran (‫ایران‬‎)+98
  • Iraq (‫العراق‬‎)+964
  • Ireland+353
  • Isle of Man+44
  • Israel (‫ישראל‬‎)+972
  • Italy (Italia)+39
  • Jamaica+1876
  • Japan (日本)+81
  • Jersey+44
  • Jordan (‫الأردن‬‎)+962
  • Kazakhstan (Казахстан)+7
  • Kenya+254
  • Kiribati+686
  • Kosovo+383
  • Kuwait (‫الكويت‬‎)+965
  • Kyrgyzstan (Кыргызстан)+996
  • Laos (ລາວ)+856
  • Latvia (Latvija)+371
  • Lebanon (‫لبنان‬‎)+961
  • Lesotho+266
  • Liberia+231
  • Libya (‫ليبيا‬‎)+218
  • Liechtenstein+423
  • Lithuania (Lietuva)+370
  • Luxembourg+352
  • Macau (澳門)+853
  • Macedonia (FYROM) (Македонија)+389
  • Madagascar (Madagasikara)+261
  • Malawi+265
  • Malaysia+60
  • Maldives+960
  • Mali+223
  • Malta+356
  • Marshall Islands+692
  • Martinique+596
  • Mauritania (‫موريتانيا‬‎)+222
  • Mauritius (Moris)+230
  • Mayotte+262
  • Mexico (México)+52
  • Micronesia+691
  • Moldova (Republica Moldova)+373
  • Monaco+377
  • Mongolia (Монгол)+976
  • Montenegro (Crna Gora)+382
  • Montserrat+1664
  • Morocco (‫المغرب‬‎)+212
  • Mozambique (Moçambique)+258
  • Myanmar (Burma) (မြန်မာ)+95
  • Namibia (Namibië)+264
  • Nauru+674
  • Nepal (नेपाल)+977
  • Netherlands (Nederland)+31
  • New Caledonia (Nouvelle-Calédonie)+687
  • New Zealand+64
  • Nicaragua+505
  • Niger (Nijar)+227
  • Nigeria+234
  • Niue+683
  • Norfolk Island+672
  • North Korea (조선 민주주의 인민 공화국)+850
  • Northern Mariana Islands+1670
  • Norway (Norge)+47
  • Oman (‫عُمان‬‎)+968
  • Pakistan (‫پاکستان‬‎)+92
  • Palau+680
  • Palestine (‫فلسطين‬‎)+970
  • Panama (Panamá)+507
  • Papua New Guinea+675
  • Paraguay+595
  • Peru (Perú)+51
  • Philippines+63
  • Poland (Polska)+48
  • Portugal+351
  • Puerto Rico+1
  • Qatar (‫قطر‬‎)+974
  • Réunion (La Réunion)+262
  • Romania (România)+40
  • Russia (Россия)+7
  • Rwanda+250
  • Saint Barthélemy (Saint-Barthélemy)+590
  • Saint Helena+290
  • Saint Kitts and Nevis+1869
  • Saint Lucia+1758
  • Saint Martin (Saint-Martin (partie française))+590
  • Saint Pierre and Miquelon (Saint-Pierre-et-Miquelon)+508
  • Saint Vincent and the Grenadines+1784
  • Samoa+685
  • San Marino+378
  • São Tomé and Príncipe (São Tomé e Príncipe)+239
  • Saudi Arabia (‫المملكة العربية السعودية‬‎)+966
  • Senegal (Sénégal)+221
  • Serbia (Србија)+381
  • Seychelles+248
  • Sierra Leone+232
  • Singapore+65
  • Sint Maarten+1721
  • Slovakia (Slovensko)+421
  • Slovenia (Slovenija)+386
  • Solomon Islands+677
  • Somalia (Soomaaliya)+252
  • South Africa+27
  • South Korea (대한민국)+82
  • South Sudan (‫جنوب السودان‬‎)+211
  • Spain (España)+34
  • Sri Lanka (ශ්‍රී ලංකාව)+94
  • Sudan (‫السودان‬‎)+249
  • Suriname+597
  • Svalbard and Jan Mayen+47
  • Swaziland+268
  • Sweden (Sverige)+46
  • Switzerland (Schweiz)+41
  • Syria (‫سوريا‬‎)+963
  • Taiwan (台灣)+886
  • Tajikistan+992
  • Tanzania+255
  • Thailand (ไทย)+66
  • Timor-Leste+670
  • Togo+228
  • Tokelau+690
  • Tonga+676
  • Trinidad and Tobago+1868
  • Tunisia (‫تونس‬‎)+216
  • Turkey (Türkiye)+90
  • Turkmenistan+993
  • Turks and Caicos Islands+1649
  • Tuvalu+688
  • U.S. Virgin Islands+1340
  • Uganda+256
  • Ukraine (Україна)+380
  • United Arab Emirates (‫الإمارات العربية المتحدة‬‎)+971
  • United Kingdom+44
  • United States+1
  • Uruguay+598
  • Uzbekistan (Oʻzbekiston)+998
  • Vanuatu+678
  • Vatican City (Città del Vaticano)+39
  • Venezuela+58
  • Vietnam (Việt Nam)+84
  • Wallis and Futuna+681
  • Western Sahara (‫الصحراء الغربية‬‎)+212
  • Yemen (‫اليمن‬‎)+967
  • Zambia+260
  • Zimbabwe+263
  • Åland Islands+358
Thanks! We'll be in touch in the next 12 hours
Oops! Something went wrong while submitting the form.

Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Artificial Intelligence / Machine Learning

Introduction

Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We'll harness the power of vector search, a technology that's revolutionizing personalized recommendations. We'll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

Finally, we'll put this knowledge to work, using OpenAI's embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

Enhancing Recommendation with Vector Search and Vector Databases

Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

Vector Search: The Art of Finding Similarities

Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

Vector Databases: Navigating Complex Data Landscapes

Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

Embeddings: Semantic Representation

In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

Sample Project: Blog Recommendation Service

Project Overview

Now, let’s craft a simple blog recommendation service using OpenAI's embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

Our blogs service will be responsible for indexing the blogs, finding similar one,  and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

  • OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
  • Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
  • Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
  • Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

First, we'll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

class BlogItem:
blog_id: int
title: str
content: str
def __init__(self, blog_id: int, title: str, content: str):
self.blog_id = blog_id
self.title = title
self.content = content
view raw .py hosted with ❤ by GitHub

Elasticsearch Setup:

  • To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
  • Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:
  • docker compose -f /path/to/your/docker-compose/file up -d. 
  • You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.
  • docker compose -f /path/to/your/docker-compose/file down.

version: '3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:<version>
container_name: elasticsearch
environment:
- node.name=docker-cluster
- discovery.type=single-node
- cluster.routing.allocation.disk.threshold_enabled=false
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- xpack.security.enabled=true
- ELASTIC_PASSWORD=YourElasticPassword
- "ELASTICSEARCH_USERNAME=elastic"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata:/usr/share/elasticsearch/data
ports:
- "9200:9200"
networks:
- esnet
kibana:
image: docker.elastic.co/kibana/kibana:<version>
container_name: kibana
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
ELASTICSEARCH_USERNAME: elastic
ELASTICSEARCH_PASSWORD: YourElasticPassword
ports:
- "5601:5601"
networks:
- esnet
depends_on:
- elasticsearch
networks:
esnet:
driver: bridge
volumes:
esdata:
driver: local
view raw .txt hosted with ❤ by GitHub

  • Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

import logging
from elasticsearch import Elasticsearch
# Setting up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s [%(levelname)s] %(filename)s:%(lineno)d - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
# Elasticsearch client setup
es = Elasticsearch(hosts="http://localhost:9200",
basic_auth=("elastic", "YourElasticPassword"))
# Create an index with a mappings for embeddings
def create_index(index_name: str, embedding_dimensions: int):
try:
es.indices.create(
index=index_name,
body={
'mappings': {
'properties': {
'blog_id': {'type': 'long'},
'title': {'type': 'text'},
'content': {'type': 'text'},
'embedding': {'type': 'dense_vector', 'dims': embedding_dimensions}
}
}
}
)
except Exception as e:
logging.error(f"An error occurred while creating index {index_name} : {e}")
# Sample usage
create_index("blogs", 1536)
view raw .py hosted with ❤ by GitHub

Create Embeddings AND Index Blogs:

  • We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

from openai import OpenAI, OpenAIError
api_key = ‘YourApiKey’
client = OpenAI(api_key=api_key)
def create_embeddings(text: str, model: str = "text-embedding-ada-002") -> list[float]:
try:
text = text.replace("\n", " ")
response = client.embeddings.create(input=[text], model=model)
logging.info(f"Embedding created successfully")
return response.data[0].embedding
except OpenAIError as e:
logging.error(f"An OpenAI error occurred while creating embedding : {e}")
raise
except Exception as e:
logging.exception(f"An unexpected error occurred while creating embedding : {e}")
raise
view raw .py hosted with ❤ by GitHub

  • When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

# Define the index name as a global constant
ELASTICSEARCH_INDEX = 'blogs'
def index_blog(blog_item: BlogItem):
try:
es.index(index=ELASTICSEARCH_INDEX, body={
'blog_id': blog_item.blog_id,
'title': blog_item.title,
'content': blog_item.content,
'embedding': create_embeddings(blog_item.title + "\n" + blog_item.content)
})
except Exception as e:
logging.error(f"Failed to index blog with blog id {blog_item.blog_id} : {e}")
raise
view raw .py hosted with ❤ by GitHub

  • Create a Pydantic model for the request body:

class BlogItemRequest(BaseModel):
title: str
content: str
view raw .py hosted with ❤ by GitHub

  • Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

@app.post("/blogs/")
def save_blog(response: Response, blog_item: BlogItemRequest) -> dict[str, str]:
# Create a BlogItem instance from the request data
try:
blog_id = get_blog_id()
blog_item_obj = BlogItem(
blog_id=blog_id,
title=blog_item.title,
content=blog_item.content,
)
# Call the index_blog method to index the blog
index_blog(blog_item_obj)
return {"message": "Blog indexed successfully", "blog_id": str(blog_id)}
except Exception as e:
logging.error(f"An error occurred while indexing blog with blog_id {blog_id} : {e}")
response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
return {"error": "Failed to index blog"}
view raw .py hosted with ❤ by GitHub

Finding Relevant Blogs:

  • To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
  • Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points. 
  • The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
  • Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

class BlogNotFoundException(Exception):
def __init__(self, message="Blog not found"):
self.message = message
super().__init__(self.message)
view raw .py hosted with ❤ by GitHub

  • First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

def get_blog_embedding(blog_id: int) -> Optional[Dict]:
try:
response = es.search(index=ELASTICSEARCH_INDEX, body={
'query': {
'term': {
'blog_id': blog_id
}
},
'_source': ['title', 'content', 'embedding'] # Fetch title, content and embedding
})
if response['hits']['hits']:
logging.info(f"Blog found with blog_id {blog_id}")
return response['hits']['hits'][0]['_source']
else:
logging.info(f"No blog found with blog_id {blog_id}")
return None
except Exception as e:
logging.error(f"Error occurred while searching for blog with blog_id {blog_id}: {e}")
raise
def find_similar_blog(current_blog_id: int, num_neighbors=2) -> list[dict[str, str]]:
try:
blog_data = get_blog_embedding(current_blog_id)
if not blog_data:
raise BlogNotFoundException(f"Blog not found for id:{current_blog_id}")
blog_embedding = blog_data['embedding']
if not blog_embedding:
blog_embedding = create_embeddings(blog_data['title'] + '\n' + blog_data['content'])
# Find similar blogs using the embedding
response = es.search(index=ELASTICSEARCH_INDEX, body={
'size': num_neighbors + 1, # Retrieve extra result as we'll exclude the current blog
'_source': ['title', 'content', 'blog_id', '_score'],
'query': {
'bool': {
'must': {
'script_score': {
'query': {'match_all': {}},
'script': {
'source': "cosineSimilarity(params.query_vector, 'embedding')",
'params': {'query_vector': blog_embedding}
}
}
},
'must_not': {
'term': {
'blog_id': current_blog_id # Exclude the current blog
}
}
}
}
})
# Extract and return the hits
hits = [
{
'title': hit['_source']['title'],
'content': hit['_source']['content'],
'blog_id': hit['_source']['blog_id'],
'score': f"{hit['_score'] * 100:.2f}%"
}
for hit in response['hits']['hits']
if hit['_source']['blog_id'] != current_blog_id
]
return hits
except Exception as e:
logging.error(f"An error while finding similar blogs: {e}")
raise
view raw .py hosted with ❤ by GitHub

  • Define a Pydantic model for the response:

class BlogRecommendation(BaseModel):
blog_id: int
title: str
content: str
score: str
view raw .py hosted with ❤ by GitHub

  • Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

@app.get("/recommend-blogs/{current_blog_id}")
def recommend_blogs(
response: Response,
current_blog_id: int,
num_neighbors: Optional[int] = 2) -> Union[Dict[str, str], List[BlogRecommendation]]:
try:
# Call the find_similar_blog function to get recommended blogs
recommended_blogs = find_similar_blog(current_blog_id, num_neighbors)
return recommended_blogs
except BlogNotFoundException as e:
response.status_code = status.HTTP_400_BAD_REQUEST
return {"error": f"Blog not found for id:{current_blog_id}"}
except Exception as e:
response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
return {"error": "Unable to process the request"}
view raw .py hosted with ❤ by GitHub

  • The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

  • Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
  • A sample from the test dataset:


  • Test Result 1: Medical Research Blog

    Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.


  • Test Result 2: Travel Blog

    Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

Based on the test results, it's evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don't share identical keywords, enhancing the user's experience by connecting them with content that aligns more closely with their interests and search intent.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

  • Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality. 
  • If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

  • Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
  • Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you're now ready to build a recommendation service that doesn't just guide users to content but connects them to the stories, insights, and experiences they seek. It's not just about building a service; it's about enriching the digital exploration experience, one recommendation at a time.

Get the latest engineering blogs delivered straight to your inbox.
No spam. Only expert insights.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Vector Search: The New Frontier in Personalized Recommendations

Introduction

Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We'll harness the power of vector search, a technology that's revolutionizing personalized recommendations. We'll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

Finally, we'll put this knowledge to work, using OpenAI's embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

Enhancing Recommendation with Vector Search and Vector Databases

Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

Vector Search: The Art of Finding Similarities

Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

Vector Databases: Navigating Complex Data Landscapes

Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

Embeddings: Semantic Representation

In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

Sample Project: Blog Recommendation Service

Project Overview

Now, let’s craft a simple blog recommendation service using OpenAI's embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

Our blogs service will be responsible for indexing the blogs, finding similar one,  and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

  • OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
  • Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
  • Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
  • Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

First, we'll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

class BlogItem:
blog_id: int
title: str
content: str
def __init__(self, blog_id: int, title: str, content: str):
self.blog_id = blog_id
self.title = title
self.content = content
view raw .py hosted with ❤ by GitHub

Elasticsearch Setup:

  • To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
  • Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:
  • docker compose -f /path/to/your/docker-compose/file up -d. 
  • You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.
  • docker compose -f /path/to/your/docker-compose/file down.

version: '3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:<version>
container_name: elasticsearch
environment:
- node.name=docker-cluster
- discovery.type=single-node
- cluster.routing.allocation.disk.threshold_enabled=false
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- xpack.security.enabled=true
- ELASTIC_PASSWORD=YourElasticPassword
- "ELASTICSEARCH_USERNAME=elastic"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata:/usr/share/elasticsearch/data
ports:
- "9200:9200"
networks:
- esnet
kibana:
image: docker.elastic.co/kibana/kibana:<version>
container_name: kibana
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
ELASTICSEARCH_USERNAME: elastic
ELASTICSEARCH_PASSWORD: YourElasticPassword
ports:
- "5601:5601"
networks:
- esnet
depends_on:
- elasticsearch
networks:
esnet:
driver: bridge
volumes:
esdata:
driver: local
view raw .txt hosted with ❤ by GitHub

  • Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

import logging
from elasticsearch import Elasticsearch
# Setting up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s [%(levelname)s] %(filename)s:%(lineno)d - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
# Elasticsearch client setup
es = Elasticsearch(hosts="http://localhost:9200",
basic_auth=("elastic", "YourElasticPassword"))
# Create an index with a mappings for embeddings
def create_index(index_name: str, embedding_dimensions: int):
try:
es.indices.create(
index=index_name,
body={
'mappings': {
'properties': {
'blog_id': {'type': 'long'},
'title': {'type': 'text'},
'content': {'type': 'text'},
'embedding': {'type': 'dense_vector', 'dims': embedding_dimensions}
}
}
}
)
except Exception as e:
logging.error(f"An error occurred while creating index {index_name} : {e}")
# Sample usage
create_index("blogs", 1536)
view raw .py hosted with ❤ by GitHub

Create Embeddings AND Index Blogs:

  • We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

from openai import OpenAI, OpenAIError
api_key = ‘YourApiKey’
client = OpenAI(api_key=api_key)
def create_embeddings(text: str, model: str = "text-embedding-ada-002") -> list[float]:
try:
text = text.replace("\n", " ")
response = client.embeddings.create(input=[text], model=model)
logging.info(f"Embedding created successfully")
return response.data[0].embedding
except OpenAIError as e:
logging.error(f"An OpenAI error occurred while creating embedding : {e}")
raise
except Exception as e:
logging.exception(f"An unexpected error occurred while creating embedding : {e}")
raise
view raw .py hosted with ❤ by GitHub

  • When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

# Define the index name as a global constant
ELASTICSEARCH_INDEX = 'blogs'
def index_blog(blog_item: BlogItem):
try:
es.index(index=ELASTICSEARCH_INDEX, body={
'blog_id': blog_item.blog_id,
'title': blog_item.title,
'content': blog_item.content,
'embedding': create_embeddings(blog_item.title + "\n" + blog_item.content)
})
except Exception as e:
logging.error(f"Failed to index blog with blog id {blog_item.blog_id} : {e}")
raise
view raw .py hosted with ❤ by GitHub

  • Create a Pydantic model for the request body:

class BlogItemRequest(BaseModel):
title: str
content: str
view raw .py hosted with ❤ by GitHub

  • Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

@app.post("/blogs/")
def save_blog(response: Response, blog_item: BlogItemRequest) -> dict[str, str]:
# Create a BlogItem instance from the request data
try:
blog_id = get_blog_id()
blog_item_obj = BlogItem(
blog_id=blog_id,
title=blog_item.title,
content=blog_item.content,
)
# Call the index_blog method to index the blog
index_blog(blog_item_obj)
return {"message": "Blog indexed successfully", "blog_id": str(blog_id)}
except Exception as e:
logging.error(f"An error occurred while indexing blog with blog_id {blog_id} : {e}")
response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
return {"error": "Failed to index blog"}
view raw .py hosted with ❤ by GitHub

Finding Relevant Blogs:

  • To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
  • Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points. 
  • The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
  • Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

class BlogNotFoundException(Exception):
def __init__(self, message="Blog not found"):
self.message = message
super().__init__(self.message)
view raw .py hosted with ❤ by GitHub

  • First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

def get_blog_embedding(blog_id: int) -> Optional[Dict]:
try:
response = es.search(index=ELASTICSEARCH_INDEX, body={
'query': {
'term': {
'blog_id': blog_id
}
},
'_source': ['title', 'content', 'embedding'] # Fetch title, content and embedding
})
if response['hits']['hits']:
logging.info(f"Blog found with blog_id {blog_id}")
return response['hits']['hits'][0]['_source']
else:
logging.info(f"No blog found with blog_id {blog_id}")
return None
except Exception as e:
logging.error(f"Error occurred while searching for blog with blog_id {blog_id}: {e}")
raise
def find_similar_blog(current_blog_id: int, num_neighbors=2) -> list[dict[str, str]]:
try:
blog_data = get_blog_embedding(current_blog_id)
if not blog_data:
raise BlogNotFoundException(f"Blog not found for id:{current_blog_id}")
blog_embedding = blog_data['embedding']
if not blog_embedding:
blog_embedding = create_embeddings(blog_data['title'] + '\n' + blog_data['content'])
# Find similar blogs using the embedding
response = es.search(index=ELASTICSEARCH_INDEX, body={
'size': num_neighbors + 1, # Retrieve extra result as we'll exclude the current blog
'_source': ['title', 'content', 'blog_id', '_score'],
'query': {
'bool': {
'must': {
'script_score': {
'query': {'match_all': {}},
'script': {
'source': "cosineSimilarity(params.query_vector, 'embedding')",
'params': {'query_vector': blog_embedding}
}
}
},
'must_not': {
'term': {
'blog_id': current_blog_id # Exclude the current blog
}
}
}
}
})
# Extract and return the hits
hits = [
{
'title': hit['_source']['title'],
'content': hit['_source']['content'],
'blog_id': hit['_source']['blog_id'],
'score': f"{hit['_score'] * 100:.2f}%"
}
for hit in response['hits']['hits']
if hit['_source']['blog_id'] != current_blog_id
]
return hits
except Exception as e:
logging.error(f"An error while finding similar blogs: {e}")
raise
view raw .py hosted with ❤ by GitHub

  • Define a Pydantic model for the response:

class BlogRecommendation(BaseModel):
blog_id: int
title: str
content: str
score: str
view raw .py hosted with ❤ by GitHub

  • Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

@app.get("/recommend-blogs/{current_blog_id}")
def recommend_blogs(
response: Response,
current_blog_id: int,
num_neighbors: Optional[int] = 2) -> Union[Dict[str, str], List[BlogRecommendation]]:
try:
# Call the find_similar_blog function to get recommended blogs
recommended_blogs = find_similar_blog(current_blog_id, num_neighbors)
return recommended_blogs
except BlogNotFoundException as e:
response.status_code = status.HTTP_400_BAD_REQUEST
return {"error": f"Blog not found for id:{current_blog_id}"}
except Exception as e:
response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
return {"error": "Unable to process the request"}
view raw .py hosted with ❤ by GitHub

  • The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

  • Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
  • A sample from the test dataset:


  • Test Result 1: Medical Research Blog

    Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.


  • Test Result 2: Travel Blog

    Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

Based on the test results, it's evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don't share identical keywords, enhancing the user's experience by connecting them with content that aligns more closely with their interests and search intent.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

  • Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality. 
  • If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

  • Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
  • Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you're now ready to build a recommendation service that doesn't just guide users to content but connects them to the stories, insights, and experiences they seek. It's not just about building a service; it's about enriching the digital exploration experience, one recommendation at a time.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings