Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Artificial Intelligence / Machine Learning

Tags:

Elasticsearch

openai

python

Vector Search

Introduction

Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We'll harness the power of vector search, a technology that's revolutionizing personalized recommendations. We'll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

Finally, we'll put this knowledge to work, using OpenAI's embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

Enhancing Recommendation with Vector Search and Vector Databases

Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

Vector Search: The Art of Finding Similarities

Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

Vector Databases: Navigating Complex Data Landscapes

Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

Embeddings: Semantic Representation

In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

Sample Project: Blog Recommendation Service

Project Overview

Now, let’s craft a simple blog recommendation service using OpenAI's embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

Our blogs service will be responsible for indexing the blogs, finding similar one, and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

First, we'll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

	class BlogItem:
	blog_id: int
	title: str
	content: str

	def __init__(self, blog_id: int, title: str, content: str):
	self.blog_id = blog_id
	self.title = title
	self.content = content

view raw .py hosted with ❤ by GitHub

Elasticsearch Setup:

To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:

docker compose -f /path/to/your/docker-compose/file up -d.

You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.

docker compose -f /path/to/your/docker-compose/file down.‍

	version: '3'

	services:

	elasticsearch:
	image: docker.elastic.co/elasticsearch/elasticsearch:<version>
	container_name: elasticsearch
	environment:
	- node.name=docker-cluster
	- discovery.type=single-node
	- cluster.routing.allocation.disk.threshold_enabled=false
	- bootstrap.memory_lock=true
	- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
	- xpack.security.enabled=true
	- ELASTIC_PASSWORD=YourElasticPassword
	- "ELASTICSEARCH_USERNAME=elastic"
	ulimits:
	memlock:
	soft: -1
	hard: -1
	volumes:
	- esdata:/usr/share/elasticsearch/data
	ports:
	- "9200:9200"
	networks:
	- esnet

	kibana:
	image: docker.elastic.co/kibana/kibana:<version>
	container_name: kibana
	environment:
	ELASTICSEARCH_HOSTS: http://elasticsearch:9200
	ELASTICSEARCH_USERNAME: elastic
	ELASTICSEARCH_PASSWORD: YourElasticPassword
	ports:
	- "5601:5601"
	networks:
	- esnet
	depends_on:
	- elasticsearch

	networks:
	esnet:
	driver: bridge

	volumes:
	esdata:
	driver: local

view raw .txt hosted with ❤ by GitHub

Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

	import logging
	from elasticsearch import Elasticsearch

	# Setting up logging
	logging.basicConfig(level=logging.INFO, format='%(asctime)s [%(levelname)s] %(filename)s:%(lineno)d - %(message)s',
	datefmt='%Y-%m-%d %H:%M:%S')
	# Elasticsearch client setup
	es = Elasticsearch(hosts="http://localhost:9200",
	basic_auth=("elastic", "YourElasticPassword"))


	# Create an index with a mappings for embeddings
	def create_index(index_name: str, embedding_dimensions: int):
	try:
	es.indices.create(
	index=index_name,
	body={
	'mappings': {
	'properties': {
	'blog_id': {'type': 'long'},
	'title': {'type': 'text'},
	'content': {'type': 'text'},
	'embedding': {'type': 'dense_vector', 'dims': embedding_dimensions}
	}
	}
	}
	)
	except Exception as e:
	logging.error(f"An error occurred while creating index {index_name} : {e}")


	# Sample usage
	create_index("blogs", 1536)

view raw .py hosted with ❤ by GitHub

Create Embeddings AND Index Blogs:

We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

	from openai import OpenAI, OpenAIError

	api_key = ‘YourApiKey’
	client = OpenAI(api_key=api_key)

	def create_embeddings(text: str, model: str = "text-embedding-ada-002") -> list[float]:
	try:
	text = text.replace("\n", " ")
	response = client.embeddings.create(input=[text], model=model)
	logging.info(f"Embedding created successfully")
	return response.data[0].embedding
	except OpenAIError as e:
	logging.error(f"An OpenAI error occurred while creating embedding : {e}")
	raise
	except Exception as e:
	logging.exception(f"An unexpected error occurred while creating embedding : {e}")
	raise

view raw .py hosted with ❤ by GitHub

When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

	# Define the index name as a global constant

	ELASTICSEARCH_INDEX = 'blogs'

	def index_blog(blog_item: BlogItem):
	try:
	es.index(index=ELASTICSEARCH_INDEX, body={
	'blog_id': blog_item.blog_id,
	'title': blog_item.title,
	'content': blog_item.content,
	'embedding': create_embeddings(blog_item.title + "\n" + blog_item.content)
	})
	except Exception as e:
	logging.error(f"Failed to index blog with blog id {blog_item.blog_id} : {e}")
	raise

view raw .py hosted with ❤ by GitHub

Create a Pydantic model for the request body:

	class BlogItemRequest(BaseModel):
	title: str
	content: str

view raw .py hosted with ❤ by GitHub

Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

	@app.post("/blogs/")
	def save_blog(response: Response, blog_item: BlogItemRequest) -> dict[str, str]:
	# Create a BlogItem instance from the request data
	try:
	blog_id = get_blog_id()
	blog_item_obj = BlogItem(
	blog_id=blog_id,
	title=blog_item.title,
	content=blog_item.content,
	)

	# Call the index_blog method to index the blog
	index_blog(blog_item_obj)
	return {"message": "Blog indexed successfully", "blog_id": str(blog_id)}
	except Exception as e:
	logging.error(f"An error occurred while indexing blog with blog_id {blog_id} : {e}")
	response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
	return {"error": "Failed to index blog"}

view raw .py hosted with ❤ by GitHub

Finding Relevant Blogs:

To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points.
The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

	class BlogNotFoundException(Exception):
	def __init__(self, message="Blog not found"):
	self.message = message
	super().__init__(self.message)

view raw .py hosted with ❤ by GitHub

First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

	def get_blog_embedding(blog_id: int) -> Optional[Dict]:
	try:
	response = es.search(index=ELASTICSEARCH_INDEX, body={
	'query': {
	'term': {
	'blog_id': blog_id
	}
	},
	'_source': ['title', 'content', 'embedding'] # Fetch title, content and embedding
	})

	if response['hits']['hits']:
	logging.info(f"Blog found with blog_id {blog_id}")
	return response['hits']['hits'][0]['_source']
	else:
	logging.info(f"No blog found with blog_id {blog_id}")
	return None
	except Exception as e:
	logging.error(f"Error occurred while searching for blog with blog_id {blog_id}: {e}")
	raise


	def find_similar_blog(current_blog_id: int, num_neighbors=2) -> list[dict[str, str]]:
	try:
	blog_data = get_blog_embedding(current_blog_id)
	if not blog_data:
	raise BlogNotFoundException(f"Blog not found for id:{current_blog_id}")
	blog_embedding = blog_data['embedding']
	if not blog_embedding:
	blog_embedding = create_embeddings(blog_data['title'] + '\n' + blog_data['content'])
	# Find similar blogs using the embedding
	response = es.search(index=ELASTICSEARCH_INDEX, body={
	'size': num_neighbors + 1, # Retrieve extra result as we'll exclude the current blog
	'_source': ['title', 'content', 'blog_id', '_score'],
	'query': {
	'bool': {
	'must': {
	'script_score': {
	'query': {'match_all': {}},
	'script': {
	'source': "cosineSimilarity(params.query_vector, 'embedding')",
	'params': {'query_vector': blog_embedding}
	}
	}
	},
	'must_not': {
	'term': {
	'blog_id': current_blog_id # Exclude the current blog
	}
	}
	}
	}
	})

	# Extract and return the hits
	hits = [
	{
	'title': hit['_source']['title'],
	'content': hit['_source']['content'],
	'blog_id': hit['_source']['blog_id'],
	'score': f"{hit['_score'] * 100:.2f}%"
	}
	for hit in response['hits']['hits']
	if hit['_source']['blog_id'] != current_blog_id
	]

	return hits
	except Exception as e:
	logging.error(f"An error while finding similar blogs: {e}")
	raise

view raw .py hosted with ❤ by GitHub

Define a Pydantic model for the response:

	class BlogRecommendation(BaseModel):
	blog_id: int
	title: str
	content: str
	score: str

view raw .py hosted with ❤ by GitHub

Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

	@app.get("/recommend-blogs/{current_blog_id}")
	def recommend_blogs(
	response: Response,
	current_blog_id: int,
	num_neighbors: Optional[int] = 2) -> Union[Dict[str, str], List[BlogRecommendation]]:
	try:
	# Call the find_similar_blog function to get recommended blogs
	recommended_blogs = find_similar_blog(current_blog_id, num_neighbors)
	return recommended_blogs
	except BlogNotFoundException as e:
	response.status_code = status.HTTP_400_BAD_REQUEST
	return {"error": f"Blog not found for id:{current_blog_id}"}
	except Exception as e:
	response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
	return {"error": "Unable to process the request"}

view raw .py hosted with ❤ by GitHub

The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
A sample from the test dataset:

Test Result 1: Medical Research Blog

Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.

‍

Test Result 2: Travel Blog

Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

Based on the test results, it's evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don't share identical keywords, enhancing the user's experience by connecting them with content that aligns more closely with their interests and search intent.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality.
If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you're now ready to build a recommendation service that doesn't just guide users to content but connects them to the stories, insights, and experiences they seek. It's not just about building a service; it's about enriching the digital exploration experience, one recommendation at a time.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Vector Search: The New Frontier in Personalized Recommendations

Introduction

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Enhancing Recommendation with Vector Search and Vector Databases

Vector Search: The Art of Finding Similarities

Vector Databases: Navigating Complex Data Landscapes

Embeddings: Semantic Representation

Sample Project: Blog Recommendation Service

Project Overview

Our blogs service will be responsible for indexing the blogs, finding similar one, and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

	class BlogItem:
	blog_id: int
	title: str
	content: str

	def __init__(self, blog_id: int, title: str, content: str):
	self.blog_id = blog_id
	self.title = title
	self.content = content

view raw .py hosted with ❤ by GitHub

Elasticsearch Setup:

To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:

docker compose -f /path/to/your/docker-compose/file up -d.

You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.

docker compose -f /path/to/your/docker-compose/file down.‍

	version: '3'

	services:

	elasticsearch:
	image: docker.elastic.co/elasticsearch/elasticsearch:<version>
	container_name: elasticsearch
	environment:
	- node.name=docker-cluster
	- discovery.type=single-node
	- cluster.routing.allocation.disk.threshold_enabled=false
	- bootstrap.memory_lock=true
	- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
	- xpack.security.enabled=true
	- ELASTIC_PASSWORD=YourElasticPassword
	- "ELASTICSEARCH_USERNAME=elastic"
	ulimits:
	memlock:
	soft: -1
	hard: -1
	volumes:
	- esdata:/usr/share/elasticsearch/data
	ports:
	- "9200:9200"
	networks:
	- esnet

	kibana:
	image: docker.elastic.co/kibana/kibana:<version>
	container_name: kibana
	environment:
	ELASTICSEARCH_HOSTS: http://elasticsearch:9200
	ELASTICSEARCH_USERNAME: elastic
	ELASTICSEARCH_PASSWORD: YourElasticPassword
	ports:
	- "5601:5601"
	networks:
	- esnet
	depends_on:
	- elasticsearch

	networks:
	esnet:
	driver: bridge

	volumes:
	esdata:
	driver: local

view raw .txt hosted with ❤ by GitHub

Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

	import logging
	from elasticsearch import Elasticsearch

	# Setting up logging
	logging.basicConfig(level=logging.INFO, format='%(asctime)s [%(levelname)s] %(filename)s:%(lineno)d - %(message)s',
	datefmt='%Y-%m-%d %H:%M:%S')
	# Elasticsearch client setup
	es = Elasticsearch(hosts="http://localhost:9200",
	basic_auth=("elastic", "YourElasticPassword"))


	# Create an index with a mappings for embeddings
	def create_index(index_name: str, embedding_dimensions: int):
	try:
	es.indices.create(
	index=index_name,
	body={
	'mappings': {
	'properties': {
	'blog_id': {'type': 'long'},
	'title': {'type': 'text'},
	'content': {'type': 'text'},
	'embedding': {'type': 'dense_vector', 'dims': embedding_dimensions}
	}
	}
	}
	)
	except Exception as e:
	logging.error(f"An error occurred while creating index {index_name} : {e}")


	# Sample usage
	create_index("blogs", 1536)

view raw .py hosted with ❤ by GitHub

Create Embeddings AND Index Blogs:

We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

	from openai import OpenAI, OpenAIError

	api_key = ‘YourApiKey’
	client = OpenAI(api_key=api_key)

	def create_embeddings(text: str, model: str = "text-embedding-ada-002") -> list[float]:
	try:
	text = text.replace("\n", " ")
	response = client.embeddings.create(input=[text], model=model)
	logging.info(f"Embedding created successfully")
	return response.data[0].embedding
	except OpenAIError as e:
	logging.error(f"An OpenAI error occurred while creating embedding : {e}")
	raise
	except Exception as e:
	logging.exception(f"An unexpected error occurred while creating embedding : {e}")
	raise

view raw .py hosted with ❤ by GitHub

When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

	# Define the index name as a global constant

	ELASTICSEARCH_INDEX = 'blogs'

	def index_blog(blog_item: BlogItem):
	try:
	es.index(index=ELASTICSEARCH_INDEX, body={
	'blog_id': blog_item.blog_id,
	'title': blog_item.title,
	'content': blog_item.content,
	'embedding': create_embeddings(blog_item.title + "\n" + blog_item.content)
	})
	except Exception as e:
	logging.error(f"Failed to index blog with blog id {blog_item.blog_id} : {e}")
	raise

view raw .py hosted with ❤ by GitHub

Create a Pydantic model for the request body:

	class BlogItemRequest(BaseModel):
	title: str
	content: str

view raw .py hosted with ❤ by GitHub

Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

	@app.post("/blogs/")
	def save_blog(response: Response, blog_item: BlogItemRequest) -> dict[str, str]:
	# Create a BlogItem instance from the request data
	try:
	blog_id = get_blog_id()
	blog_item_obj = BlogItem(
	blog_id=blog_id,
	title=blog_item.title,
	content=blog_item.content,
	)

	# Call the index_blog method to index the blog
	index_blog(blog_item_obj)
	return {"message": "Blog indexed successfully", "blog_id": str(blog_id)}
	except Exception as e:
	logging.error(f"An error occurred while indexing blog with blog_id {blog_id} : {e}")
	response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
	return {"error": "Failed to index blog"}

view raw .py hosted with ❤ by GitHub

Finding Relevant Blogs:

To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points.
The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

	class BlogNotFoundException(Exception):
	def __init__(self, message="Blog not found"):
	self.message = message
	super().__init__(self.message)

view raw .py hosted with ❤ by GitHub

First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

	def get_blog_embedding(blog_id: int) -> Optional[Dict]:
	try:
	response = es.search(index=ELASTICSEARCH_INDEX, body={
	'query': {
	'term': {
	'blog_id': blog_id
	}
	},
	'_source': ['title', 'content', 'embedding'] # Fetch title, content and embedding
	})

	if response['hits']['hits']:
	logging.info(f"Blog found with blog_id {blog_id}")
	return response['hits']['hits'][0]['_source']
	else:
	logging.info(f"No blog found with blog_id {blog_id}")
	return None
	except Exception as e:
	logging.error(f"Error occurred while searching for blog with blog_id {blog_id}: {e}")
	raise


	def find_similar_blog(current_blog_id: int, num_neighbors=2) -> list[dict[str, str]]:
	try:
	blog_data = get_blog_embedding(current_blog_id)
	if not blog_data:
	raise BlogNotFoundException(f"Blog not found for id:{current_blog_id}")
	blog_embedding = blog_data['embedding']
	if not blog_embedding:
	blog_embedding = create_embeddings(blog_data['title'] + '\n' + blog_data['content'])
	# Find similar blogs using the embedding
	response = es.search(index=ELASTICSEARCH_INDEX, body={
	'size': num_neighbors + 1, # Retrieve extra result as we'll exclude the current blog
	'_source': ['title', 'content', 'blog_id', '_score'],
	'query': {
	'bool': {
	'must': {
	'script_score': {
	'query': {'match_all': {}},
	'script': {
	'source': "cosineSimilarity(params.query_vector, 'embedding')",
	'params': {'query_vector': blog_embedding}
	}
	}
	},
	'must_not': {
	'term': {
	'blog_id': current_blog_id # Exclude the current blog
	}
	}
	}
	}
	})

	# Extract and return the hits
	hits = [
	{
	'title': hit['_source']['title'],
	'content': hit['_source']['content'],
	'blog_id': hit['_source']['blog_id'],
	'score': f"{hit['_score'] * 100:.2f}%"
	}
	for hit in response['hits']['hits']
	if hit['_source']['blog_id'] != current_blog_id
	]

	return hits
	except Exception as e:
	logging.error(f"An error while finding similar blogs: {e}")
	raise

view raw .py hosted with ❤ by GitHub

Define a Pydantic model for the response:

	class BlogRecommendation(BaseModel):
	blog_id: int
	title: str
	content: str
	score: str

view raw .py hosted with ❤ by GitHub

Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

	@app.get("/recommend-blogs/{current_blog_id}")
	def recommend_blogs(
	response: Response,
	current_blog_id: int,
	num_neighbors: Optional[int] = 2) -> Union[Dict[str, str], List[BlogRecommendation]]:
	try:
	# Call the find_similar_blog function to get recommended blogs
	recommended_blogs = find_similar_blog(current_blog_id, num_neighbors)
	return recommended_blogs
	except BlogNotFoundException as e:
	response.status_code = status.HTTP_400_BAD_REQUEST
	return {"error": f"Blog not found for id:{current_blog_id}"}
	except Exception as e:
	response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
	return {"error": "Unable to process the request"}

view raw .py hosted with ❤ by GitHub

The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
A sample from the test dataset:

Test Result 1: Medical Research Blog

Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.

‍

Test Result 2: Travel Blog

Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality.
If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Velotio Technologies is an outsourced software product development partner for top technology startups and enterprises. We partner with companies to design, develop, and scale their products. Our work has been featured on TechCrunch, Product Hunt and more.

We have partnered with our customers to built 90+ transformational products in areas of edge computing, customer data platforms, exascale storage, cloud-native platforms, chatbots, clinical trials, healthcare and investment banking.

Since our founding in 2016, our team has completed more than 90 projects with 220+ employees across the following areas:

Building web/mobile applications
Architecting Cloud infrastructure and Data analytics platforms
Designing AI/ML-based solutions
Intelligent Chatbots

Talk to us

Subscribe to get the latest technology updates

Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Introduction

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Enhancing Recommendation with Vector Search and Vector Databases

Vector Search: The Art of Finding Similarities

Vector Databases: Navigating Complex Data Landscapes

Embeddings: Semantic Representation

Sample Project: Blog Recommendation Service

Project Overview

Tools and Setup

Steps

Conclusion

MORE POSTS BY THIS AUTHOR

Afshan Khan

You may also like

Policy Insights: Chatbots and RAG in Health Insurance Navigation

Shreyash Panchal

The Responsible Use of Artificial Intelligence - Shaping a Safer Tomorrow

Shivali Bari

Unlocking Legal Insights: Effortless Document Summarization with OpenAI's LLM and LangChain

Shreyash Panchal

Vector Search: The New Frontier in Personalized Recommendations

Introduction

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Enhancing Recommendation with Vector Search and Vector Databases

Vector Search: The Art of Finding Similarities

Vector Databases: Navigating Complex Data Landscapes

Embeddings: Semantic Representation

Sample Project: Blog Recommendation Service

Project Overview

Tools and Setup

Steps

Conclusion

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

Policy Insights: Chatbots and RAG in Health Insurance Navigation

The Responsible Use of Artificial Intelligence - Shaping a Safer Tomorrow

Unlocking Legal Insights: Effortless Document Summarization with OpenAI's LLM and LangChain

Building an Intelligent Recommendation Engine with Collaborative Filtering

Build ML Pipelines at Scale with Kubeflow

Exploring OpenAI Gym: A Platform for Reinforcement Learning Algorithms

Real Time Text Classification Using Kafka and Scikit-learn

Your Complete Guide to Building Stateless Bots Using Rasa Stack

Chatbots With Google DialogFlow: Build a Fun Reddit Chatbot in 30 Minutes

Amazon Lex + AWS Lambda: Beyond Hello World

Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack

A Quick Guide to Building a Serverless Chatbot With Amazon Lex

Building an Intelligent Chatbot Using Botkit and Rasa NLU

Explanatory vs. Predictive Models in Machine Learning

Benefits of Using Chatbots: How Companies Are Using Them to Their Advantange

A Step Towards Machine Learning Algorithms: Univariate Linear Regression

A Quick Introduction to Data Analysis With Pandas

Product Engineering

Data and AI

Cloud & DevOps

Strategy and Consulting