Oops! Something went wrong while submitting the form.
We use cookies to improve your browsing experience on our website, to show you personalised content and to analize our website traffic. By browsing our website, you consent to our use of cookies. Read privacy policy.
Searching has become a prominent feature of any web application, and a relevant search feature requires a robust search engine. The search engine should be capable of performing a full-text search, auto completion, providing suggestions, spelling corrections, fuzzy search, and analytics.
Elasticsearch, a distributed, fast, and scalable search and analytic engine, takes care of all these basic search requirements.
The focus of this post is using a few approaches with Elasticsearch in our Rails application to reduce time latency for web requests. Let’s review one of the best ways to improve the Elasticsearch indexing in Rails models by moving them to background jobs.
In a Rails application, Elasticsearch can be integrated with any of the following popular gems:
We can continue with any of these gems mentioned above. But for this post, we will be moving forward with the Searchkick gem, which is a much more Rails-friendly gem.
The default Searchkick gem option uses the object callbacks to sync the data in the respective Elasticsearch index. Being in the callbacks, it costs the request, which has the creation and updation of a resource to take additional time to process the web request.
The below image shows logs from a Rails application, captured for an update request of a user record. We have added a print statement before Elasticsearch tries to sync in the Rails model so that it helps identify from the logs where the indexing has started. These logs show that the last two queries were executed for indexing the data in the Elasticsearch index.
Since the Elasticsearch sync is happening while updating a user record, we can conclude that the user update request will take additional time to cover up the Elasticsearch sync.
Below is the request flow diagram:
From the request flow diagram, we can say that the end-user must wait for step 3 and 4 to be completed. Step 3 is to fetch the children object details from the database.
To tackle the problem, we can move the Elasticsearch indexing to the background jobs. Usually, for Rails apps in production, there are separate app servers, database servers, background job processing servers, and Elasticsearch servers (in this scenario).
This is how the request flow looks when we move Elasticsearch indexing:
Let’s get to coding!
For demo purposes, we will have a Rails app with models: `User` and `Blogpost`. The stack used here:
This approach does not require any specific version of Rails, Elasticsearch or Mysql. Moreover, this approach is database agnostic. You can go through the code from this Github repo for reference.
Let’s take a look at the user model with Elasticsearch index.
Now, we will need to define a callback that can sync the data to the Elasticsearch index. Typically, this callback must be written in all the models that have the Elasticsearch index. Instead, we can write a common concern and include it to required models.
In the above active support concern, we have called the Sidekiq worker named ElasticsearchWorker. After adding this concern, don’t forget to include the Elasticsearch indexer concern in the user model, like so:
That’s it, we’ve done it. Cool, huh? Now, whenever a user creates, updates, or deletes web request, a background job will be created. The background job can be seen in the Sidekiq web UI at localhost:3000/sidekiq
Now, there is little problem in the Elasticsearch indexer concern. To reproduce this, go to your user edit page, click save, and look at localhost:3000/sidekiq—a job will be queued.
We can handle this case by tracking the dirty attributes.
Furthermore, there are few more areas of improvement. Suppose you are trying to update the field of user model that is not part of the Elasticsearch index, the Elasticsearch worker Sidekiq job will still get created and reindex the associated model object. This can be modified to create the Elasticsearch indexing worker Sidekiq job only if the Elasticsearch index fields are updated.
Moving the Elasticsearch indexing to background jobs is a great way to boost the performance of the web app by reducing the response time of any web request. Implementing this approach for every model would not be ideal. I would recommend this approach only if the Elasticsearch index data are not needed in real-time.
Since the execution of background jobs depends on the number of jobs it must perform, it might take time to reflect the changes in the Elasticsearch index if there are lots of jobs queued up. To solve this problem to some extent, the Elasticsearch indexing jobs can be added in a queue with high priority. Also, make sure you have a different app server and background job processing server. This approach works best if the app server is different than the background job processing server.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Improving Elasticsearch Indexing in the Rails Model using Searchkick
Searching has become a prominent feature of any web application, and a relevant search feature requires a robust search engine. The search engine should be capable of performing a full-text search, auto completion, providing suggestions, spelling corrections, fuzzy search, and analytics.
Elasticsearch, a distributed, fast, and scalable search and analytic engine, takes care of all these basic search requirements.
The focus of this post is using a few approaches with Elasticsearch in our Rails application to reduce time latency for web requests. Let’s review one of the best ways to improve the Elasticsearch indexing in Rails models by moving them to background jobs.
In a Rails application, Elasticsearch can be integrated with any of the following popular gems:
We can continue with any of these gems mentioned above. But for this post, we will be moving forward with the Searchkick gem, which is a much more Rails-friendly gem.
The default Searchkick gem option uses the object callbacks to sync the data in the respective Elasticsearch index. Being in the callbacks, it costs the request, which has the creation and updation of a resource to take additional time to process the web request.
The below image shows logs from a Rails application, captured for an update request of a user record. We have added a print statement before Elasticsearch tries to sync in the Rails model so that it helps identify from the logs where the indexing has started. These logs show that the last two queries were executed for indexing the data in the Elasticsearch index.
Since the Elasticsearch sync is happening while updating a user record, we can conclude that the user update request will take additional time to cover up the Elasticsearch sync.
Below is the request flow diagram:
From the request flow diagram, we can say that the end-user must wait for step 3 and 4 to be completed. Step 3 is to fetch the children object details from the database.
To tackle the problem, we can move the Elasticsearch indexing to the background jobs. Usually, for Rails apps in production, there are separate app servers, database servers, background job processing servers, and Elasticsearch servers (in this scenario).
This is how the request flow looks when we move Elasticsearch indexing:
Let’s get to coding!
For demo purposes, we will have a Rails app with models: `User` and `Blogpost`. The stack used here:
This approach does not require any specific version of Rails, Elasticsearch or Mysql. Moreover, this approach is database agnostic. You can go through the code from this Github repo for reference.
Let’s take a look at the user model with Elasticsearch index.
Now, we will need to define a callback that can sync the data to the Elasticsearch index. Typically, this callback must be written in all the models that have the Elasticsearch index. Instead, we can write a common concern and include it to required models.
In the above active support concern, we have called the Sidekiq worker named ElasticsearchWorker. After adding this concern, don’t forget to include the Elasticsearch indexer concern in the user model, like so:
That’s it, we’ve done it. Cool, huh? Now, whenever a user creates, updates, or deletes web request, a background job will be created. The background job can be seen in the Sidekiq web UI at localhost:3000/sidekiq
Now, there is little problem in the Elasticsearch indexer concern. To reproduce this, go to your user edit page, click save, and look at localhost:3000/sidekiq—a job will be queued.
We can handle this case by tracking the dirty attributes.
Furthermore, there are few more areas of improvement. Suppose you are trying to update the field of user model that is not part of the Elasticsearch index, the Elasticsearch worker Sidekiq job will still get created and reindex the associated model object. This can be modified to create the Elasticsearch indexing worker Sidekiq job only if the Elasticsearch index fields are updated.
Moving the Elasticsearch indexing to background jobs is a great way to boost the performance of the web app by reducing the response time of any web request. Implementing this approach for every model would not be ideal. I would recommend this approach only if the Elasticsearch index data are not needed in real-time.
Since the execution of background jobs depends on the number of jobs it must perform, it might take time to reflect the changes in the Elasticsearch index if there are lots of jobs queued up. To solve this problem to some extent, the Elasticsearch indexing jobs can be added in a queue with high priority. Also, make sure you have a different app server and background job processing server. This approach works best if the app server is different than the background job processing server.
Velotio Technologies is an outsourced software product development partner for top technology startups and enterprises. We partner with companies to design, develop, and scale their products. Our work has been featured on TechCrunch, Product Hunt and more.
We have partnered with our customers to built 90+ transformational products in areas of edge computing, customer data platforms, exascale storage, cloud-native platforms, chatbots, clinical trials, healthcare and investment banking.
Since our founding in 2016, our team has completed more than 90 projects with 220+ employees across the following areas:
Building web/mobile applications
Architecting Cloud infrastructure and Data analytics platforms