Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
June 23, 2022 05:50 pm GMT

Swapping Elasticsearch for Meilisearch in Rails feat. Docker

A wise move for apps with simple search needs

Elasticsearch is a comprehensive and highly configurable search engine and storage system for a multitude of app concerns. In this article were only going to be comparing its search engine capabilities within the context of a Dockerized Ruby on Rails app. If your app has a need for specifically weighted attribute boosting, results that get better with machine learning, mature highly available sharding capabilities, or multi-index searching, Elasticsearch is still what you want.

If your search needs are somewhere between pg_search/ransack and Elasticsearch, Meilisearch is a new contender which is blazing fast (<50ms), much more resource efficient, has a sensible default configuration, a first-party Ruby library and Rails gem and an admin panel to try out searching before fully integrating within your app. With full text search, synonyms, typo-tolerance, stop words and customizable relevancy rules, Meilisearch has enough features to satisfy most applications and thats before their v1.0 release . Multi-index searching is also on the roadmap.

Part Zero: But Why?

Why go through the pain of switching? Performance and resource efficiency!

First lets compare Elasticsearch and Meilisearch on the item youre probably here to learn about resource usage. Memory on the cloud is expensive and Elasticsearch is a known memory hog. On my Rails app which has fairly low usage, its using 3.5GB. Thats 2.7GB more than the next highest container which is Rails web workers running malloc instead of jemalloc (a topic for a different article!).

So how much more efficient is Meilisearch? Lets get a baseline with Elasticsearch first. Well be using this movie database with ~32k rows.

I have to note here that Elasticsearch took a lot more time to set up. It initially refused to start up because it needed more memory than the OS would allow it to allocate just to start. That limit needed to be expanded with sysctl -w vm.max_map_count=262144. Then the JSON file needed a fair amount of manipulation because the bulk JSON API expects you to specify the index for every row. This wasnt evident in the documentation and an ancient StackOverflow answer came to my rescue.

docker network create elasticdocker run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -it docker.elastic.co/elasticsearch/elasticsearch:8.2.3curl --location --request POST 'https://localhost:9200/movies/_bulk/' \--header 'Content-Type: application/x-ndjson' \--header 'Authorization: Basic ---' \--data-binary '@movies.json'

docker stats reports that Elasticsearch is using 5.2GB of memory. Adding the movies to the index did not increase this it uses 5.2GB by default with no data. You can of course set ES_JAVA_OPTS and get that down. Even small apps however risk container evictions due to memory pressure when doing that. This was the main motivator for me to check out Meilisearch.

Now lets do the same thing with Meilisearch. It was quite a bit easier to setup and the bulk import was a breeze.

docker run --rm -p 7700:7700 -v "$(pwd)/meili_data:/meili_data" getmeili/meilisearchcurl -i -X POST 'http://127.0.0.1:7700/indexes/movies/documents' \  --header 'content-type: application/json' \  --data-binary @movies.json

Letting Meilisearch run for a few minutes, the memory usage actually halved down to 96.7MB.

Now lets run a simple comparison benchmark. Well run 100 iterations of q=batman&limit=10 for Meilisearch and ?q=batman&size=10 for Elasticsearch.

Elasticsearch: 9.68ms average, 15ms peak.
Meilisearch: 5.17ms average. 11ms peak.

Meilisearch used 54.8x less memory and was 46.6% faster than Elasticsearch with the same data and the same queries.

Thats a lot faster and a lot easier to host.

The image is also 36MB instead of 1.2GB nice. Note that this is specifically a comparison of default configurations. Whats more is Meilisearch has an interface at localhost:7700 so we dont even need to open Postman to poke around (sorry, no filtering or sorting on the admin interface at the moment).

Convinced? Ok read on and Ill show you what switching from Elasticsearch to Meilisearch looked like for a real production app ScribeHub. We also moved from Ankanes excellent Searchkick gem to the first party meilisearch-rails gem and Ill show you the changes there as well.

Part One: DevOps

Begin by replacing your Elasticsearch container with a Meilisearch container in your docker-compose.yml:

meilisearch:  image: getmeili/meilisearch:v0.27.0  user: root  ports:    - "7700:7700"  volumes:    - "meili:/meili_data/"  env_file:    - .msenv...volumes:  meili:

The first big difference is authentication. Meilisearch supports a direct front-end integration which doesnt even touch Rails (neat!). That means if a master key is set, it will generate default keys with specific permissions on start up. If youre just trying MS out locally, I recommend not setting the master key so that it will allow unauthenticated requests. If you intend to ship to production, Id recommend setting the master key to ensure you understand how that works before youre launching. We wont be going into front-end only implementations in this article were just going to focus on the ES to MS migration.

Something that almost made me give up right at the beginning was that the MS service will roll the keys if there is any change to its environment file. I kept dropping the default admin key into a common .env file which would roll the keys again and I would get auth errors when trying to reindex. Its supposed to roll the keys if theres a change to the master key, but rolling the keys on any change to the env file means you should have a separate env file for the MS service. I called it .msenv as you can see above. Ive seen it roll the keys even when there was no change to its own env file but that was a result of not mounting to the /meili_data directory.

If youre setting a master key, run SecureRandom.hex 32 from a Rails console and drop that into MEILI_MASTER_KEY in your .msenv file. You can also set the host and turn off anonymous analytics while youre at it, which I personally think should default to disabled. Heres my example .msenv:

# WARNING# Every time any change is made to this file, Meilisearch will regenerate keys.# That will invalidate current keys and make you sad.MEILISEARCH_HOST=http://meilisearch:7700MEILI_MASTER_KEY=<YOUR MASTER KEY>MEILI_NO_ANALYTICS=true

Run docker-compose up and you should see this in the MS start up output:

A Master Key has been set. Requests to Meilisearch wont be authorized unless you provide an authentication key.

Now well need to fetch the default admin API key. Heres the curl request to fetch keys. I recommend saving the query in Postman or Insomnia so you dont have to keep looking it up in the future.

curl --location --request GET 'http://localhost:7700/keys' \--header 'Authorization: Bearer <YOUR MASTER KEY>'

Drop the default admin API key into MEILISEARCH_API_KEY in your Rails .env file and set MEILISEARCH_HOST to the same thing you set it to in .msenv so thats available on the Rails side as well. Time to write your Meilisearch initializer file! You can tune timeouts and retries while youre at it.

MeiliSearch::Rails.configuration = {  meilisearch_host: ENV['MEILISEARCH_HOST'],  meilisearch_api_key: ENV['MEILISEARCH_API_KEY'],  timeout: 1,  max_retries: 2}

Restart everything to pick up the environment changes and you should now be able to reindex a model in terms of permissions. But first we need a model to reindex.

Part Deux: Rails Integration

This is where my path and yours differ, but Ill provide an example model integration. Because ScribeHub has many searchable resources, I wrote a concern. schema_searchable.rb:

module SchemaSearchable  extend ActiveSupport::Concern  included do    include MeiliSearch::Rails    extend Pagy::Meilisearch  end  module ClassMethods    def trigger_sidekiq_job(record, remove)      MeilisearchEnqueueWorker.perform_async(record.class.name, record.id, remove)    end  endend

This DRYed things more with Elasticsearch but Ill take all the code reduction I can get. Now you can drop include SchemaSearchable into any searchable model. Heres an example of additions to our GlossaryTerm model:

include SchemaSearchableafter_touch :index!meilisearch enqueue: :trigger_sidekiq_job, per_environment: true, primary_id: :ms_id do  attributes [:account_id, :id, :term, :definition, :updated]  attribute :updated do    updated_at.to_i  end  filterable_attributes [:account_id]enddef ms_id  "gt_#{account_id}_#{id}"end

Note that Meilisearch does not have a data type for Ruby or Rails date time objects, so were converting it to Unix epoch with to_i. after_touch :index! keeps your index up to date when the model changes. per_environment: true will ensure youre not polluting your development indexes with test data. enqueue will run index updates in the background per the method defined in schema_searchable.rb but we still need that worker. Here is meilisearch_enqueue_worker.rb:

class MeilisearchEnqueueWorker  include Sidekiq::Worker  def perform(klass, record_id, remove)    if remove      klass.constantize.index.delete_document(record_id)    else      klass.constantize.find(record_id).index!    end  endend

If youre able to start a fresh Rails console and run Model.reindex! without error, then youre ready to edit your index action in the controller. Right now using the active pagy search method without creating an N+1 query means we need both pagy_meilisearch and pagy_search like so:

def index  @pagy, @glossary_terms = pagy_meilisearch(    GlossaryTerm.includes(GlossaryTerm.search_includes).pagy_search(      params[:q],      **{        filter: "account_id = #{current_account.id}"      }    )  )end

The search_includes method on GlossaryTerm is just a list of associations needed to avoid N+1 queries. I like keeping that in the model:

def self.search_includes  %i(    user  )end

Assembling the filter string can get tricky compared to Elasticsearch due to it being a string instead of a hash but it lets you assemble the logic with as many AND and ORs as your heart desires. For things like filtering by tags with AND logic, youll need to do something like this:

filter = "discarded=false"if @conditions.key?(:tags)  @conditions[:tags].each do |tag|    filter += " AND tags='#{tag}'"  endend

In this case @conditionals a hash which is populated by processing the query to extract things like tags and sort for ordering. The documentation has some helpful notes about combining logic.

Fixing up the tests should be all that remains and its pretty much just changing index for index! and search_index.delete for clear_index!. It was very cool seeing the tests pass again after such minimal test fixing.

Hope you enjoyed! We certainly did here at ScribeHub and we eagerly await multi-index searching .


Original Link: https://dev.to/archonic/swapping-elasticsearch-for-meilisearch-in-rails-feat-docker-3g6f

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To