Minerva Studio - Fotolia
The Elasticsearch sideshow and why Algolia is the better bet
Elastic and AWS continue to make headlines over their licensing dispute, but the bickering avoids an important question: Is Elasticsearch still the right tool for the job?
The spat between AWS and Elastic continues to raise existential questions in the open source community. But on a practical level, the debate may soon become moot as software developers move to more simplified tools to meet their search needs instead.
AWS and Elastic have been at odds for years, with the dispute spilling into the public in 2019. The late salvo began this year when Elastic changed its licensing to prevent AWS from using new releases of the heretofore open source Elasticsearch that underpins Amazon Elasticsearch Service.
Elastic CEO Shay Banon said the company did this because Amazon violated the "norms and values" of the open source ecosystem. However, the net result is that AWS will now maintain open source Elasticsearch, and Elastic -- the originator of the project -- will build a more restricted license version of Elasticsearch.
Plenty of industry observers have criticized Amazon for hurting open source software companies over the years. But Elastic is ultimately a venture-backed company that made some bets on its business model, and Banon recently sold nearly $60 million in company stock, so it's hard to see this as a human-interest story.
I've used Elasticsearch and its predecessors and successors for years, and I can tell you that all the sound and fury of Elastic vs. AWS signifies nothing. Both companies are fighting for enterprise dollars here, and enterprises are always behind the curve with their tech investments.
In truth, the future looks bleak for both Elastic and Amazon Elasticsearch Service. Software development today is about highly usable, effective and scalable managed services. Unfortunately, Elasticsearch is not the best way to achieve any of those goals, which is why the better choice today is another vendor altogether.
Algolia, an Elasticsearch competitor, is poised to be the real winner of this tiff. For the purposes of this article, let's set the licensing bickering aside, and examine why a different approach to search will ultimately leave Elasticsearch behind.
AWS and the move to managed services
When AWS introduced its cloud platform, it changed software development forever. Business leaders and software developers could deploy software quicker and avoid the painfully slow process of procuring, provisioning and deploying hardware.
AWS pioneered the delivery of computer infrastructure via API, and nearly every company on the planet now understands the advantage of deploying software on a cloud. But infrastructure as a service, or IaaS, was just the first step on the path of building faster, better and cheaper software.
Many pundits envisioned a framework of SaaS on top of PaaS on top of IaaS, where perhaps a web application would run on Heroku, running on AWS. But what's played out in reality has been much vaster and more complex.
Today, software developers might use IaaS from a public cloud. But then they also use services like Twilio to handle text messaging and telephony, and Stripe to handle everything involving billing, including storing payment card information.
AWS laid the groundwork for a model where organizations offload certain tasks to fully managed cloud services and focus instead on building only the things that are specific to their businesses.
Elasticsearch fills a void
About 20 years ago, it became clear that there was a need to have separate systems to handle search. That's because relational databases, which were the primary transactional databases in software systems at the time, were slow and produced poor results.
Relational databases aren't built for searching gigabytes of text data and returning quick results, or for doing fuzzy searches. There were a handful of expensive commercial products in the market, but Doug Cutting built a great open source search engine called Lucene that brought fast text search to everyone.
However, Lucene -- and its derivative Solr -- didn't easily integrate into a web application. After a brief rise, Lucene eventually fell behind Elasticsearch, which became the preferred search tool to integrate into web applications in the early 2010s.
Elasticsearch has continued to improve since then, and it's significantly better today than it or its competitors were five-plus years ago. But it remains a product with a 2010 sensibility.
The limitations of Elasticsearch
Elasticsearch is built for systems architects who like to configure and run servers -- it has an incredible amount of configuration available, and the defaults are almost certainly not what you want. If you don't configure it to properly handle inputs, it won't index all your records.
Also, real data has a way of introducing odd characters, such as vertical tabs or emojis, and Elasticsearch is designed to eject the unknown. Elasticsearch was also originally built without much consideration for security, similar to most open source software written 10 or more years ago.
Even though it has improved in the security dimension, almost no practitioners will run an Elasticsearch server on the open internet. For example, you can't allow public/anonymous queries without giving access to essentially everything in an index. If you want to impose restrictions, such as rate limits or caps on the number of results returned, you have to run as a custom-coded proxy server to take requests and enforce security rules.
There's also the query language, which is complex and requires a fair amount of tinkering and specification to nail down for a given use case. Elasticsearch does give you the ability to search everything, but the results are rarely the most desirable outcomes. The architects, systems administrators and developers must tweak, probe and experiment to get things working properly.
Managed Elasticsearch and the problems it doesn't fix
Seeing a need to address all these configuration requirements, AWS and Elastic -- with Elastic Cloud -- each came out with hosted Elasticsearch offerings. But these services don't make things much better.
They do handle installation and upgrades, though you still need to step in to handle failures and scaling. And Elastic Cloud provides a bit more than Amazon Elasticsearch Service. It has a better management console -- although in both cases, it's essentially a GUI on top of the configuration files you'd edit in a text editor otherwise -- as well as better monitoring and more useful authentication options.
In short, the "management" that both AWS and Elastic provide does little to eliminate many of the pain points of running Elasticsearch -- the finicky indexing, the complex configuration and query language learning curve and even the administrative burden.
This is where Algolia comes in. The managed service indexes anything you throw at it with very little configuration. You can choose which fields are searched and you can filter results by some other fields, but it's built around the same concept as Google, which is: Just type what you want into a single field.
Algolia automatically scales up and down, and you only pay for data stored and queries made. It also has security built-in, so your front-end applications can connect directly to it and still maintain strong security rules, e.g., limitations on what data comes back, how much data and rate limits per client.
It's been around as long as Elastic.co -- originally Elastic NV -- and has raised more money than Elastic. Algolia powers search for a large number of fast-growing startups with strong engineering teams, like Stripe and Slack.
And while many teams will pay more for Algolia than they would to run Elasticsearch's cloud infrastructure, it will be dramatically less than the total cost of ownership with using Elasticsearch -- even the managed versions. That's because with Elasticsearch, you have to take into account all of the additional expertise you must have, including the DevOps skills needed to keep it running.
Algolia vs. Elasticsearch and what comes next
To evaluate the future of cloud-based search, look at what all three vendors are doing today.
On a strategic level, Elastic is trying to stop the bleeding from AWS winning the managed Elasticsearch market with a product that's not quite as good but noticeably cheaper. AWS is trying to keep that gravy train going with an admittedly great PR move of taking over open source development of Elasticsearch.
As for actual feature development, Elastic's release notes show a continued focus on query syntax and the complexities of how Elastic stores and searches documents.
To get a sense for where Algolia is headed, look at its recent acquisition of MorphL. The AI company's software will be used to personalize and improve search performance, much in the same way Google analyzes your data and the search results that are getting the most clicks.
And Algolia isn't alone in trying to deliver simplified search capabilities. For example, Microsoft is also using machine learning to build its Azure Cognitive Search offering, while MeiliSearch is an emerging open source search engine.
The most effective software developers today embrace managed services because they reduce the reliance on skilled and available DevOps staff in-house. They also reduce the amount of knowledge needed to build and deploy great software. When it comes to search, they're going to increasingly pick Algolia and tools like it over finicky Elasticsearch.
When we look back at this fight between Elastic and AWS, we'll wonder why we even cared about something that was going the way of the mainframe.