What are some use cases for using Elasticsearch versus standard sql queries?
To add in with the other answer, Logging is still a major use case as well as searches, but now metrics and analytics are becoming more important.
I believe that this post summarizes the changes in the market that is driving new use cases for Big Data. All you really need to know about Open Source Databases
With the advent of Web 2.0, static web pages have become dynamic and social media is all around us. Everyone is tweeting, posting, blogging, vlogging, sharing photos, chatting and commenting. The Internet of Things (IoT) is emerging — a rapidly growing network of connected devices that collect and exchange data, such as sensors and smart devices. There are some great examples here.
Altogether, this generates huge amounts of new data that businesses want to absorb and use to stay ahead, to provide features such as product recommendations and a better customer experience. The data can be analyzed in search of patterns for applications such as fraud detection and behavior analytics. Much of the new data is unstructured, which means that it can’t be neatly stored in a tabular database.
Imagine trying to design a database to hold data on your grocery shopping — what you like, how often you buy it, whether you prefer milk or cream with your coffee. New types of databases are needed to store the new data, and they need to be non-relational and ideally low cost. Ring any bells? Not relational as in NoSQL and low cost as in open source.
One of the Elasticsearch Architects I spoke with said that 80% of the data Elasticsearch works with in companies is unstructured, while 20% is structured. It's the unstructured data that companies are looking at to discover rare or unusual data patterns. They are also using Elasticsearch for monitoring data patterns. For example, a major retailer is doing real time tracking with Elasticsearch in order to ensure adequate money supplies at stores for people to cash checks on paydays.
In my own experience with our search use case, we not only use fuzzy searches, but it evolved into auto-complete and quick searches. From what I've seen, once you start working with Elasticsearch, you start evolving into other use cases that complement what you already have in place. Now that we have established Elasticsearch as a fuzzy search engine at our company, we now have other teams looking into analytics and metrics for logging.
Here are some additional resources that go more in-depth on this topic:
- Elasticsearch Use Cases, Stories from Users
- Uses of Elasticsearch
- Elasticsearch Anaytics Use Cases
- Elasticsearch Use Cases for Document Storage
- Graphs with Elasticsearch
- Forensic analysis: Panama Papers and the Wisdom of Crowds
- Introduction to Machine Learning with Elasticsearch
There are two primary Elasticsearch use cases:
- Text search
You want Elasticsearch when you're doing a lot of text search, where traditional RDBMS databases are not performing really well (poor configuration, acts as a black-box, poor performance). Elasticsearch is highly customizable, extendable through plugins. You can build robust search without much knowledge quite fast.
- Logging and analysis
Another edge case is that a lot people use Elasticsearch to store logs from various sources (to centralize them), so they can analyze them and make sense out of it. In this case, Kibana becomes handy. It lets you connect to Elasticsearch cluster and create visualisations straight away. For instance, Loggly is built using Elasticsearch and Kibana.
Keep in mind, that you wouldn't want to use Elasticsearch as your primary data storage. Reasons here: How reliable is ElasticSearch as a primary datastore against factors like write loss, data availability
Update
I felt like the second part is no longer edgy, it's actually what Elastic as a company has been doing really well in past year. With current DevOps movement, CI/CD pipelines, increasing amount of metrics from various sources, ELK became a defacto choice for infrastructure monitoring, it's no longer just a distributed RESTful text-search engine. It has an amazing set of products:
- Logstash (tons of data inputs)
- Beats
- Filebeat
- Metricbeat
- Packetbeat
- Winlogbeat
- Kibana
- Graph
- Timelion
- X-Pack (premium)
- Alerts
- Reporting
- Security
- Machine Learning
- Cross data center metrics
An ecosystem, built by community, is growing around ELK stack that expands current features, few of them worth mentioning:
- ElastAlert
- Search Guard