What does Netflix, eBay and Walmart have in common? They all use Elasticsearch.
What is Elasticsearch?
Elasticsearch is a real-time open-source distributed search and analytics engine built on top of Apache Lucene™, a fulltext search-engine library and developed in Java. Elasticsearch started as a scalable version of the Lucene open-source search framework that uses a structure based on documents instead of tables and schemas and comes with extensive REST APIs for storing and searching the data.
Elasticsearch is much more than just full-text search. It can be better described as a distributed real-time document store where every field is indexed and searchable. It is a distributed search engine with real-time analytics that is capable of scaling to hundreds of servers and petabytes of structured and unstructured data. And it packages up all this functionality into a standalone server that your application can talk to via a simple RESTful API, using a web client or from the command line.
How does Elasticsearch work?
Data is more than just random bits and bytes. It is the relationship between data elements that allows us to represent entities (a thing) that exist in the real world.
Data structures as objects
An object is a language-specific, in-memory data structure. One of the reasons why object-oriented programming languages are so popular is that objects help us represent and manipulate real-world entities with potentially complex data structures.
JSON formatted
To send objects across the network or store it, we need to be able to represent it in some standard format. JSON is a way of representing objects in human-readable text. It has become the de facto standard for exchanging data in the NoSQL world.
An index for every field
Data is sent in the form of JSON documents to Elasticsearch using the API or ingestion tools such as Logstash. In Elasticsearch, all data in every field is indexed by default. That is, every field has a dedicated inverted index for fast retrieval. And, unlike most other databases, it can use all of those inverted indices in the same query, to return results at breathtaking speed. The document can be retrieved using Elasticsearch API.
Enterprise use cases of Elasticsearch
Search and analytics are key features of modern software applications. Scalability and the capability to handle large volumes of data in near real-time is a must for many applications such as mobile apps, web, and data analytics applications.
Enterprises are using Elasticsearch as a search platform for the access, retrieval, and reporting of data, logging and log-analytics, analysis of infrastructure metrics, security and business analytics.
Scaling real-time analytics
The analytical use case is one of the most popular Elasticsearch enterprise use cases. Elasticsearch is often used for log analytics, slicing and dicing of numerical data such as application and infrastructure performance metrics. Elasticsearch is used across various use cases to monitor and analyze customer service operations and security logs. Enterprises are using Elasticsearch to reveal the hidden potential of their customer data to gain insights about customer purchasing patterns, track store performance metrics, and holiday analytics – all in near real-time.
Performant data queries
Enterprises are also using Elasticsearch successfully in their company intranets to provide full-text search with highlighted search snippets, and search-as-you-type and did-you-mean suggestions. One of the leading online communities for developers uses Elasticsearch for full-text search with geolocation queries to find related questions and answers. A leading english newspaper uses Elasticsearch to combine visitor logs with social -network data to provide real-time feedback to its editors about the public’s response to new articles. A well known provider of Internet hosting for software development and version control uses Elasticsearch to query 130 billion lines of code.
But Elasticsearch is not just for mega-corporations. It has enabled many startups to prototype ideas and to turn them into scalable solutions. Elasticsearch can run on your laptop, or scale out to hundreds of servers and petabytes of data.
Why managed Elasticsearch?
Distributed systems are complex. The scaling and management of Elasticsearch production workloads can be difficult and often requires expertise in Elasticsearch setup and configuration. Offload the complexity of managing the day to day operations of Elasticsearch clusters and gain freedom to focus on driving the business value of your Elasticsearch use case implementation.
Why managed open source ? Watch video
Feature-rich, managed Elasticsearch at any scale from Canonical
Whether you’re deploying your first Elasticsearch cluster or scaling up your existing deployment, Canonical’s SLA-based Elasticsearch managed app service makes it easy to deploy, secure, scale and manage your open source Elasticsearch cluster. Canonical engineers will ensure smooth Elasticsearch operations and you can benefit from optimisation of your Elasticsearch clusters, proactive maintenance and production support.
Get in touch for Elasticsearch deployment assessment
Photos by Markus Winkler and Michael Walter on Unsplash