⚡ Real-World Impact: Elasticsearch can search through millions of documents in milliseconds. Companies like Netflix, Uber, and LinkedIn use it to power search across massive datasets. You can implement basic search functionality in under an hour.
What Is Elasticsearch and Why Should You Care?
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Think of it as a specialized database optimized for one thing: finding stuff blazingly fast.
Here's what makes it special:
- Full-text search: Find documents by content, not just exact matches
- Near real-time: Documents are searchable within seconds of indexing
- RESTful API: Everything happens via HTTP requests
- Scalable: Start with one server, scale to hundreds
- Flexible: Works with structured and unstructured data
💡 Use Cases: Product catalogs, log analysis, autocomplete suggestions, user search, document management, monitoring dashboards, geospatial queries, and real-time analytics.
Core Concepts You Need to Know
Document
The basic unit of data. Think of it as a JSON object that represents a single record (a product, user, log entry, etc.)
Index
A collection of documents with similar characteristics. Similar to a database table, but more flexible.
Mapping
Defines how documents and their fields are stored and indexed. Like a schema, but dynamic.
Query DSL
Domain Specific Language for building complex search queries using JSON.
Shard
A subset of an index. Elasticsearch divides indexes into shards for distribution and parallelization.
Getting Started: Your First Index
Let's build a simple product search. First, create an index and add some documents:
Create an Index
PUT /products
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"name": { "type": "text" },
"description": { "type": "text" },
"price": { "type": "float" },
"category": { "type": "keyword" },
"in_stock": { "type": "boolean" },
"created_at": { "type": "date" }
}
}
}
Add Documents
POST /products/_doc
{
"name": "Wireless Bluetooth Headphones",
"description": "High-quality noise cancelling headphones with 30hr battery",
"price": 79.99,
"category": "Electronics",
"in_stock": true,
"created_at": "2024-01-15"
}
POST /products/_doc
{
"name": "Gaming Mouse RGB",
"description": "Ergonomic gaming mouse with customizable RGB lighting",
"price": 49.99,
"category": "Electronics",
"in_stock": true,
"created_at": "2024-01-20"
}
💡 Field Types Matter: text is analyzed for full-text search, while keyword is for exact matching and aggregations. Choose wisely!
Basic Search Queries
1. Match All (Get Everything)
GET /products/_search
{
"query": {
"match_all": {}
}
}
2. Full-Text Search
GET /products/_search
{
"query": {
"match": {
"description": "gaming mouse"
}
}
}
This finds documents where "gaming" OR "mouse" appear in the description. Elasticsearch ranks results by relevance score.
3. Multi-Field Search
GET /products/_search
{
"query": {
"multi_match": {
"query": "bluetooth headphones",
"fields": ["name^2", "description"]
}
}
}
The ^2 boosts the name field's importance by 2x. Matches in name will rank higher.
4. Exact Match (Term Query)
GET /products/_search
{
"query": {
"term": {
"category": "Electronics"
}
}
}
⚠️ Important: Use term queries with keyword fields only. For text fields, use match instead.
Advanced Queries You'll Actually Use
Bool Query (Combining Conditions)
GET /products/_search
{
"query": {
"bool": {
"must": [
{ "match": { "description": "wireless" } }
],
"filter": [
{ "term": { "in_stock": true } },
{ "range": { "price": { "lte": 100 } } }
],
"should": [
{ "match": { "category": "Electronics" } }
],
"must_not": [
{ "match": { "name": "refurbished" } }
]
}
}
}
What's happening:
must: Documents MUST match (affects score)filter: Documents MUST match (doesn't affect score, faster)should: Documents SHOULD match (boosts score if they do)must_not: Documents must NOT match
Range Queries
GET /products/_search
{
"query": {
"range": {
"price": {
"gte": 20,
"lte": 80
}
}
}
}
Range operators: gt (greater than), gte (greater than or equal), lt, lte. Works with numbers, dates, and strings.
Fuzzy Search (Typo Tolerance)
GET /products/_search
{
"query": {
"fuzzy": {
"name": {
"value": "headphonez",
"fuzziness": "AUTO"
}
}
}
}
This matches "headphones" even with spelling mistakes. AUTO fuzziness adjusts based on term length.
Aggregations: Analytics on Steroids
Aggregations let you calculate metrics and group data. Think SQL GROUP BY, but more powerful.
Count Products by Category
GET /products/_search
{
"size": 0,
"aggs": {
"categories": {
"terms": {
"field": "category",
"size": 10
}
}
}
}
Calculate Average Price
GET /products/_search
{
"size": 0,
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
},
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
The stats aggregation gives you count, min, max, avg, and sum in one query.
Nested Aggregations
GET /products/_search
{
"size": 0,
"aggs": {
"categories": {
"terms": { "field": "category" },
"aggs": {
"avg_price_per_category": {
"avg": { "field": "price" }
}
}
}
}
}
This groups by category and calculates average price for each category. Perfect for building dashboards!
Real-World Implementation in Node.js
Installation
npm install @elastic/elasticsearch
Complete Search API
const { Client } = require('@elastic/elasticsearch');
const client = new Client({
node: 'http://localhost:9200'
});
// Search function
async function searchProducts(query, options = {}) {
try {
const { body } = await client.search({
index: 'products',
body: {
query: {
bool: {
must: [
{
multi_match: {
query: query,
fields: ['name^2', 'description'],
fuzziness: 'AUTO'
}
}
],
filter: options.filters || []
}
},
from: options.page * options.size || 0,
size: options.size || 10,
sort: options.sort || [{ _score: 'desc' }]
}
});
return {
total: body.hits.total.value,
results: body.hits.hits.map(hit => ({
id: hit._id,
score: hit._score,
...hit._source
}))
};
} catch (error) {
console.error('Search error:', error);
throw error;
}
}
// Usage example
async function main() {
const results = await searchProducts('wireless headphones', {
filters: [
{ range: { price: { lte: 100 } } },
{ term: { in_stock: true } }
],
page: 0,
size: 20
});
console.log(`Found ${results.total} products`);
results.results.forEach(product => {
console.log(`${product.name} - ${product.price}`);
});
}
main();
🎯 Production Tips:
- Always use pagination (from/size) for large result sets
- Cache frequently-used queries with Redis
- Use connection pooling in production
- Monitor query performance with slowlog
Performance Optimization Tips
1. Use Filters Instead of Queries When Possible
Filters are cached and don't calculate scores, making them much faster for boolean conditions.
2. Limit the Number of Shards
For small indexes (under 50GB), use 1-2 shards. Too many shards hurt performance.
3. Use _source Filtering
Only return the fields you need to reduce network overhead.
4. Bulk Indexing
When indexing multiple documents, use the bulk API for 10x better performance.
Common Pitfalls to Avoid
Pitfall #1: Using Wildcards at the Beginning
Queries like *search* scan the entire index. Use ngrams instead.
Pitfall #2: Not Setting Refresh Interval
Default 1s refresh is overkill for most cases. Set to 30s or disable for bulk indexing.
Pitfall #3: Deep Pagination
Don't paginate beyond 10,000 results. Use search_after or scroll API for deep pagination.
Start Searching Faster Today
Elasticsearch transforms how users find information in your application. What takes SQL minutes to process, Elasticsearch does in milliseconds.
Don't let slow search frustrate your users. Implement Elasticsearch and watch engagement soar.
Essential Resources
- Elasticsearch Official Docs - Comprehensive documentation and guides
- Elastic Stack - Learn Kibana, Logstash, and Beats integration
- Elasticsearch JavaScript Client - Official Node.js library docs
- Elastic Blog - Best practices and case studies
- Elasticsearch Community Forum - Get help from experts
Questions about Elasticsearch? Drop them in the comments below!

0 Comments