Introduction
The ELK Stack is a collection of three open-source tools — Elasticsearch, Logstash, and Kibana — that provide a robust solution for searching, analyzing, and visualizing log data in real time. This documentation covers installation, configuration, and core concepts for each component:
- Elasticsearch: A distributed, RESTful search and analytics engine
- Logstash: A server-side data processing pipeline for ingesting, transforming, and shipping data
- Kibana: A visualization layer that works on top of Elasticsearch
Elasticsearch Core Concepts
Documents
- The basic unit of information in Elasticsearch
- Expressed in JSON format
- Self-contained and can contain various fields with values
- Example:
{
"id": "1",
"title": "Understanding Elasticsearch",
"content": "Elasticsearch is a powerful search engine...",
"tags": ["search", "database", "elastic"],
"created_at": "2025-05-04T10:00:00Z"
}
Indices
- Collections of documents with similar characteristics
- Similar to a database table in relational databases
- Each index is optimized for specific search operations
- Example: A
logs-2025.05.04
index might contain all logs from May 4, 2025
Shards
- Horizontal divisions of an index that distribute data across nodes
- Each shard is a fully-functional, independent Lucene index
- Benefits:
- Distributes processing across multiple nodes
- Allows horizontal scaling
- Improves parallel operations
Configuration example:
# In elasticsearch.yml or via API
index.number_of_shards: 5 # Default for new indices
Replicas
- Copies of shards for redundancy and increased query throughput
- Provides high availability in case of node failures
- A replica shard is never allocated on the same node as its primary shard
Configuration example:
# In elasticsearch.yml or via API
index.number_of_replicas: 1 # Default for new indices
Elasticsearch REST API
Elasticsearch provides a comprehensive REST API for interacting with the cluster. The main HTTP methods are:
GET
Used to retrieve data or information about the cluster, indices, or documents:
# Get information about the cluster
curl -X GET "localhost:9200" -u elastic:password
# Get all indices
curl -X GET "localhost:9200/_cat/indices?v" -u elastic:password
# Get a specific document
curl -X GET "localhost:9200/my_index/_doc/1" -u elastic:password
POST
Used to create resources without specifying an ID, or to search documents:
# Create a document without specifying ID
curl -X POST "localhost:9200/my_index/_doc" -H "Content-Type: application/json" -d '{
"title": "New Document",
"content": "This is a new document"
}' -u elastic:password
# Search for documents
curl -X POST "localhost:9200/my_index/_search" -H "Content-Type: application/json" -d '{
"query": {
"match": {
"title": "document"
}
}
}' -u elastic:password
PUT
Used to create or update resources with a specified ID:
# Create an index
curl -X PUT "localhost:9200/my_index" -u elastic:password
# Create/update a document with a specific ID
curl -X PUT "localhost:9200/my_index/_doc/1" -H "Content-Type: application/json" -d '{
"title": "Updated Document",
"content": "This document has been updated"
}' -u elastic:password
DELETE
Used to remove resources:
# Delete a document
curl -X DELETE "localhost:9200/my_index/_doc/1" -u elastic:password
# Delete an index
curl -X DELETE "localhost:9200/my_index" -u elastic:password
Elasticsearch Common Operations
Creating an Index with Mappings
curl -X PUT "localhost:9200/logs" -H "Content-Type: application/json" -d '{
"mappings": {
"properties": {
"timestamp": { "type": "date" },
"message": { "type": "text" },
"level": { "type": "keyword" },
"source": { "type": "keyword" }
}
}
}' -u elastic:password
Bulk Indexing Documents
curl -X POST "localhost:9200/_bulk" -H "Content-Type: application/json" -d '
{"index": {"_index": "logs", "_id": "1"}}
{"timestamp": "2025-05-04T10:00:00Z", "message": "Server started", "level": "INFO", "source": "app-server"}
{"index": {"_index": "logs", "_id": "2"}}
{"timestamp": "2025-05-04T10:01:15Z", "message": "Connection established", "level": "INFO", "source": "app-server"}
' -u elastic:password
Searching Documents
curl -X GET "localhost:9200/logs/_search" -H "Content-Type: application/json" -d '{
"query": {
"bool": {
"must": [
{ "match": { "level": "INFO" } }
],
"filter": [
{ "range": { "timestamp": { "gte": "2025-05-04T00:00:00Z" } } }
]
}
},
"sort": [
{ "timestamp": "desc" }
],
"size": 20
}' -u elastic:password
Logstash
Logstash Architecture
Logstash processes events through a pipeline consisting of three stages:
- Inputs: Collect data from various sources
- Filters: Process and transform the data
- Outputs: Send the processed data to destinations
Logstash Configuration Structure
Logstash configuration files use a simple format:
input {
# Input plugins
}
filter {
# Filter plugins
}
output {
# Output plugins
}
Logstash Common Plugins
Input Plugins
- stdin:
- Reads from standard input
- Useful for testing
input {
stdin { }
}
- file:
- Reads from files on the filesystem
- Supports file rotation and tracking
input {
file {
path => "/var/log/apache/access.log"
start_position => "beginning"
sincedb_path => "/var/lib/logstash/sincedb"
}
}
- beats:
- Receives events from Beats framework (Filebeat, Metricbeat, etc.)
input {
beats {
port => 5044
host => "0.0.0.0"
}
}
Filter Plugins
- grok:
- Parses unstructured log data into structured fields
- Uses pattern matching
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
- date:
- Parses dates from fields and uses them for timestamp
- Supports various date formats
filter {
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "@timestamp"
}
}
- mutate:
- Performs general transformations on fields
- Operations: rename, remove, replace, convert, etc.
filter {
mutate {
convert => { "bytes" => "integer" }
rename => { "source" => "source_host" }
remove_field => [ "temp_field" ]
}
}
Output Plugins
- elasticsearch:
- Sends events to Elasticsearch
output {
elasticsearch {
hosts => ["localhost:9200"]
user => "elastic"
password => "YOUR_PASSWORD"
index => "logs-%{+YYYY.MM.dd}"
}
}
- stdout:
- Outputs events to standard output
- Useful for debugging
output {
stdout {
codec => rubydebug
}
}
- file:
- Writes events to files
output {
file {
path => "/var/log/logstash/processed_events.log"
codec => json_lines
}
}
Logstash Example Configurations
System Logs Processing
input {
file {
path => "/var/log/syslog"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{SYSLOGBASE} %{GREEDYDATA:syslog_message}" }
}
date {
match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
remove_field => [ "timestamp" ]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
user => "elastic"
password => "YOUR_PASSWORD"
index => "syslog-%{+YYYY.MM.dd}"
}
}
Apache Access Logs Processing
input {
file {
path => "/var/log/apache2/access.log"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
geoip {
source => "clientip"
}
useragent {
source => "agent"
target => "user_agent"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
user => "elastic"
password => "YOUR_PASSWORD"
index => "apache-access-%{+YYYY.MM.dd}"
}
}
Kibana
Connecting to Elasticsearch
-
Initial setup:
- Access Kibana through your Nginx reverse proxy: https://kibana.example.com
- Log in with Elasticsearch credentials (elastic user)
- Follow the on-screen setup wizard
-
Configure data source:
- Navigate to Stack Management → Data Views
- Create data views (formerly known as “index patterns”) for your Elasticsearch indices
- Example pattern:
logs-*
(matches all indices starting with “logs-”) - Select the timestamp field (typically
@timestamp
)
Creating Dashboards and Visualizations
-
Create visualizations:
- Navigate to Visualize
- Click “Create visualization”
- Choose a visualization type:
- Line, bar, area charts
- Pie charts
- Data tables
- Metrics
- Heat maps
- Maps
- Many more
-
Example: Create a line chart:
- Select “Line” visualization
- Choose your data view
- Configure metrics (Y-axis):
- Add metric: count(), avg(), sum(), etc.
- Configure buckets (X-axis):
- Add date histogram for time-based data
- Save the visualization
-
Create a dashboard:
- Navigate to Dashboard
- Click “Create dashboard”
- Add visualizations using “Add from library”
- Arrange and resize visualizations
- Add filters to refine the data
- Save the dashboard
Query Syntax
Kibana supports two query languages:
Lucene Query Syntax
- Field search:
field:value
- Example:
status:404
- Example:
- Wildcard:
*
(multiple characters),?
(single character)- Example:
user:j*
- Example:
- Range search:
field:[value1 TO value2]
- Example:
bytes:[1000 TO 5000]
- Example:
- Boolean operators:
AND
,OR
,NOT
- Example:
status:404 AND path:"/admin"
- Example:
- Grouping:
()
- Example:
(status:404 OR status:500) AND path:"/api"
- Example:
Kibana Query Language (KQL)
- Basic queries:
field:value
- Example:
status:404
- Example:
- Wildcard:
*
(same as Lucene)- Example:
user:j*
- Example:
- Boolean operators:
and
,or
,not
(lowercase)- Example:
status:404 and path:"/admin"
- Example:
- Nested field access:
parent.child:value
- Example:
user.name:john
- Example:
- Value lists:
field:(value1 or value2)
- Example:
status:(404 or 500)
- Example:
- Range operators:
>
,>=
,<
,<=
- Example:
bytes > 1000
- Example:
Troubleshooting
Elasticsearch Issues
- Not Starting:
- Check logs:
sudo journalctl -u elasticsearch
- Verify Java version:
java -version
- Check file permissions:
ls -la /etc/elasticsearch/
- Check memory settings:
grep -A 20 "heap" /etc/elasticsearch/jvm.options
- Check logs:
- Connection Refused:
- Verify service status:
sudo systemctl status elasticsearch
- Check listening addresses:
netstat -tlpn | grep 9200
- Confirm network settings in
elasticsearch.yml
- Verify service status:
Logstash Issues
- Configuration Problems:
- Validate config:
sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash -t -f /etc/logstash/conf.d/your-config.conf
- Check logs:
sudo journalctl -u logstash
- Validate config:
- Performance Issues:
- Check JVM settings:
grep -A 20 "heap" /etc/logstash/jvm.options
- Adjust worker settings in
logstash.yml
- Check JVM settings:
Kibana Issues
- Cannot Connect to Elasticsearch:
- Verify Elasticsearch is running:
curl -u elastic:password localhost:9200
- Check Kibana logs:
sudo journalctl -u kibana
- Verify connection settings in
kibana.yml
- Verify Elasticsearch is running:
- Nginx Proxy Issues:
- Check Nginx logs:
sudo tail -f /var/log/nginx/error.log
- Verify Nginx config:
sudo nginx -t
- Check Kibana is listening:
netstat -tlpn | grep 5601
- Check Nginx logs:
Security Considerations
Network Security
- Firewall Configuration:
- Only expose necessary ports (80/443 for Nginx)
- Block direct access to Elasticsearch (9200) and Kibana (5601)
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable
Elasticsearch Security
- Enable X-Pack Security:
- Set
xpack.security.enabled: true
inelasticsearch.yml
- Use strong passwords for all built-in users
- Configure proper role-based access control
- Set
- Encrypt Communications:
- Set
xpack.security.http.ssl.enabled: true
inelasticsearch.yml
- Configure certificates
- Set
Logstash Security
- Secure Connections:
- Use SSL/TLS for input and output plugins
- Store sensitive information in keystore instead of plain text
sudo -u logstash /usr/share/logstash/bin/logstash-keystore create
sudo -u logstash /usr/share/logstash/bin/logstash-keystore add ES_PWD
Kibana Security
- Authentication:
- Use Nginx basic auth or X-Pack security
- Configure SSL/TLS
- Role-Based Access Control:
- Create specific roles for different user groups
- Apply least privilege principle