Introduction

The ELK Stack is a collection of three open-source tools — Elasticsearch, Logstash, and Kibana — that provide a robust solution for searching, analyzing, and visualizing log data in real time. This documentation covers installation, configuration, and core concepts for each component:

  • Elasticsearch: A distributed, RESTful search and analytics engine
  • Logstash: A server-side data processing pipeline for ingesting, transforming, and shipping data
  • Kibana: A visualization layer that works on top of Elasticsearch

Elasticsearch Core Concepts

Documents

  • The basic unit of information in Elasticsearch
  • Expressed in JSON format
  • Self-contained and can contain various fields with values
  • Example:
{
  "id": "1",
  "title": "Understanding Elasticsearch",
  "content": "Elasticsearch is a powerful search engine...",
  "tags": ["search", "database", "elastic"],
  "created_at": "2025-05-04T10:00:00Z"
}

Indices

  • Collections of documents with similar characteristics
  • Similar to a database table in relational databases
  • Each index is optimized for specific search operations
  • Example: A logs-2025.05.04 index might contain all logs from May 4, 2025

Shards

  • Horizontal divisions of an index that distribute data across nodes
  • Each shard is a fully-functional, independent Lucene index
  • Benefits:
    • Distributes processing across multiple nodes
    • Allows horizontal scaling
    • Improves parallel operations

Configuration example:

# In elasticsearch.yml or via API
index.number_of_shards: 5  # Default for new indices

Replicas

  • Copies of shards for redundancy and increased query throughput
  • Provides high availability in case of node failures
  • A replica shard is never allocated on the same node as its primary shard

Configuration example:

# In elasticsearch.yml or via API
index.number_of_replicas: 1  # Default for new indices

Elasticsearch REST API

Elasticsearch provides a comprehensive REST API for interacting with the cluster. The main HTTP methods are:

GET

Used to retrieve data or information about the cluster, indices, or documents:

# Get information about the cluster
curl -X GET "localhost:9200" -u elastic:password
 
# Get all indices
curl -X GET "localhost:9200/_cat/indices?v" -u elastic:password
 
# Get a specific document
curl -X GET "localhost:9200/my_index/_doc/1" -u elastic:password

POST

Used to create resources without specifying an ID, or to search documents:

# Create a document without specifying ID
curl -X POST "localhost:9200/my_index/_doc" -H "Content-Type: application/json" -d '{
  "title": "New Document",
  "content": "This is a new document"
}' -u elastic:password
 
# Search for documents
curl -X POST "localhost:9200/my_index/_search" -H "Content-Type: application/json" -d '{
  "query": {
    "match": {
      "title": "document"
    }
  }
}' -u elastic:password

PUT

Used to create or update resources with a specified ID:

# Create an index
curl -X PUT "localhost:9200/my_index" -u elastic:password
 
# Create/update a document with a specific ID
curl -X PUT "localhost:9200/my_index/_doc/1" -H "Content-Type: application/json" -d '{
  "title": "Updated Document",
  "content": "This document has been updated"
}' -u elastic:password

DELETE

Used to remove resources:

# Delete a document
curl -X DELETE "localhost:9200/my_index/_doc/1" -u elastic:password
 
# Delete an index
curl -X DELETE "localhost:9200/my_index" -u elastic:password

Elasticsearch Common Operations

Creating an Index with Mappings

curl -X PUT "localhost:9200/logs" -H "Content-Type: application/json" -d '{
  "mappings": {
    "properties": {
      "timestamp": { "type": "date" },
      "message": { "type": "text" },
      "level": { "type": "keyword" },
      "source": { "type": "keyword" }
    }
  }
}' -u elastic:password

Bulk Indexing Documents

curl -X POST "localhost:9200/_bulk" -H "Content-Type: application/json" -d '
{"index": {"_index": "logs", "_id": "1"}}
{"timestamp": "2025-05-04T10:00:00Z", "message": "Server started", "level": "INFO", "source": "app-server"}
{"index": {"_index": "logs", "_id": "2"}}
{"timestamp": "2025-05-04T10:01:15Z", "message": "Connection established", "level": "INFO", "source": "app-server"}
' -u elastic:password

Searching Documents

curl -X GET "localhost:9200/logs/_search" -H "Content-Type: application/json" -d '{
  "query": {
    "bool": {
      "must": [
        { "match": { "level": "INFO" } }
      ],
      "filter": [
        { "range": { "timestamp": { "gte": "2025-05-04T00:00:00Z" } } }
      ]
    }
  },
  "sort": [
    { "timestamp": "desc" }
  ],
  "size": 20
}' -u elastic:password

Logstash

Logstash Architecture

Logstash processes events through a pipeline consisting of three stages:

  1. Inputs: Collect data from various sources
  2. Filters: Process and transform the data
  3. Outputs: Send the processed data to destinations

Logstash Configuration Structure

Logstash configuration files use a simple format:

input {
  # Input plugins
}

filter {
  # Filter plugins
}

output {
  # Output plugins
}

Logstash Common Plugins

Input Plugins

  1. stdin:
    • Reads from standard input
    • Useful for testing
input {
  stdin { }
}
  1. file:
    • Reads from files on the filesystem
    • Supports file rotation and tracking
input {
  file {
    path => "/var/log/apache/access.log"
    start_position => "beginning"
    sincedb_path => "/var/lib/logstash/sincedb"
  }
}
  1. beats:
    • Receives events from Beats framework (Filebeat, Metricbeat, etc.)
input {
  beats {
    port => 5044
    host => "0.0.0.0"
  }
}

Filter Plugins

  1. grok:
    • Parses unstructured log data into structured fields
    • Uses pattern matching
filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
}
  1. date:
    • Parses dates from fields and uses them for timestamp
    • Supports various date formats
filter {
  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    target => "@timestamp"
  }
}
  1. mutate:
    • Performs general transformations on fields
    • Operations: rename, remove, replace, convert, etc.
filter {
  mutate {
    convert => { "bytes" => "integer" }
    rename => { "source" => "source_host" }
    remove_field => [ "temp_field" ]
  }
}

Output Plugins

  1. elasticsearch:
    • Sends events to Elasticsearch
output {
  elasticsearch {
    hosts => ["localhost:9200"]
    user => "elastic"
    password => "YOUR_PASSWORD"
    index => "logs-%{+YYYY.MM.dd}"
  }
}
  1. stdout:
    • Outputs events to standard output
    • Useful for debugging
output {
  stdout {
    codec => rubydebug
  }
}
  1. file:
    • Writes events to files
output {
  file {
    path => "/var/log/logstash/processed_events.log"
    codec => json_lines
  }
}

Logstash Example Configurations

System Logs Processing

input {
  file {
    path => "/var/log/syslog"
    start_position => "beginning"
  }
}

filter {
  grok {
    match => { "message" => "%{SYSLOGBASE} %{GREEDYDATA:syslog_message}" }
  }
  date {
    match => [ "timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
  }
  mutate {
    remove_field => [ "timestamp" ]
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    user => "elastic"
    password => "YOUR_PASSWORD"
    index => "syslog-%{+YYYY.MM.dd}"
  }
}

Apache Access Logs Processing

input {
  file {
    path => "/var/log/apache2/access.log"
    start_position => "beginning"
  }
}

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  geoip {
    source => "clientip"
  }
  useragent {
    source => "agent"
    target => "user_agent"
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    user => "elastic"
    password => "YOUR_PASSWORD"
    index => "apache-access-%{+YYYY.MM.dd}"
  }
}

Kibana

Connecting to Elasticsearch

  1. Initial setup:

    • Access Kibana through your Nginx reverse proxy: https://kibana.example.com
    • Log in with Elasticsearch credentials (elastic user)
    • Follow the on-screen setup wizard
  2. Configure data source:

    • Navigate to Stack Management → Data Views
    • Create data views (formerly known as “index patterns”) for your Elasticsearch indices
    • Example pattern: logs-* (matches all indices starting with “logs-”)
    • Select the timestamp field (typically @timestamp)

Creating Dashboards and Visualizations

  1. Create visualizations:

    • Navigate to Visualize
    • Click “Create visualization”
    • Choose a visualization type:
      • Line, bar, area charts
      • Pie charts
      • Data tables
      • Metrics
      • Heat maps
      • Maps
      • Many more
  2. Example: Create a line chart:

    • Select “Line” visualization
    • Choose your data view
    • Configure metrics (Y-axis):
      • Add metric: count(), avg(), sum(), etc.
    • Configure buckets (X-axis):
      • Add date histogram for time-based data
    • Save the visualization
  3. Create a dashboard:

    • Navigate to Dashboard
    • Click “Create dashboard”
    • Add visualizations using “Add from library”
    • Arrange and resize visualizations
    • Add filters to refine the data
    • Save the dashboard

Query Syntax

Kibana supports two query languages:

Lucene Query Syntax

  • Field search: field:value
    • Example: status:404
  • Wildcard: * (multiple characters), ? (single character)
    • Example: user:j*
  • Range search: field:[value1 TO value2]
    • Example: bytes:[1000 TO 5000]
  • Boolean operators: AND, OR, NOT
    • Example: status:404 AND path:"/admin"
  • Grouping: ()
    • Example: (status:404 OR status:500) AND path:"/api"

Kibana Query Language (KQL)

  • Basic queries: field:value
    • Example: status:404
  • Wildcard: * (same as Lucene)
    • Example: user:j*
  • Boolean operators: and, or, not (lowercase)
    • Example: status:404 and path:"/admin"
  • Nested field access: parent.child:value
    • Example: user.name:john
  • Value lists: field:(value1 or value2)
    • Example: status:(404 or 500)
  • Range operators: >, >=, <, <=
    • Example: bytes > 1000

Troubleshooting

Elasticsearch Issues

  1. Not Starting:
    • Check logs: sudo journalctl -u elasticsearch
    • Verify Java version: java -version
    • Check file permissions: ls -la /etc/elasticsearch/
    • Check memory settings: grep -A 20 "heap" /etc/elasticsearch/jvm.options
  2. Connection Refused:
    • Verify service status: sudo systemctl status elasticsearch
    • Check listening addresses: netstat -tlpn | grep 9200
    • Confirm network settings in elasticsearch.yml

Logstash Issues

  1. Configuration Problems:
    • Validate config: sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash -t -f /etc/logstash/conf.d/your-config.conf
    • Check logs: sudo journalctl -u logstash
  2. Performance Issues:
    • Check JVM settings: grep -A 20 "heap" /etc/logstash/jvm.options
    • Adjust worker settings in logstash.yml

Kibana Issues

  1. Cannot Connect to Elasticsearch:
    • Verify Elasticsearch is running: curl -u elastic:password localhost:9200
    • Check Kibana logs: sudo journalctl -u kibana
    • Verify connection settings in kibana.yml
  2. Nginx Proxy Issues:
    • Check Nginx logs: sudo tail -f /var/log/nginx/error.log
    • Verify Nginx config: sudo nginx -t
    • Check Kibana is listening: netstat -tlpn | grep 5601

Security Considerations

Network Security

  1. Firewall Configuration:
    • Only expose necessary ports (80/443 for Nginx)
    • Block direct access to Elasticsearch (9200) and Kibana (5601)
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

Elasticsearch Security

  1. Enable X-Pack Security:
    • Set xpack.security.enabled: true in elasticsearch.yml
    • Use strong passwords for all built-in users
    • Configure proper role-based access control
  2. Encrypt Communications:
    • Set xpack.security.http.ssl.enabled: true in elasticsearch.yml
    • Configure certificates

Logstash Security

  1. Secure Connections:
    • Use SSL/TLS for input and output plugins
    • Store sensitive information in keystore instead of plain text
sudo -u logstash /usr/share/logstash/bin/logstash-keystore create
sudo -u logstash /usr/share/logstash/bin/logstash-keystore add ES_PWD

Kibana Security

  1. Authentication:
    • Use Nginx basic auth or X-Pack security
    • Configure SSL/TLS
  2. Role-Based Access Control:
    • Create specific roles for different user groups
    • Apply least privilege principle

References