Brief Steps

  1. Install Elasticsearch: Set up and run Elasticsearch on your server(s). This will be your central data store and search engine.
  2. Install Logstash: Set up and run Logstash on a server (often the same as Elasticsearch or separate). This will be used for processing logs before they go into Elasticsearch.
  3. Install Kibana: Set up and run Kibana, usually on the same server as Elasticsearch. This is your visualization and exploration tool.
  4. Configure Logstash Pipeline: Create a configuration for Logstash that defines:
    • An input to listen for data coming from Filebeat (e.g., using the Beats input plugin).
    • Optional filters to parse or transform your log data if needed (e.g., grok for unstructured logs, date parsing).
    • An output to send the processed data to your Elasticsearch instance.
  5. Configure Kibana Connection: Point Kibana to your running Elasticsearch instance so it knows where to fetch data from. Start Kibana.
  6. Install Filebeat: On the server(s) where your application is running and writing .log files, install the Filebeat agent.
  7. Configure Filebeat: Modify the Filebeat configuration file to:
    • Specify the path(s) to your application’s .log files under the input section.
    • Specify the output destination, pointing it to your Logstash instance’s address and port (where the Beats input is listening).
  8. Start Services: Ensure Elasticsearch, Logstash, Kibana, and Filebeat services are running.
  9. Create Kibana Index Pattern: Create an “Index Pattern”. This tells Kibana which Elasticsearch indices to look at (e.g., logstash-* or filebeat-*, depending on the setup) and identifies the timestamp field.
  10. Explore Logs in Kibana: Go to the “Discover” section in Kibana. Select your newly created index pattern. You should now see your log events flowing in. You can search and filter them here.
  11. Build Kibana Dashboard:

Preparation

Let say that we use app.log file as source, located in /home/bigdata/log/app.log

2023-10-27 10:00:01 INFO User 'admin' logged in successfully. session_id=abc123
2023-10-27 10:00:05 INFO Processing request for resource '/api/users'. request_id=xyz789
2023-10-27 10:00:10 WARN Database connection pool nearing capacity. usage=85%
2023-10-27 10:00:15 ERROR Failed to process payment for order 'ORD998'. reason=Insufficient funds. transaction_id=pay556
2023-10-27 10:00:20 INFO User 'guest' accessed public page '/home'.
2023-10-27 10:00:22 ERROR Uncaught exception in background task 'Cleanup'. error=NullPointerException trace=...

Setup Elasticsearch

Download and extract the Elasticsearch

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-9.0.0-linux-x86_64.tar.gz
tar xvf elasticsearch-9.0.0-linux-x86_64.tar.gz
mv elasticsearch-9.0.0 elasticsearch
cd elasticsearch

Run the Elasticsearch and grab the username and password. Also, enrollment token

./bin/elasticsearch

Example output

✅ Elasticsearch security features have been automatically configured!
✅ Authentication is enabled and cluster connections are encrypted.
 
ℹ️  Password for the elastic user (reset with `bin/elasticsearch-reset-password -u elastic`):
  f=7NMyONNetJX_OzIt6l
 
ℹ️  HTTP CA certificate SHA-256 fingerprint:
  40c8fed1a681c36d2265be4e99a348cf41be9a54d53b36448058428c3963f3f7
 
ℹ️  Configure Kibana to use this cluster:
• Run Kibana and click the configuration link in the terminal when Kibana starts.
• Copy the following enrollment token and paste it into Kibana in your browser (valid for the next 30 minutes):
  eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTAuMjU1LjI1NS4yNTQ6OTIwMCJdLCJmZ3IiOiI0MGM4ZmVkMWE2ODFjMzZkMjI2NWJlNGU5OWEzNDhjZjQxYmU5YTU0ZDUzYjM2NDQ4MDU4NDI4YzM5NjNmM2Y3Iiwia2V5Ijoia216cW5KWUJCaEdna1Vyd3VZTWo6NmFwX2tBOC1sZFpaVjFHbnZoemtnQSJ9
 
ℹ️  Configure other nodes to join this cluster:
• On this node:
  ⁃ Create an enrollment token with `bin/elasticsearch-create-enrollment-token -s node`.
  ⁃ Uncomment the transport.host setting at the end of config/elasticsearch.yml.
  ⁃ Restart Elasticsearch.
• On other nodes:
  ⁃ Start Elasticsearch with `bin/elasticsearch --enrollment-token <token>`, using the enrollment token that you generated.

Setup Kibana

Download and extract the Kibana

wget https://artifacts.elastic.co/downloads/kibana/kibana-9.0.0-linux-x86_64.tar.gz
tar xvf kibana-9.0.0-linux-x86_64.tar.gz
mv kibana-9.0.0 kibana
cd kibana

Start the Kibana dashboard

./bin/kibana

Redirect to http://localhost:5601/, It may ask for code, you can get it from console

Setup Logstash

Download and extract the Logstash

wget https://artifacts.elastic.co/downloads/logstash/logstash-9.0.0-linux-x86_64.tar.gz
mv logstash-9.0.0 logstash
cd logstash

Configure Logstash Yaml

Edit these line in the /config/logstash.yml,pipelines.yml to locate the .conf files.

# logstash.yml
path.config: /home/bigdata/logstash/config/*.conf
# pipelines.yml
- pipeline.id: main
  path.config: "/home/bigdata/logstash/config/*.conf"

Configure Logstash Pipeline

# /etc/logstash/conf.d/01-beats-input.conf
 
input {
  beats {
    # The port where Filebeat is sending data
    port => 5044
    # You can add an ID for clarity, but it's optional
    # id => "filebeat_input"
  }
}
 
filter {
  # Attempt to parse logs matching the dummy format
  grok {
    # Match pattern: Timestamp, LogLevel, Rest of the message
    # Example line: 2023-10-27 10:00:01 INFO User 'admin' logged in...
    match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{LOGLEVEL:log.level} %{GREEDYDATA:log_message}" }
 
    # Overwrite the main 'message' field with the parsed log message part
    # Keep the original full line in '[event][original]' if needed (often added by Filebeat/Agent anyway)
    overwrite => [ "message" ]
  }
 
  # If the timestamp was successfully parsed by grok, use it as the event's main timestamp
  if [log_timestamp] {
    date {
      match => [ "log_timestamp", "yyyy-MM-dd HH:mm:ss" ]
      # Set the main @timestamp field for Elasticsearch/Kibana
      target => "@timestamp"
      # Optional: Remove the temporary field after parsing
      # remove_field => [ "log_timestamp" ]
    }
  }
}
 
output {
  elasticsearch {
    # Address of your local Elasticsearch instance
    hosts => ["http://localhost:9200"]
 
    # The index name pattern to use in Elasticsearch.
    # %{+YYYY.MM.dd} creates daily indices (e.g., logstash-2023.10.27)
    index => "logstash-%{+YYYY.MM.dd}"
 
    # Optional: If your Elasticsearch requires authentication
    user => "elastic"
    password => "bjHv0UypohOnCt+fJe3R"
 
    # Optional: Add an ID for clarity
    # id => "elasticsearch_output"
  }
 
  # Optional: Output to console for debugging (remove in production)
  # stdout { codec => rubydebug }
}

You can optionally run this specific .conf via

logstash -f 01-beats-input.conf

Other example of Logstash config

input {
	beats {
		port => "5044" 
	}
}
 
output {
	elasticsearch {
		hosts => [ "127.0.0.1:9200" ],
		user => 'elastic'
		password => 'bjHv0UypohOnCt+fJe3R'
		index => 'beat-2-index-%{+YYYY.MM.dd}'
	
	}
}
input {
  file {
    path => "/path/to/sample.log"
    start_position => "beginning"  # Starts reading from the beginning of the file
    sincedb_path => "/dev/null"  # Prevents Logstash from remembering the last read position (for testing)
  }
}
 
filter {
  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}" }
  }
 
  # Convert timestamp to a date field in Elasticsearch
  date {
    match => ["timestamp", "yyyy-MM-dd HH:mm:ss"]
  }
}
 
output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "logs"
  }
 
  stdout { codec => rubydebug }  # For debugging and viewing the output in the console
}
 

Install Filebeat

Download and extract the Filebeat

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-9.0.0-linux-x86_64.tar.gz
tar xvf filebeat-9.0.0-linux-x86_64
mv filebeat-9.0.0 filebeat
cd filebeat

Configure Filebeat

filebeat.inputs:
  # Specifies the input type (use 'log' for log files, 'filestream' is newer but 'log' is classic)
  - type: log
  - enabled: true # Enable this input configuration
 
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /home/bigdata/log/app.log
    # Could add more paths here, e.g.:
    # - /var/log/another_app/*.log
    # - /opt/my_custom_app/logs/*.log
 
  # Optional: Add custom fields to identify the source
  # fields:
  #   log_source: myapplication
  #   environment: production
  # fields_under_root: true # Place these fields at the top level of the event
 
# Configure Filebeat to send events to Logstash
output.logstash:
  hosts: ["localhost:5044"] # Logstash host
  # Optional: Load balancing and connection settings
  # loadbalance: true
  # worker: 1 # Number of workers sending events to Logstash
  
# ============================== Other settings =================================
# Optional logging for Filebeat itself
# logging.level: info
# logging.to_files: true
# logging.files:
#   path: /var/log/filebeat
#   name: fileboat
#   keepfiles: 7
#   permissions: 0644

Can publish the specified input via

./filebeat -e -c filebeat.yml -d "publish"

-e flag tells Filebeat to log output to standard error (stderr) -c flag specifies the configuration file that Filebeat should use -d flag enables debug logging for a specific Filebeat module or component.

(Optional) Create Elasticsearch Mapping

PUT /logs
{
  "mappings": {
    "properties": {
      "timestamp": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "level": {
        "type": "keyword"
      },
      "msg": {
        "type": "text"
      }
    }
  }
}