How To Get Started With Logging On Kubernetes?

Shruti Anekar

Cloud & DevOps

Tags:

kubernetes

cloud native

distributed systems

In distributed systems like Kubernetes, logging is critical for monitoring and providing observability and insight into an application's operations. With the ever-increasing complexity of distributed systems and the proliferation of cloud-native solutions, monitoring and observability have become critical components in knowing how the systems are functioning.

Logs don’t lie! They have been one of our greatest companions when investigating a production incident.

How is logging in Kubernetes different?

Log aggregation in Kubernetes differs greatly from logging on traditional servers or virtual machines, owing to the way it manages its applications (pods).

When an app crashes on a virtual machine, its logs remain accessible until they are deleted. When pods are evicted, crashed, deleted, or scheduled on a different node in Kubernetes, the container logs are lost. The system is self-cleaning. As a result, you are left with no knowledge of why the anomaly occurred. Because default logging in Kubernetes is transient, a centralized log management solution is essential.

Kubernetes is highly distributed and dynamic in nature; hence, in production, you’ll most certainly be working with multiple machines that have multiple containers each, which can crash at any time. Kubernetes clusters add to the complexity by introducing new layers that must be monitored, each of which generates its own type of log.

We’ve curated some of the best tools to help you achieve this, alongside a simple guide on how to get started with each of them, as well as a comparison of these tools to match your use case.

PLG Stack

Introduction:

Promtail is an agent that ships the logs from the local system to the Loki cluster.

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It indexes only metadata and doesn’t index the content of the log. This design decision makes it very cost-effective and easy to operate.

Grafana is the visualisation tool which consumes data from Loki data source

Loki is like Prometheus, but for logs: we prefer a multidimensional label-based approach to indexing and want a single-binary, easy to operate a system with no dependencies. Loki differs from Prometheus by focusing on logs instead of metrics, and delivering logs via push, instead of pull.

Configuration Options:

Installation with Helm chart -

	# Create a namespace to deploy PLG stack :

	kubectl create ns loki

	# Add Grafana's Helm Chart repository and Update repo :

	helm repo add grafana https://grafana.github.io/helm-charts
	helm repo update

	# Deploy the Loki stack :

	helm upgrade --install loki-stack grafana/loki-stack -n loki --set grafana.enabled=true

	# Retrieve password to log into Grafana with user admin

	kubectl get secret loki-stack-grafan -n loki -o jsonpath="{.data.admin-password}" \| base64 --decode ; echo

	# Finally execute command below to access the Grafana UI on http://localhost:3000

	kubectl port-forward -n loki service/loki-stack-grafana 3000:80

view raw .sh hosted with ❤ by GitHub

Query Methods:

Using CLI :

Curl command to fetch logs directly from Loki

	curl -G -s "http://localhost:3100/loki/api/v1/query"
	--data-urlencode 'query={job="shruti/logging-golang"}' \| jq

view raw .sh hosted with ❤ by GitHub

Using LogQL :

LogQL provides the functionality to filter logs through operators.

For example :

{container="kube-apiserver"} |= "error" != "timeout"

view raw .sh hosted with ❤ by GitHub

LogCLI is the command-line interface to Grafana Loki. It facilitates running LogQLqueries against a Loki instance.

For example :

logcli query '{job="shruti/logging-golang"}'

view raw .sh hosted with ❤ by GitHub

Using Dashboard :

Click on Explore tab on the left side. Select Loki from the data source dropdown

EFK Stack

Introduction :

The Elastic Stack contains most of the tools required for log management

Elastic search is an open source, distributed, RESTful and scalable search engine. It is a NoSQL database, primarily to store logs and retrive logs from Fluentd.
Log shippers such as LogStash, Fluentd , Fluent-bit. It is an open source log collection agent which support multiple data sources and output formats. It can forward logs to solutions like Stackdriver, CloudWatch, Splunk, Bigquery, etc.
Kibana as the UI tool for querying, data visualisation and dashboards. It has ability to virtually build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data.

Fluentd ➖Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
ElasticSearch ➖ Deployed as statefulset as it holds the log data. A service endpoint is also exposed for Fluentd and Kibana to connect with it.
Kibana ➖ Deployed as deployment and connects to elasticsearch service endpoint.

Configuration Options :

Can be installed through helm chart as a Stack or as Individual components

Add the Elastic Helm charts repo:

helm repo add elastic https://helm.elastic.co && helm repo update

view raw .sh hosted with ❤ by GitHub

More information related to deploying these Helm Charts can be found here

After installation is complete and Kibana Dashboard is accessible, We need to define index patterns to be able to see logs in Kibana.

From homepage, write Kibana / Index Patterns to search bar. Go to Index patterns page and click to Create index pattern on the right corner. You will see the list of index patterns here.

Add required patterns to your indices and From left side menu, click to discover and check your logs :)

Query Methods :

Elastic Search can be queried directly on any indices.
For Example -

curl -XGET 'localhost:9200/my_index/my_type/_count?q=field:value&pretty'

view raw .sh hosted with ❤ by GitHub

More information can be found here https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

Using Dashboard :

Graylog Stack

Introduction

Graylog is a leading centralised log management solution built to open standards for capturing, storing, and enabling real-time analysis of terabytes of machine. it supports the Master-Slave Architecture. The Graylog Stack — Graylog v3, Elasticsearch v6 along with MongoDB v3.

Graylog is an open-source log management tool, using Elasticsearch as its storage. Unlike the ELK stack, which is built from individual components (Elasticsearch, Logstash, Kibana), Graylog is built as a complete package that can do everything.

One package with all the essentials of log processing: collect, parse, buffer, index, search, analyze
Additional features that you don’t get with the open-source ELK stack, such as role-based access control and alerts
Fits the needs of most centralized log management use-cases in one package
Easily scale both the storage (Elasticsearch) and the ingestion pipeline
Graylog's extractors allow to extract fields out of log messages using a lot of methods such grok expression, regex and json

Cons :

Visualization capabilities are limited, at least compared to ELK’s Kibana
Can’t use the whole ELK ecosystem, because they wouldn’t directly access the Elasticsearch API. Instead, Graylog has its own API
It is Not implemented for kubernetes distribution directly rather supports logging via fluent-bit/logstash/fluentd

Configurations Options

Graylog is very flexible in such a way that it supports multiple inputs (data sources ) we can mention :

GELF TCP.
GELF Kafka.
AWS Logs.

as well as Outputs (how can Graylog nodes forward messages) — we can mention :

GELF Output.
STDOUT.- query via http / rest api

Connecting External GrayLog Stack:

Host & IP (12201) TCP input to push logs to graylog stack directly

Query Methods

Using CLI:

curl -u admin:password -H 'X-Requested-By: cli' "http://GRAYLOG_IP_OR_HOSTNAME/api/search/universal/relative?query=*&range=3600&limit=100&sort=timestamp:desc&pretty=true" -H "Accept: application/json" -H "Content-Type: application/json"

view raw .sh hosted with ❤ by GitHub

Where:

query=* - replace * with your desired string
range=3600 - replace 3600 with time range (in seconds)
limit=100 - replace 100 with number of returned results
sort=timestamp:desc - replace timestamp:desc with field you want to sort

Using Dashboard:

One can easily navigate the filter section and perform search with the help of labels generated by log collectors.

‍

Splunk Stack

Introduction

Splunk is used for monitoring and searching through big data. It indexes and correlates information in a container that makes it searchable, and makes it possible to generate alerts, reports and visualisations.

Configuration Options

1. Helm based Installation as well as Operator based Installation is supported

2. Splunk Connect for Kubernetes provides a way to import and search your Kubernetes logging, object, and metrics data in your Splunk platform deployment. Splunk Connect for Kubernetes supports importing and searching your container logs on ECS, EKS, AKS, GKE and Openshift

3. Splunk Connect for Kubernetes supports installation using Helm.

4. Splunk Connect for Kubernetes deploys a DaemonSet on each node. And in the DaemonSet, a Fluentd container runs and does the collecting job. Splunk Connector for Kubernetes collects three types of data - Logs, Objects and Metrics

5. We need a minimum of two Splunk platform indexes

One events index, which will handle logs and objects (you may also create two separate indexes for logs and objects).

One metrics index. If you do not configure these indexes, Kubernetes Connect for Splunk uses the defaults created in your HTTP Event Collector (HEC) token.

6. An HEC token will be required, before moving on to installation

7. To install and configure defaults with Helm :

Add Splunk chart repo

helm repo add splunk <https://splunk.github.io/splunk-connect-for-kubernetes/>

view raw .sh hosted with ❤ by GitHub

Get values file in your working directory and prepare this Values file.

helm show values splunk/splunk-connect-for-kubernetes > values.yaml

view raw .sh hosted with ❤ by GitHub

Once you have a Values file, you can simply install the chart with by running

helm install my-splunk-connect -f values.yaml splunk/splunk-connect-for-kubernetes

view raw .sh hosted with ❤ by GitHub

To learn more about using and modifying charts, see: https://github.com/splunk/splunk-connect-for-kubernetes/tree/main/helm-chart

The values file for logging

Query Methods

Using CLI :

curl --location -k --request GET '<https://localhost:8089/services/search/jobs/export?search=search%20index=%22event%22%20sourcetype=%22kube:container:docker-log-generator%22&output_mode=json>' -u admin:Admin123!

view raw .sh hosted with ❤ by GitHub

Using Dashboard :

Logging Stack

Comparison of Tools

Some of the other tools that are interesting but aren’t open source—but are too good not to talk about and offer end-to-end functionality for all your logging needs:

Sumo Logic :

This log management tool can store logs as well as metrics. It has a powerful search syntax, where you can define operations similarly to UNIX pipes.

Powerful query language
Capability to detect common log patterns and trends
Centralized management of agents
Supports Log Archival & Retention
Ability to perform Audit Trails and Compliance Tracking

Configuration Options :

A subscription to Sumo Logic will be required
Helm installation
Provides options to install side-by-side existing Prometheus Operator

More information can be found here!

Cons :

Performance can be bad for searches over large data sets or long timeframes.
Deployment only available on Cloud, SaaS, and Web-Based
Expensive - Pricing is per ingested byte, so it forces you to pick and choose what you log, rather than ingesting everything and figuring it out later

Datadog:

Datadog is a SaaS that started up as a monitoring (APM) tool and later added log management capabilities as well.

You can send logs via HTTP(S) or syslog, either via existing log shippers (rsyslog, syslog-ng, Logstash, etc.) or through Datadog’s own agent. With it, observe your logs in real-time using the Live Tail, without indexing them. You can also ingest all of the logs from your applications and infrastructure, decide what to index dynamically with filters, and then store them in an archive.

It features Logging without Limits™, which is a double-edged sword: it’s harder to predict and manage costs, but you get pay-as-you-use pricing combined with the fact that you can archive and restore from archive

Log processing pipelines have the ability to process millions of logs per minute or petabytes per month seamlessly.
Automatically detects common log patterns
Can archive logs to AWS/Azure/Google Cloud storage and rehydrate them later
Easy search with good autocomplete (based on facets)
Integration with Datadog metrics and traces
Affordable, especially for short retention and/or if you rely on the archive for a few searches going back

Configuration options :

Datadog Agent installation using Helm. A Datadog account will be required to get an API Key and App Key.Installation can get a bit complex. You can find more information at - https://docs.datadoghq.com/getting_started/agent/

Using CLI :

curl -X GET "<https://api.datadoghq.com/api/v2/logs/events>" -H "Content-Type: application/json" -H "DD-API-KEY: {DD_API_KEY}" -H "DD-APPLICATION-KEY: ${DD_APP_KEY}"

view raw .sh hosted with ❤ by GitHub

Cons :

Not available on premises
It is a bit complicated to set up for the first time. Is not quite easy to use or know at first about all the available features that Datadog has. The interface is tricky and can be a hindrance sometimes. Following that, if application fields are not mapped in the right way, filters are not that useful.
Datadog per host pricing can be very expensive.

Conclusion :

As one can see, each software has its own benefits and downsides. Grafana’s Loki is more lightweight than Elastic Stack in overall performance, supporting Persistent Storage Options.

That being said, the right solution platform really depends on each administrator's needs.

That’s all! Thank you.

If you enjoyed this article, please like it.

Feel free to drop a comment too.

‍

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

How To Get Started With Logging On Kubernetes?

Logs don’t lie! They have been one of our greatest companions when investigating a production incident.

How is logging in Kubernetes different?

Log aggregation in Kubernetes differs greatly from logging on traditional servers or virtual machines, owing to the way it manages its applications (pods).

We’ve curated some of the best tools to help you achieve this, alongside a simple guide on how to get started with each of them, as well as a comparison of these tools to match your use case.

PLG Stack

Introduction:

Promtail is an agent that ships the logs from the local system to the Loki cluster.

Grafana is the visualisation tool which consumes data from Loki data source

Configuration Options:

Installation with Helm chart -

	# Create a namespace to deploy PLG stack :

	kubectl create ns loki

	# Add Grafana's Helm Chart repository and Update repo :

	helm repo add grafana https://grafana.github.io/helm-charts
	helm repo update

	# Deploy the Loki stack :

	helm upgrade --install loki-stack grafana/loki-stack -n loki --set grafana.enabled=true

	# Retrieve password to log into Grafana with user admin

	kubectl get secret loki-stack-grafan -n loki -o jsonpath="{.data.admin-password}" \| base64 --decode ; echo

	# Finally execute command below to access the Grafana UI on http://localhost:3000

	kubectl port-forward -n loki service/loki-stack-grafana 3000:80

view raw .sh hosted with ❤ by GitHub

Query Methods:

Using CLI :

Curl command to fetch logs directly from Loki

	curl -G -s "http://localhost:3100/loki/api/v1/query"
	--data-urlencode 'query={job="shruti/logging-golang"}' \| jq

view raw .sh hosted with ❤ by GitHub

Using LogQL :

LogQL provides the functionality to filter logs through operators.

For example :

{container="kube-apiserver"} |= "error" != "timeout"

view raw .sh hosted with ❤ by GitHub

LogCLI is the command-line interface to Grafana Loki. It facilitates running LogQLqueries against a Loki instance.

For example :

logcli query '{job="shruti/logging-golang"}'

view raw .sh hosted with ❤ by GitHub

Using Dashboard :

Click on Explore tab on the left side. Select Loki from the data source dropdown

EFK Stack

Introduction :

The Elastic Stack contains most of the tools required for log management

Elastic search is an open source, distributed, RESTful and scalable search engine. It is a NoSQL database, primarily to store logs and retrive logs from Fluentd.
Log shippers such as LogStash, Fluentd , Fluent-bit. It is an open source log collection agent which support multiple data sources and output formats. It can forward logs to solutions like Stackdriver, CloudWatch, Splunk, Bigquery, etc.
Kibana as the UI tool for querying, data visualisation and dashboards. It has ability to virtually build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data.

Fluentd ➖Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
ElasticSearch ➖ Deployed as statefulset as it holds the log data. A service endpoint is also exposed for Fluentd and Kibana to connect with it.
Kibana ➖ Deployed as deployment and connects to elasticsearch service endpoint.

Configuration Options :

Can be installed through helm chart as a Stack or as Individual components

Add the Elastic Helm charts repo:

helm repo add elastic https://helm.elastic.co && helm repo update

view raw .sh hosted with ❤ by GitHub

More information related to deploying these Helm Charts can be found here

After installation is complete and Kibana Dashboard is accessible, We need to define index patterns to be able to see logs in Kibana.

From homepage, write Kibana / Index Patterns to search bar. Go to Index patterns page and click to Create index pattern on the right corner. You will see the list of index patterns here.

Add required patterns to your indices and From left side menu, click to discover and check your logs :)

Query Methods :

Elastic Search can be queried directly on any indices.
For Example -

curl -XGET 'localhost:9200/my_index/my_type/_count?q=field:value&pretty'

view raw .sh hosted with ❤ by GitHub

More information can be found here https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

Using Dashboard :

Graylog Stack

Introduction

One package with all the essentials of log processing: collect, parse, buffer, index, search, analyze
Additional features that you don’t get with the open-source ELK stack, such as role-based access control and alerts
Fits the needs of most centralized log management use-cases in one package
Easily scale both the storage (Elasticsearch) and the ingestion pipeline
Graylog's extractors allow to extract fields out of log messages using a lot of methods such grok expression, regex and json

Cons :

Visualization capabilities are limited, at least compared to ELK’s Kibana
Can’t use the whole ELK ecosystem, because they wouldn’t directly access the Elasticsearch API. Instead, Graylog has its own API
It is Not implemented for kubernetes distribution directly rather supports logging via fluent-bit/logstash/fluentd

Configurations Options

Graylog is very flexible in such a way that it supports multiple inputs (data sources ) we can mention :

GELF TCP.
GELF Kafka.
AWS Logs.

as well as Outputs (how can Graylog nodes forward messages) — we can mention :

GELF Output.
STDOUT.- query via http / rest api

Connecting External GrayLog Stack:

Host & IP (12201) TCP input to push logs to graylog stack directly

Query Methods

Using CLI:

curl -u admin:password -H 'X-Requested-By: cli' "http://GRAYLOG_IP_OR_HOSTNAME/api/search/universal/relative?query=*&range=3600&limit=100&sort=timestamp:desc&pretty=true" -H "Accept: application/json" -H "Content-Type: application/json"

view raw .sh hosted with ❤ by GitHub

Where:

query=* - replace * with your desired string
range=3600 - replace 3600 with time range (in seconds)
limit=100 - replace 100 with number of returned results
sort=timestamp:desc - replace timestamp:desc with field you want to sort

Using Dashboard:

One can easily navigate the filter section and perform search with the help of labels generated by log collectors.

‍

Splunk Stack

Introduction

Configuration Options

1. Helm based Installation as well as Operator based Installation is supported

3. Splunk Connect for Kubernetes supports installation using Helm.

5. We need a minimum of two Splunk platform indexes

One events index, which will handle logs and objects (you may also create two separate indexes for logs and objects).

One metrics index. If you do not configure these indexes, Kubernetes Connect for Splunk uses the defaults created in your HTTP Event Collector (HEC) token.

6. An HEC token will be required, before moving on to installation

7. To install and configure defaults with Helm :

Add Splunk chart repo

helm repo add splunk <https://splunk.github.io/splunk-connect-for-kubernetes/>

view raw .sh hosted with ❤ by GitHub

Get values file in your working directory and prepare this Values file.

helm show values splunk/splunk-connect-for-kubernetes > values.yaml

view raw .sh hosted with ❤ by GitHub

Once you have a Values file, you can simply install the chart with by running

helm install my-splunk-connect -f values.yaml splunk/splunk-connect-for-kubernetes

view raw .sh hosted with ❤ by GitHub

To learn more about using and modifying charts, see: https://github.com/splunk/splunk-connect-for-kubernetes/tree/main/helm-chart

The values file for logging

Query Methods

Using CLI :

curl --location -k --request GET '<https://localhost:8089/services/search/jobs/export?search=search%20index=%22event%22%20sourcetype=%22kube:container:docker-log-generator%22&output_mode=json>' -u admin:Admin123!

view raw .sh hosted with ❤ by GitHub

Using Dashboard :

Logging Stack

Comparison of Tools

Some of the other tools that are interesting but aren’t open source—but are too good not to talk about and offer end-to-end functionality for all your logging needs:

Sumo Logic :

This log management tool can store logs as well as metrics. It has a powerful search syntax, where you can define operations similarly to UNIX pipes.

Powerful query language
Capability to detect common log patterns and trends
Centralized management of agents
Supports Log Archival & Retention
Ability to perform Audit Trails and Compliance Tracking

Configuration Options :

A subscription to Sumo Logic will be required
Helm installation
Provides options to install side-by-side existing Prometheus Operator

More information can be found here!

Cons :

Performance can be bad for searches over large data sets or long timeframes.
Deployment only available on Cloud, SaaS, and Web-Based
Expensive - Pricing is per ingested byte, so it forces you to pick and choose what you log, rather than ingesting everything and figuring it out later

Datadog:

Datadog is a SaaS that started up as a monitoring (APM) tool and later added log management capabilities as well.

Log processing pipelines have the ability to process millions of logs per minute or petabytes per month seamlessly.
Automatically detects common log patterns
Can archive logs to AWS/Azure/Google Cloud storage and rehydrate them later
Easy search with good autocomplete (based on facets)
Integration with Datadog metrics and traces
Affordable, especially for short retention and/or if you rely on the archive for a few searches going back

Configuration options :

Datadog Agent installation using Helm. A Datadog account will be required to get an API Key and App Key.Installation can get a bit complex. You can find more information at - https://docs.datadoghq.com/getting_started/agent/

Using CLI :

curl -X GET "<https://api.datadoghq.com/api/v2/logs/events>" -H "Content-Type: application/json" -H "DD-API-KEY: {DD_API_KEY}" -H "DD-APPLICATION-KEY: ${DD_APP_KEY}"

view raw .sh hosted with ❤ by GitHub

Cons :

Not available on premises
It is a bit complicated to set up for the first time. Is not quite easy to use or know at first about all the available features that Datadog has. The interface is tricky and can be a hindrance sometimes. Following that, if application fields are not mapped in the right way, filters are not that useful.
Datadog per host pricing can be very expensive.

Conclusion :

As one can see, each software has its own benefits and downsides. Grafana’s Loki is more lightweight than Elastic Stack in overall performance, supporting Persistent Storage Options.

That being said, the right solution platform really depends on each administrator's needs.

That’s all! Thank you.

If you enjoyed this article, please like it.

Feel free to drop a comment too.

‍

kubernetes

cloud native

distributed systems

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Subscribe to get the latest technology updates

How To Get Started With Logging On Kubernetes?

Shruti Anekar

How is logging in Kubernetes different?

PLG Stack

Introduction:

Configuration Options:

Installation with Helm chart -

Query Methods:

Using CLI :

Using LogQL :

Using Dashboard :

EFK Stack

Introduction :

Configuration Options :

Query Methods :

Using Dashboard :

Graylog Stack

Introduction

Cons :

Configurations Options

Query Methods

Using CLI:

Using Dashboard:

Splunk Stack

Introduction

Configuration Options

Query Methods

Using CLI :

Using Dashboard :

Logging Stack

Comparison of Tools

Sumo Logic :

Configuration Options :

Cons :

Datadog:

Configuration options :

Using CLI :

Cons :

Conclusion :

MORE POSTS BY THIS AUTHOR

Shruti Anekar

You may also like

Linux Internals of Kubernetes Networking

Shiwam Jaiswal

Strategies for Cost Optimization Across Amazon EKS Clusters

Saurabh Taneja

Mastering Prow: A Guide to Developing Your Own Plugin for Kubernetes CI/CD Workflow

Bhavya Jain

How To Get Started With Logging On Kubernetes?

How is logging in Kubernetes different?

PLG Stack

Introduction:

Configuration Options:

Installation with Helm chart -

Query Methods:

Using CLI :

Using LogQL :

Using Dashboard :

EFK Stack

Introduction :

Configuration Options :

Query Methods :

Using Dashboard :

Graylog Stack

Introduction

Cons :

Configurations Options

Query Methods

Using CLI:

Using Dashboard:

Splunk Stack

Introduction

Configuration Options

Query Methods

Using CLI :

Using Dashboard :

Logging Stack

Comparison of Tools

Sumo Logic :