Thanks! We'll be in touch in the next 12 hours
Oops! Something went wrong while submitting the form.

How To Get Started With Logging On Kubernetes?

Shruti Anekar

Cloud & DevOps

In distributed systems like Kubernetes, logging is critical for monitoring and providing observability and insight into an application's operations. With the ever-increasing complexity of distributed systems and the proliferation of cloud-native solutions, monitoring and observability have become critical components in knowing how the systems are functioning.

Logs don’t lie! They have been one of our greatest companions when investigating a production incident.

How is logging in Kubernetes different?

Log aggregation in Kubernetes differs greatly from logging on traditional servers or virtual machines, owing to the way it manages its applications (pods).

When an app crashes on a virtual machine, its logs remain accessible until they are deleted. When pods are evicted, crashed, deleted, or scheduled on a different node in Kubernetes, the container logs are lost. The system is self-cleaning. As a result, you are left with no knowledge of why the anomaly occurred. Because default logging in Kubernetes is transient, a centralized log management solution is essential.

Kubernetes is highly distributed and dynamic in nature; hence, in production, you’ll most certainly be working with multiple machines that have multiple containers each, which can crash at any time. Kubernetes clusters add to the complexity by introducing new layers that must be monitored, each of which generates its own type of log.

We’ve curated some of the best tools to help you achieve this, alongside a simple guide on how to get started with each of them, as well as a comparison of these tools to match your use case.

PLG Stack

Introduction:

Promtail is an agent that ships the logs from the local system to the Loki cluster.

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It indexes only metadata and doesn’t index the content of the log. This design decision makes it very cost-effective and easy to operate.

Grafana is the visualisation tool which consumes data from Loki data source

Loki is like Prometheus, but for logs: we prefer a multidimensional label-based approach to indexing and want a single-binary, easy to operate a system with no dependencies. Loki differs from Prometheus by focusing on logs instead of metrics, and delivering logs via push, instead of pull.

Configuration Options:

Installation with Helm chart -

CODE: https://gist.github.com/velotiotech/6b6dbff1396595fc763dfc69a3548100.js

Log in with user name “admin” and the password you retrieved earlier.

Query Methods:

Using CLI :

Curl command to fetch logs directly from Loki

CODE: https://gist.github.com/velotiotech/4af229644a02c09146eed6aef1b2a731.js

Using LogQL :

  • LogQL provides the functionality to filter logs through operators.

For example :

CODE: https://gist.github.com/velotiotech/1c0ce269f61e52a289d521e70ae4f86e.js

  • LogCLI is the command-line interface to Grafana Loki. It facilitates running LogQLqueries against a Loki instance.

For example :

CODE: https://gist.github.com/velotiotech/4b5029d3ff735952b66c4cf04d4713f9.js

Using Dashboard :

Click on Explore tab on the left side. Select Loki from the data source dropdown

EFK Stack

Introduction :

The Elastic Stack contains most of the tools required for log management

  • Elastic search is an open source, distributed, RESTful and scalable search engine. It is a NoSQL database, primarily to store logs and retrive logs from Fluentd.
  • Log shippers such as LogStash, Fluentd , Fluent-bit. It is an open source log collection agent which support multiple data sources and output formats. It can forward logs to solutions like Stackdriver, CloudWatch, Splunk, Bigquery, etc.
  • Kibana as the UI tool for querying, data visualisation and dashboards. It has ability to virtually  build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data.
  • Fluentd ➖Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
  • ElasticSearch ➖ Deployed as statefulset as it holds the log data. A service endpoint is also exposed for Fluentd and Kibana to connect with it.
  • Kibana ➖ Deployed as deployment and connects to elasticsearch service endpoint.

Configuration Options :

Can be installed through helm chart as a Stack or as Individual components

  • Add the Elastic Helm charts repo:

CODE: https://gist.github.com/velotiotech/9bcd6864d75fd4144f8a5c598a967ae1.js

  • More information related to deploying these Helm Charts can be found here

After installation is complete and Kibana Dashboard is accessible, We need to define index patterns to be able to see logs in Kibana.

From homepage, write Kibana / Index Patterns to search bar. Go to Index patterns page and click to Create index pattern on the right corner. You will see the list of index patterns here.

Add required patterns to your indices and From left side menu, click to discover and check your logs :)

Query Methods :

  • Elastic Search can be queried directly on any indices.
  • For Example -

CODE: https://gist.github.com/velotiotech/20adb5d42ed4ee97ee5f8d377edff299.js

More information can be found here  https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

Using Dashboard :

Graylog Stack

Introduction

Graylog is a leading centralised log management solution built to open standards for capturing, storing, and enabling real-time analysis of terabytes of machine. it supports the Master-Slave Architecture. The Graylog Stack — Graylog v3, Elasticsearch v6 along with MongoDB v3.

Graylog is an open-source log management tool, using Elasticsearch as its storage. Unlike the ELK stack, which is built from individual components (Elasticsearch, Logstash, Kibana), Graylog is built as a complete package that can do everything.

  • One package with all the essentials of log processing: collect, parse, buffer, index, search, analyze
  • Additional features that you don’t get with the open-source ELK stack, such as role-based access control and alerts
  • Fits the needs of most centralized log management use-cases in one package
  • Easily scale both the storage (Elasticsearch) and the ingestion pipeline
  • Graylog's extractors allow to extract fields out of log messages using a lot of methods such grok expression, regex and json

Cons :

  • Visualization capabilities are limited, at least compared to ELK’s Kibana
  • Can’t use the whole ELK ecosystem, because they wouldn’t directly access the Elasticsearch API. Instead, Graylog has its own API
  • It is Not implemented for kubernetes distribution directly rather supports logging via fluent-bit/logstash/fluentd

Configurations Options

Graylog is very flexible in such a way that it supports multiple inputs (data sources ) we can mention :

  • GELF TCP.
  • GELF Kafka.
  • AWS Logs.

as well as Outputs (how can Graylog nodes forward messages) — we can mention :

  • GELF Output.
  • STDOUT.- query via http / rest api

Connecting External GrayLog Stack:

  • Host & IP (12201) TCP input to push logs to graylog stack directly

Query Methods

Using CLI:

CODE: https://gist.github.com/velotiotech/82ae5aef0e23a1e3e2b2302a18e0fb2a.js

Where:

  • query=* - replace * with your desired string
  • range=3600 - replace 3600 with time range (in seconds)
  • limit=100 - replace 100 with number of returned results
  • sort=timestamp:desc - replace timestamp:desc with field you want to sort

Using Dashboard:

One can easily navigate the filter section and perform search with the help of labels generated by log collectors.

Splunk Stack

Introduction

Splunk is used for monitoring and searching through big data. It indexes and correlates information in a container that makes it searchable, and makes it possible to generate alerts, reports and visualisations.

Configuration Options

1. Helm based Installation as well as Operator based Installation is supported

2. Splunk Connect for Kubernetes provides a way to import and search your Kubernetes logging, object, and metrics data in your Splunk platform deployment. Splunk Connect for Kubernetes supports importing and searching your container logs on ECS, EKS, AKS, GKE and Openshift

3. Splunk Connect for Kubernetes supports installation using Helm.

4. Splunk Connect for Kubernetes deploys a DaemonSet on each node. And in the DaemonSet, a Fluentd container runs and does the collecting job. Splunk Connector for Kubernetes collects three types of data - Logs, Objects and Metrics

5. We need a minimum of two Splunk platform indexes

One events index, which will handle logs and objects (you may also create two separate indexes for logs and objects).

One metrics index. If you do not configure these indexes, Kubernetes Connect for Splunk uses the defaults created in your HTTP Event Collector (HEC) token.

6. An HEC token will be required, before moving on to installation

7. To install and configure defaults with Helm :

Add Splunk chart repo

CODE: https://gist.github.com/velotiotech/a8f44f6962298e45e15e450dea12214d.js

Get values file in your working directory and prepare this Values file.

CODE: https://gist.github.com/velotiotech/bbb1f5bc46dc3bac4d4ce5c535bfd368.js

Once you have a Values file, you can simply install the chart with by running

CODE: https://gist.github.com/velotiotech/06e4973582ea3769b7efb0d90634f220.js

To learn more about using and modifying charts, see: https://github.com/splunk/splunk-connect-for-kubernetes/tree/main/helm-chart

The values file for logging

Query Methods

Using CLI :

CODE: https://gist.github.com/velotiotech/c81c04c1f23c49e24eb02644bba2e409.js

Using Dashboard :

Logging Stack

Comparison of Tools

Some of the other tools that are interesting but aren’t open source—but are too good not to talk about and offer end-to-end functionality for all your logging needs:

Sumo Logic :

This log management tool can store logs as well as metrics. It has a powerful search syntax, where you can define operations similarly to UNIX pipes.

  • Powerful query language
  • Capability to detect common log patterns and trends
  • Centralized management of agents
  • Supports Log Archival & Retention
  • Ability to perform Audit Trails and Compliance Tracking

Configuration Options :

  • A subscription to Sumo Logic will be required
  • Helm installation
  • Provides options to install side-by-side existing Prometheus Operator

More information can be found here!

Cons :

  • Performance can be bad for searches over large data sets or long timeframes.
  • Deployment only available on Cloud, SaaS, and Web-Based
  • Expensive - Pricing is per ingested byte, so it forces you to pick and choose what you log, rather than ingesting everything and figuring it out later

Datadog:

Datadog is a SaaS that started up as a monitoring (APM) tool and later added log management capabilities as well.

You can send logs via HTTP(S) or syslog, either via existing log shippers (rsyslog, syslog-ng, Logstash, etc.) or through Datadog’s own agent. With it, observe your logs in real-time using the Live Tail, without indexing them. You can also ingest all of the logs from your applications and infrastructure, decide what to index dynamically with filters, and then store them in an archive.

It features Logging without Limits™, which is a double-edged sword: it’s harder to predict and manage costs, but you get pay-as-you-use pricing combined with the fact that you can archive and restore from archive

  • Log processing pipelines have the ability to process millions of logs per minute or petabytes per month seamlessly.
  • Automatically detects common log patterns
  • Can archive logs to AWS/Azure/Google Cloud storage and rehydrate them later
  • Easy search with good autocomplete (based on facets)
  • Integration with Datadog metrics and traces
  • Affordable, especially for short retention and/or if you rely on the archive for a few searches going back

Configuration options :

Using CLI :

CODE: https://gist.github.com/velotiotech/fdf57799eba54752f99c13f64f37e178.js

Cons :

  • Not available on premises
  • It is a bit complicated to set up for the first time. Is not quite easy to use or know at first about all the available features that Datadog has. The interface is tricky and can be a hindrance sometimes. Following that, if application fields are not mapped in the right way, filters are not that useful.
  • Datadog per host pricing can be very expensive.

Conclusion :

As one can see, each software has its own benefits and downsides. Grafana’s Loki is more lightweight than Elastic Stack in overall performance, supporting Persistent Storage Options.

That being said, the right solution platform really depends on each administrator's needs.

That’s all! Thank you.

If you enjoyed this article, please like it.

Feel free to drop a comment too.

Get the latest engineering blogs delivered straight to your inbox.
No spam. Only expert insights.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

How To Get Started With Logging On Kubernetes?

In distributed systems like Kubernetes, logging is critical for monitoring and providing observability and insight into an application's operations. With the ever-increasing complexity of distributed systems and the proliferation of cloud-native solutions, monitoring and observability have become critical components in knowing how the systems are functioning.

Logs don’t lie! They have been one of our greatest companions when investigating a production incident.

How is logging in Kubernetes different?

Log aggregation in Kubernetes differs greatly from logging on traditional servers or virtual machines, owing to the way it manages its applications (pods).

When an app crashes on a virtual machine, its logs remain accessible until they are deleted. When pods are evicted, crashed, deleted, or scheduled on a different node in Kubernetes, the container logs are lost. The system is self-cleaning. As a result, you are left with no knowledge of why the anomaly occurred. Because default logging in Kubernetes is transient, a centralized log management solution is essential.

Kubernetes is highly distributed and dynamic in nature; hence, in production, you’ll most certainly be working with multiple machines that have multiple containers each, which can crash at any time. Kubernetes clusters add to the complexity by introducing new layers that must be monitored, each of which generates its own type of log.

We’ve curated some of the best tools to help you achieve this, alongside a simple guide on how to get started with each of them, as well as a comparison of these tools to match your use case.

PLG Stack

Introduction:

Promtail is an agent that ships the logs from the local system to the Loki cluster.

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It indexes only metadata and doesn’t index the content of the log. This design decision makes it very cost-effective and easy to operate.

Grafana is the visualisation tool which consumes data from Loki data source

Loki is like Prometheus, but for logs: we prefer a multidimensional label-based approach to indexing and want a single-binary, easy to operate a system with no dependencies. Loki differs from Prometheus by focusing on logs instead of metrics, and delivering logs via push, instead of pull.

Configuration Options:

Installation with Helm chart -

CODE: https://gist.github.com/velotiotech/6b6dbff1396595fc763dfc69a3548100.js

Log in with user name “admin” and the password you retrieved earlier.

Query Methods:

Using CLI :

Curl command to fetch logs directly from Loki

CODE: https://gist.github.com/velotiotech/4af229644a02c09146eed6aef1b2a731.js

Using LogQL :

  • LogQL provides the functionality to filter logs through operators.

For example :

CODE: https://gist.github.com/velotiotech/1c0ce269f61e52a289d521e70ae4f86e.js

  • LogCLI is the command-line interface to Grafana Loki. It facilitates running LogQLqueries against a Loki instance.

For example :

CODE: https://gist.github.com/velotiotech/4b5029d3ff735952b66c4cf04d4713f9.js

Using Dashboard :

Click on Explore tab on the left side. Select Loki from the data source dropdown

EFK Stack

Introduction :

The Elastic Stack contains most of the tools required for log management

  • Elastic search is an open source, distributed, RESTful and scalable search engine. It is a NoSQL database, primarily to store logs and retrive logs from Fluentd.
  • Log shippers such as LogStash, Fluentd , Fluent-bit. It is an open source log collection agent which support multiple data sources and output formats. It can forward logs to solutions like Stackdriver, CloudWatch, Splunk, Bigquery, etc.
  • Kibana as the UI tool for querying, data visualisation and dashboards. It has ability to virtually  build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data.
  • Fluentd ➖Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
  • ElasticSearch ➖ Deployed as statefulset as it holds the log data. A service endpoint is also exposed for Fluentd and Kibana to connect with it.
  • Kibana ➖ Deployed as deployment and connects to elasticsearch service endpoint.

Configuration Options :

Can be installed through helm chart as a Stack or as Individual components

  • Add the Elastic Helm charts repo:

CODE: https://gist.github.com/velotiotech/9bcd6864d75fd4144f8a5c598a967ae1.js

  • More information related to deploying these Helm Charts can be found here

After installation is complete and Kibana Dashboard is accessible, We need to define index patterns to be able to see logs in Kibana.

From homepage, write Kibana / Index Patterns to search bar. Go to Index patterns page and click to Create index pattern on the right corner. You will see the list of index patterns here.

Add required patterns to your indices and From left side menu, click to discover and check your logs :)

Query Methods :

  • Elastic Search can be queried directly on any indices.
  • For Example -

CODE: https://gist.github.com/velotiotech/20adb5d42ed4ee97ee5f8d377edff299.js

More information can be found here  https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

Using Dashboard :

Graylog Stack

Introduction

Graylog is a leading centralised log management solution built to open standards for capturing, storing, and enabling real-time analysis of terabytes of machine. it supports the Master-Slave Architecture. The Graylog Stack — Graylog v3, Elasticsearch v6 along with MongoDB v3.

Graylog is an open-source log management tool, using Elasticsearch as its storage. Unlike the ELK stack, which is built from individual components (Elasticsearch, Logstash, Kibana), Graylog is built as a complete package that can do everything.

  • One package with all the essentials of log processing: collect, parse, buffer, index, search, analyze
  • Additional features that you don’t get with the open-source ELK stack, such as role-based access control and alerts
  • Fits the needs of most centralized log management use-cases in one package
  • Easily scale both the storage (Elasticsearch) and the ingestion pipeline
  • Graylog's extractors allow to extract fields out of log messages using a lot of methods such grok expression, regex and json

Cons :

  • Visualization capabilities are limited, at least compared to ELK’s Kibana
  • Can’t use the whole ELK ecosystem, because they wouldn’t directly access the Elasticsearch API. Instead, Graylog has its own API
  • It is Not implemented for kubernetes distribution directly rather supports logging via fluent-bit/logstash/fluentd

Configurations Options

Graylog is very flexible in such a way that it supports multiple inputs (data sources ) we can mention :

  • GELF TCP.
  • GELF Kafka.
  • AWS Logs.

as well as Outputs (how can Graylog nodes forward messages) — we can mention :

  • GELF Output.
  • STDOUT.- query via http / rest api

Connecting External GrayLog Stack:

  • Host & IP (12201) TCP input to push logs to graylog stack directly

Query Methods

Using CLI:

CODE: https://gist.github.com/velotiotech/82ae5aef0e23a1e3e2b2302a18e0fb2a.js

Where:

  • query=* - replace * with your desired string
  • range=3600 - replace 3600 with time range (in seconds)
  • limit=100 - replace 100 with number of returned results
  • sort=timestamp:desc - replace timestamp:desc with field you want to sort

Using Dashboard:

One can easily navigate the filter section and perform search with the help of labels generated by log collectors.

Splunk Stack

Introduction

Splunk is used for monitoring and searching through big data. It indexes and correlates information in a container that makes it searchable, and makes it possible to generate alerts, reports and visualisations.

Configuration Options

1. Helm based Installation as well as Operator based Installation is supported

2. Splunk Connect for Kubernetes provides a way to import and search your Kubernetes logging, object, and metrics data in your Splunk platform deployment. Splunk Connect for Kubernetes supports importing and searching your container logs on ECS, EKS, AKS, GKE and Openshift

3. Splunk Connect for Kubernetes supports installation using Helm.

4. Splunk Connect for Kubernetes deploys a DaemonSet on each node. And in the DaemonSet, a Fluentd container runs and does the collecting job. Splunk Connector for Kubernetes collects three types of data - Logs, Objects and Metrics

5. We need a minimum of two Splunk platform indexes

One events index, which will handle logs and objects (you may also create two separate indexes for logs and objects).

One metrics index. If you do not configure these indexes, Kubernetes Connect for Splunk uses the defaults created in your HTTP Event Collector (HEC) token.

6. An HEC token will be required, before moving on to installation

7. To install and configure defaults with Helm :

Add Splunk chart repo

CODE: https://gist.github.com/velotiotech/a8f44f6962298e45e15e450dea12214d.js

Get values file in your working directory and prepare this Values file.

CODE: https://gist.github.com/velotiotech/bbb1f5bc46dc3bac4d4ce5c535bfd368.js

Once you have a Values file, you can simply install the chart with by running

CODE: https://gist.github.com/velotiotech/06e4973582ea3769b7efb0d90634f220.js

To learn more about using and modifying charts, see: https://github.com/splunk/splunk-connect-for-kubernetes/tree/main/helm-chart

The values file for logging

Query Methods

Using CLI :

CODE: https://gist.github.com/velotiotech/c81c04c1f23c49e24eb02644bba2e409.js

Using Dashboard :

Logging Stack

Comparison of Tools

Some of the other tools that are interesting but aren’t open source—but are too good not to talk about and offer end-to-end functionality for all your logging needs:

Sumo Logic :

This log management tool can store logs as well as metrics. It has a powerful search syntax, where you can define operations similarly to UNIX pipes.

  • Powerful query language
  • Capability to detect common log patterns and trends
  • Centralized management of agents
  • Supports Log Archival & Retention
  • Ability to perform Audit Trails and Compliance Tracking

Configuration Options :

  • A subscription to Sumo Logic will be required
  • Helm installation
  • Provides options to install side-by-side existing Prometheus Operator

More information can be found here!

Cons :

  • Performance can be bad for searches over large data sets or long timeframes.
  • Deployment only available on Cloud, SaaS, and Web-Based
  • Expensive - Pricing is per ingested byte, so it forces you to pick and choose what you log, rather than ingesting everything and figuring it out later

Datadog:

Datadog is a SaaS that started up as a monitoring (APM) tool and later added log management capabilities as well.

You can send logs via HTTP(S) or syslog, either via existing log shippers (rsyslog, syslog-ng, Logstash, etc.) or through Datadog’s own agent. With it, observe your logs in real-time using the Live Tail, without indexing them. You can also ingest all of the logs from your applications and infrastructure, decide what to index dynamically with filters, and then store them in an archive.

It features Logging without Limits™, which is a double-edged sword: it’s harder to predict and manage costs, but you get pay-as-you-use pricing combined with the fact that you can archive and restore from archive

  • Log processing pipelines have the ability to process millions of logs per minute or petabytes per month seamlessly.
  • Automatically detects common log patterns
  • Can archive logs to AWS/Azure/Google Cloud storage and rehydrate them later
  • Easy search with good autocomplete (based on facets)
  • Integration with Datadog metrics and traces
  • Affordable, especially for short retention and/or if you rely on the archive for a few searches going back

Configuration options :

Using CLI :

CODE: https://gist.github.com/velotiotech/fdf57799eba54752f99c13f64f37e178.js

Cons :

  • Not available on premises
  • It is a bit complicated to set up for the first time. Is not quite easy to use or know at first about all the available features that Datadog has. The interface is tricky and can be a hindrance sometimes. Following that, if application fields are not mapped in the right way, filters are not that useful.
  • Datadog per host pricing can be very expensive.

Conclusion :

As one can see, each software has its own benefits and downsides. Grafana’s Loki is more lightweight than Elastic Stack in overall performance, supporting Persistent Storage Options.

That being said, the right solution platform really depends on each administrator's needs.

That’s all! Thank you.

If you enjoyed this article, please like it.

Feel free to drop a comment too.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings