29
- January
2020
Posted By : Alexander Goida
Deployment to Docker and Kubernetes of StatsD-Grafana bundle

In the beginning of my study of and , I’ve experienced the lack of simple and clear examples, which I could “play around” while studying the features. I would like to close this gap with this article. Here I’ll talk about the easy way to deploy , on your local environment using and Kubernetes. In the talk I’ll focus more on getting a result, rather than full theoretical coverage of the material. I’ll also cover, but not deeply, how to send StatsD from .NET Core application. All examples in the article are intended for those who begin to study this topic, but it’s desirable to understand basic concepts of and Kubernetes in order to fully understand the article.

A note about containerization

The examples in this article were run on Kubernetes distributed with Docker for Windows. But if you have Windows OS Home edition, then you won’t be able to install it. If you have Windows Professional, then there should be no problems. There are a couple of tips for installing Kubernetes on Windows. Everything should work fine on Linux, tested on Ubuntu, 18.04 LTS.

Real Example

First of all, let’s see in action what I’ll tell about in the article. In this example you’ll start two applications: the one is sending requests to another, the other one is performing some heavy CPU-bound task, both of them are sending some StatsD metrics which you’ll see in Grafana.

Before performing steps described here, you need to make sure you have installed Kubernetes on your machine. Just follow instructions below and you should see some results, but some configuration you’ll need to do on your own.

Now in your browser you can load URL http:\\localhost:3003 and use credentials root/root to access Grafana interface. Statistic is already captured and you can try to add your dashboard. After playing with dashboard configuration you can get something like this.

When you decide to stop and clear resources, just close two windows with the loader and worker processes. The following commands will clear objects in Kubernetes:

Now let’s talk about what has happened.

You deployed StatsD, and Grafana locally to Kubernetes

My recent discover is that DockerHub has a lot of useful already preparaed bundles. The one which I’ll tell about is one of such. The GitHub repo with the image you can find here. This image contains InfluxDB, Telegraf (StatsD) and Grafana which already configured to work together. There are two common ways to deploy images: #1 using docker-compose and #2 Kubernetes. Both ways will help you to deploy several images at once and control network parameters, such as port mapping and others. I’ll cover in short docker-compose and tell more about deployment to Kubernetes. Recently it became possible to deploy docker-compose files to Kubernetes so this would be the most beneficial.

Deployment with Docker-Compose

Docker-compose is distributed with Docker for Windows. But you need to check which version you can use in your YAML. You need to know version of installed Docker and see the compatibility matrix on this page. I’ve got installed Docker of version 19.03.5 so I can use file version 3.x. But I’ll use 2 for compatibility reason. All information we need is already described on the bundle’s page: image name and ports.

In the section ports I’m exposing ports from container to hosts ports. If I don’t do this, I’ll not be able to access resources in the container from the host system, because they will be visible only inside Docker. More about ports mapping you can find here. Now I’ll deploy the system to Docker using docker-compose. I’ll use additional parameters to specify the file explicitly and run the container in a detached mode so it can run in the background. By default, docker-compose looks for the file docker-compose.yaml in the current folder, so you can run it with very minimum parameters as docker-compose up. This will deploy the container to Docker. The deployment to Kubernetes is described in the next section.

Deployment to Kubernetes

The deployment to Kubernetes looks a bit more complicated at first glance, since you need to define deployment, services and other parameters. I found a small hack which saved me some time on writing YAML files for Kubernetes. At first, I deploy everything with the minimum required configuration to the cluster using kubectl util, and then I extract objects as YAML configuration and do some tweaks to adjust to my needs.

Note about private Kubernetes on VMs in GCP

I was trying to use Kubernetes which is deployed on Compute Engine in GCP. I faced the problem with deploying of LoadBalancer service. It stays in pending state and doesn’t acquire external IP address. This circumstance prevents from accessing the service from internet even if you configure your network. There is a solution for this, which requires deployment of ingress service and usage of NodePort as per the answer on Stackoverflow.

Deployment with kubectl

So let’s create a deployment from the image. The name stats is the name of deployment which I gave for this object. You can use another name.

This command will create deployment, which, in its turn, will create Pod, and ReplicaSet in k8s. In order to access Grafana I need to create a service and expose ports. This may be done using NodePort or LoadBalancer service. In most of the cases you’ll be creating LoadBlancer services.

This command will map also host port 3003 (--port) to TCP port in the container (--target-port). Now you should be able to access Grafana using URL http:\\localhost:3003. You can check created objects in k8s with this command:

You should see something like this:

Extracting YAML configuration

At this moment this deployment is not what I need, but I can use it as a draft for my real one. Extracting of YAML configuration:.

The exported file will have definitions of the deployment and service with current configuration. Some of the settings you don’t need. I will need to remove not necessary settings and add port mapping. The final minimalist version may look as follows:

Note about mixing TCP/UDP protocols

You won’t be able to create a service of type LoadBalancer which supports TCP and UDP protocols. This is known limitation and the community is trying to find some solution. Meanwhile you can create two separate services for each of protocol types.

Before applying the new file to Kubernetes clear the existing resources and then use command kubectl apply.

You’ve just deployed the bundle to Kubernetes and can access Grafana using the same URL stated above, but now it can also receive StatsD metrics. In the next section I’ll explain a bit about metrics and how they are sent from the application.

StatsD protocol

StatsD protocol is very simple and you can also build your own client library if you really need. Below you’ll find a summary and here you can read more about StatsD datagrams. StatsD supports such metrics as counters, time measure, gauge and etc.

Counters are used to calculate number of occurrences of some events. Usually you just always increment some bucket (aka StatsD variable). The underlying mechanism will do all math for you. Later when it’s integrated in Grafana, you’ll see a number per seconds, per minutes and so on.

Timing is used to measure time length of some process. For example, this metric just ideally suites the measure of web request time length.

Gauge is used to take a snapshot of some resource state. For example, available memory or threads.

Metrics in .NET Core Service

You’ll need NuGet package JustEat.StatsD. Its description on GitHub is complete and simple. So just follow it to make your own configuration and registration in IoC.

For the purpose of example, let’s take an API where some method, when executed, enqueues working item to ThreadPool. The logic of API allows only certain number of executions in parallel. And let’s say you want to know the following about your service:

  • How many requests are coming?
  • How many requests are waiting before ThreadPool gives a thread?
  • How long does the operation take time?
  • How fast the service is exhausted?

This is how the metrics capturing may look like in the code:

In the code the number of simultaneous calculations is controlled by SemaphoreSlim class. If the number of parallel executions exceeds the maximum, it will stop the execution and wait until some other execution finishes.

See Also

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.