Streaming Telemetry & Grafana

Intro

Today I’ll give you a quick overview about Streaming Telemetry. After a that overview we’ll take a look, how to install InfluxDB and Grafana to start with streaming telemetry. It’s just a short intro in a very deep function. I think just to control InfluxDB or Grafana you can read tons of stuff.

Streaming Telemetry

To export data or logs from a System, we have different protocols. During the old days we use SNMP and Syslog. Syslog is for logging only. SNMP supports configuration and logging. For logging we can poll the system or send traps. Booth protocols have some problems like scaling and retransmission. With the C9800 Cisco use Streaming Telemetry. It use the gRPC protocol.

gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.

https://grpc.io/

gRPC Dial-Out function push information via TCP/HTTP2. This is a very fast, effencis and secure way to provide information. The controller can push realtime information in very short intervals, what isn’t possible with SNMP or Syslog. Additional to that, the data is model based (YANG).

YANG Model

The data that is pushed by the C9800 is based on YANG models. YANG is a data modeling language. It define the data structure. With OpenConfig there is a vendor neutral way to model data. Additional to the OpenConfig YANG Models Cisco provide a lot of vendor data models.

Why using big data

Streaming Telemetry is a way to export tons of data. That data need to saved and analyse. Why should we do this?
There are a lot of reasons why to analyse data and also a lot of ways. If you look to the vendors, each vendor have it tools. AirWave, DNA-C, NSight. All of this tools collect data and analyse the data. Each vendor have it own way to work with the data and most vendors show some nice colors. But most vendors don’t provide you information, what you see. Additional, they decide what can be interesting for you.
If you have some knowledge about wireless networks, you maybe want to look on vendor metrics. Maybe vendors don’t show you the data you need. This is a point where you can start with big data. In my case, I use Grafana and InfluxDB as a solution.

Depending on the vendor you can share some or tons of data, in small or large time intervals. With Streaming Telemetry for example you can share data in short second intervals. That puts you in a position to query the signal strengths of all clients in second intervals and write it to a database (InfluxDB). With a visualisation tool (Grafana) you can analyse and visualise that data.

Let me give you an sample. You are running a warehouse and have some trouble with the wifi signal. Now you change your TX settings on some access point. How do you validate, if that solve the problem? You can do a validation survey or maybe talking to the worker. But all that data tells you assumptions. The validation show real data, but is it what you device show? You need to compare it. A validation survey is essential. But during a live system you can compare other metrics. One metrics for example is the client data rate. How was that rate befor and after the change?

What is, if your problem was a firmware bug on the device. You change the driver on 10% or 20% of the devices and request feedback. What if you can monitor you metrics like data rate, roaming time, RSSI on a second interval. Now you can compare the data from last week with the data from this week. Same devices, same worker, other firmware version. That data show you real wifi relevant information and not assumptions from a worker.

  • You are starting the introduction of a new generation of devices, how does they perform compared to the old devices.
  • You do changes to your system, configuration or firmware, how does that effect my infrastructure?
  • You would compare backbone throughput with wifi throughput over time from different vendors?

There are a lot of examples, where big data analyse can be very useful. That tutorial should give you a easy way to start with big data.

During the installation you’ll get a quick intro about all used tools like Grafana and InfluxDB.

Install your Logging System

I use a Ubuntu 18.04 with a static IP as logging System. It’s clean installed and SSH enabled. More isn’t needed.

Pipeline

What is Pipeline?

I’ll describe Pipeline as a middleware between C9800 and InfluxDB. It receive gRPC data from the C9800 and convert/write it into the InfluxDB.

Install Pipeline

Pipeline is available via GIT hub and can easy be cloned:

git clone https://github.com/cisco/bigmuddy-network-telemetr-pipeline.git

Change dir to the cloned data and create a new config file:

cd bigmuddy-network-telemetry-pipeline/
nano C9800.conf

I use the following config as sample:

[default]
id = pipeline

[gRPCDialout]
stage = xport_input
type = grpc
encap = gpb
listen = :58000
tls = false
logdata = on

[inspector]
stage = xport_output
type = tap
file = dump_script.json
encoding = json_events
datachanneldepth = 1000
countonly = false

[metrics_influx]
stage = xport_output
type = metrics														#file type for parsing
file = /home/timo/bigmuddy-network-telemetry-pipeline/metrics.json	#config path
datachanneldepth = 10000												#optionally, specify a buffer for the data
output = influx														#InfluxDB output
influx = http://<Server IP>:8086					#InfluxDB url
database = mdt_db							#InfluxDB database
dump = metricsdump.txt							#local InfluxDB dump file (remove after testing)
workers = 15

A description of the different commands can be found in the default configuration, pipeline.conf

Pipeline include a default Metrics file, that I move to create my own metrics.json. This file describe, how pipeline modified and forward the data into InfluxDB.

mv metrics.json metrics.json.backup
touch metrics.json

We’ll add the content for the metrics file later. For a first check we can now start Pipeline in Debug:

./bin/pipeline -config=C9800.conf -log= - -debug

At the end you should see that it starts, but get two error messages, as we don’t setup our metrics.json.

With CTRL + C you can terminate the pipeline for now.

InfluxDB

What is InfluxDB?

InfluxDB is a open source time series database. It is optimized for large time series data. That is what we need to monitor a hugh amount of data over time.

Install InfluxDB

The following commands are needed to install InfluxDB:

curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt-get update
sudo apt-get install influxdb

To start InfluxDB and check the status use the following commands:

sudo service influxdb start
sudo service influxdb status

Now we need to connect to the local DB and create the database.

influx
precision rfc3339
CREATE DATABASE mdt_db
SHOW DATABASES
exit

Grafana

What is Grafana?

Grafana access the data from the InfluxDB and present it. You can easy create from simple till complex dashboards. Additional there is an easy way to add an alarm with push notification per Dashboard/Panel.

The push notification can send for example via the good old E-Mail or on a modern way like a Slack message including a screenshot of the last data.

Install Grafana

The following commands are needed to install Grafana:

curl https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
sudo apt-get update
sudo apt-get install grafana

To start Grafana and check the status use the following commands:

sudo systemctl start grafana-server
sudo systemctl status grafana-server

Enabling autostart with this command:

sudo systemctl enable grafana-server

Access Grafana

Grafana can accessed via http://<IP-address>:3000/

The default login and password is:

User: admin
Pass: admin

After the first login you need to change the password.

Configure your Logging System

We have now all components up and running. Next step is to configure the C9800 to send streaming telemetry and Pipeline to store the needed data into InfluxDB. Final we create a Dashboard inside Grafana to view the data.

Get XPath

To push informations from C9800 to Pipeline/InfluxDB we need XPath’s. We can get the XPath from the YANG model. Cisco provide the YANG Explorer that shows very easy the XPath per parameter that we require.

For a first example, we’ll push informations about traffic status for wireless clients.

How to find the XPath? I search for „clients“ and found the entry:

Cisco-IOS-XE-wireless-client-oper -> client-oper-data -> traffic-stats

That sounds for me interesting. There are a lot of options and data you can export. For me it’s mostly start clicking or searching, what can be interesting.

I select the XPath, it include all elements like „ms-mac-address“ or „bytes-rx“ or „bytes-tx“…

Configuration C9800

Per XPath we need to add a subscription to our C9800. That configuration is self-explanatory.

telemetry ietf subscription 300
    encoding encode-kvgpb
    filter xpath /wireless-client-oper:client-oper-data/traffic-stats
    source-address 10.10.10.10
    stream yang-push
    update-policy periodic 500
    receiver ip address 10.10.10.18 58000 protocol grpc-tcp

Configure Pipeline

After pushing information from C9800 to Pipeline, we need to inform Pipeline, what information needed to translate into InfluxDB.

nano /home/timo/bigmuddy-network-telemetry-pipeline/metrics.json

We get the basepath with YANG Explorer. The fields define, what information we write into the database. All other data we receive from the controller will be dropped.

[
    {
                "basepath" : "Cisco-IOS-XE-wireless-client-oper:client-oper-data/traffic-stats",
                "spec" : {
                        "fields" : [
                                {"name":"name", "tag": true},
                                {"name":"pkts-rx"},
                                {"name":"pkts-tx"}
                        ]
                }
        }
]

Start Pipeline

If we now start Pipeline and load the our metrics.json via our configuration, you should see the gRPC Session:

./bin/pipeline -config=C9800.conf -log= - -debug

During the start it ask for a user and password, it’s admin/admin for the InfluxDB and not your Grafana password!

Open a second SSH session to your server to check if Pipeline receive data and what data it receive. You can check that with the follwoing comand:

cd bigmuddy-network-telemetry-pipeline/
tail -f dump_script.json

Configure Grafana

We have now the base configured. C9800 pushed data to Pipeline. Pipeline write data to InfluxDB. Now it’s time to configure Grafana to access and present data from the InfluxDB.

Add datasource to Grafana

During our first login we change the password for the admin user. Please login now with the new password to add a datasoure. The datasource is our InfluxDB, please add it.

You can get the details, what and how to configure from the following screenshots. It’s straight ahead.

Create our first Grafana dashboard

The Grafana dashboard is used during our daily business. It should include all information we need on the first view. The dashboard include panels of different size. As we include for now only Client RX/TX pakets, we can create on Panel to show booth values or one panel for RX and on for TX.

As it’s our first panel, we’ll create one, that include booth values. If you use a demo C9800, it’s important that we need to connect a client first and generate some traffic.

You can select all information per Drop Down. If you drop down show no information, the database is empty. Generate some data and check if pipeline is working fine.

We add a second query to show RX and TX. During your changes you can already see the data at the panel on top. With ESC or the arrow on top left you can go back.

Now we have our first panel. You can change for example the time range on the top right.

To change the panel name you can click on the drop down arrow or just hit „e“. Under the general icon we can change the panel name and go back.

For a detail view, we create a second panel. That second panel show the Client RX, but per client and not in total. You can add as much views you need. Just make sure, that the C9800 send the data and pipeline write it to InfluxDB.

Summary

With streaming telemtery and Grafana we can create customer dashboards that show what we need in almost real time. It’s a great solution to monitor and analyse what’s important from a customer view and not a vendor view. If you looking into the YANG data models you can find a lot of useful information.

You can log basic information like throughput or client count. But you can also log information like client RSSI and SNR. If you now collect that information from the client side, you are able to compare and validate information.

I’m personally like to collect some KPI’s from my network before and after changes or updates. This makes it possible to validate your changes. Grafana is a great tool to document and validate that KPI’s.

Grafana and InfluxDB are tools that provide a massive amount of functions. It’s not limited to some of the easy stuff I show for the start. It’s also easy to implement other vendors and devices. Even if you device only support maybe Syslog or SNMP.

Additional to only monitor you can create alarm rules and forward events to your system. I use that with a Mail and Slack integration. It’s pretty easy and fast configured.

Schreibe einen Kommentar