Taming troubleshooting on the cloud-native ‘connectivity layer’

0
190
Taming troubleshooting on the cloud-native ‘connectivity layer’


Diagnosing the well being of connections between trendy API-driven purposes is a beast. Isovalent and Grafana Labs are working to offer platform groups less complicated choices.

cloud computing
Image: ShpilbergStudios/Adobe Stock

KubeCon — underway this week in Detroit — is all the time a bellwether of the place the ache factors nonetheless exist round Kubernetes adoption, as platform groups evolve from the so-called “Day 1” challenges to the “Day 2” necessities wanted to make K8s infrastructure simpler to scale and function.

A transparent focus this yr at KubeCon is how platform groups troubleshoot what’s more and more being known as the cloud-native “connectivity layer.” Integration between open supply Grafana and Cilium brings heightened observability to this layer.

Working at the hours of darkness

“The shift toward building modern applications as a collection of API-driven services has many benefits, but let’s be honest, simplified monitoring and troubleshooting is not one of them,” mentioned Dan Wendlandent, CEO at Isovalent. “In a world where a single click by a user may result in dozens, or even hundreds, of API calls under the hood, any fault, over-capacity or latency in the underlying connectivity can and often will negatively impact application behavior in ways that can be devilishly difficult to detect and root cause.”

SEE: Hiring Kit: Cloud Engineer (TechRepublic Premium)

And these devilish particulars are many. For one, the container replicas that Kubernetes creates of every service throughout multi-tenant Linux clusters make it very tough to pinpoint the place these connectivity points happen. Between the appliance layer, and the underlying Layer 7 community, cloud-native connectivity is abstractions on prime of abstractions — countless layers to troubleshoot. And as a result of K8s clusters typically run hundreds of various companies as containerized workloads which can be consistently being created and destroyed, there’s a ton of noise and ephemerality to take care of.

It’s a very totally different structure than legacy VM environments, the place direct entry to low-level community counters and instruments like netstat and tcpdump have been as soon as frequent fare for troubleshooting connectivity, and the place IPs have been instructive concerning the sources and locations of connections.

“In the ‘olden days’ of static applications, servers run as physical nodes or VMs on dedicated VLANs and subnets, and the IP address or subnet of a workload was often a long-term meaningful way to identify a specific application,” mentioned Wendlandt. “This meant that IP-based network logs or counters could be analyzed to make meaningful statements about the behavior of an application.… Outside the Kubernetes cluster, when application developers use external APIs from cloud providers or other third parties, the IP addresses associated with these destinations often vary from one connection attempt to another, making it hard to interpret using IP-based logs.”

All is just not misplaced, nonetheless. Relief could also be forward for platform groups, made potential by eBPF- primarily based Cilium.

Enhancing observability by way of Cilium and Grafana

Cilium — a CNCF incubating mission that’s changing into a de facto container networking interface for all the most important cloud service suppliers’ Kubernetes engines — builds on prime of eBPF’s capacity to inject kernel-level observability into a brand new connectivity layer.

“Cilium leverages eBPF to ensure that all connectivity observability data is associated not only with the IP addresses, but also with the higher-level service identity of applications on both sides of a network connection,” mentioned Wendlandt. “Because eBPF operates at the Linux kernel layer, this added observability does not require any changes to applications themselves or the use of heavyweight and complex sidecar proxies. Instead, Cilium inserts transparently beneath existing workloads, scaling horizontally within a Kubernetes cluster as it grows.”

Today at KubeCon, Grafana Labs and Isovalent, the corporate whose founders embody the creator of Cilium and the eBPF Linux kernel maintainer, have introduced a brand new Cilium-Grafana integration. This Cilium integration into the Grafana stack means platform groups that desire a constant observability expertise for service connectivity throughout their Kubernetes environments can begin utilizing their similar Grafana visualization instruments to roll up their logging, tracing and metrics throughout the cloud-native connectivity layer.

This integration of the 2 open supply applied sciences marks the start of the joint engineering initiatives launched after Grafana Labs’ strategic funding in Isovalent’s Series B funding spherical final month.

I beforehand argued that “observability” appears to have risen because the cool new time period for a lot the identical metrics, logs and traces that we’ve been analyzing lengthy earlier than the time period was coined. But clearly this cloud-native connectivity challenge is an particularly confounding downside area for platform groups to troubleshoot, and with this new eBPF-driven, kernel-level information being uncovered as a constant connectivity datasource, there seems to be a really excessive ceiling for brand spanking new observability use circumstances being mentioned at KubeCon this week.

Disclosure: I work for MongoDB however the views expressed herein are mine.

LEAVE A REPLY

Please enter your comment!
Please enter your name here