Thursday 21 March 2024

Microsoft open sources Retina: A cloud-native container networking observability platform

Microsoft open sources Retina: A cloud-native container networking observability platform

The Microsoft Azure Container Networking team is excited to announce Retina, a cloud-native container networking observability platform that enables Kubernetes users, admins, and developers to visualize, observe, debug, and analyze Kubernetes’ workload traffic irrespective of Container Network Interface (CNI), operating system (OS), and cloud. We are excited to release Retina as an open-source repository that helps with DevOps and SecOps related networking cases for your Kubernetes clusters and we invite the open-source community to innovate along with us.

Embracing and advancing open-source software


Cloud native technologies like Kubernetes have made building applications that can run anywhere, easier. At the same time, many applications have become more complex, and managing them in the cloud is increasingly difficult. As companies build cloud-native applications composed of interconnected services and then deploy them to multiple public clouds as well as their private infrastructure, network related observability, troubleshooting, and debugging has become increasingly difficult.

With the power of extended Berkley Packet Filter (eBPF), it is now possible to offer actionable network insights including how containerized micro-services interact and do so in non-intrusive ways without any change in the applications itself—that’s exactly what Retina sets out to achieve. Retina will help democratize network observability and troubleshooting by bringing new focus to the experience of application developers. Retina provides developers with simple ways to observe and troubleshoot their applications for issues such as packet drops and latency without worrying about the complexities of the underlying network infrastructure and transformations.

Based on our positive experience in the community with eBPF and Cilium, we are excited to build on this relationship and engage both more closely and with more communities. We believe that by opening Retina to the community, we can benefit from informed feedback, innovative ideas, and collaborative efforts that will help enhance and expand Retina’s capabilities.

Retina solutions and capabilities


Drawing from our extensive experience managing multiple container networking services for the Azure Kubernetes Service (AKS), we identified critical gaps in network monitoring, the collection of network metrics and traces from Kubernetes clusters. Retina is a cutting-edge solution that closes these gaps and is designed to tackle the complex challenges of managing and supporting Kubernetes networks providing infrastructure- and site-reliability engineers comprehensive insights into cluster networking. Retina also provides deep traffic analysis with Kubernetes-specific context, translating metrics into either industry-standard Prometheus or network flow logs.

Existing open-source solutions are often tightly coupled with specific CNI’s, OS, or data planes, thereby limiting their versatility and use. For this reason, Retina has been designed and developed to be a highly versatile, adaptable, and extensible framework of plugins capable of working seamlessly with any CNI, OS, or cloud provider—making it a valuable addition to any existing toolset. Retina supports both Linux and Windows data planes, ensuring it meets the diverse needs of infrastructure- and site-reliability engineers, while maintaining a minimal memory and CPU footprint on the cluster—this remains true even at scale. Retina’s pluggability design ethos helps us easily extend and adapt to address new use cases without depending on any specific CNI, OS, or data plane.

Microsoft open sources Retina: A cloud-native container networking observability platform
Figure 1: Architecture overview of Retina

One of Retina’s key features provides deep network traffic insights that include Layer 4 (L4) metrics, Domain Name System (DNS) metrics, and distributed packet captures. It seamlessly integrates the Kubernetes app model offering pod-level metrics with detailed context. It emits actionable networking observability data into industry-standard Prometheus metrics providing node-level metrics (for example, forward, drop, Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Linux utility) and pod-level metrics (such as basic metrics, DNS, and API server latency.)

Retina’s distributed packet captures are label-driven—allowing users to specify what, where, and who to capture packets from. Additionally, it provides historical context of network flow logs and advanced debugging capabilities that enhance network troubleshooting and performance optimization.

Our vision for Retina


Many enterprises are multi-cloud and want solutions that work well not just on Microsoft Azure, but on other clouds as well as on-premises. Retina is open-source and multi-cloud from day one. By open-sourcing Retina, we aim to share our knowledge and vision for Kubernetes networking observability with the broader cloud-native ecosystem. Our hope is that Retina will evolve and grow through collaboration with other developers and organizations who share similar experiences and goals in this field.

In terms of architecture, extensibility was key from the outset and will remain going forward. Retina offers extensibility in data collection—allowing users to easily add new metrics and insights. It also offers extensibility in exporters—enabling users to integrate with other monitoring systems and tools. This flexibility ensures that Retina can adapt to different use cases and environments, making it a versatile and powerful platform for Kubernetes networking observability. In conclusion, we envision Retina as a platform allowing anyone to contribute, extend, and innovate on ultimately creating a robust, purpose-built, and comprehensive solution for Kubernetes networking observability.

Source: microsoft.com

Related Posts

0 comments:

Post a Comment