Showing posts with label DevOps. Show all posts
Showing posts with label DevOps. Show all posts

Thursday, 21 March 2024

Microsoft open sources Retina: A cloud-native container networking observability platform

Microsoft open sources Retina: A cloud-native container networking observability platform

The Microsoft Azure Container Networking team is excited to announce Retina, a cloud-native container networking observability platform that enables Kubernetes users, admins, and developers to visualize, observe, debug, and analyze Kubernetes’ workload traffic irrespective of Container Network Interface (CNI), operating system (OS), and cloud. We are excited to release Retina as an open-source repository that helps with DevOps and SecOps related networking cases for your Kubernetes clusters and we invite the open-source community to innovate along with us.

Embracing and advancing open-source software


Cloud native technologies like Kubernetes have made building applications that can run anywhere, easier. At the same time, many applications have become more complex, and managing them in the cloud is increasingly difficult. As companies build cloud-native applications composed of interconnected services and then deploy them to multiple public clouds as well as their private infrastructure, network related observability, troubleshooting, and debugging has become increasingly difficult.

With the power of extended Berkley Packet Filter (eBPF), it is now possible to offer actionable network insights including how containerized micro-services interact and do so in non-intrusive ways without any change in the applications itself—that’s exactly what Retina sets out to achieve. Retina will help democratize network observability and troubleshooting by bringing new focus to the experience of application developers. Retina provides developers with simple ways to observe and troubleshoot their applications for issues such as packet drops and latency without worrying about the complexities of the underlying network infrastructure and transformations.

Based on our positive experience in the community with eBPF and Cilium, we are excited to build on this relationship and engage both more closely and with more communities. We believe that by opening Retina to the community, we can benefit from informed feedback, innovative ideas, and collaborative efforts that will help enhance and expand Retina’s capabilities.

Retina solutions and capabilities


Drawing from our extensive experience managing multiple container networking services for the Azure Kubernetes Service (AKS), we identified critical gaps in network monitoring, the collection of network metrics and traces from Kubernetes clusters. Retina is a cutting-edge solution that closes these gaps and is designed to tackle the complex challenges of managing and supporting Kubernetes networks providing infrastructure- and site-reliability engineers comprehensive insights into cluster networking. Retina also provides deep traffic analysis with Kubernetes-specific context, translating metrics into either industry-standard Prometheus or network flow logs.

Existing open-source solutions are often tightly coupled with specific CNI’s, OS, or data planes, thereby limiting their versatility and use. For this reason, Retina has been designed and developed to be a highly versatile, adaptable, and extensible framework of plugins capable of working seamlessly with any CNI, OS, or cloud provider—making it a valuable addition to any existing toolset. Retina supports both Linux and Windows data planes, ensuring it meets the diverse needs of infrastructure- and site-reliability engineers, while maintaining a minimal memory and CPU footprint on the cluster—this remains true even at scale. Retina’s pluggability design ethos helps us easily extend and adapt to address new use cases without depending on any specific CNI, OS, or data plane.

Microsoft open sources Retina: A cloud-native container networking observability platform
Figure 1: Architecture overview of Retina

One of Retina’s key features provides deep network traffic insights that include Layer 4 (L4) metrics, Domain Name System (DNS) metrics, and distributed packet captures. It seamlessly integrates the Kubernetes app model offering pod-level metrics with detailed context. It emits actionable networking observability data into industry-standard Prometheus metrics providing node-level metrics (for example, forward, drop, Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Linux utility) and pod-level metrics (such as basic metrics, DNS, and API server latency.)

Retina’s distributed packet captures are label-driven—allowing users to specify what, where, and who to capture packets from. Additionally, it provides historical context of network flow logs and advanced debugging capabilities that enhance network troubleshooting and performance optimization.

Our vision for Retina


Many enterprises are multi-cloud and want solutions that work well not just on Microsoft Azure, but on other clouds as well as on-premises. Retina is open-source and multi-cloud from day one. By open-sourcing Retina, we aim to share our knowledge and vision for Kubernetes networking observability with the broader cloud-native ecosystem. Our hope is that Retina will evolve and grow through collaboration with other developers and organizations who share similar experiences and goals in this field.

In terms of architecture, extensibility was key from the outset and will remain going forward. Retina offers extensibility in data collection—allowing users to easily add new metrics and insights. It also offers extensibility in exporters—enabling users to integrate with other monitoring systems and tools. This flexibility ensures that Retina can adapt to different use cases and environments, making it a versatile and powerful platform for Kubernetes networking observability. In conclusion, we envision Retina as a platform allowing anyone to contribute, extend, and innovate on ultimately creating a robust, purpose-built, and comprehensive solution for Kubernetes networking observability.

Source: microsoft.com

Thursday, 5 October 2023

Announcing Microsoft Playwright Testing: Scalable end-to-end testing for modern web apps

We are excited to announce the preview of Microsoft Playwright Testing, a new service for running Playwright tests easily at scale. Playwright, a fast-growing, open-source framework, enables reliable end-to-end testing and automation for modern web apps. Microsoft Playwright Testing is a fully managed service that uses the cloud to enable you to run Playwright tests with much higher parallelization across different operating system-browser combinations simultaneously. This means faster test runs with broader scenario coverage, which helps speed up delivery of features without sacrificing quality.

Get test suite results faster


Adding Playwright tests to your continuous integration (CI) workflow helps ensure that as the app evolves, your web app experiences continue to work the way you expect. But as the app becomes more complex, the test suite required for comprehensive testing across multiple browser and operating system combinations also increases in size. This leads to longer test suite completion times, potentially delaying your feature delivery. Development teams are already under pressure to quickly deploy app enhancements. To work around long wait times for test completion, it is common practice for development teams to selectively run only a small subset of tests. In a more detrimental scenario, a team may choose to execute tests less frequently, such as only a few times a week in an integration environment instead of with every pull request. This approach can potentially delay catching issues, complicate the process of pinpointing the cause of problems, and adversely affect the overall productivity of the development team.

With the @playwright/test runner, your tests run in independent, parallel worker processes with each process starting its own browser.  Increasing the number of parallel workers can reduce the time it takes to complete the full test suite. You can set the number of workers using the command line:

npx playwright test --workers=4

However, when you run tests locally or in your CI pipeline, you’re limited to the number of central processing unit (CPU) cores on your local machine or CI agent machine. At some point adding more workers will lead to resource contention, slowing down each worker and introducing test flakiness.

By using Microsoft Playwright Testing service you can increase the number of workers at cloud-scale to much bigger numbers. The worker processes orchestrated by @playwright/test continue to run locally but the browser instances, which are resource-intensive, now run in the cloud. You can see in the demo video below how thousands of tests run on 50 parallel browsers in the cloud managed by Microsoft Playwright Testing, significantly reducing the wait time for test results.


Consistent test results across multiple operating systems and browser combinations


App complexity isn’t the only factor in increasing test suite size. Modern web apps need to work flawlessly across numerous browsers, operating systems, and devices. Testing across all these variables increases the amount of time it takes to run your test suite. With Microsoft Playwright Testing you’ll use the scalable parallelism provided by the service to run these tests simultaneously across all modern rendering engine. This includes Chromium, WebKit and Firefox on Windows, and Linux and mobile emulation of Google Chrome for Android and Mobile Safari. Also, the service-managed browsers ensure consistent and reliable results for both functional and visual regression testing, whether tests are run from your CI pipeline or development machine. This extensive cross-compatibility testing helps ensure your web app delivers consistent performance and functionality across all platforms, optimizing the experience for any user, regardless of their browser or operating system.

Announcing Microsoft Playwright Testing: Scalable end-to-end testing for modern web apps

Figure 1-Use Microsoft Playwright Testing service from your CI pipelines and code editors.

No test code changes required


If you’re using Playwright today, getting started with Microsoft Playwright Testing is easy! The service is designed to seamlessly integrate with your Playwright test suite, no changes to existing test code required. In just a few steps you can connect your test suite to the service and unlock the full potential of cloud-powered parallel testing. Plus, the service supports multiple versions of Playwright and updates with each new Playwright release, ensuring your tests run against the latest browser versions and technologies while helping to keep your app current, robust, and secure. Now you can focus on thorough application testing without the worry of managing a complex test infrastructure.

Source: microsoft.com