Microsoft Online Guide: Azure Machine Learning

Showing posts with label Azure Machine Learning. Show all posts

Saturday, 9 December 2023

AI + Machine Learning Azure Machine Learning Azure OpenAI Service

Infuse responsible AI tools and practices in your LLMOps

As we embrace advancements in generative AI, it’s crucial to acknowledge the challenges and potential harms associated with these technologies. Common concerns include data security and privacy, low quality or ungrounded outputs, misuse of and overreliance on AI, generation of harmful content, and AI systems that are susceptible to adversarial attacks, such as jailbreaks. These risks are critical to identify, measure, mitigate, and monitor when building a generative AI application.

Note that some of the challenges around building generative AI applications are not unique to AI applications; they are essentially traditional software challenges that might apply to any number of applications. Common best practices to address these concerns include role-based access (RBAC), network isolation and monitoring, data encryption, and application monitoring and logging for security. Microsoft provides numerous tools and controls to help IT and development teams address these challenges, which you can think of as being deterministic in nature. In this blog, I’ll focus on the challenges unique to building generative AI applications—challenges that address the probabilistic nature of AI.

First, let’s acknowledge that putting responsible AI principles like transparency and safety into practice in a production application is a major effort. Few companies have the research, policy, and engineering resources to operationalize responsible AI without pre-built tools and controls. That’s why Microsoft takes the best in cutting edge ideas from research, combines that with thinking about policy and customer feedback, and then builds and integrates practical responsible AI tools and methodologies directly into our AI portfolio. In this post, we’ll focus on capabilities in Azure AI Studio, including the model catalog, prompt flow, and Azure AI Content Safety. We’re dedicated to documenting and sharing our learnings and best practices with the developer community so they can make responsible AI implementation practical for their organizations.

Mapping mitigations and evaluations to the LLMOps lifecycle

We find that mitigating potential harms presented by generative AI models requires an iterative, layered approach that includes experimentation and measurement. In most production applications, that includes four layers of technical mitigations: (1) the model, (2) safety system, (3) metaprompt and grounding, and (4) user experience layers. The model and safety system layers are typically platform layers, where built-in mitigations would be common across many applications. The next two layers depend on the application’s purpose and design, meaning the implementation of mitigations can vary a lot from one application to the next.

Fig 1. Enterprise LLMOps development lifecycle.

Ideating and exploring loop: Add model layer and safety system mitigations

The first iterative loop in LLMOps typically involves a single developer exploring and evaluating models in a model catalog to see if it’s a good fit for their use case. From a responsible AI perspective, it’s crucial to understand each model’s capabilities and limitations when it comes to potential harms. To investigate this, developers can read model cards provided by the model developer and work data and prompts to stress-test the model.

Model

The Azure AI model catalog offers a wide selection of models from providers like OpenAI, Meta, Hugging Face, Cohere, NVIDIA, and Azure OpenAI Service, all categorized by collection and task. Model cards provide detailed descriptions and offer the option for sample inferences or testing with custom data. Some model providers build safety mitigations directly into their model through fine-tuning. You can learn about these mitigations in the model cards, which provide detailed descriptions and offer the option for sample inferences or testing with custom data. At Microsoft Ignite 2023, we also announced the model benchmark feature in Azure AI Studio, which provides helpful metrics to evaluate and compare the performance of various models in the catalog.

Safety system

For most applications, it’s not enough to rely on the safety fine-tuning built into the model itself. large language models can make mistakes and are susceptible to attacks like jailbreaks. In many applications at Microsoft, we use another AI-based safety system, Azure AI Content Safety, to provide an independent layer of protection to block the output of harmful content. Customers like South Australia’s Department of Education and Shell are demonstrating how Azure AI Content Safety helps protect users from the classroom to the chatroom.

This safety runs both the prompt and completion for your model through classification models aimed at detecting and preventing the output of harmful content across a range of categories (hate, sexual, violence, and self-harm) and configurable severity levels (safe, low, medium, and high). At Ignite, we also announced the public preview of jailbreak risk detection and protected material detection in Azure AI Content Safety. When you deploy your model through the Azure AI Studio model catalog or deploy your large language model applications to an endpoint, you can use Azure AI Content Safety.

Building and augmenting loop: Add metaprompt and grounding mitigations

Once a developer identifies and evaluates the core capabilities of their preferred large language model, they advance to the next loop, which focuses on guiding and enhancing the large language model to better meet their specific needs. This is where organizations can differentiate their applications.

Metaprompt and grounding

Proper grounding and metaprompt design are crucial for every generative AI application. Retrieval augmented generation (RAG), or the process of grounding your model on relevant context, can significantly improve overall accuracy and relevance of model outputs. With Azure AI Studio, you can quickly and securely ground models on your structured, unstructured, and real-time data, including data within Microsoft Fabric.

Once you have the right data flowing into your application, the next step is building a metaprompt. A metaprompt, or system message, is a set of natural language instructions used to guide an AI system’s behavior (do this, not that). Ideally, a metaprompt will enable a model to use the grounding data effectively and enforce rules that mitigate harmful content generation or user manipulations like jailbreaks or prompt injections. We continually update our prompt engineering guidance and metaprompt templates with the latest best practices from the industry and Microsoft research to help you get started. Customers like Siemens, Gunnebo, and PwC are building custom experiences using generative AI and their own data on Azure.

Fig 2. Summary of responsible AI best practices for a metaprompt.

Evaluate your mitigations

It’s not enough to adopt the best practice mitigations. To know that they are working effectively for your application, you will need to test them before deploying an application in production. Prompt flow offers a comprehensive evaluation experience, where developers can use pre-built or custom evaluation flows to assess their applications using performance metrics like accuracy as well as safety metrics like groundedness. A developer can even build and compare different variations of their metaprompts to assess which may result in the higher quality outputs aligned to their business goals and responsible AI principles.

Fig 3. Summary of evaluation results for a prompt flow built in Azure AI Studio.

Fig 4. Details for evaluation results for a prompt flow built in Azure AI Studio.

Operationalizing loop: Add monitoring and UX design mitigations

The third loop captures the transition from development to production. This loop primarily involves deployment, monitoring, and integrating with continuous integration and continuous deployment (CI/CD) processes. It also requires collaboration with the user experience (UX) design team to help ensure human-AI interactions are safe and responsible.

User experience

In this layer, the focus shifts to how end users interact with large language model applications. You’ll want to create an interface that helps users understand and effectively use AI technology while avoiding common pitfalls. We document and share best practices in the HAX Toolkit and Azure AI documentation, including examples of how to reinforce user responsibility, highlight the limitations of AI to mitigate overreliance, and to ensure users are aware that they are interacting with AI as appropriate.

Monitor your application

Continuous model monitoring is a pivotal step of LLMOps to prevent AI systems from becoming outdated due to changes in societal behaviors and data over time. Azure AI offers robust tools to monitor the safety and quality of your application in production. You can quickly set up monitoring for pre-built metrics like groundedness, relevance, coherence, fluency, and similarity, or build your own metrics.

Looking ahead with Azure AI

Microsoft’s infusion of responsible AI tools and practices into LLMOps is a testament to our belief that technological innovation and governance are not just compatible, but mutually reinforcing. Azure AI integrates years of AI policy, research, and engineering expertise from Microsoft so your teams can build safe, secure, and reliable AI solutions from the start, and leverage enterprise controls for data privacy, compliance, and security on infrastructure that is built for AI at scale. We look forward to innovating on behalf of our customers, to help every organization realize the short- and long-term benefits of building applications built on trust.

Source: microsoft.com

Thursday, 1 June 2023

Artificial Intelligence Azure DevOps Azure Machine Learning Machine Learning

Build next-generation, AI-powered applications on Microsoft Azure

16:08 By Kristen Waston 0 Comment

Microsoft Azure, Azure Exam Prep, Azure Exam Prep, Azure Preparation, Azure Tutorial and Materials, Azure Learning, Azure Guides

The potential of generative AI is much bigger than any of us can imagine today. From healthcare to manufacturing to retail to education, AI is transforming entire industries and fundamentally changing the way we live and work. At the heart of all that innovation are developers, pushing the boundaries of possibility and creating new business and societal value even faster than many thought possible. Trusted by organizations around the world with mission-critical application workloads, Azure is the place where developers can build with generative AI securely, responsibly, and with confidence.

Welcome to Microsoft Build 2023—the event where we celebrate the developer community. This year, we’ll dive deep into the latest technologies across application development and AI that are enabling the next wave of innovation. First, it’s about bringing you state-of-the-art, comprehensive AI capabilities and empowering you with the tools and resources to build with AI securely and responsibly. Second, it’s about giving you the best cloud-native app platform to harness the power of AI in your own business-critical apps. Third, it’s about the AI-assisted developer tooling to help you securely ship the code only you can build.

We’ve made announcements in all key areas to empower you and help your organizations lead in this new era of AI.

Bring your data to life with generative AI

Generative AI has quickly become the generation-defining technology shaping how we search and consume information every day, and it’s been wonderful to see customers across industries embrace Microsoft Azure OpenAI Service. In March, we announced the preview of OpenAI’s GPT-4 in Azure OpenAI Service, making it possible for developers to integrate custom AI-powered experiences directly into their own applications. Today, OpenAI’s GPT-4 is generally available in Azure OpenAI Service, and we’re building on that announcement with several new capabilities you can use to apply generative AI to your data and to orchestrate AI with your own systems.

We’re excited to share our new Azure AI Studio. With just a few clicks, developers can now ground powerful conversational AI models, such as OpenAI’s ChatGPT and GPT-4, on their own data. With Azure OpenAI Service on your data, coming to public preview, and Azure Cognitive Search, employees, customers, and partners can discover information buried in the volumes of data, text, and images using natural language-based app interfaces. Create richer experiences and help users find organization-specific insights, such as inventory levels or healthcare benefits, and more.

To further extend the capabilities of large language models, we are excited to announce that Azure Cognitive Search will power vectors in Azure (in private preview), with the ability to store, index, and deliver search applications over vector embeddings of organizational data including text, images, audio, video, and graphs. Furthermore, support for plugins with Azure OpenAI Service, in private preview, will simplify integrating external data sources and streamline the process of building and consuming APIs. Available plugins include plugins for Azure Cognitive Search, Azure SQL, Azure Cosmos DB, Microsoft Translator, and Bing Search. We are also enabling a Provisioned Throughput Model, which will soon be generally available in limited access to offer dedicated capacity.

Customers are already benefitting from Azure OpenAI Service today, including DocuSign, Volvo, Ikea, Crayon, and 4,500 others.

We continue to innovate across our AI portfolio, including new capabilities in Azure Machine Learning, so developers and data scientists can use the power of generative AI with their data. Foundation models in Azure Machine Learning, now in preview, empower data scientists to fine-tune, evaluate, and deploy open-source models curated by Azure Machine Learning, models from Hugging Face Hub, as well as models from Azure OpenAI Service, all in a unified model catalog. This will provide data scientists with a comprehensive repository of popular models directly within the Azure Machine Learning registry.

We are also excited to announce the upcoming preview of Azure Machine Learning prompt flow that will provide a streamlined experience for prompting, evaluating, tuning, and operationalizing large language models. With prompt flow, you can quickly create prompt workflows that connect to various language models and data sources. This allows for building intelligent applications and assessing the quality of your workflows to choose the best prompt for your case. See all the announcements for Azure Machine Learning.

It’s great to see momentum for machine learning with customers like Swift, a member-owned cooperative that provides a secure global financial messaging network, who is using Azure Machine Learning to develop an anomaly detection model with federated learning techniques, enhancing global financial security without compromising data privacy. We cannot wait to see what our customers build next.

Run and scale AI-powered, intelligent apps on Azure

Azure’s cloud-native platform is the best place to run and scale applications while seamlessly embedding Azure’s native AI services. Azure gives you the choice between control and flexibility, with complete focus on productivity regardless of what option you choose.

Azure Kubernetes Service (AKS) offers you complete control and the quickest way to start developing and deploying intelligent, cloud-native apps in Azure, datacenters, or at the edge with built-in code-to-cloud pipelines and guardrails. We’re excited to share some of the most highly anticipated innovations for AKS that support the scale and criticality of applications running on it.

To give enterprises more control over their environment, we are announcing long-term support for Kubernetes that will enable customers to stay on the same release for two years—twice as long as what’s possible today. We are also excited to share that starting today, Azure Linux is available as a container host operating system platform optimized for AKS. Additionally, we are now enabling Azure customers to access a vibrant ecosystem of first-party and third-party solutions with easy click-through deployments from Azure Marketplace. Lastly, confidential containers are coming soon to AKS, as a first-party supported offering. Aligned with Kata Confidential Containers, this feature enables teams to run their applications in a way that supports zero-trust operator deployments on AKS.

Azure lets you choose from a range of serverless execution environments to build, deploy, and scale dynamically on Azure without the need to manage infrastructure. Azure Container Apps is a fully managed service that enables microservices and containerized applications to run on a serverless platform. We announced, in preview, several new capabilities for teams to simplify serverless application development. Developers can now run Azure Container Apps jobs on demand and schedule applications and event-driven ad hoc tasks to asynchronously execute them to completion. This new capability enables smaller executables within complex jobs to run in parallel, making it easier to run unattended batch jobs right along with your core business logic. With these advancements to our container and serverless products, we are making it seamless and natural to build intelligent cloud-native apps on Azure.

Integrated, AI-based tools to help developers thrive

Making it easier to build intelligent, AI-embedded apps on Azure is just one part of the innovation equation. The other, equally important part is about empowering developers to focus more time on strategic, meaningful work, which means less toiling on tasks like debugging and infrastructure management. We’re making investments in GitHub Copilot, Microsoft Dev Box, and Azure Deployment Environments to simplify processes and increase developer velocity and scale.

GitHub Copilot is the world’s first at-scale AI developer tool, helping millions of developers code up to 55 percent faster. Today, we announced new Copilot experiences built into Visual Studio, eliminating wasted time when getting started with a new project. We’re also announcing several new capabilities for Microsoft Dev Box, including new starter developer images and elevated integration of Visual Studio in Microsoft Dev Box, that accelerates setup time and improves performance. Lastly, we’re announcing the general availability of Azure Deployment Environments and support for HashiCorp Terraform in addition to Azure Resource Manager.

Enable secure and trusted experiences in the era of AI

When it comes to building, deploying, and running intelligent applications, security cannot be an afterthought—developer-first tooling and workflow integration are critical. We’re investing in new features and capabilities to enable you to implement security earlier in your software development lifecycle, find and fix security issues before code is deployed, and pair with tools to deploy trusted containers to Azure.

We’re pleased to announce GitHub Advanced Security for Azure DevOps in preview soon. This new solution provides the three core features of GitHub Advanced Security into the Azure DevOps platform, so you can integrate automated security checks into your workflow. It includes code scanning powered by CodeQL to detect vulnerabilities, secret scanning to prevent the inclusion of sensitive information in code repositories, and dependency scanning to identify vulnerabilities in open-source dependencies and provide update alerts.

While security is at the top of the list for any developer, using AI responsibly is no less important. For almost seven years, we have invested in a cross-company program to ensure our AI systems are responsible by design. Our work on privacy and the General Data Protection Regulation (GDPR) has taught us that policies aren’t enough; we need tools and engineering systems that help make it easy to build with AI responsibly. We’re pleased to announce new products and features to help organizations improve accuracy, safety, fairness, and explainability across the AI development lifecycle.

Azure AI Content Safety, now in preview, enables developers to build safer online environments by detecting and assigning severity scores to unsafe images and text across languages, helping businesses prioritize what content moderators review. It can also be customized to address an organization’s regulations and policies. As part of Microsoft’s commitment to responsible AI, we’re integrating Azure AI Content Safety across our products, including Azure OpenAI Service and Azure Machine Learning, to help users evaluate and moderate content in prompts and generated content.

Additionally, the responsible AI dashboard in Azure Machine Learning now supports text and image data in preview. This means users can more easily identify model errors, understand performance and fairness issues, and provide explanations for a wider range of machine learning model types, including text and image classification and object detection scenarios. In production, users can continue to monitor their model and production data for model and data drift, perform data integrity tests, and make interventions with the help of model monitoring, now in preview.

We are committed to helping developers and machine learning engineers apply AI responsibly, through shared learning, resources, and purpose-built tools and systems.

Let’s write this history, together

AI is a massive shift in computing. Whether it is part of your workflow or part of cloud development, powering your next-generation, intelligent apps, this community of developers is leading this shift.

We are excited to bring Microsoft Build to you, especially this year as we go deep into the latest AI technologies, connect you with experts from within and outside of Microsoft, and showcase real-world solutions powered by AI.

Source: microsoft.com

Thursday, 25 May 2023

Artificial Intelligence Azure Machine Learning

The Net Zero journey: Why digital twins are a powerful ally

16:03 By Kristen Waston 0 Comment

Climate impacts raise stakes for Net Zero transition

Following weeks of vital discussions at COP27 in Egypt, the urgency to bring the world to a more sustainable path has never been greater. Scientists have warned that the world needs to cut global emissions by 5 percent to 7 percent per year to limit the damage caused by climate change. At present, however, emissions are rising by 1 percent to 2 percent per year. Discovering new routes to a Net Zero economy is critical if we are to limit the economic and social damage of a rapidly changing climate. And that means we all have a part to play in ensuring we strike the optimal balance between greenhouse gas production and the amount of greenhouse gas that gets removed from the atmosphere.

Azure Exam, Azure Prep, Azure Tutorial and Materials, Azure Guides, Azure Skills, Azure Job

A Microsoft and PWC blueprint for the transition to Net Zero highlights the importance of innovation and the harnessing of new technologies that enable organizations to deliver on their Net Zero ambitions, at pace. A key innovation that aims to accelerate organizations’ journey to Net Zero is digital twin technology supported by AI Infrastructure capabilities. A digital twin can be considered as a virtual working representation of assets, products, and production plants. Powered by Microsoft Azure AI-optimized infrastructure that leverages NVIDIA accelerated computing and networking technologies, digital twins allow organizations to visualize, simulate, and predict operations, whether those are at a manufacturing plant, a wind farm, a mining operation, or any other type of operation.

Adoption of digital twin technology offers early adopters the potential of truly accelerated and differentiated business value realization. Innovative companies can leverage this potent toolset to accelerate their innovation journeys and drive strategic business outcomes powered by technology innovation at scale. A recent study by Microsoft and Intel found that globally, only 28 percent of manufacturers have started rolling out a digital twin solution, and of those, only one in seven have fully deployed it at their manufacturing plants. One of the key findings of this study highlighted that when digital twins are utilized effectively, they can realize huge efficiency, optimization, and cost-saving gains while unlocking mission-critical insights that can drive innovation and improve decision-making for those who adopt the technology.

Maximizing wind energy production with digital twins

Digital twins have emerged as a powerful tool for renewable energy producers seeking optimization gains in their production processes too. Take South Korea’s Doosan Heavy Industries & Construction as an example. As a leader in engineering, procurement, heavy manufacturing, power generation and desalination services, Doosan Heavy Industries & Construction was appointed by the South Korean government to help it meet the goals of its Green New Deal plan, which includes a target of generating 20 percent of the country’s electricity needs through renewables by 2030.

Seeking improvements in the efficiency of their wind turbines, Doosan Heavy Industries & Construction partnered with Microsoft and Bentley Systems to develop a digital twin of its wind farms that helps it maximize energy production and reduce maintenance costs. The company currently has 16 South Korean wind farms in operation, which generate enough electricity to power as many as 35,000 homes per year. Its innovative digital controls and operations enables Doosan to remotely monitor wind farm operations, predict maintenance before failures occur, and limit the need for maintenance teams to physically inspect the wind turbines.

Leveraging Azure Digital Twins and Azure IoT Hub powered by NVIDIA-accelerated Azure AI Infrastructure capabilities, Doosan can simulate, visualize, and optimize every aspect of its infrastructure planning, deployment, and ongoing monitoring. This has led to greater energy efficiency, boosted employee safety, and improved asset resilience. And with Bentley seeing their Azure-powered digital twin technology reduce operational and maintenance costs by 15 percent at other facilities, Doosan is well-positioned to continue benefiting from their digital twin solution and unlocking new efficiency gains by leveraging the power of cloud-based AI infrastructure capabilities.

Leveraging digital twins to power Net Zero transition

In the oil and gas sector, digital twin technology is helping one of the world’s leading carbon-emitting industries to identify opportunities for optimization and carbon reduction. A noteworthy showcase can be found with Tata Consulting Services who delivered a Clever Energy solution to a global consumer goods giant. Using digital twins, real-time data and cognitive intelligence to improve energy savings at this consumer goods customer’s production plants, the solution helped reduce energy use by up to 15 percent as well as an equivalent CO2 emissions reduction. Considering that buildings consume nearly 40 percent of the world’s energy and emit one third of greenhouse gasses, this solution also helps the customer alleviate some of the pressures of significant energy cost increases in Europe.

In another example, a large multinational supplier that aims to achieve Net Zero carbon status by no later than 2050 is today leveraging the power of digital twins to support its sustainability goals.

From the vast global network of complex assets this company manages, a digital twin of one of their facilities was developed to calculate real-time carbon intensity and energy efficiency. Microsoft Azure provided the perfect platform: the IoT Hub receives more than 250 billion data signals per month from the company’s global operating assets, with AI providing key insights into how they could become a safer and more efficient business and Azure AI Infrastructure and High-Performance Computing enabling the seamless processing of huge volumes of data.

With long-term plans in place to scale the digital twin solution to all of the company’s global facilities, Microsoft Azure’s security, scalability, and powerful high-performance computing capabilities will be key supporting factors in how successfully they could transition to more carbon-aware operations.

Powering the Next Era of Industrial Digitalization

At NVIDIA GTC, a global AI conference, NVIDIA and Microsoft announced a collaboration to connect the NVIDIA Omniverse platform for developing and operating industrial metaverse applications with Azure Cloud Services. Enterprises of every scale will soon be able to use the Omniverse Cloud platform-as-a-service on Microsoft Azure to fast-track development and deployment of physically accurate, connected, secure, AI-enabled digital twin simulations.

Key takeaways about a Net Zero economy and digital twins

Shifting to a Net Zero economy is one of the defining challenges of our time. As the devastating impact of climate change continues to disrupt global economies, businesses will need novel ways of reducing their carbon footprint and help bring the world to a more sustainable path.

Considering the vast complexity of modern businesses—especially resource-intensive industries such as oil and gas, and manufacturing—finding ways to optimize processes, reduce waste, and accelerate time to value can be extremely cumbersome unless novel technology solutions are found to help provide differentiated strategic capabilities.

Digital twin technology offers organizations a powerful option to run detailed simulations generating vast amounts of data. By integrating that data to the power and scalability of Azure high performance computing (HPC) and leveraging the visualization power of Nvidia’s GPU-accelerated virtual computing capabilities, organizations can discover new opportunities for greater efficiency, optimization, and carbon-neutrality gains.

Source: microsoft.com

Saturday, 19 November 2022

Artificial Intelligence Azure Machine Learning

AI and the need for purpose-built cloud infrastructure

13:48 By Kristen Waston 0 Comment

Azure Exam Prep, Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Skills, Azure Jobs

The progress of AI has been astounding with solutions pushing the envelope by augmenting human understanding, preferences, intent, and even spoken language. AI is improving our knowledge and understanding by helping us provide faster, more insightful solutions that fuel transformation beyond our imagination. However, with this rapid growth and transformation, AI’s demand for compute power has grown by leaps and bounds, outpacing Moore’s Law’s ability to keep up. With AI powering a wide array of important applications that include natural language processing, robot-powered process automation, and machine learning and deep learning, AI silicon manufacturers are finding new, innovative ways to get more out of each piece of silicon such as integration of advanced, mixed-precision capabilities, to enable AI innovators to do more with less. At Microsoft, our mission is to empower every person and every organization on the planet to achieve more, and with Azure’s purpose-built AI infrastructure we intend to deliver on that promise.

Azure high-performance computing provides scalable solutions

The need for purpose-built infrastructure for AI is evident—one that can not only scale up to take advantage of multiple accelerators within a single server but also scale out to combine many servers (with multi-accelerators) distributed across a high-performance network. High-performance computing (HPC) technologies have significantly advanced multi-disciplinary science and engineering simulations—including innovations in hardware, software, and the modernization and acceleration of applications by exposing parallelism and advancements in communications to advance AI infrastructure. Scale-up AI computing infrastructure combines memory from individual graphics processing units (GPUs) into a large, shared pool to tackle larger and more complex models. When combined with the incredible vector-processing capabilities of the GPUs, high-speed memory pools have proven to be extremely effective at processing large multidimensional arrays of data to enhance insights and accelerate innovations.

With the added capability of a high-bandwidth, low-latency interconnect fabric, scale-out AI-first infrastructure can significantly accelerate time to solution via advanced parallel communication methods, interleaving computation and communication across a vast number of compute nodes. Azure scale-up-and scale-out AI-first infrastructure combines the attributes of both vertical and horizontal system scaling to address the most demanding AI workloads. Azure’s AI-first infrastructure delivers leadership-class price, compute, and energy-efficient performance today.

Cloud infrastructure purpose-built for AI

Microsoft Azure, in partnership with NVIDIA, delivers purpose-built AI supercomputers in the cloud to meet the most demanding real-world workloads at scale while meeting price/performance and time-to-solution requirements. And with available advanced machine learning tools, you can accelerate incorporating AI into your workloads to drive smarter simulations and accelerate intelligent decision-making.

Microsoft Azure is the only global public cloud service provider that offers purpose-built AI supercomputers with massively scalable scale-up-and-scale-out IT infrastructure comprised of NVIDIA InfiniBand interconnected NVIDIA Ampere A100 Tensor Core GPUs. Optional and available Azure Machine Learning tools facilitate the uptake of Azure’s AI-first infrastructure—from early development stages through enterprise-grade production deployments.

Scale-up-and-scale-out infrastructures powered by NVIDIA GPUs and NVIDIA Quantum InfiniBand networking rank amongst the most powerful supercomputers on the planet. Microsoft Azure placed in the top 15 of the Top500 supercomputers worldwide and currently, five systems in the top 50 use Azure infrastructure with NVIDIA A100 Tensor Core GPUs. Twelve of the top twenty ranked supercomputers in the Green500 list use NVIDIA A100 Tensor Core GPUs.

Source: Top 500 The List: Top500 November 2022, Green500 November 2022.

With a total solution approach that combines the latest GPU architectures, designed for the most compute-intensive AI training and inference workloads, and optimized software to leverage the power of the GPUs, Azure is paving the way to beyond exascale AI supercomputing. And this supercomputer-class AI infrastructure is made broadly accessible to researchers and developers in organizations of any size around the world in support of Microsoft’s stated mission. Organizations that need to augment their existing on-premises HPC or AI infrastructure can take advantage of Azure’s dynamically scalable cloud infrastructure.

In fact, Microsoft Azure works closely with customers across industry segments. Their increasing need for AI technology, research, and applications is fulfilled, augmented, and/or accelerated with Azure’s AI-first infrastructure. Some of these collaborations and applications are explained below:

Retail and AI

AI-first cloud infrastructure and toolchain from Microsoft Azure featuring NVIDIA are having a significant impact in retail. With a GPU-accelerated computing platform, customers can churn through models quickly and determine the best-performing model. Benefits include:

◉ Deliver 50x performance improvements for classical data analytics and machine learning (ML) processes at scale with AI-first cloud infrastructure.

◉ Leveraging RAPIDS with NVIDIA GPUs, retailers can accelerate the training of their machine learning algorithms up to 20x. This means they can use larger data sets and process them faster with more accuracy, allowing them to react in real-time to shopping trends and realize inventory cost savings at scale.

◉ Reduce the total cost of ownership (TCO) for large data science operations.

◉ Increase ROI for forecasting, resulting in cost savings from reduced out-of-stock and poorly placed inventory.

With autonomous checkout, retailers can provide customers with frictionless and faster shopping experiences while increasing revenue and margins. Benefits include:

◉ Deliver better and faster customer checkout experience and reduce queue wait time.

◉ Increase revenue and margins.

◉ Reduce shrinkage—the loss of inventory due to theft such as shoplifting or ticket switching at self-checkout lanes, which costs retailers $62 billion annually, according to the National Retail Federation.

In both cases, these data-driven solutions require sophisticated deep learning models—models that are much more sophisticated than those offered by machine learning alone. In turn, this level of sophistication requires AI-first infrastructure and an optimized AI toolchain.

Customer story (video): Everseen and NVIDIA create a seamless shopping experience that benefits the bottom line.

Manufacturing

In manufacturing, compared to routine-based or time-based preventative maintenance, proactive predictive maintenance can get ahead of the problem before it happens and save businesses from costly downtime. Benefits of Azure and NVIDIA cloud infrastructure purpose-built for AI include:

◉ GPU-accelerated compute enables AI at an industrial scale, taking advantage of unprecedented amounts of sensor and operational data to optimize operations, improve time-to-insight, and reduce costs.

◉ Process more data faster with higher accuracy, allowing faster reaction time to potential equipment failures before they even happen.

◉ Achieve a 50 percent reduction in false positives and a 300 percent reduction in false negatives.

Traditional computer vision methods that are typically used in automated optical inspection (AOI) machines in production environments require intensive human and capital investment. Benefits of GPU-accelerated infrastructure include:

◉ Consistent performance with guaranteed quality of service, whether on-premises or in the cloud.

◉ GPU-accelerated compute enables AI at an industrial scale, taking advantage of unprecedented amounts of sensor and operational data to optimize operations, improve quality, time to insight, and reduce costs.

◉ Leveraging RAPIDS with NVIDIA GPUs, manufacturers can accelerate the training of their machine-learning algorithms up to 20x.

Each of these examples require an AI-first infrastructure and toolchain to significantly reduce false positives and negatives in predictive maintenance and to account for subtle nuances in ensuring overall product quality.

Customer story (video): Microsoft Azure and NVIDIA gives BMW the computing power for automated quality control.

As we have seen, AI is everywhere, and its application is growing rapidly. The reason is simple. AI enables organizations of any size to gain greater insights and apply those insights to accelerating innovations and business results. Optimized AI-first infrastructure is critical in the development and deployment of AI applications.

Azure is the only cloud service provider that has a purpose-built, AI-optimized infrastructure comprised of Mellanox InfiniBand interconnected NVIDIA Ampere A100 Tensor Core GPUs for AI applications of any scale for organizations of any size. At Azure, we have a purpose-built AI-first infrastructure that empowers every person and every organization on the planet to achieve more. Come and do more with Azure!

Source: microsoft.com

Saturday, 8 June 2019

Artificial Intelligence Azure Machine Learning Big Data

Build more accurate forecasts with new capabilities in automated machine learning

09:39 By Kristen Waston 1 Comment

We are excited to announce new capabilities which are apart of time-series forecasting in Azure Machine Learning service. We launched preview of forecasting in December 2018, and we have been excited with the strong customer interest. We listened to our customers and appreciate all the feedback. Your responses helped us reach this milestone.

Big Data, Artificial Intelligence, Azure Machine Learning, Azure Study Materials, Azure Certifications

Building forecasts is an integral part of any business, whether it’s revenue, inventory, sales, or customer demand. Building machine learning models is time-consuming and complex with many factors to consider, such as iterating through algorithms, tuning your hyperparameters and feature engineering. These choices multiply with time series data, with additional considerations of trends, seasonality, holidays and effectively splitting training data.

Forecasting within automated machine learning (ML) now includes new capabilities that improve the accuracy and performance of our recommended models:

◈ New forecast function

◈ Rolling-origin cross validation

◈ Configurable Lags

◈ Rolling window aggregate features

◈ Holiday detection and featurization

Expanded forecast function

We are introducing a new way to retrieve prediction values for the forecast task type. When dealing with time series data, several distinct scenarios arise at prediction time that require more careful consideration. For example, are you able to re-train the model for each forecast? Do you have the forecast drivers for the future? How can you forecast when you have a gap in historical data? The new forecast function can handle all these scenarios.

Let’s take a closer look at common configurations of train and prediction data scenarios, when using the new forecasting function. For automated ML the forecast origin is defined as the point when the prediction of forecast values should begin. The forecast horizon is how far out the prediction should go into the future.

In many cases training and prediction do not have any gaps in time. This is the ideal because the model is trained on the freshest available data. We recommend you set your forecast this way if your prediction interval allows time to retrain, for example in more fixed data situations such as financial forecasts rate or supply chain applications using historical revenue or known order volumes.

When forecasting you may know future values ahead of time. These values act as contextual information that can greatly improve the accuracy of the forecast. For example, the price of a grocery item is known weeks in advance, which strongly influences the “sales” target variable. Another example is when you are running what-if analyses, experimenting with future values of drivers like foreign exchange rates. In these scenarios the forecast interface lets you specify forecast drivers describing time periods for which you want the forecasts (Xfuture).

If train and prediction data have a gap in time, the trained model becomes stale. For example, in high-frequency applications like IoT it is impractical to retrain the model constantly, due to high velocity of change from sensors with dependencies on other devices or external factors e.g. weather. You can provide prediction context with recent values of the target (ypast) and the drivers (Xpast) to improve the forecast. The forecast function will gracefully handle the gap, imputing values from training and prediction context where necessary.

In other scenarios, such as sales, revenue, or customer retention, you may not have contextual information available for future time periods. In these cases, the forecast function supports making zero-assumption forecasts out to a “destination” time. The forecast destination is the end point of the forecast horizon. The model maximum horizon is the number of periods the model was trained to forecast and may limit the forecast horizon length.

The forecast model enriches the input data (e.g. adds holiday features) and imputes missing values. The enriched and imputed data are returned with the forecast.

Rolling-origin cross validation

Cross-validation (CV) is a vital procedure for estimating and reducing out-of-sample error for a model. For time series data we need to ensure training only occurs using values to the past of the test data. Partitioning the data without regard to time does not match how data becomes available in production, and can lead to incorrect estimates of the forecaster’s generalization error.

To ensure correct evaluation, we added rolling-origin cross validation (ROCV) as the standard method to evaluate machine learning models on time series data. It divides the series into training and validation data using an origin time point. Sliding the origin in time generates the cross-validation folds.

As an example, when we do not use ROCV, consider a hypothetical time-series containing 40 observations. Suppose the task is to train a model that forecasts the series up-to four time-points into the future. A standard 10-fold cross validation (CV) strategy is shown in the image below. The y-axis in the image delineates the CV folds that will be made while the colors distinguish training points (blue) from validation points (orange). In the 10-fold example below, notice how folds one through nine result in model training on dates future to be included the validation set resulting inaccurate training and validation results.

This scenario should be avoided for time-series instead, when we use an ROCV strategy as shown below, we preserve the time series data integrity and eliminate the risk of data leakage.

ROCV is used automatically for forecasting. You simply pass the training and validation data together and set the number of cross validation folds. Automated machine learning (ML) will use the time column and grain columns you have defined in your experiment to split the data in a way that respects time horizons. Automated ML will also retrain the selected model on the combined train and validation set to make use of the most recent and thus most informative data, which under the rolling-origin splitting method ends up in the validation set.

Lags and rolling window aggregates

Often the best information a forecaster can have is the recent value of the target. Creating lags and cumulative statistics of the target then increases accuracy of your predictions.

In automated ML, you can now specify target lag as a model feature. Adding lag length identifies how many rows to lag based on your time interval. For example, if you wanted to lag by two units of time, you set the lag length parameter to two.

The table below illustrates how a lag length of two would be treated. Green columns are engineered features with lags of sales by one day and two day. The blue arrows indicate how each of the lags are generated by the training data. Not a number (Nan) are created when sample data does not exist for that lag period.

In addition to the lags, there may be situations where you need to add rolling aggregation of data values as features. For example, when predicting energy demand you might add a rolling window feature of three days to account for thermal changes of heated spaces. The table below shows feature engineering that occurs when window aggregation is applied. Columns for minimum, maximum, and sum are generated on a sliding window of three based on the defined settings. Each row has a new calculated feature, in the case of date January 4, 2017 maximum, minimum, and sum values are calculated using temp values for January 1, 2017, January 2, 2017, and January 3, 2017. This window of “three” shifts along to populate data for the remaining rows.

Generating and using these additional features as extra contextual data helps with the accuracy of the trained model. This is all possible by adding a few parameters to your experiment settings.

Holiday features

For many time series scenarios, holidays have a strong influence on how the modeled system behaves. The time before, during, and after a holiday can modify the series’ patterns, especially in scenarios such as sales and product demand. Automated ML will create additional features as input for model training on daily datasets. Each holiday generates a window over your existing dataset that the learner can assign an effect to. With this update, we will support over 2000 holidays in over 110 countries. To use this feature, simply pass the country code as a part of the time series settings. The example below shows input data in the left table and the right table shows updated dataset with holiday featurization applied. Additional features or columns are generated that add more context when models are trained for improved accuracy.

Get started with time-series forecasting in automated ML

With these new capabilities automated ML increases support more complex forecasting scenarios, provides more control to configure training data using lags and window aggregation and improves accuracy with new holiday featurization and ROCV. Azure Machine Learning aims to enable data scientists of all skill levels to use powerful machine learning technology that simplifies their processes and reduces the time spent training models.

Thursday, 23 May 2019

Artificial Intelligence Azure Machine Learning

Visual interface for Azure Machine Learning service

14:54 By Kristen Waston 3 Comment

Artificial Intelligence, Azure Machine Learning Service, Azure Study Materials, Azure Guides

This new drag-and-drop workflow capability in Azure Machine Learning service simplifies the process of building, testing, and deploying machine learning models for customers who prefer a visual experience to a coding experience. This capability brings the familiarity of what we already provide in our popular Azure Machine Learning Studio with significant improvements to ease the user experience.

Visual interface

The Azure Machine Learning visual interface is designed for simplicity and productivity. The drag-and-drop experience is tailored for:

◈ Data scientists who are more familiar with visual tools than coding.

◈ Users who are new to machine learning and want to learn it in an intuitive way.

◈ Machine learning experts who are interested in rapid prototyping.

It offers a rich set of modules covering data preparation, feature engineering, training algorithms, and model evaluation. Another great aspect of this new capability is that it is completely web-based with no software installation required. All of this to say, users of all experience levels can now view and work on their data in a more consumable and easy-to-use manner.

Scalable Training

One of the biggest challenges data scientists previously faced when training data sets was the cumbersome limitations to scaling. If you were to start by training on a smaller model and then had a need to expand it due to an influx of data, or complex algorithms you would be required to migrate your entire data set to continue your training. With the new visual interface for Azure Machine Learning we’ve replaced the back end to reduce these limitations.

An experiment authored in the drag-and-drop experience can run on any Azure Machine Learning Compute cluster. As your training scales up on larger data sets or complex models, the Azure Machine Learning compute can autoscale from single node to multi node each time an experiment is submitted to run. With autoscaling you can now start with small models and not worry about expanding your production work with bigger data. By removing scaling limitations, data scientists now can focus on their training work.

Easy deployment

Deploying a trained model to a production environment previously required knowledge of coding, model management, container service, and web service testing. We wanted to provide an easier solution to this challenge so that these skills are no longer necessary. With the new visual interface, customers of all experience levels can deploy a trained model with just a few clicks. We will discuss how to launch this interface later in this blog.

Once a model is deployed, you can test the web service immediately from this new user visual interface. Now you can test to make sure your models are correctly deployed. All web service inputs are now pre-populated for convenience. The web service API and sample code are also automatically generated. These procedures normally used to take hours to perform, but now with the new visual interface it can all happen within just a few clicks.

Full integration of Azure Machine Learning service

As the newest capability of Azure Machine Learning service, the visual interface brings the best of Azure Machine Learning service and Machine Learning Studio together. The assets created in this new visual interface experience can be used and managed in the Azure Machine Learning service workspace. These include experiments, compute, models, images, and deployments. It also natively inherits the capabilities like run history, versioning, and security of Azure Machine Learning service.

How to use

See for yourself just how easy it is to use this interface with just a few clicks. To access this new capability, open your Azure Machine Learning workspace in the Azure portal. In your workspace, select visual interface (preview) to launch the visual interface.

Saturday, 23 February 2019

Artificial Intelligence Azure Data Factory Azure Machine Learning Data Science

PyTorch on Azure: Deep learning in the oil and gas industry

14:15 By Kristen Waston 0 Comment

Drilling for oil and gas is one of the most dangerous jobs on Earth. Workers are exposed to the risk of events ranging from small equipment malfunctions to entire off shore rigs catching on fire. Fortunately, the application of deep learning in predictive asset maintenance can help prevent natural and human made catastrophes.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

We have more information than ever on our equipment thanks to sensors and IoT devices, but we are still working on ways to process the data so it is valuable for preventing these catastrophic events. That’s where deep learning comes in. Data from multiple sources can be used to train a predictive model that helps oil and gas companies predict imminent disasters, enabling them to follow a proactive approach.

Using the PyTorch deep learning framework on Microsoft Azure, Accenture helped a major oil and gas company implement such a predictive asset maintenance solution. This solution will go a long way in protecting their staff and the environment.

What is predictive asset maintenance?

Predictive asset maintenance is a core element of the digital transformation of chemical plants. It is enabled by an abundance of cost-effective sensors, increased data processing, automation capabilities, and advances in predictive analytics. It involves converting information from both real-time and historical data into simple, accessible, and actionable insights. This is in order to enable the early detection and elimination of defects that would otherwise lead to malfunction. For example, by simply detecting an early defect in a seal that connects the pipes, we can prevent a potential failure that can result in a catastrophic collapse of the whole gas turbine.

Under the hood, predictive asset maintenance combines condition-based monitoring technologies, statistical process control, and equipment performance analysis to enable data from disparate sources across the plant to be visualized clearly and intuitively. This allows operations and equipment to be better monitored, processes to be optimized, better controlled, and energy management to be improved.

It is worth noting that the predictive analytics at the heart of this process do not tell the plant operators what will happen in the future with complete certainty. Instead, they forecast what is likely to happen in the future with an acceptable level of reliability. It can also provide “what-if” scenarios and an assessment of risks and opportunities.

Figure 1 – Asset maintenance maturity matrix (Source: Accenture)

The challenge with oil and gas

Event prediction is one of the key elements in predictive asset maintenance. For most prediction problems there are enough examples of each pattern to create a model to identify them. Unfortunately, in certain industries like oil and gas where everything is geared towards avoiding failure, the sought-after examples of failure patterns are rare. This means that most standard modelling approaches either perform no better than experienced humans or fail to work at all.

Accenture’s solution with PyTorch and Azure

Although there only exists a small number of failure examples, there exists a wealth of times series and inspection data that can be leveraged.

Figure 2 – Approach for Predictive Maintenance (Source : Accenture)

After preparing the data in stage one, a two-phase deep learning solution was built with PyTorch in stage two. First, a recurrent neural network (RNN) was trained in combination with a long short-term memory (LSTM) architecture which is phase one of stage two. The neural network architecture used in the solution was inspired by Koprinkova-Hristova et al 2011 and Aydin and Guldamlasioglu 2017. This RNN timeseries model forecasts important variables, such as the temperature of an important seal. These forecasts are then fed into a classifier algorithm (random forest) to identify the variable is outside of the safe range and if so, the algorithm produces a ranking of potential causes which experts can examine and address. This effectively enables experts to address the root causes of potential disasters before they occur.

The following is a diagram of the system that was used for training and execution of the solution:

Figure 3 - System Architecture

The architecture above was chosen to ensure the customer requirement of maximum flexibility in modeling, training, and in the execution of complex machine learning workflows are using Microsoft Azure. At the time of implementation, the services that fit these requirements were HDInsights and Data Science Virtual Machines (DSVM). If the project was implemented today, Azure Machine Learning service would have been used for training/inferencing with HDInsights or Azure Databricks for data processing.

PyTorch was used due to the extreme flexibility in designing the computational execution graphs, and not being bound into a static computation execution graph like in other deep learning frameworks. Another important benefit of PyTorch is that standard python control flow can be used and models can be different for every sample. For example, tree-shaped RNNs can be created without much effort. PyTorch also enables the use of Python debugging tools, so programs can be stopped at any point for inspection of variables, gradients, and more. This flexibility was very beneficial during training and tuning cycles.

The optimized PyTorch solution resulted in faster training time by over 20 percent compared to other deep learning frameworks along with 12 percent faster inferencing. These improvements were crucial in the time critical environment that team was working in. Please note, that the version tested was PyTorch 0.3.

Overview of benefits of using PyTorch in this project:

◈ Training time

◈ Reduction in average training time by 22 percent using PyTorch on the outlined Azure architecture.

◈ Debugging/bug fixing

◈ The dynamic computational execution graph in combination with Python standard features reduced the overall development time by 10 percent.

◈ Visualization

◈ The direct integration into Power BI enabled a high end-user acceptance from day one.

◈ Experience using distributed training

◈ The dynamic computational execution graph in combination with flow control allowed us to create a simple distributed training model and gain significant improvements in overall training time.

How did Accenture operationalize the final model?

Scalability and operationalization were key design considerations from day one of the project, as the customer wanted to scale out the prototype to several other assets across the fleet. As a result, all components within the system architecture were chosen with those as criteria. In addition, the customer wanted to have the ability to add more data sources using Azure Data Factory. Azure Machine Learning service and its model management capability were used to operationalize the final model. The following diagram illustrates the deployment workflow used.

Figure 4 – Deployment workflow

The deployment model was also integrated into a Continuous Integration/Continuous Delivery (CI/CD) workflow as depicted below.

PyTorch on Azure: Better together

The combination of Azure AI offerings with the capabilities of PyTorch proved to be a very efficient way to train and rapidly iterate on the deep learning architectures used for the project. These choices yielded a significant reduction in training time and increased productivity for data scientists.

Azure is committed to bringing enterprise-grade AI advances to developers using any language, any framework, and any development tool. Customers can easily integrate Azure AI offerings into any part of their machine learning lifecycles to productionize their projects at scale, without getting locked into any one tool or platform.

Saturday, 9 December 2023

Mapping mitigations and evaluations to the LLMOps lifecycle

Ideating and exploring loop: Add model layer and safety system mitigations

Building and augmenting loop: Add metaprompt and grounding mitigations

Operationalizing loop: Add monitoring and UX design mitigations

Looking ahead with Azure AI

Thursday, 1 June 2023

Bring your data to life with generative AI

Run and scale AI-powered, intelligent apps on Azure

Integrated, AI-based tools to help developers thrive

Enable secure and trusted experiences in the era of AI

Let’s write this history, together

Thursday, 25 May 2023

Climate impacts raise stakes for Net Zero transition

Maximizing wind energy production with digital twins

Leveraging digital twins to power Net Zero transition

Powering the Next Era of Industrial Digitalization

Key takeaways about a Net Zero economy and digital twins

Saturday, 19 November 2022

Azure high-performance computing provides scalable solutions

Cloud infrastructure purpose-built for AI

Retail and AI

Manufacturing

Saturday, 8 June 2019

Expanded forecast function

Rolling-origin cross validation

Lags and rolling window aggregates

Holiday features

Get started with time-series forecasting in automated ML

Thursday, 23 May 2019

Visual interface

Scalable Training

Easy deployment

Full integration of Azure Machine Learning service

How to use

Saturday, 23 February 2019

What is predictive asset maintenance?

How did Accenture operationalize the final model?

PyTorch on Azure: Better together

Blog Archive

Labels

Popular posts

Subscribe To

Total Pageviews

Translate