Saturday, 23 February 2019

PyTorch on Azure: Deep learning in the oil and gas industry

Drilling for oil and gas is one of the most dangerous jobs on Earth. Workers are exposed to the risk of events ranging from small equipment malfunctions to entire off shore rigs catching on fire. Fortunately, the application of deep learning in predictive asset maintenance can help prevent natural and human made catastrophes.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

We have more information than ever on our equipment thanks to sensors and IoT devices, but we are still working on ways to process the data so it is valuable for preventing these catastrophic events. That’s where deep learning comes in. Data from multiple sources can be used to train a predictive model that helps oil and gas companies predict imminent disasters, enabling them to follow a proactive approach.

Using the PyTorch deep learning framework on Microsoft Azure, Accenture helped a major oil and gas company implement such a predictive asset maintenance solution. This solution will go a long way in protecting their staff and the environment.

What is predictive asset maintenance?


Predictive asset maintenance is a core element of the digital transformation of chemical plants. It is enabled by an abundance of cost-effective sensors, increased data processing, automation capabilities, and advances in predictive analytics. It involves converting information from both real-time and historical data into simple, accessible, and actionable insights. This is in order to enable the early detection and elimination of defects that would otherwise lead to malfunction. For example, by simply detecting an early defect in a seal that connects the pipes, we can prevent a potential failure that can result in a catastrophic collapse of the whole gas turbine.

Under the hood, predictive asset maintenance combines condition-based monitoring technologies, statistical process control, and equipment performance analysis to enable data from disparate sources across the plant to be visualized clearly and intuitively. This allows operations and equipment to be better monitored, processes to be optimized, better controlled, and energy management to be improved.

It is worth noting that the predictive analytics at the heart of this process do not tell the plant operators what will happen in the future with complete certainty. Instead, they forecast what is likely to happen in the future with an acceptable level of reliability. It can also provide “what-if” scenarios and an assessment of risks and opportunities.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

Figure 1 – Asset maintenance maturity matrix (Source: Accenture)

The challenge with oil and gas

Event prediction is one of the key elements in predictive asset maintenance. For most prediction problems there are enough examples of each pattern to create a model to identify them. Unfortunately, in certain industries like oil and gas where everything is geared towards avoiding failure, the sought-after examples of failure patterns are rare. This means that most standard modelling approaches either perform no better than experienced humans or fail to work at all.

Accenture’s solution with PyTorch and Azure

Although there only exists a small number of failure examples, there exists a wealth of times series and inspection data that can be leveraged.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

Figure 2 – Approach for Predictive Maintenance (Source : Accenture)

After preparing the data in stage one, a two-phase deep learning solution was built with PyTorch in stage two. First, a recurrent neural network (RNN) was trained in combination with a long short-term memory (LSTM) architecture which is phase one of stage two. The neural network architecture used in the solution was inspired by Koprinkova-Hristova et al 2011 and Aydin and Guldamlasioglu 2017. This RNN timeseries model forecasts important variables, such as the temperature of an important seal. These forecasts are then fed into a classifier algorithm (random forest) to identify the variable is outside of the safe range and if so, the algorithm produces a ranking of potential causes which experts can examine and address. This effectively enables experts to address the root causes of potential disasters before they occur.

The following is a diagram of the system that was used for training and execution of the solution:  

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

Figure 3 - System Architecture

The architecture above was chosen to ensure the customer requirement of maximum flexibility in modeling, training, and in the execution of complex machine learning workflows are using Microsoft Azure. At the time of implementation, the services that fit these requirements were HDInsights and Data Science Virtual Machines (DSVM). If the project was implemented today, Azure Machine Learning service would have been used for training/inferencing with HDInsights or Azure Databricks for data processing.

PyTorch was used due to the extreme flexibility in designing the computational execution graphs, and not being bound into a static computation execution graph like in other deep learning frameworks. Another important benefit of PyTorch is that standard python control flow can be used and models can be different for every sample. For example, tree-shaped RNNs can be created without much effort. PyTorch also enables the use of Python debugging tools, so programs can be stopped at any point for inspection of variables, gradients, and more. This flexibility was very beneficial during training and tuning cycles.

The optimized PyTorch solution resulted in faster training time by over 20 percent compared to other deep learning frameworks along with 12 percent faster inferencing. These improvements were crucial in the time critical environment that team was working in. Please note, that the version tested was PyTorch 0.3.

Overview of benefits of using PyTorch in this project:

◈ Training time
     ◈ Reduction in average training time by 22 percent using PyTorch on the outlined Azure architecture.
◈ Debugging/bug fixing
     ◈ The dynamic computational execution graph in combination with Python standard features reduced the overall development time by 10 percent.
◈ Visualization
     ◈ The direct integration into Power BI enabled a high end-user acceptance from day one.
◈ Experience using distributed training
     ◈ The dynamic computational execution graph in combination with flow control allowed us to create a simple distributed training model and gain significant improvements in overall training time.

How did Accenture operationalize the final model?


Scalability and operationalization were key design considerations from day one of the project, as the customer wanted to scale out the prototype to several other assets across the fleet. As a result, all components within the system architecture were chosen with those as criteria. In addition, the customer wanted to have the ability to add more data sources using Azure Data Factory. Azure Machine Learning service and its model management capability were used to operationalize the final model. The following diagram illustrates the deployment workflow used.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

Figure 4 – Deployment workflow

The deployment model was also integrated into a Continuous Integration/Continuous Delivery (CI/CD) workflow as depicted below.

Azure Certification, Azure Tutorial and Material, Azure Guides, Azure Learning

PyTorch on Azure: Better together


The combination of Azure AI offerings with the capabilities of PyTorch proved to be a very efficient way to train and rapidly iterate on the deep learning architectures used for the project. These choices yielded a significant reduction in training time and increased productivity for data scientists.

Azure is committed to bringing enterprise-grade AI advances to developers using any language, any framework, and any development tool. Customers can easily integrate Azure AI offerings into any part of their machine learning lifecycles to productionize their projects at scale, without getting locked into any one tool or platform.

Friday, 22 February 2019

Modernize alerting using Azure Resource Manager storage accounts

Classic alerts in Azure Monitor will reach retirement this coming June. We recommend that you migrate your classic alert rules defined on your storage accounts, especially if you want to retain alerting functionality with the new alerting platform. If you have classic alert rules configured on classic storage accounts, you will need to upgrade your accounts to Azure Resource Manager (ARM) storage accounts before you migrate alert rules.

Identify classic alert rules


You should first find all classic alert rules before you migrate. The following screenshot shows how you can identify classic alert rules in the Azure portal. Please note, you can filter by subscription so you can find all classic alert rules without checking on each resource separately.

Azure Certification, Azure Guides, Azure Certification, Azure Learning

Migrate classic storage accounts to ARM


New alerts do not support classic storage accounts, only ARM storage accounts. If you configured classic alert rules on a classic storage account you will need to migrate to an ARM storage account.

You can use "Migrate to ARM" to migrate using the storage menu on your classic storage account. The screenshot below shows an example of this.

Azure Certification, Azure Guides, Azure Certification, Azure Learning

Re-create alert rules in new alerting platform


After you have migrated the storage account to ARM, you then need to re-create your alert rules. The new alerting platform supports alerting on ARM storage accounts using new storage metrics. In the storage blade, the menu is named "Alert" for the new alerting platform.

Before you re-create alert rules as a new alert for your storage accounts, you may want to understand the difference between classic metrics and new metrics and how they are mapped. 

The following screenshot shows how to create an alert based on “UsedCapacity.”

Azure Certification, Azure Guides, Azure Certification, Azure Learning

Some metrics include dimension, which allows you to see and use different dimension value types. For example, the transactions metric has a dimension named “ResponseType” and the values represent different type of errors and success. You can create an alert to monitor transactions on a particular error such as “ServerBusyError” or “ClientOtherError” with “ResponseType”.

The following screenshot shows how to create an alert based on Transactions with “ClientOtherError.”

Azure Certification, Azure Guides, Azure Certification, Azure Learning

In the list of dimension values, you won't see all supported values by default. You will only see values that have been triggered by actual requests. If you want to monitor conditions that have not happened, you can add a custom dimension value during alert creation. For example, when you have not had anonymous requests to your storage account yet, you can still setup alerts in advance to monitor such activity from upcoming requests.

The following screenshot shows how to add a custom dimension value to monitor upcoming anonymous transactions.

Azure Certification, Azure Guides, Azure Certification, Azure Learning

We recommend creating the new alert rules first, verify they work as intended, then remove the classic alerts.

Azure Monitor is a unified monitoring service that includes alerting and other monitor capabilities.

Thursday, 21 February 2019

Class schedules on Azure Lab Services

Classroom labs in Azure Lab Services make it easy to set up labs by handling the creation and management of virtual machines and enabling the infrastructure to scale. Through our continuous enhancements to Azure Lab Services, we are proud share that the latest deployment now includes added support for class schedules.

Schedules management is one of the key features requested by our customers. This feature helps teachers easily create, edit, and delete schedules for their classes. A teacher can set up a recurring or a one-time schedule and provide a start, end date, and time for the class in the time zone of choice. Schedules can be viewed and managed through a simple, easy to use calendar view.


Students virtual machines are turned on and ready to use when a class schedule starts and will be turned off at the end of the schedule. This feature helps limit the usage of virtual machines to class times only, thereby helping IT admins and teachers manage costs efficiently.

Schedule hours are not counted against quota allotted to a student. Quota is the time limit outside of schedule hours when a student can use the virtual machine.

With schedules, we are also introducing no quota hours. When no quota hours are set for a lab, students can only use their virtual machines during scheduled hours or if the teacher turns on virtual machines for the students to use.


Students will be able to clearly see when a lab schedule session is in progress on their virtual machines view.

Tuesday, 19 February 2019

Controlling costs in Azure Data Explorer using down-sampling and aggregation

Azure Data Explorer (ADX) is an outstanding service for continuous ingestion and storage of high velocity telemetry data from cloud services and IoT devices. Leveraging its first-rate performance for querying billions of records, the telemetry data can be further analyzed for various insights such as monitoring service health, production processes, and usage trends. Depending on data velocity and retention policy, data size can rapidly scale to petabytes of data and increase the costs associated with data storage. A common solution for storage of large datasets for a long period of time is to store the data with differing resolution. The most recent data is stored at maximum resolution, meaning all events are stored in raw format. While the historic data is stored at reduced resolution, being filtered and/or aggregated. This solution is often used for time series databases to control hot storage costs.

I’ll use the GitHub events public dataset as the playground.  I’ll describe how ADX users can take advantage of stored functions, the “.set-or-append” command, and the Microsoft Flow Azure Kusto connector. This will help you to create and update tables with filtered, down-sampled, and aggregated data for controlling storage costs. The following are steps which I performed.

Create a function for down-sampling and aggregation


The ADX demo11 cluster contains a database named GitHub. Since 2016, all events from GHArchive have been ingested into the GitHubEvent table and now total more than 1 billion records. Each GitHub event is represented in a single record with event-related information on the repository, author, comments, and more.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

Initially, I created the stored function AggregateReposWeeklyActivity which counts the total number of events in every repository for a given week.

.create-or-alter function with (folder = "TimeSeries", docstring = "Aggregate Weekly Repos Activity”)
AggregateReposWeeklyActivity(StartTime:datetime)
{
     let PeriodStart = startofweek(StartTime);
     let Period = 7d;
     GithubEvent
     | where CreatedAt between(PeriodStart .. Period)
     | summarize EventCount=count() by RepoName = tostring(Repo.name), StartDate=startofweek(CreatedAt)
     | extend EndDate=endofweek(StartDate)
     | project StartDate, EndDate, RepoName, EventCount
}

I can now use this function to generate a down-sampled dataset of the weekly repository activity. For example, using the AggregateReposWeeklyActivity function for the first week of 2017 results in a dataset of 867,115 records.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

Using Kusto query, create a table with historic data


Since the original dataset starts in 2016, I formulated a program that creates a table named ReposWeeklyActivity and backfills it with weekly aggregated data from the GitHubEvent table. The query runs in parallel ingestion of weekly aggregated datasets using the “.set-or-append” command. The first ingestion operation also creates the table that holds the aggregated data.

.show table GithubEvent details
| project TableName, SizeOnDiskGB=TotalExtentSize/pow(1024,3), TotalRowCount

.show table ReposWeeklyActivity details
| project TableName, SizeOnDiskGB=TotalExtentSize/pow(1024,3), TotalRowCount

Code sample:
using Kusto.Data.Common;
using Kusto.Data.Net.Client;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace GitHubProcessing
{
     class Program
     {
         static void Main(string[] args)
         {
             var clusterUrl = "https://demo11.westus.kusto.windows.net:443;Initial Catalog=GitHub;Fed=True";
             using (var queryProvider = KustoClientFactory.CreateCslAdminProvider(clusterUrl))
             {
                 Parallel.For(
                     0,
                     137,
                     new ParallelOptions() { MaxDegreeOfParallelism = 8 },
                     (i) =>
                     {
                         var startDate = new DateTime(2016, 01, 03, 0, 0, 0, 0, DateTimeKind.Utc) + TimeSpan.FromDays(7 * i);
                         var startDateAsCsl = CslDateTimeLiteral.AsCslString(startDate);
                         var command = $@"
                         .set-or-append ReposWeeklyActivity <|
                         AggregateReposWeeklyActivity({startDateAsCsl})";
                         queryProvider.ExecuteControlCommand(command);

                        Console.WriteLine($"Finished: start={startDate.ToUniversalTime()}");
                     });
             }
         }
     }
}

Once the backfill is complete, the ReposWeeklyActivity table will contain 153 million records.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

Configure weekly aggregation jobs using Microsoft Flow and Azure Kusto connector


Once the ReposWeeklyActivity table is created and filled with the historic data, we want to make sure it stays updated with new data appended every week. For that purpose, I created a flow in Microsoft Flow that leverages Azure Kusto connector to ingest aggregation data on a weekly basis. The flow is built of two simple steps:

1. Weekly trigger of Microsoft Flow.
2. Use of “.set-or-append” to ingest the aggregated data from the past week.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

Start saving


To depict the cost saving potential of down-sampling, I’ve used “.show table <table name> details” command to compare the size of the original GitHubEvent table and the down-sampled table ReposWeeklyActivity.

.show table GithubEvent details
| project TableName, SizeOnDiskGB=TotalExtentSize/pow(1024,3), TotalRowCount

.show table ReposWeeklyActivity details
| project TableName, SizeOnDiskGB=TotalExtentSize/pow(1024,3), TotalRowCount

The results, summarized in the table below, show that for the same time frame the down-sampled data is approximately 10 times smaller in record count and approximately 180 times smaller in storage size.

Original data Down-sampled/aggregated data
Time span 2016-01-01 … 2018-09-26  2016-01-01 … 2018-09-26
Record count 1,048,961,967  153,234,107
Total size on disk (indexed and compressed)  725.2 GB 4.38 GB 


Converting the cost savings potential to real savings can be performed in various ways. A combination of the different methods are usually most efficient in controlling costs.

◈ Control cluster size and hot storage costs: Set different caching policies for the original data table and down-sampled table. For example, 30 days caching for the original data and two years for the down-sampled table. This configuration allows you to enjoy ADX first-rate performance for interactive exploration of raw data, and analyze activity trends over years. All while controlling cluster size and hot storage costs.

◈ Control cold storage costs: Set different retention policies for the original data table and down-sampled table. For example, 30 days retention for the original data and two years for the down-sampled table. This configuration allows you to explore the raw data and analyze activity trends over years while controlling cold storage costs. On a different note, this configuration is also common for meeting privacy requirements as the raw data might contain user-identifiable information and the aggregated data is usually anonymous.

◈ Use the down-sampled table for analysis: Running queries on the down-sampled table for time series trend analysis will consume less CPU and memory resources. In the example below, I compare the resource consumption of a typical query that calculates the total weekly activity across all repositories. The query statistics shows that analyzing weekly activity trends on the down-sampled dataset is approximately 17 times more efficient in CPU consumption and approximately eight times more efficient in memory consumption.

Running this query on the original GitHubEvent table consumes approximately 56 seconds of total CPU time and 176MB of memory.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

The same calculation on the aggregated ReposWeeklyActivity table consumes only about three seconds of total CPU time and 16MB of memory.

Azure Data Explorer, Azure Certifications, Azure Tutorial and Materials, Azure Guides

Monday, 18 February 2019

Monitor at scale in Azure Monitor with multi-resource metric alerts

Our customers rely on Azure to run large scale applications and services critical to their business. To run services at scale, you need to setup alerts to proactively detect, notify, and remediate issues before it affects your customers. However, configuring alerts can be hard when you have a complex, dynamic environment with lots of moving parts.

Today, we are excited to release multi-resource support for metric alerts in Azure Monitor to help you set up critical alerts at scale. Metric alerts in Azure Monitor work on a host of multi-dimensional platform and custom metrics, and notify you when the metric breaches a threshold that was either defined by you or detected automatically.

With this new feature, you will be able to set up a single metric alert rule that monitors:

◈ A list of virtual machines in one Azure region
◈ All virtual machines in one or more resource groups in one Azure region
◈ All virtual machines in a subscription in one Azure region

Benefits of using multi-resource metric alerts


◈ Get alerting coverage faster: With a small number of rules, you can monitor all the virtual machines in your subscription. Multi-resource rules set at subscription or resource group level can automatically monitor new virtual machines deployed to the same resource group/subscription (in the same Azure region). Once you have such a rule created, you can deploy hundreds of virtual machines all monitored from day one without any additional effort.

◈ Much smaller number of rules to manage: You no longer need to have a metric alert for every resource that you want to monitor.

◈ You still get resource level notifications: You still get granular notifications per impacted resource, so you always have the information you need to diagnose issues.

◈ Even simpler at scale experience: Using Dynamic Thresholds along with multi-resource metric alerts, you can monitor each virtual machine without the need to manually identify and set thresholds that fit all the selected resources. Dynamic condition type applies tailored thresholds based on advanced machine learning (ML) capabilities that learn metrics' historical behavior, as well as identifies patterns and anomalies.

Setting up a multi-resource metric alert rule


When you set up a new metric alert rule in the alert rule creation experience, use the checkboxes to select all the virtual machines you want the rule to be applied to. Please note that all the resources must be in the same Azure region.

Azure Study Materials, Azure Certifications, Azure Certifications, Azure Learning

You can select one or more resource groups, or select a whole subscription to apply the rule to all virtual machines in the subscription.

Azure Study Materials, Azure Certifications, Azure Certifications, Azure Learning

If you select all virtual machines in your subscription, or one or more resource groups, you get the option to auto-grow your selection. Selecting this option means the alert rule will automatically monitor any new virtual machines that are deployed to this subscription or resource group. With this option selected, you don’t need to create a new rule or edit an existing rule whenever a new virtual machine is deployed.

Azure Study Materials, Azure Certifications, Azure Certifications, Azure Learning

You can also use Azure Resource Manager templates to deploy multi-resource metric alerts.

Pricing


The pricing for metric alert rules is based on number of metric timeseries monitored by an alert rule. This same pricing applies to multi-resource metric alert rules.

Wrapping up


We are excited about this new capability that makes configuring and managing metric alerts rule at scale easier. This functionality is currently only supported for virtual machines with support for other resource types coming soon. We would love to hear what you think about it and what improvements we should make.

Sunday, 17 February 2019

Moving your Azure Virtual Machines has never been easier!

To meet customer demand, Azure is continuously expanding. We’ve been adding new Azure regions and introducing new capabilities. As a result, customers want to move their existing virtual machines (VMs) to new regions while adopting the latest capabilities. There are other factors that prompt our customers to relocate their VMs. For example, you may want to do that to increase SLAs.

Azure Virtual Machines, Azure Guides, Azure Certification, Azure Learning, Azure Learning

In this blog, we will walk you through the steps you need to follow to move your VM as is or to increase availability, across regions.

Why do customers want to move their Azure IaaS Virtual Machines?


Some of the most common reasons that prompt our customers to move their virtual machines include:

• Geographical proximity: “I deployed my VM in region A and now region B, which is closer to my end users, has become available.”

• Mergers and acquisitions: “My organization was acquired, and the new management team wants to consolidate resources and subscriptions into one region.”

• Data sovereignty: “My organization is based in the UK with a large local customer base. As a result of Brexit, I need to move my Azure resources from various European regions to the UK in order to comply with local rules and regulations.”

• SLA requirements: “I deployed my VMs in Region A, and I would like to get a higher level of confidence regarding the availability of my services by moving my VMs into Availability Zones (AZ). Region A doesn’t have an AZ at the moment. I want to move my VMs to Region B, which is still within my latency limits and has Availability Zones.”

If you or your organization are going through any of these scenarios or you have a different reason to move your virtual machines, we’ve got you covered!

Move Azure VMs to a target region


For any of the scenarios outlined above, if you want to move your Azure Virtual Machines to a different region with the same configuration as the source region or increase your availability SLAs by moving your virtual machines into an Availability Zone, you can use Azure Site Recovery (ASR). We recommend taking the following steps to ensure a successful transition:

1. Verify prerequisites: To move your VMs to a target region, there are a few prerequisites we recommend you gather. This ensures that you’re creating a basic understanding of the Azure Site Recovery replication, the components involved, the support matrix, etc.

2. Prepare the source VMs: This involves ensuring the network connectivity of your VMs, certificates installed on your VMs, identifying the networking layout of your source and dependent components, etc.

3. Prepare the target region: You should have the necessary permissions to create resources in the target region including the resources that are not replicated by Site Recovery. For example, permissions for your subscriptions in the target region, available quota in the target region, Site Recovery’s ability to support replication across the source-target regional pair, pre-creation of load balancers, network security groups (NSGs), key vault, etc.

4. Copy data to the target region: Use Azure Site Recovery replication technology to copy data from the source VM to the target region.

5. Test the configuration: Once the replication is complete, test the configuration by performing a failover test to a non-production network.

6. Perform the move: Once you’re satisfied with the testing and you have verified the configuration, you can initiate the actual move to the target region.

7. Discard the resources in the source region: Clean up the resources in the source region and stop replication of data.

Azure Virtual Machines, Azure Guides, Azure Certification, Azure Learning, Azure Learning

Move your Azure VM ‘as is’


If you intend to retain the same source configuration as the target region, you can do so with Azure Site Recovery. Your virtual machine configuration availability SLAs will be the same before and after the move. A single instance VM after the move will come back online as a single instance VM. VMs in an Availability Set after the move will be placed into an Availability Set, and VMs in an Availability Zone will be placed into an Availability Zone within the target region.

Move your Azure virtual machines to increase availability


As many of you know, we offer Availability Zones (AZs), a high availability offering that protects your applications and data from datacenter failures. AZs are unique physical locations within an Azure region and are equipped with independent power, cooling, and networking. To ensure resiliency, there’s a minimum of three separate zones in all enabled regions. With AZs, Azure offers 99.99 percent VM uptime SLA.

You can use Azure Site Recovery to move your single instance VM or VMs in an Availability Set into an Availability Zone, thereby achieving 99.99 percent uptime SLA. You can choose to place your single instance VM or VMs in an Availability Set into Availability Zones when you choose to enable the replication for your VM using Azure Site Recovery. Ideally each VM in an Availability Set should be spread across Availability Zones. The SLA for availability will be 99.99 percent once you complete the move operation.

Friday, 1 February 2019

QnA Maker simplifies knowledge base management for your Q&A bot

With Microsoft Bot Framework, you can build chatbots and conversational applications in a variety of ways. Whether you’re looking to develop a bot from scratch with the open source Bot Framework, looking to create your own branded assistant with the Virtual Assistant solution accelerator, or looking to create a Q&A bot in minutes with QnA Maker. QnA Maker is an easy-to-use web-based service that makes it easy to power a question-answer application or chatbot from semi-structured content like FAQ documents and product manuals. With QnA Maker, developers can build, train, and publish question and answer bots in minutes.

Today, we are excited to reveal the launch of a highly requested feature, Active Learning in QnA Maker. Active Learning helps identify and recommend question variations for any question and allows you to add them to your knowledge base. Your knowledge base content won’t change unless you choose to add or edit the suggestions to the knowledge base.

How it works


Active Learning is triggered based on the scores of top N answers returned by QnA Maker for any given query. If the score differences lie within a small range, then the query is considered a possible “suggestion” for each of the possible answers. The exact score difference logic is a function of the score root of the confidence score of the top answer.

All the suggestions are then clustered together by similarity and top suggestions for alternate questions are displayed based on the frequency of the particular queries by end users. Therefore, active learning gives the best possible suggestions in cases where the endpoints are getting a reasonable quantity and variety in terms of usage queries.

QnA Maker learns new question variations in two possible ways.

◈ Implicit feedback – The ranker understands when a user question has multiple answers with scores which are very close and considers that as implicit feedback.

◈ Explicit feedback – When multiple answers with little variation in scores are returned from the knowledge base, the client application can ask the user which question is the correct question. When the user selects the correct question, the user's explicit feedback is sent to QnA Maker with the Train API.

Either method provides the ranker with similar queries that are clustered. When similar queries are clustered, QnA Maker suggests the user-based questions to the knowledge base designer to accept or reject.

How to turn on active learning


By default, Active Learning will be disabled for everybody. Please follow the below steps to enable the Active Learning.

1. To turn active learning on, go to your Service Settings in the QnA Maker portal, in the top-right corner.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

2. Find the QnA Maker service then toggle Active Learning.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

Once Active Learning is enabled, the knowledge suggests new questions at regular intervals based on user-submitted questions. You can disable Active Learning by toggling the setting again.

How to add Active Learning suggestion to the knowledge base


1. In order to see the suggested questions, on the Edit knowledge base page, select Show Suggestions.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

2. Filter the knowledge base with question and answer pairs to only show suggestions by selecting Filter by Suggestions.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

3. Each question section with suggestions shows the new questions with a check mark to accept the question or an x mark to reject the suggestions. Click on the checkmark to add the question.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

You can add or delete all suggestions by selecting Add all or Reject all.

4. Select Save and Train to save the changes to the knowledge base.

To use Active Learning efficiently, one should have higher traffic on the bot. Higher the number of end-user queries, the better will be quality and quantity of suggestions.

QnA Maker active learning Dialog

The QnA Maker active learning Dialog does the following:

◈ Get the Top N matches from the QnA service for every query above the threshold set.
◈ If the top result confidence score is significantly more than the rest of the results, show only the top answer.
◈ If the Top N results have similar confidence scores, then we prompt the user asking which of the following question he meant.
◈ Once the user selects the right question that matches intent, show the answer for that corresponding question.
◈ This selection also triggers feedback into the QnA Maker service via the Train API.

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

Migrating knowledge bases from the old preview portal 


You may recall at //Build last May 2018, we announced the general availability (GA) of QnA Maker with new architecture built on Azure. As a result, knowledge bases created with QnA Maker free preview will need to be migrated to QnA GA, as the QnA Maker preview will be deprecated January 31, 2019.

Below is a screenshot of the old QnA Maker preview portal for reference:

Azure Study Material, Azure Guides, Azure Certifications, Azure Tutorial and Materials

QnA Maker GA highlights:


1. New architecture. The data and runtime components of the QnAMaker stack will be hosted in the user’s Azure subscription. 

2. No more throttling. Pay for services hosted, instead of transactions.

3. Data privacy and compliance. The QnA data will be hosted within your Azure compliance boundary.

4. Brand new portal experience to create and manage your knowledge base. 

5. Scale as you go. Scale different part of the stack as per your needs.