Sunday, 31 December 2017

Azure Media Services announces support for AAD and deprecation of ACS authentication

This month we are announcing the release of support for Azure Active Directory (AAD) authentication in Azure Media Services. Customers of our REST API and .NET client libraries can now use AAD authentication to authorize requests. In addition, we are releasing a new management blade in the Azure Portal to simplify the usage of User and Service Principal authentication with AAD.

With the release of this update to our REST API, we are now able to provide the same role-based access management (RBAC) as provided by the Azure Resource Management (ARM) service. By moving to AAD authentication you will also now be able to track and audit all changes made by specific users or an application connected to your Media Services account. The new Azure Media REST API requires that the user or application making REST API requests must have either contributor or owner level access to the resources it is attempting to manage. More details on how role-based access control works for Azure resources is available at Azure Role-based Access Control.

12-month deprecation notice of ACS authentication support in Azure Media Services

Because Azure Active Directory provides powerful role-based access control features and support for more fine-grained access to resources in your account compared to the ACS token authentication model ("account keys"), we strongly recommend that you update your code and migrate from ACS to AAD-based authentication by June 22, 2018. Also, a key reason for the rapid migration is the upcoming announced deprecation of the ACS key based authentication system.

What does this mean for you?


◉ Microsoft Azure Media Services will end support for Microsoft Azure Access Control Service (ACS)-based authentication on June 22, 2018.
◉ To provide customers sufficient time to update their application code, we are providing 12 months' notice to manage the necessary transition.

What actions should you take?


We recommend that you take the following actions prior to June 22, 2018 to ensure that your applications continue to work as expected:

◉ Update the code for your applications authored for Media Services.
◉ Migrate from ACS-based authentication.
◉ Begin using AAD-based authentication.

Mitigation steps must be taken on or before June 22, 2018 to ensure your applications authored for Media Services using ACS authentication tokens will continue to function as expected without failures in production. Please review each of the new authentication scenarios below closely and take the appropriate action to update to using AAD authentication in your source code.

The Azure Media Services REST API supports authentication for both interactive users and web API, middle-tier, or daemon applications. The following sections provide details on how to use AAD authentication when working directly with the REST API or through the .NET client library.

User Authentication with AAD in Media Services


If you are looking to build a management application for your Azure Media Services account like the Azure Media Services Explorer tool, you can simply login with a User's credentials that has been granted access to the Media Services Resource in the portal via the Access Control (IAM) blade. This type of solution is very useful when you want human interaction with the service that fits one of the following scenarios:

◉ Monitoring dashboard for your Encoding jobs
◉ Monitoring dashboard for your Live Streams
◉ Management application for desktop or mobile users to administer resources in a Media services account.

A native application would first acquire an access token from Azure Active Directory and then use that access token to make all REST API calls. The following diagram shows a typical interactive application authentication flow. For a REST API request to succeed, the calling user must be a “Contributor” or “Owner” of the Azure Media Services account it is trying to access. Unauthorized requests would fail with status code 401. If you see this failure, please double check that you have configured your user as “Contributor” or “Owner” on the Media Services account. You can check this through the Azure portal by searching for your media account and clicking on “Access control” tab.


Users of the .NET client SDK for Media Services must upgrade to the latest version on Nuget (windowsazure.mediaservices version 4.1.0.1 or greater) to use AAD authentication for communicating with REST requests. The following example shows the differences between how to authenticate with the .NET client SDK previously using ACS and the new way that uses AAD credentials.

NOTE: Applications will also need to update their references to include a new assembly "Microsoft.WindowsAzure.MediaServices.Client.Common.Authentication.dll" and add references to that namespace as well as reference to the "Microsoft.IdentityModel.Clients.ActiveDirectory" assembly to get access to the ITokenProvider interface.

DEPRECATED method of authenticating using ACS credentials
// Create and cache Media Services credentials in a static class variable.
_cachedCredentials = new MediaServicesCredentials(
            _mediaServicesAccountName,
            _mediaServicesAccountKey, 
            "urn:windowsazuremediaservices",
            "https://wamsprodglobal001acs.accesscontrol.windows.net");
            
// Used the cached credentials to create CloudMediaContext.
 var mediaContext = new CloudMediaContext(_cachedCredentials);
 mediaContext.Assets.FirstOrDefault();

New method of authenticating using AAD credentials and User authentication

var tokenCredentials = new AzureAdTokenCredentials("{YOUR AAD TENANT DOMAIN HERE}", AzureEnvironments.AzureCloudEnvironment);
var tokenProvider = new AzureAdTokenProvider(tokenCredentials);
var mediaContext = new CloudMediaContext(new Uri("YOUR REST API ENDPOINT HERE"), tokenProvider);
mediaContext.Assets.FirstOrDefault()  // This would return a 401 unauthorized if you are not set up as an authorized user

The "AzureEnvironments.AzureCloudEnvironment" constant is a helper in the .NET SDK to get the right environment variable settings for a public Azure Data Center. It contains pre-defined environment settings for accessing Media Services in the public data centers only. For sovereign or government cloud regions, you can use the " AzureChinaCloudEnvironment", "AzureUsGovernmentEnvrionment", or "AzureGermanCloudEnvironment" respectively.

A lot of the details regarding acquiring an AAD access token has been wrapped and simplified for you in the AzureAdTokenProvider and AzureAdTokenCredentials classes. For example, you do not need to provide the AAD authority, Media services Resource URI or native AAD application details. These are well known values that are already configured by the AAD access token provider class. If you are not using our .NET client SDK, it is recommended to use the ADAL Library to simplify the creation of the access token request using these parameters. The following values are used by default in the AzureAdTokenProvider and AzureAdTokenCredentials classes.

You also have the option of replacing the default implementation of the AzureAdTokenProvider with your own implementation.

AAD Service Principal Authentication in Media Services


For non-human interaction through daemon services, Web APIs, Consumer (mobile or desktop), and Web application, where interactive login or direct user management/monitoring of resources in the Media Services account is not required, you will need to first create an Azure Active Directory application in its own tenant.

Once it is created, you will have to give this application “Contributor” or “Owner” level access to the Media Services account in the Access Control (IAM) blade. Both steps can easily be done through the Azure Portal or through the Azure CLI, or PowerShell script. Note that for AAD resources, “Contributor” has the same access to the resource as “Owner” but only the “Owner” role can grant access to other users. Currently this version of the Media Services REST API does not provide RBAC at the entity level, but that is something we have on the roadmap for our future API update in the Fall. We have also provided the new "API Access" blade in your Media Services account to make it easy to generate the required application or select from an existing one.  If you would like to use x509 certificates instead or ClientID and ClientKey, you can reference the documentation for details on how to configure the SDK

The following examples show how a daemon application may use AAD web application credentials to authenticate requests with the REST service.


Deprecated way of authenticating using ACS credentials

// Create and cache Media Services credentials in a static class variable.
_cachedCredentials = new MediaServicesCredentials(
            _mediaServicesAccountName,
            _mediaServicesAccountKey, 
            "urn:windowsazuremediaservices",
            "https://wamsprodglobal001acs.accesscontrol.windows.net");
            
// Used the cached credentials to create CloudMediaContext.
var mediaContext = new CloudMediaContext(_cachedCredentials);

New way of authenticating with an AAD Service Principal and client symmetric key

var tokenCredentials = new AzureAdTokenCredentials(“{YOUR AAD TENANT DOMAIN HERE}”, new AzureAdClientSymmetricKey(“{YOUR CLIENT ID HERE}”, “{YOUR CLIENT SECRET}”), AzureEnvironments.AzureCloudEnvironment);
var tokenProvider = new AzureAdTokenProvider(tokenCredentials);

var mediaContext = new CloudMediaContext(_mediaServicesApiServerUri, tokenProvider);

mediaContext.Assets.FirstOrDefault();

Making it easy to get started with the new API Access Blade for Media Services


Azure Active Directory authentication could be complex for users unfamiliar with the details of AAD, so we wanted to make it very easy to get started with very little knowledge of AAD. For that reason, we are introducing a new "API Access" blade for Media Services accounts in the portal that will replace the previous ACS "Account keys" blade. We are also disabling the ability to rotate the ACS keys to promote users to update their code and move to AAD support.


The new API Access blade makes the process of connecting to Azure Media Services with AAD much simpler. When you first select the API Access blade, you will be presented with a choice of using either user authentication for human interactive management applications or creating a Service Principal and AAD application for non-human interaction with the Media Services API.


When selecting the user based authentication option, you will see a new panel that contains all the Active Directory information needed to authenticate with the API. This includes the API endpoint that you need to call, along with the ClientID, Domain, and Resource.


For Service Principal authentication, you will see additional values and the ability to select from an existing AAD Application or create a new one directly in the panel.


When the Service Principal blade opens, it selects the first AAD application that meets the following criteria:

◈ It is a registered AAD application
◈ It has "Contributor" or "Owner" RBAC permissions on the account

After creating or selecting an AAD app, you will be able to create and copy a Key (Client Secret) and copy the Client ID (Application ID) which are required to get the access token in this scenario.
In the blade, you can choose to “Create New” AAD Application, or select from an Existing one in your Subscription.  When selecting an existing one, you will see a new blade listing your existing applications to choose from.


Once you select from an existing application or create a new one, you will see additional buttons to “Manage Permissions” or “Manage Application”.  You can use these settings to open the AAD application management blade directly to perform management tasks such as changing keys or reply URL or customizing the applications manifest.

Clicking on the Manage Application button will bring up the AAD application management blade which allows you to create Keys for use with the API using this application.


If you do not have permissions to create AAD apps in your Domain, the AAD app controls of the blade are not shown and a warning message is shown instead.

Next Steps and Actions for Media Services Customers


We are very excited to be making the transition from the older ACS key-based authentication to the more secure, flexible, and role-based Azure Active Directory service. All Azure Media Services customers should begin immediately to migrate to use the new AAD based authentication model by downloading the latest .NET SDK or updating their existing REST-based API calls.

In addition, we are working on a new version of our REST APIs with support for more client SDK languages with AAD authentication. More details on that updated API will come in a later blog post.

The key actions you should be taking today:

1. If you are using.NET, update to the latest SDK and migrate to AAD authentication.
2. Plan early for the deprecation of ACS authentication support in Media Services API. The older ACS authentication support will be shutting off officially on June 22, 2018.

Java SDK and Open Source and Community-driven client SDKs


If you are currently using the Java SDK or one of the many community or open source generated client SDKs for Media Services, you have a couple of options at this time. 

Azure Media Services client SDKs for both JAVA and PHP now support Azure Active Directory (AAD) Authentication. To get the latest Java SDK release see the details in our Java documentation. To download the latest PHP SDK for Media Services, look for version 0.5.7 of the Microsoft/WindowAzure package in the Packagist repository. 

For other open source libraries, since these are not supported directly by the Media Services team, you would need to work with the community SDK developer to prioritize updating the SDK to support AAD for your scenario.

In addition, we are working hard on an updated REST API (v3) that is coming out in 2018 with support for AutoRest generated client SDKs across PHP, Java, Python, and more which will support AAD authentication. We will be following up with more blog posts on migrating to the new v3 API and client SDKs when they are ready for preview.

Friday, 29 December 2017

Hardening Azure Analysis Services with the new firewall capability

Azure Analysis Services (Azure AS) is designed with security in mind and takes advantage of the security features available on the Azure platform. For example, integration with Azure Active Directory (Azure AD) provides a solid foundation for access control. Any user creating, managing, or connecting to an Azure Analysis Services server must have a valid Azure AD user identity. Object-level security within a model enables you to define permissions at the table, row, and column levels. Moreover, Azure AS uses encryption to help safeguard data at rest and in transit within the local data center, across data centers, between data centers and on-premises networks, as well as across public Internet connections. The combination of Transport Layer Security (TLS), Perfect Forward Secrecy (PFS), and RSA-based 2,048-bit encryption keys provides strong protection against would-be eavesdroppers.

However, keeping in mind that Azure Analysis Services is a multi-tenant cloud service, it is important to note that the service accepts network traffic from any client by default. Do not forget to harden your servers by taking advantage of basic firewall support. In the Azure Portal, you can find the firewall settings when you display the properties of your Azure AS server. Click on the Firewall tab, as the following screenshot illustrates. You must be a member of the Analysis Services Admins group to configure the firewall.

Enabling the firewall without providing any client IP address ranges effectively closes the Azure AS server to all inbound traffic—except traffic from the Power BI cloud service. The Power BI service is whitelisted in the default "Firewall on" state, but you can disable this rule if desired. Click Save to apply the changes.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

With the firewall enabled, the Azure AS server responds to blocked traffic with a 401 error code. The corresponding error message informs you about the IP address that the client was using. This can be helpful if you want to grant this IP address access to your Azure AS server. This error handling is different from a network firewall in stealth mode not responding to blocked traffic at all. Although the Azure AS firewall does not operate in stealth mode, it enables you to lock down your servers effectively. You can quickly verify the firewall behavior in SQL Server Management Studio (SSMS), as shown in the following screenshot.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

You can also discover the client IP address of your workstation in the Azure Portal. On the Firewall page, click on Add client IP to add the current workstation IP address to the list of allowed IP addresses. Please note that the IP address is typically a public address, most likely assigned dynamically at your network access point to the Internet. Your client computer might not always use the same IP address. For this reason, it is usually advantageous to configure an IP range instead of an individual address. See the following table for examples. Note that you must specify addresses in IPv4 format.

Name Start IP Address  End IP Address  Comments 
ClientIPAddress  192.168.1.1 192.168.1.1  Grants access to exactly one IP address. 
ClientIPAddresses  192.168.1.0  192.168.1.254  Grants access to all IP addresses in the 192.168.1.x subnet. 
US East 2 Data Center 23.100.64.1  23.100.71.254  This is the address range 23.100.64.0/21 from the US East 2 data center. 

Besides Power BI and client computers in on-premises networks, you might also want to grant specific Azure-based solutions access to your Azure AS server. For example, you could be using a solution based on Azure Functions to perform automated processing or other actions against Azure AS. If the Azure AS firewall blocks your solution, you will encounter the error message, “System.Net.WebException: The remote server returned an error: (401) Unauthorized.” The following screenshot illustrates the error condition.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

In order to grant the Azure App Service access to your Azure AS server, you must determine the IP address that your function app uses. In the properties of your function app, copy the outbound IP addresses (see the following screenshot) and add them to the list of allowed client IP addresses in your firewall rules.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

Perhaps you are wondering at this point how to open an Azure AS server to an entire data center. This is slightly more complicated because the Azure data center address ranges are dynamic. You can download an XML file with the list of IP address ranges for all Azure data centers from the Microsoft Download Center. This list is updated on a weekly basis, so make sure you check for updates periodically.

Note that the XML file uses the classless inter-domain routing (CIDR) notation, while the Azure AS Firewall settings expect the ranges to be specified with start and end IP address. To convert the CIDR format into start and end IP addresses, you can use any of the publicly available IP converter tools. Alternatively, you can process the XML file by using Power Query, as the following screenshot illustrates.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

Download the Excel workbook and make sure you update the XmlFilePath parameter to point to the XML file you downloaded. For your convenience, the workbook includes a column called Firewall Rule Added, which concatenates the data center information into firewall rules as they would be defined in an Azure Resource Manager (ARM) template. The following screenshot shows an ARM template with several rules that grant IP address ranges from the US East 2 data center access to an Azure AS server.

Azure Analysis Services, Microsoft Guides, Microsoft Tutorials and Materials

The ARM template makes it easy to apply a large list of rules programmatically by using Azure PowerShell, Azure Command Line Interface (CLI), Azure portal, or the Resource Manager REST API. However, an excessively long list of IP addresses is hard to manage. Moreover, the Azure AS firewall must evaluate each rule for every incoming request. For this reason, it is recommended to limit the number of rules to the absolute necessary. For example, avoid adding approximately 3,500 rules for all IP ranges across all Azure data centers. Even if you limit the rules to your server’s local data center, there still may be more than 400 subnets. As a best practice, build your Azure AS business solutions using technologies that support static IP addresses, or at least a small set of dynamic IP addresses, as is the case with the Azure App Service. The smaller the surface area, the more effective the hardening of your Azure AS server.

Wednesday, 27 December 2017

Azure HDInsight Performance Benchmarking: Interactive Query, Spark and Presto

Fast SQL query processing at scale is often a key consideration for our customers. In this blog post we compare HDInsight Interactive Query, Spark, and Presto using the industry standard TPCDS benchmarks. These benchmarks are run using out of the box default HDInsight configurations, with no special optimizations.

Summary of the results


◉ HDInsight Interactive Query is faster than Spark.
◉ HDInsight Spark is faster than Presto.
◉ Text caching in Interactive Query, without converting data to ORC or Parquet, is equivalent to warm Spark performance.
◉ Interactive query is most suitable to run on large scale data as this was the only engine which could run all TPCDS 99 queries without any modifications at 100TB scale.
◉ Interactive Query preforms well with high concurrency.

About TPCDS


The TPC Benchmark DS (TPC-DS) is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. According to TPCDS, the benchmark provides a representative evaluation of performance as a general purpose decision support system. A benchmark result measures query response time in single user mode, query throughput in multi-user mode and data maintenance performance for a given hardware, operating system, and data processing system configuration under a controlled, complex, and multi-user decision support workload. The purpose of TPC benchmarks is to provide relevant, objective performance data to industry users. TPC-DS Version 2 enables emerging technologies, such as big data systems, to execute the benchmark. Please note that these are unaudited results.

HDInsight Interactive Query


HDInsight Interactive Query enables you to get super fast query results from your big data with ZERO ETL (Extract Transform & Load).

Interactive Query in HDInsight leverages (Hive on LLAP) intelligent caching, optimizations in core engines, as well as Azure optimizations to produce blazing-fast query results on remote cloud storage, such as Azure Blob and Azure Data Lake Store.

Comparative performance of Spark, Presto, and LLAP on HDInsight


We conducted these test using LLAP, Spark, and Presto against TPCDS data running in a higher scale Azure Blob storage account*. These storage accounts now provide an increase upwards of 10x to Blob storage account scalability. Over last few months we have also contributed to improve the performance of Windows Azure Storage Driver (WASB), which as a result has helped improve the performance for all HDInsight workloads.

To get your standard storage accounts to grow past the advertised limits in capacity, ingress/egress and request rate.

We picked a common external Hive metastore, Azure SQL DB S2, so that various engines could go against the same data and metadata.

HDInsight configuration


For these tests, we used a similar cluster to run LLAP, Spark, and Presto.

Note: Tests were performed using the default out-of-the-box configurations resulting in no optimizations, no special settings, and no query change for any engine. 

The table below uses 45 queries that ran on all engines successfully. As shown, LLAP was able to run many more queries than Presto or Spark.


As you can see with above run, LLAP with ORC is faster than all other engines. What’s an even more interesting observation is that LLAP with Text is also very fast. Even faster then Spark with Parquet file format.   

Fast analytics on Hadoop have always come with one big catch, they require up-front conversion to a columnar format like ORC or parquet, which can be time consuming and expensive with on-demand computing. LLAP Dynamic Text Cache converts CSV or JSON data into LLAP’s optimized in-memory format on-the-fly. Caching is dynamic so the queries your users run determine what data is cached.

HDInsight Interactive Query(LLAP) architecture

LLAP also utilized cluster memory DRAM and SSD to provide better performance. Cache pool is a joint pool made up of cluster DRAM and SSD. To give you an example, with D14V2 VM’s in Azure you an get 112 GB of RAM and 800 GB of local SSD, so just a couple of nodes are good enough to keep over a terabyte of data in memory for fast query performance.

Text caching in Interactive Query


Text caching in Interactive Query is a very interesting concept which has caused us to think about big data pipelines very differently. Traditionally, after ingesting data in raw form we needed to convert the data to an optimized file format such as ORC, Parquet, or Avro, as these file formats ensured users would receive good performance while querying the big data. With text caching, raw text and json performance is very similar to ORC which eliminates the need for having additional steps in our big data pipeline, resulting in cost saving as well as faster and fresher query results.


Running Interactive Query on 100TB TPCDS data


As we see many benchmarks all over the web by different vendors, one thing we notice was that they focus on only a select set of queries where their respective engine will produce the best results. We decided to run all 99 queries at 100 TB scale, and only Interactive Query was able to run these unmodified. 41% of queries were returned under 30 seconds, and 71% of queries came back under 2 minutes. This benchmarks proves that Interactive query is fast, has rich SQL, and scales at much larger scale levels without any special efforts.  


Concurrency


With the introduction of much improved fine-grain resource management and preemption, Interactive Query (Hive on LLAP) makes it easier for concurrent users. With Interactive Query, the only limit to concurrency is cluster resources. Cluster can be scaled to achieve higher and higher levels of concurrency.

We used number of different concurrency levels to test the concurrency performance. For the dataset, we again used 99 TPCDS queries on 1 TB data with 32 worker node cluster with max concurrency set to 32.

Test 1: Run all 99 queries, 1 at a time - Concurrency = 1

Test 2: Run all 99 queries, 2 at a time - Concurrency = 2

Test 3: Run all 99 queries, 4 at a time - Concurrency = 4

Test 4: Run all 99 queries, 8 at a time - Concurrency = 8

Test 5: Run all 99 queries, 16 at a time - Concurrency = 16

Test 6: Run all 99 queries, 32 at a time - Concurrency = 32

Test 7: Run all 99 queries, 64 at a time - Concurrency = 64

Results: As outlined in the above results, Interactive Query is a super optimized engine for running concurrent queries. The longest time to finish the workload was with single concurrent query.   


Comparison with Hive and performance improvements over time


Its important that we compare Interactive Query (LLAP) performance with Hive. There has been a ton of work done to make Hive more performant in the community, as well as some of the work we have been doing to improve Windows Azure storage driver performance. Back in January 2017, it took 200 minutes to run the TPCDS workload with Hive 1.2, and with the storage driver improvements Hive can now run the benchmark in 137 minutes. With LLAP cached data, the benchmark completes in 49 minutes. These are impressive gains.


Integration with Power BI direct Query, Apache Zeppelin, and other tools


Power BI now allows you to connect directly to your HDInsight Interactive Query cluster to explore and monitor data without requiring a data model as an intermediate cache. This offers interactive exploration of your data and automatically refreshes the visuals without requiring a scheduled refresh. 


HDInsight Interactive Query supports many end points. You can also use Apache Zeppelin , Visual Studio, Visual Studio Code, Hive View, and Beeline to run your queries

Friday, 22 December 2017

Bring your own vocabulary to Microsoft Video Indexer

Self-service customization for speech recognition


Video Indexer (VI) now supports industry and business specific customization for automatic speech recognition (ASR) through integration with the Microsoft Custom Speech Service!

ASR is an important audio analysis feature in Video Indexer. Speech recognition is artificial intelligence at its best, mimicking the human cognitive ability to extract words from audio. In this blog post, we will learn how to customize ASR in VI, to better fit specialized needs.

Before we get in to technical details, let’s take inspiration from a situation we have all experienced. Try to recall your first days on a job. You can probably remember feeling flooded with new words, product names, cryptic acronyms, and ways to use them. After some time, however, you can understand all these new words. You adapted yourself to the vocabulary.

ASR systems are great, but when it comes to recognizing a specialized vocabulary, ASR systems are just like humans. They need to adapt. Video Indexer now supports a customization layer for speech recognition, which allows you to teach the ASR engine new words, acronyms, and how they are used in your business context.

How does Automatic Speech Recognition work? Why is customization needed?


Roughly speaking, ASR works with two basic models - an acoustic model and a language model. The acoustic model is responsible for translating the audio signal to phonemes, parts of words. Based on these phonemes, guesses regarding how these phonemes can be sequenced into words know to the system’s lexicon are generated. The language model is then used to choose the most reasonable sequence of words out of these guesses, based on the probabilities of words to occur one after the other, as learned from large samples of text.

When input speech contains new words, the system cannot propose them as guesses, and they won’t be recognized correctly. For instance, Kubernetes, a new Azure product, is a word that we will teach VI to recognize in the example below. In other cases, the words exist, but the language model is not expecting them to appear in a certain context. For example, container service is not a 2-word sequence that a non-specialized language model would be scoring highly probable.

How does customization work?


Video Indexer lets you customize speech recognition by uploading adaptation text, namely text from the domain whose vocabulary you’d like the engine to adapt to. New words appearing in the adaptation text will now be recognized, assuming default pronunciation, and the language model will learn new probable sequences of words.

An example

Let’s take a video on Azure Containers as an example. First, we upload the video to video indexer, without adaptation. Go to the VI portal,  click 'upload’, and choose the file from your machine.

After a few minutes, the video on Kubernetes will be indexed. Let us see where adaptation can help. Go 9 minute and 46 seconds into the video. The word ‘Kubernetes’ is a new, highly specific, word that the system does not know, and is therefore recognized as “communities”.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

Here are two other examples. At 00:49, “a VM” was recognized as “IBM”. Again, specific domain vocabulary, this time an acronym. The same happens for “PM” at 00:17, where it is not recognized.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

To solve these, and other, issues, we need to apply language adaptation. We will start with a partial solution, which will help us understand the full solution.

Example 1: Partial adaptation – words without context


VI allows you to provide adaptation text that introduces your vocabulary to the speech recognition system. At this point, we will introduce just three lines, each with a word including Kubernetes, VM, and PM. The file is available for your review.

Go to the customization settings by clicking on the highlighted icon on the upper-right hand corner of the VI portal, as shown below:

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

On the next screen, click “add file”, and upload the adaptation file.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

Make sure you activate the file as adaptation data.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

After the model has been adapted, re-index the file. And… Kubernetes is now recognized!

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

VM is also recognized, as well as PM at 00:17.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

However, there is still room for more adaptation. Manually adding words can only help so much, since we cannot cover all the words, and we would also like the language model to learn from real instances of the vocabulary. This will make use of context, parts of speech, and other cues which can be learned from a larger corpus. In the next example, we will take a more complete approach by adding a decent corpus of real sentences from the domain. 

Example 2: Adapting the language model


Similar to what we have done above, let us now use as adaptation text a few pages of documentation about Azure containers. We have collected this adaptation text for your review. Below is an example for this style of adaptation data:

To mount an Azure file share as a volume in Azure Container Instances, you need three values: the storage account name, the share name, and the storage access key… The task of automating and managing a large number of containers and how they interact is known as orchestration. Popular container orchestrators include Kubernetes, DC/OS, and Docker Swarm, all of which are available in the Azure Container Service.

We recommend taking a look at the whole file. Let’s see a few examples of the effect. Let’s go back to 09:46. “Orchestrated” became orchestrator because of the adaptation text context.

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

Here is another nice example in which highly specific terms become recognizable.

Before adaptation:

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

After adaptation:

Microsoft Video Indexer, Microsoft Guides, Microsoft Learning, Microsoft Tutorials and Materials

Do’s and don’ts for language model adaptation


The system learns based on probabilities of word combinations, so to learn best:

◉ Give enough real examples of sentences as they would be spoken, hundreds to thousands is a good base.
◉ Put only one sentence per line, not more. Otherwise the system will learn probabilities across sentences.
◉ It is okay to put one word as a sentence to boost the word against others, but the system learns best from full sentences.
◉ When introducing new words or acronyms, if possible, give as many examples of usage in a full sentence to give as much context as possible to the system.
◉ Try to put several adaptation options, and see how they work for you.

Some patterns to avoid in adaptation data:

◉ Repetition of the exact same sentence multiple times. It does not boost further the probability and may create bias against the rest of the input.
◉ Including uncommon symbols (~, # @ % &) as they will get discarded, as well as the sentence they appear into.
◉ Putting too large inputs, including hundreds of thousands of sentences. These will dilute the effect of boosting.

Using the VI language adaptation API


To support adaptation, we have added a new customization tab to the site, and a new web API to manage the adaptation texts, training of the adaptation text, and transcription using adapted models.

In the Api/Partner/LinguisticTrainingData web API you will be able to create, read, update, and delete the adaptation text files. The files are plain *.txt  files which contain your adaptation data. For an improved user experience, mainly in the UI, we have groups that each file belongs to. This is especially useful when wanting to disable or enable multiple files at once in the UI.

After adaptation data files are uploaded, we need to use them to customize the system using the Api/Partner/LinguisticModel  API's, which creates a linguistic model based on one or more files. In cases where there is more than a single file provided we concatenate the files into a single one. Preparing a customized model can take several minutes, and you are required to make sure that your model status is "Complete" before using it in indexing.

The last and most important step is the transcription itself. We added to the upload a new field named “linguisticModel” that accept a valid, customized linguistic model ID to be used for transcription. When re-indexing, we use the same model ID provided in the original indexing.

Important note: There is a slight difference in the user experience when using our site and API. When using our site, we allow enabling/disabling training data files and groups, and we will choose the active model during file upload/re-index. When using the API, we disregard the active state and index the videos based on model ID provided at run time. This difference is intentional to allow both simplicity in our website and more a robust experience for developers.

Wednesday, 20 December 2017

Azure HDInsight Integration with Azure Log Analytics is now generally available

I am excited to announce the general availability of HDInsight Integration with Azure Log Analytics.

Azure HDInsight is a fully managed cloud service for customers to do analytics at scale using the most popular open-source engines such as Hadoop, Hive/LLAP, Presto, Spark, Kafka, Storm, HBase etc.

Thousands of our customers run their big data analytical applications on HDInsight at global scale. The ability to monitor this infrastructure, detect failures quickly and take quick remedial action is key to ensuring a better customer experience.

Log Analytics is part of Microsoft Azure's overall monitoring solution. Log Analytics helps you monitors cloud and on-premises environments to maintain availability and performance.

Our integration with log analytics will make it easier for our customers to operate their big data production workloads more effective and simple manner.

Monitor & debug full spectrum of big data open source engines at global scale


Typical big data pipelines utilize multiple open source engines such as Kafka for Ingestion, Spark streaming or Storm for stream processing, Hive & Spark for ETL, Interactive Query [LLAP] for blazing fast querying of big data.

Additionally, these pipelines may be running in different datacenters across the globe.

With new HDInsight monitoring capabilities, our customers can connect different HDInsight clusters to Log Analytics workspace and monitor them with single pane of glass.

Image: Monitoring your global big data deployments with single pane of glass

Collect logs and metrics from open source analytics engines


Once Azure Log Analytics is enabled on your cluster, you will see important logs and metrics from number of different open source frameworks as well as cluster VM level metrics such as CPU usage, memory utilization and more. Customers will be able to get a full view into their cluster, from one location.

Many of our customers take advantage of elasticity of the cloud by creating and deleting clusters to minimize their costs. However, they want to retain the job logs and other useful information even after the cluster is terminated. With Azure log analytics, customers can retain the job information even after the cluster is deleted.

Below are some of the key metrics and logs collected from your HDInsight clusters.

Yarn Resource Manager, Yarn Applications, Hive, Mapreduce, Kafka, Storm, Hive Server 2, Hive Server Interactive, Oozie, Spark, Spark executor and driver Livy, Storm, HBase, Phoenix, Juypter, LLAP, Zookeeper, and many more.

Image: Logs & Metrics from various Open Source engines.

Visualize key metrics with solution templates


To make it easier we have created number of visualizations so that our customers can understand important metrics. We have published multiple solution templates for you to get started quickly. You can install these solutions templates from Azure portal directly, under Monitoring + Management.

Image: Installing HDInsight solution templates from Azure portal

Once installed, you can see visualize the key metrics. In the example below you can see the dashboard for your Spark clusters.

Image: Spark dashboard

Troubleshoot issues faster


It’s important to be able to detect and troubleshoot issues faster and find the root cause when developing big data applications in Hive, Spark or Kafka.

With log analytics portal, you can:

◉ Write queries to quickly find issues of important data in your logs and metrics
◉ Filter, sort, and group results within a time range
◉ See your data in tabular format or in a chart

Below is the example query to look at application metrics from a Hive query

search *

| where Type contains "application_stats_dag_CL" and ClusterName_s contains "testhive02"

|order by TimeGenerated desc

Image: troubleshooting hive jobs 

Enabling Log Analytics


Log Analytics integration with HDInsight is enabled via the Azure portal, PowerShell or the Azure SDK. 

Enable-AzureRmHDInsightOperationsManagementSuite
        [-Name] <String>
        [-WorkspaceId] <String>
        [-PrimaryKey] <String>
        [-ResourceGroupName <String>]
        [-DefaultProfile <IAzureContextContainer>]
        [-WhatIf]
        [-Confirm]
        [<CommonParameters>]

Image: Enabling log Analytics from Azure portal

Sunday, 17 December 2017

General availability of Azure Site Recovery Deployment Planner for VMware and Hyper-V

I am excited to announce the general availability (GA) of the Azure Site Recovery Deployment Planner for VMware and Hyper-V. This tool helps VMware and Hyper-V enterprise customers to understand their on-premises networking requirements, Microsoft Azure compute and storage requirements for successful Azure Site Recovery replication, and test failover or failover of their applications.

Microsoft Guides, Microsoft Azure, Microsoft Tutorials and Materials

Apart from understanding infrastructure requirements, our customers also needed a way to estimate the total disaster recovery (DR) cost to Azure. In this GA release, we have added detailed estimated DR cost to Azure for your environment. You can generate a report with the latest Azure prices based on your subscription, the offer that is associated with your subscription, and the target Azure region for the specified currency. The Deployment Planner report gives you cost for compute, storage, network, and Azure Site Recovery licenses.

Key features of the tool


◉ The Deployment Planner can be run without having to install any Azure Site Recovery components to your on-premises environment.

◉ The tool does not impact the performance of production servers, as no direct connection is made to them. All performance data is collected from the Hyper-V server or VMware vCenter Server/VMware vSphere ESXi Server, which hosts the production virtual machines.

What aspects does the Azure Site Recovery Deployment Planner cover?


As you move from a proof of concept to a production rollout of Azure Site Recovery, we strongly recommend running the Deployment Planner. The tool provides following details:

Compatibility assessment

◉ A VM eligibility assessment to protect to Azure with Site Recovery

Network bandwidth need vs. RPO assessment

◉ The estimated network bandwidth that's required for delta replication
◉ The throughput that Site Recovery can get from on-premises to Azure
◉ RPO that can be achieved for a given bandwidth
◉ Impact on the desired RPO if lower bandwidth is provisioned

Microsoft Azure infrastructure requirements

◉ The storage type (standard or premium storage account) requirement for each virtual machine
◉ The total number of standard and premium storage accounts to be set up for replication
◉ The storage-account placement for all virtual machines
◉ The number of Azure cores to be set up before test failover or failover on the subscription
◉ The Azure VM-recommended size for each on-premises VM

On-premises infrastructure requirements

◉ The required free storage on each of volume of Hyper-V storage for successful initial replication and delta replication
◉ Maximum copy frequency to be set for Hyper-V replication
◉ The required number of Configuration Servers and Process Servers to be deployed on-premises for ◉ VMware to Azure scenario

Initial replication batching guidance

◉ Number of virtual machines that can be replicated to Azure in parallel to complete initial replication

Estimated DR cost to Azure

◉ Estimated total DR cost to Azure: compute, storage, network, and Azure Site Recovery license cost
◉ Detail cost analysis per virtual machine
◉ Specifies replication cost and the DR-Drill cost

Factoring future growth

◉ All the above factors are impacted after considering possible future growth of the on-premises workloads with increased usage

How does the Deployment Planner work?


The Deployment Planner has three main modes of operation:

◉ Profiling
◉ Report generation
◉ Throughput calculation

Profiling

In this mode, you profile all the on-premises servers that you want to protect over a few days, e.g. 30 days. The tool stores various performance counters like R/W IOPS, Write IOPS, and data churn, as well as other virtual machine characteristics like number of cores, number/size of disks, number of NICs, ect., by connecting to the Hyper-V server or the VMware vCenter Server/VMware vSphere ESXi Server where the virtual machines are hosted.

Report generation

In this mode, the tool uses the profiled data to generate a deployment planning report in Microsoft Excel format. The report has six to eight sheets based on the virtualization type:

◉ On-premises summary
◉ Recommendations
◉ Virtual machine to storage placement
◉ Compatible VMs
◉ Incompatible VMs
◉ On-premises storage requirement (only for Hyper-V)
◉ Initial replication batching (only for Hyper-V)
◉ Cost estimation

By default, the tool takes the 95th percentile of all profiled performance metrics and includes a growth factor of 30%. Both these parameters, percentile calculation and growth factor, are configurable.

Microsoft Guides, Microsoft Azure, Microsoft Tutorials and Materials

Throughput calculation


In this mode, the tool finds the network throughput that can be achieved from your on-premises environment to Microsoft Azure for replication. This will help you determine what additional bandwidth you need to provision for replication.

With Azure Site Recovery’s promise of full application recovery on Microsoft Azure, through deployment planning is critical for disaster recovery. With the Deployment Planner, we will ensure that both brand new deployments and existing deployments get the best replication experience and application performance when running on Microsoft Azure.

Friday, 15 December 2017

Conversational Bots Deep Dive – What’s new with the General Availability of Azure Bot Service and Language Understanding

Microsoft brings the latest advanced chatbot capabilities to developers' fingertips, allowing them to create apps that see, hear, speak, understand, and interpret users’ needs -- using natural communication styles and methods.

Today, we’re excited to announce we’re making generally available Microsoft Cognitive Services Language Understanding service (LUIS) and Azure Bot Service, two top notch AI services to create digital agents that interact in natural ways and make sense of the surrounding environment.

Think about the possibilities: all developers regardless of expertise in data science able to build conversational AI that can enrich and expand the reach of applications to audiences across a myriad of conversational channels. The app will be able to understand natural language, reason about content and take intelligent actions. Bringing intelligent agents to developers and organizations that do not have expertise in data science is disruptive to the way humans interact with computers in their daily life and the way enterprises run their businesses with their customers and employees.

Through our preview journey in the past two years, we have learned a lot from interacting with thousands of customers undergoing digital transformation. We highlighted some of our customer stories (such as UPS, Equadex, and more) in our general availability announcement. This post covers conversational AI in a nutshell using Azure Bot Service and LUIS, what we’ve learned so far, and dive into the new capabilities. We will also show how easy it is to get started in building a conversational bot with natural language.

Conversational AI with Azure Bot Service and LUIS


Azure Bot Service provides a scalable, integrated bot development and hosting environment for conversational bots that can reach customers across multiple channels on any device. Bots provide the conversational interface that accepts user input in different modalities including text, speech, cards, or images. The Azure Bot Service offers a set of fourteen channels to interact with users including Cortana, Facebook Messenger, Skype, etc. Intelligence is enabled in the Azure Bot Service through the cloud AI services forming the bot brain that understands and reasons about the user input. Based on understanding the input, the bot can help the user complete some tasks, answer questions, or even chit chat through action handlers. The following diagram summarizes how conversational AI applications are enabled through the Azure Bot Service and the Cloud AI services including language understanding, speech recognition, Q&A Maker, etc.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

Language Understanding (LUIS) is the key part of the bot brain that allows the bot to understand natural language input and reason about it to take the appropriate action. As customization is critical for every business scenario, Language Understanding helps build custom models for your business vertical with little effort and without prior expertise in data science. Designed to identify valuable information in conversations, it interprets user goals (intents) and distills valuable information from sentences (entities), for a high quality, nuanced language model.

With the General Availability of Language Understanding and Azure Bot Service, we're also introducing new capabilities to help you achieve more and delight your users

Language Understanding:
  • With an updated user interface, we’re providing Language Understanding service (LUIS) users more intents and entities than ever: expanding up to 500 intents (task or action identified in the sentence) and 100 entities (relevant information extracted, from the sentence, to complete the task or action associated to the intent) per application.
  • Language Understanding is now available in 7 new regions (South Central US, East US, West US 2, East Asia, North Europe, Brazil South, Australia East) on top of the 5 existing regions (West Europe, West US, East US2, West central US, South east Asia). This will help customers to improve network latency and bandwidth.
  • The Language Understanding service is also supporting more languages for its various features, in addition to English.
    • The prebuilt entities (representing common concepts like numbers, date, time) previously available in English are now available in French and Spanish.
    • Prebuilt domains (off-the-shelf collections of intents and entities grouped by domain that you can directly add and use in your application) are now also available in Chinese.
    • Phrase suggestions that help the developer customize your LUIS domain vocabulary are available in 7 new languages Chinese, Spanish, Japanese, French, Portuguese, German, and Italian.
Azure Bot Service:
  • Speed bot development by providing an integrated environment with the Microsoft Bot Framework channels, development tools and hosting solutions.
  • Connect with your audience with no code modifications via our supported channels on the Bot Service; Office 365 Email, GroupMe, Facebook Messenger, Kik, Skype, Slack, Microsoft Teams, Telegram, text/SMS, Twilio, Cortana, Skype for Business – or provide a custom experience in your app or website.
  • Bot Service is now integrated into the Azure portal; easy access to 24x7 support, monitoring capabilities, integrated billing and more in the trusted Azure ecosystem.
  • Now generally available in 9 different regions namely West US, East US, West Europe, and Southeast Asia including new deployments in North Europe, Australia Southeast, Australia East, Brazil South, and East Asia regions.
  • We are also announcing Premium Channels including webchat and directline.  Premium channels offer unique capabilities over the standard channels:
    • Communicate with your users on your website or in your application instead of sharing that data with public chat services.
    • Open Source webchat and directline clients enabling advanced customization opportunities.
    • 99.9% availability guarantees for premium channels
Developers can connect to other Azure services to enrich their bots as well as add Cognitive Services to enable your bots to see, hear, interpret, and interact in more human ways. For example, on top of language, the Computer Vision and Face APIs can enable bots to understand images and faces passed to the bot.

Learning through our customer’s experiences


For several years now, Microsoft has been leading the charge into the application of AI to build new intelligent conversational experiences…everything from proprietary solutions built to target a specific audience on a specific chat service to general purpose API’s that expect the developer to create the rest of the custom solution themselves.  We are still at the beginning of this evolution of the conversational application model; but already we have takeaways that are guiding how we think about the future.

Bots are changing how we do business. We are constantly having great discussions with customers who see bots as a key part of their digital transformation as a business. They see the opportunity to enhance their customer support experiences, provide easy access to information, or even expose their business to an audience that might not otherwise visit their website.

Developers need to have choice in technologies. With the growth in popularity of open source technologies, developers want choice of the technology components they use to build solutions.

Great conversational applications are multi-modal. Our customers are building conversational experiences which accomplish multiple tasks. For example, a customer support bot may have a Q&A search function, a support ticket entry function, a guided dialog to diagnose a problem, and an appointment scheduling function that hands off to a human for final confirmation.

AI platforms must scale to the needs of business. Often as not, business scenarios are based on sets of concepts that are being codified into the bot. Developers require the technologies they depend on to scale to the complexity of their business without arbitrary limits getting in the way.

Conversational app platforms need to be reliable and compliant. In the same way that mobile app platforms have needed to provide robust and secure platforms to enable great productivity scenarios, so too will conversational application platforms; they must be certifiably secure, reliable, compliant and privacy aware. In addition, the platform should to make it easy for developers building to it to build compliant solutions as well.

Businesses are global and multi-lingual. Businesses need to talk to customers world-wide 24/7 in their language of choice.

There is art in building a great conversational application. Much in the same way the 80’s and 90’s cemented what we now think of as common controls for native apps, and the 2000’s for web and mobile, the industry is still defining what it means to be a great conversational application.

Key design considerations


Given the learnings we’ve had, we’ve anchored our design on the following six points to shape the Azure Bot Service and Language Understanding (LUIS) capabilities:

Code-first approach: Azure Bot Service is built on top of the BotBuilder SDK V3 (in Node.js) that takes a code-first approach to enable developers to have full control over their bots’ conversational capabilities. Available for both Node.JS and C#, the open source SDK’s provides multiple dialog types and conversational orchestration tools to help the developer with various tasks like slot filling, dialog management and card representation.

Different dialog management flavors: developers build bots that range from simple question answer bots to multi-turn solutions that span ten or fifteen turns to complete a task. We provide a rich set of dialog management flavors to cover the different task types a bot developer might wish to expose. You can create bots that utilize a mix of prompts, form filling, natural language, and your own dialog management system with the ability to reuse some of the components like prompts.

Open bot platform: Building on Azure's commitment to open source technologies, applications using our SDK and LUIS can be deployed on any connected infrastructure and consumed from any device anywhere targeting your audience on multiple chat channels. This open design allows the offering to be integrated with different deployment platforms including public cloud or on-premise infrastructure.

Global and multi-lingual: We have put considerable effort into making our services highly available and as close to customers as possible as part of the Azure cloud.  Azure Bot Service and Language Understanding support a growing list of languages for understanding conversations.

Getting started quickly: While bots can be deployed anywhere, with Azure we provide rich connected cloud services for hosting your bot and AI applications with a single click.  The Azure Bot Service and LUIS get you a running bot that can converse with users in a natural way in minutes. Azure Bot Service takes care of provisioning all of the Azure resources you need so that developers can focus on their business logic. LUIS provides customizable pre-built apps and entity dictionaries, such as Calendar, Music, and Devices, so you can build and deploy a solution more quickly. Dictionaries are mined from the collective knowledge of the web and supply billions of entries, helping your model to correctly identify valuable information from user conversations.

Custom models with little effort: as customization is critical for every business scenario, LUIS capitalizes on the philosophy of machine teaching to help non-expert machine learning developers build effective custom language understanding models. While machine learning focuses on creating new algorithms and improving the accuracy of “learners”, the machine teaching discipline focuses on the efficacy of the “teachers”. Machine teaching as a discipline is a paradigm shift that follows and extends principles of software engineering and programming languages. It provides the developer with a set of tools to build machine learning models by transferring the developer domain knowledge to the machine learning algorithms. This contrasts with Machine Learning which is about creating useful models from this knowledge. Developer knowledge is expressed in LUIS through schema (what intents and entities are in the LUIS application) and labeled examples.  It supports a wide variety of techniques for reliably recognizing entities with normalization to allow them to be easily consumed in a program.

Always monitor, learn and improve: Azure Bot Service and LUIS use Azure monitoring tools to help developers monitor the performance of their bots including the quality of the language understanding models and the bot usage. Once the model starts processing input, LUIS begins active learning, allowing you to constantly update and improve the model. It helps you pick the most informative utterances from your real bot traffic to add to your model and continuously improve. This intelligent selection of examples to add to the training data of the LUIS model helps developers build cost effective models that don’t require a lot of data and yet perform with high accuracy.

Getting started with the Bot Service and Language Understanding


In this section, we’ll create a bot using the Azure Bot Service that uses Language Understanding (LUIS) to understand the user. When creating a bot using natural language, the bot determines what a user wants to do by identifying their intent. This intent is determined from spoken or textual input, or utterances, which in turn can be mapped to actions that Bot developers has coded. For example, a note-taking bot recognizes a Notes. Create intent to invoke the functionality for creating a note. A bot may also need to extract entities, which are important words in utterances. In the example of a note-taking bot, the Notes. Title entity identifies the title of each note.

Create a Language Understanding bot with Bot Service


To create your bot; log in the Azure portal, select Create new resource in the menu blade and select AI + Cognitive Services.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

You can browse through the suggestions, or search for Web App Bot.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

Once selected, the Bot Service blade should appear; which will be familiar to users of Azure services. For those that aren’t, here you can specify information about your service for the Bot Service to use in creating your bot such as where it will live, what subscription in and so forth. In the Bot Service blade, provide the required information, and click Create. This creates and deploys the bot service and LUIS app to Azure. Some interesting fields:
  • Set App name to your bot’s name. The name is used as the subdomain when your bot is deployed to the cloud (for example, mynotesbot.azurewebsites.net). This name is also used as the name of the LUIS app associated with your bot. Copy it to use later, to find the LUIS app associated with the bot.
  • Select the subscription, resource group, hosting plan, and location.
  • For pricing, you can choose the free pricing tier. You can go back and change that at any time if you need more.
  • For this sample, select the Language understanding (C#) template for the Bot template field.
  • For the final required field, choose the Azure Storage where you wish to store your bot’s conversation state. Think of this as where the bot keeps track of where each user is in the conversation.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

Now that you’re complete, you can click Create. Azure will set about creating your bot including the resources it needs to operate your bot and a LUIS account to host your natural language model. Once complete, you’ll receive a notification via the bell in the top right corner of the Azure portal.
Next up, lets confirm that the bot service has been deployed.

◉ Click Notifications (the bell icon that is located along the top edge of the Azure portal). The notification will change from Deployment started to Deployment succeeded.
◉ After the notification changes to Deployment succeeded, click Go to resource on that notification.

Try the bot


So now you should have a working bot. Let’s try it out.

Once the bot is registered, click Test in Web Chat to open the Web Chat pane. Type "hello" in Web Chat.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

The bot responds by saying "You have reached Greeting. You said: hello". This confirms that the bot has received your message and passed it to a default LUIS app that it created. This default LUIS app detected a Greeting intent.

Note: Occasionally, the first message or two after startup may need to be retried before the bot will answer.

Viola! You have a working bot! The default bot only knows a few things; it recognizes some greetings, as well as help and cancel. In the next section we’ll modify the LUIS app for our bot to add some new intents for our Note taking bot.

Modify the LUIS app


Log in to www.luis.ai using the same account you use to log in to Azure. Click on My apps. If all has gone well, in the list of apps, you’ll find the app with the same name as the App name from the Bot Service blade when you created the Bot Service.

After opening the app, you should see it has four intents: Cancel, Greeting, Help, and None. The first three we already mentioned. None is a special intent in LUIS that captures “everything else”.

For our sample, we’re going to add three intents for the user: Note.Create and Note.ReadAloud. Conveniently, one of the great features about LUIS are the pre-built domains that can be used to bootstrap your application, of which Note is one.
  • Click on Pre-built Domains in the lower left of the page. Find the Note domain and click Add domain.
  • This tutorial doesn't use all the intents included in the Note prebuilt domain. In the Intents page, click on each of the following intent names and then click the Delete Intent button to remove them from your app.
    • Note.ShowNext
    • Note.DeleteNoteItem
    • Note.Confirm
    • Note.Clear
    • Note.CheckOffItem
    • Note.AddToNote
    • Note.Delete
◉ IMPORTANT: The only intents that should remain in the LUIS app are the Note.ReadAloud, Note.Create, None, Help, Greeting, and Cancel intents.  If they’re still there, your app will still work, but may more often behave inconsistently.
As mentioned earlier, the Intents that we’ve now added represent the types of things we expect the user to want the bot to do.  Since these are pre-defined, we don’t have to do any further tuning to the model, so let’s jump right to training and publishing your model.

◈ Click the Train button in the upper right to train your app.  Training takes everything you’ve entered into the model by creating intents and entities, entering utterances and labeling them and generates a machine learned model, all with one click.  You can test your app here in the LUIS portal, or move on to publishing so that it’s available to your bot.

◈ Click PUBLISH in the top navigation bar to open the Publish page. Click the Publish to production slot button. After successful publish, copy the URL displayed in the Endpoint column the Publish App page, in the row that starts with the Resource Name Starter_Key. Save this URL to use later in your bot’s code. The URL has a format similar to this example: https://westus.api.cognitive.microsoft.com/luis/v2.0/apps/xxxxxxxxxxxxxxxxx?subscription-key=xxxxxxxxxxxxxx3&timezoneOffset=0&verbose=true&q=

Your Language Understanding Application is now ready for your Bot. If the user asks to create, delete, or read back a note, Language Understanding will identify that and return the correct intent to the Bot to be acted on. In the next section we’ll add logic to the bot to handle these Intents.

Modify the bot code


The Bot Service is set up to work in a traditional development environment; sync your source code with GIT and work in your favorite dev environment. That said, Azure Bot Service also offers the ability to edit right in the portal; which is great for our experiment. Click Build and then click Open online code editor.


Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

First, some preamble. In the code editor, open BasicLuisDialog.cs. It contains the code for handling Cancel, Greeting, Help, and None intents from the LUIS app.

Add the following statement:

using System.Collections.Generic;

Create a class for storing notes


Add the following after the BasicLuisDialog constructor:

private readonly Dictionary<string, Note> noteByTitle = new Dictionary<string, Note>();

private Note noteToCreate;

private string currentTitle;

// CONSTANTS

// Name of note title entity

public const string Entity_Note_Title = "Note.Title";

// Default note title

public const string DefaultNoteTitle = "default";

[Serializable]

public sealed class Note : IEquatable<Note>

{

public string Title { get; set; }

public string Text { get; set; }

public override string ToString()

{

return $"[{this.Title} : {this.Text}]";

}

public bool Equals(Note other)

{

return other != null

&& this.Text == other.Text

&& this.Title == other.Title;

}

public override bool Equals(object other)

{

return Equals(other as Note);

}

public override int GetHashCode()

{

return this.Title.GetHashCode();

}

}

Handle the Note.Create intent


Note.Create intent, add the following code to the BasicLuisDialog class.

[LuisIntent("Note.Create")]

public Task NoteCreateIntent(IDialogContext context, LuisResult result)

{

EntityRecommendation title;

if (!result.TryFindEntity(Entity_Note_Title, out title))

{

// Prompt the user for a note title

PromptDialog.Text(context, After_TitlePrompt, "What is the title of the note you want to create?");

}

else

{

var note = new Note() { Title = title.Entity };

noteToCreate = this.noteByTitle[note.Title] = note;

// Prompt the user for what they want to say in the note

PromptDialog.Text(context, After_TextPrompt, "What do you want to say in your note?");

}

return Task.CompletedTask;

}

private async Task After_TitlePrompt(IDialogContext context, IAwaitable<string> result)

{

EntityRecommendation title;

// Set the title (used for creation, deletion, and reading)

currentTitle = await result;

if (currentTitle != null)

{

title = new EntityRecommendation(type: Entity_Note_Title) { Entity = currentTitle };

}

else

{

// Use the default note title

title = new EntityRecommendation(type: Entity_Note_Title) { Entity = DefaultNoteTitle };

}

// Create a new note object

var note = new Note() { Title = title.Entity };

// Add the new note to the list of notes and also save it in order to add text to it later

noteToCreate = this.noteByTitle[note.Title] = note;

// Prompt the user for what they want to say in the note

PromptDialog.Text(context, After_TextPrompt, "What do you want to say in your note?");

}

private async Task After_TextPrompt(IDialogContext context, IAwaitable<string> result)

{

// Set the text of the note

noteToCreate.Text = await result;

await context.PostAsync($"Created note **{this.noteToCreate.Title}** that says \"{this.noteToCreate.Text}\".");

context.Wait(MessageReceived);

}

Handle the Note.ReadAloud Intent


The bot can use the Note.ReadAloud intent to show the contents of a note, or of all the notes if the note title isn't detected. Paste the following code into the BasicLuisDialog class.

[LuisIntent("Note.ReadAloud")]

public async Task NoteReadAloudIntent(IDialogContext context, LuisResult result)

{

Note note;

if (TryFindNote(result, out note))

{

await context.PostAsync($"**{note.Title}**: {note.Text}.");

}

else

{

// Print out all the notes if no specific note name was detected

string NoteList = "Here's the list of all notes: \n\n";

foreach (KeyValuePair<string, Note> entry in noteByTitle)

{

Note noteInList = entry.Value;

NoteList += $"**{noteInList.Title}**: {noteInList.Text}.\n\n";

}

await context.PostAsync(NoteList);

}

context.Wait(MessageReceived);

}

public bool TryFindNote(string noteTitle, out Note note)

{

// TryGetValue returns false if no match is found.

bool foundNote = this.noteByTitle.TryGetValue(noteTitle, out note);

return foundNote;

}

public bool TryFindNote(LuisResult result, out Note note)

{

note = null;

string titleToFind;

EntityRecommendation title;

if (result.TryFindEntity(Entity_Note_Title, out title))

{

titleToFind = title.Entity;

}

else

{

titleToFind = DefaultNoteTitle;

}

// TryGetValue returns false if no match is found.

return this.noteByTitle.TryGetValue(titleToFind, out note);

}

Build the bot


Now that the cut and paste part is done, you can right-click on build.cmd in the code editor and choose Run from Console. Your bot will be built and deployed from within the online code editor environment.

Test the bot


In the Azure Portal, click on Test in Web Chat to test the bot. Try type messages like "Create a note", "read my notes", and "delete notes".  Because you’re using natural language you have more flexibility on how you state your request, and in turn, Language Understanding’s Active Learning feature can be used such that you can open your Language Understanding application and it can make suggestions about things you said which it didn’t understand and might make your app more effective.

Microsoft Guides, Microsoft Tutorials and Materials, Microsoft Live

Tip: If you find that your bot doesn't always recognize the correct intent or entities, improve your Language Understanding app's performance by giving it more example utterances to train it. You can retrain your Language Understanding app without any modification to your bot's code.