Sunday, 30 December 2018

Conversational - AI updates December 2018

We are thrilled to present the release of Bot Framework SDK version 4.2 and we want to use this opportunity to provide additional updates on Conversational-AI releases from Microsoft.

Azure Certification, Azure Tutorial and Material, Azure Learning, Azure Guides

In the SDK 4.2 release, the team focused on enhancing monitoring, telemetry, and analytics capabilities of the SDK by improving the integration with Azure App Insights. As with any release, we fixed a number of bugs, continued to improve Language Understanding (LUIS) and QnA integration, and enhanced our engineering practices. There were additional updates across the other areas like language, prompt and dialogs, and connectors and adapters. You can review all the changes that went into 4.2 in the detailed changelog.

Telemetry updates for SDK 4.2

Azure Certification, Azure Tutorial and Material, Azure Learning, Azure Guides
With the SDK 4.2 release, we started improving the built-in monitoring, telemetry, and analytics capabilities provided by the SDK. Our goal is to provide developers with the ability to understand their overall bot-health, provide detailed reports about the bot’s conversation quality, as well as tools to understand where conversations fall short. To do that, we decided to further enhance the built-in integration with Bot FrameworkMicrosoft Azure Application Insights. For that end, we have streamlined the integration and default telemetry emitted from the SDK. This includes waterfall dialog instrumentation, docs, examples for querying data, and a PowerBI dashboard.

Bot Framework can use the App Insights telemetry to provide information about how your bot is performing and track key metrics. For example, once you enable App Insights for your bot, the SDK will automatically trace important information for each activity that gets sent to your bot. Essentially, per activity, for example, a user interacts with your bot by typing some utterance, the SDK emit traces for all different stages of the activity processing. This then can be placed on a timeline showing each component latency and performance – as you can see from the following image.

This can help identify slow responses and further optimize your bot performance.

Beyond basic bot performance analysis, we have instrumented the SDK to emit traces for the dialog stack in the SDK, primarily the waterfall dialog. The following image is a visualization showing the behavior of a waterfall dialog. Specifically, this image shows three events before and after someone completes a dialog across all sessions. The center “Initial Event” is the starting point that fans left and right showing before and after, respectively. This is great to show drop-off rate, shown in red, and where most conversation ‘flows’ by the thickness of the line. This view is a default app insight report, all we had to do is connect the wires between the SDK, dialogs, and App Insights.

Azure Certification, Azure Tutorial and Material, Azure Learning, Azure Guides

The SDK and integration with App Insights provide a lot more capabilities, for example:

◈ Complete activity tracing including all dependencies.

◈ LUIS telemetry, including non-functional such as latency, error rate, and functions such as intent distribution, intent sentiment, and more.

◈ QnA telemetry including non-functional such as latency, error rate, and functional such QnA score and relevance.

◈ Word Cloud, common and utterances showing for top most used words and phrases – this can help in case you missed some intents or QnA.

◈ Conversation length expressed in term of time and step-count.

◈ Come up with your own reports using custom queries.

◈ Custom logging to your bot.

Solutions


The creation of a high-quality conversational experience requires a foundational set of capabilities. To help customers and partners succeed with building great conversational experiences, we released the enterprise bot template at Microsoft Ignite 2018. This template brings together all the best practices and supporting components we've identified through the building of conversational experiences.

Azure Certification, Azure Tutorial and Material, Azure Learning, Azure Guides

Synchronized with the SDK 4.2 release, we have delivered updates to the enterprise template which provides additional localization for LUIS models and responses including multi-language dispatcher support for customers that wish to support multiple native languages in one bot deployment. We’ve also replaced custom telemetry work with the new native SDK support for dialog telemetry and a new Conversational Analytics Power BI dashboard providing deep analytics into usage, dialog quality, and more.

The enterprise template is now joined by retail customer support focused template which provides further LUIS models for this scenario and example dialogs for order management, stock availability, and store location.

The virtual assistant solution accelerator, which enables customers and partners to build their own virtual assistants tailored to their brand and scenarios, has continued to evolve. Ignite was our first crucial milestone for virtual assistant and skills. Work has continued with regular updates to all elements of the overall solution.

We now have full support for six languages including Chinese for the virtual assistant and skills. The productivity skills (i.e., calendar, email, and tasks) have updated conversation flows, entity handling, new pre-built domain language models, and work with Microsoft Graph. This release also includes the first automotive capabilities enabling control of car features along with updates to skills enabling proactive experiences, Speech DDK integration, and experimental skills (restaurant booking and news).

Language Understanding December update


December was a very exciting month for Language Understanding in Microsoft. On December 4, 2018, we announced Docker container support for LUIS in public preview. Hosing LUIS runtime on containers provides a great set of benefits including:

◈ Control over data: Allow customers to use the service with complete control over their data. This is essential for customers that cannot send data to the cloud but need access to the technology. Support consistency in hybrid environments – across data, management, identity, and security.

◈ Control over model updates: Provide customers flexibility in versioning and updating of models deployed in their solutions.

◈ Portable architecture: Enable the creation of a portable application architecture that can be deployed in the cloud, on-premises, and the edge.

◈ High throughput/low latency: Provide customers the ability to scale for high throughput, low latency, requirements by enabling Cognitive Services to run in Azure Kubernetes Service physically close to their application logic, and data.

LUIS has expanded its service to seven new regions completing worldwide availability in all major Azure regions including UK, India, Canada, and Japan.

Among the other notables is the enhancement of the training experience. This included the improvement in the time required to train application. The team also released new pre-built entity extractors for people’s names, and geographical locations in English and Chinese, and expanded the phone number, URL, and email entities across all languages.

QnA Maker updates

In December, the QnA Maker service released an improvement for its intelligent extraction capabilities. Along with accuracy improvements for existing supported sources, QnA Maker can now extract information from simple “Support” URLs. Read more about extraction and supported data sources in the documentation, “Data sources for QnA Maker content.” QnA Maker also rolled out an improved ranking and scoring algorithm for all English KBs.

The team also released SDKs for the service in .NET, Node.js, Go, and Ruby.

Web chat speech update

Azure Certification, Azure Tutorial and Material, Azure Learning, Azure Guides

We now support the new Cognitive Services Speech to Text and Text to Speech services directly in Web Chat 4.2. This sample is a great place to learn about the new feature and start migrating your bot from Bing Speech to the new Speech Services.

We also added a few samples including backchannel injection and minimize mode. Backchannel injection demonstrates how to add sideband data to outgoing activities. You can leverage this technique to send browser language and time zone information alongside with messages sent by the user. Minimize mode sample shows how to load Web Chat on-demand and overlay it on top of your existing web page.

Thursday, 27 December 2018

Best practices for queries used in log alert rules

Queries can start with a table name like SecurityEvent and Perf, or with “search” and “union” operators that can provide a multi-table/multi-column search experience. These operators are useful during data exploration and for searching terms in the entire data model. However, these operators are not efficient for productization in alerts. Log alert rule queries in Log Analytics and Application Insights should always start with a table to define a clear scope for the query execution. It improves both query performance and the relevance of the results.

Microsoft Tutorial and Material, Microsoft Certification, Microsoft Learning

Note that using cross-resource queries in log alert rules is not considered inefficient although “union” operator is used. The “union” in cross-resource queries is scoped to specific resources and tables as shown in this example, while the query scope for “union *” is the entire data model.

Union

app('Contoso-app1').requests,

app('Contoso-app2').requests,

workspace('Contoso-workspace1').Perf

After data exploration and query authoring, you may want to create a log alert using this query. These examples show how you can modify your queries and avoid “search” and “union *” operators.

Example 1


You want to create a log alert on the following query.

search ObjectName == 'Memory' and (CounterName == '% Committed Bytes In Use' or CounterName == '% Used Memory') and TimeGenerated > ago(5m)

| summarize Avg_Memory_Usage =avg(CounterValue) by Computer

| where Avg_Memory_Usage between(90 .. 95)

| count

To author a valid alert query without the use of “search” operator, follow these steps:

1. Identify the table that the properties are hosted in.

search ObjectName == 'Memory' and (CounterName == '% Committed Bytes In Use' or CounterName == '% Used Memory')

| summarize by $table

The result indicates that these properties belong to the Perf table.

Microsoft Tutorial and Material, Microsoft Certification, Microsoft Learning

2. Since the properties used in the query are from the Perf table, the query should start with it and scope the query execution to that table.

Perf

| where ObjectName == 'Memory' and (CounterName == '% Committed Bytes In Use' or CounterName == '% Used Memory') and TimeGenerated > ago(5m)

| summarize Avg_Memory_Usage=avg(CounterValue) by Computer

| where Avg_Memory_Usage between(90 .. 95)

| count

Example 2


You want to create a log alert on the following query.

search (ObjectName == 'Processor' and CounterName == '% Idle Time' and InstanceName == '_Total')

| where Computer !in ((union * | where CounterName == '% Processor Utility' | summarize by Computer)) | summarize Avg_Idle_Time = avg(CounterValue) by Computer, CounterPath | where Avg_Idle_Time < 5 | count

To modify the query, follow these steps:

1. Since the query makes use of both “search” and “union *” operators, you need to identify the tables hosting the properties in two stages.

search (ObjectName == 'Processor' and CounterName == '% Idle Time' and InstanceName == '_Total')

| summarize by $table

The properties of the first part of the query belong to the Perf table.

Microsoft Tutorial and Material, Microsoft Certification, Microsoft Learning

Note, the “withsource = table” command adds a column that designates the table name that hosts the property.

union withsource = table * | where CounterName == '% Processor Utility'

| summarize by table

The property in the second part of the query also belongs to the Perf table.

Microsoft Tutorial and Material, Microsoft Certification, Microsoft Learning

2. Since the properties used in the query are from the Perf table, both outer and inner queries shoud start with it and scope the query execution to this table.

Perf

| where ObjectName == 'Processor' and CounterName == '% Idle Time' and InstanceName == '_Total'

| where Computer !in ((Perf | where CounterName == '% Processor Utility' | summarize by Computer))

| summarize Avg_Idle_Time = avg(CounterValue) by Computer, CounterPath

| where Avg_Idle_Time < 5

| count

Wednesday, 19 December 2018

Fine-tune natural language processing models using Azure Machine Learning service

In the natural language processing (NLP) domain, pre-trained language representations have traditionally been a key topic for a few important use cases, such as named entity recognition (Sang and Meulder, 2003), question answering (Rajpurkar et al., 2016), and syntactic parsing (McClosky et al., 2010).

The intuition for utilizing a pre-trained model is simple: A deep neural network that is trained on large corpus, say all the Wikipedia data, should have enough knowledge about the underlying relationships between different words and sentences. It should also be easily adapted to a different domain, such as medical or financial domain, with better performance than training from scratch.

Recently, a paper called “BERT: Bidirectional Encoder Representations from Transformers” was published by Devlin, et al, which achieves new state-of-the-art results on 11 NLP tasks, using the pre-trained approach mentioned above. In this technical blog post, we want to show how customers can efficiently and easily fine-tune BERT for their custom applications using Azure Machine Learning Services.

Intuition behind BERT

The intuition behind the new language model, BERT, is simple yet powerful. Researchers believe that a large enough deep neural network model, with large enough training corpus, could capture the relationship behind the corpus. In NLP domain, it is hard to get a large annotated corpus, so researchers used a novel technique to get a lot of training data. Instead of having human beings label the corpus and feed it into neural networks, researchers use the large Internet available corpus – BookCorpus (Zhu, Kiros et al) and English Wikipedia (800M and 2,500M words respectively). Two approaches, each for different language tasks, are used to generate the labels for the language model.

◈ Masked language model: To understand the relationship between words. The key idea is to mask some of the words in the sentence (around 15 percent) and use those masked words as labels to force the models to learn the relationship between words. For example, the original sentence would be:

The man went to the store. He bought a gallon of milk.

And the input/label pair to the language model is:

Input: The man went to the [MASK1]. He bought a [MASK2] of milk.
Labels: [MASK1] = store; [MASK2] = gallon

◈ Sentence prediction task: To understand the relationships between sentences. This task asks the model to predict whether sentence B, is likely to be the next sentence following a given sentence A. Using the same example above, we can generate training data like:

Sentence A: The man went to the store.
Sentence B: He bought a gallon of milk.
Label: IsNextSentence

Applying BERT to customized dataset


After BERT is trained on a large corpus (say all the available English Wikipedia) using the above steps, the assumption is that because the dataset is huge, the model can inherit a lot of knowledge about the English language. The next step is to fine-tune the model on different tasks, hoping the model can adapt to a new domain more quickly. The key idea is to use the large BERT model trained above and add different input/output layers for different types of tasks. For example, you might want to do sentiment analysis for a customer support department. This is a classification problem, so you might need to add an output classification layer (as shown on the left in the figure below) and structure your input. For a different task, say question answering, you might need to use a different input/output layer, where the input is the question and the corresponding paragraph, while the output is the start/end answer span for the question (see the figure on the right). In each case, the way BERT is designed can enable data scientists to plug in different layers easily so BERT can be adapted to different tasks.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Study Materials

Figure 1. Adapting BERT for different tasks (Source)

The image below shows the result for one of the most popular dataset in NLP field, the Stanford Question Answering Dataset (SQuAD).

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Study Materials

Figure 2. Reported BERT performance on SQuAD 1.1 dataset (Source).

Depending on the specific task types, you might need to add very different input/output layer combinations. In the GitHub repository, we demonstrated two tasks, General Language Understanding Evaluation (GLUE) (Wang et al., 2018) and Stanford Question Answering Dataset (SQuAD) (Rajpurkar and Jia et al., 2018).

Using the Azure Machine Learning Service


We are going to demonstrate different experiments on different datasets. In addition to tuning different hyperparameters for various use cases, Azure Machine Learning service can be used to manage the entire lifecycle of the experiments. Azure Machine Learning service provides an end-to-end cloud-based machine learning environment, so customers can develop, train, test, deploy, manage, and track machine learning models, as shown below. It also has full support for open-source technologies, such as PyTorch and TensorFlow which we will be using later.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Study Materials

Figure 3. Azure Machine Learning Service Overview

What is in the notebook


Defining the right model for specific task

To fine-tune the BERT model, the first step is to define the right input and output layer. In the GLUE example, it is defined as a classification task, and the code snippet shows how to create a language classification model using BERT pre-trained models:

model = modeling.BertModel(
     config=bert_config,
     is_training=is_training,
     input_ids=input_ids,
     input_mask=input_mask,
     token_type_ids=segment_ids,
     use_one_hot_embeddings=use_one_hot_embeddings)

logits = tf.matmul(output_layer, output_weights, transpose_b=True)
logits = tf.nn.bias_add(logits, output_bias)
probabilities = tf.nn.softmax(logits, axis=-1)
log_probs = tf.nn.log_softmax(logits, axis=-1)
one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
loss = tf.reduce_mean(per_example_loss)

Set up training environment using Azure Machine Learning service

Depending on the size of the dataset, training the model on the actual dataset might be time-consuming. Azure Machine Learning Compute provides access to GPUs either for a single node or multiple nodes to accelerate the training process. Creating a cluster with one or multiple nodes on Azure Machine Learning Compute is very intuitive, as below:

compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC24s_v3',
                                                         min_nodes=0,
                                                         max_nodes=8)
# create the cluster
gpu_compute_target = ComputeTarget.create(ws, gpu_cluster_name, compute_config)
gpu_compute_target.wait_for_completion(show_output=True)
estimator = PyTorch(source_directory=project_folder,
                 compute_target=gpu_compute_target,
                 script_params = {...},
                 entry_script='run_squad.azureml.py',
                 conda_packages=['tensorflow', 'boto3', 'tqdm'],
                 node_count=node_count,
                 process_count_per_node=process_count_per_node,
                 distributed_backend='mpi',
                 use_gpu=True)

Azure Machine Learning is greatly simplifying the work involved in setting up and running a distributed training job. As you can see, scaling the job to multiple workers is done by just changing the number of nodes in the configuration and providing a distributed backend. For distributed backends, Azure Machine Learning supports popular frameworks such as TensorFlow Parameter server as well as MPI with Horovod, and it ties in with the Azure hardware such as InfiniBand to connect the different worker nodes to achieve optimal performance. We will have a follow up blogpost on how to use the distributed training capability on Azure Machine Learning service to fine-tune NLP models.

Hyper Parameter Tuning

For a given customer’s specific use case, model performance depends heavily on the hyperparameter values selected. Hyperparameters can have a big search space, and exploring each option can be very expensive. Azure Machine Learning Services provide an automated machine learning service, which provides hyperparameter tuning capabilities and can search across various hyperparameter configurations to find a configuration that results in the best performance.

In the provided example, random sampling is used, in which case hyperparameter values are randomly selected from the defined search space. In the example below, we explored the learning rate space from 1e-4 to 1e-6 in log uniform manner, so the learning rate might be 2 values around 1e-4, 2 values around 1e-5, and 2 values around 1e-6.

Customers can also select which metric to optimize. Validation loss, accuracy score, and F1 score are some popular metrics that could be selected for optimization.

from azureml.train.hyperdrive import *
import math

param_sampling = RandomParameterSampling( {
         'learning_rate': loguniform(math.log(1e-4), math.log(1e-6)),
})

hyperdrive_run_config = HyperDriveRunConfig(
     estimator=estimator,
     hyperparameter_sampling=param_sampling,
     primary_metric_name='f1',
     primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
     max_total_runs=16,
     max_concurrent_runs=4)

For each experiment, customers can watch the progress for different hyperparameter combinations. For example, the picture below shows the mean loss over time using different hyperparameter combinations. Some of the experiments can be terminated early if the training loss doesn’t meet expectations (like the top red curve).

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Study Materials

Figure 4. Mean loss for training data for different runs, as well as early termination

Visualizing the result

Using the Azure Machine Learning service, customers can achieve 85 percent evaluation accuracy when fine-tuning MRPC in GLUE dataset (it requires 3 epochs for BERT base model), which is close to the state-of-the-art result. Using multiple GPUs can shorten the training time and using more powerful GPUs (say V100) can also improve the training time. For one of the specific experiments, the details are as below:

GPU#
K80 (NC Family) 191 s/epoch  105 s/epoch   60 s/epoch 
V100 (NCv3 Family)   36 s/epoch 22 s/epoch  13 s/epoch 

Table 1. Training time per epoch for MRPC in GLUE dataset

For SQuAD 1.1, customers can achieve around 88.3 F1 score and 81.2 Exact Match (EM) score. It requires 2 epochs using BERT base model, and the time for each epoch is shown below:

GPU#
K80 (NC Family) 16,020 s/epoch 8,820 s/epoch 4,020 s/epoch
V100 (NCv3 Family)   2,940 s/epoch 1,393 s/epoch 735 s/epoch
Table 2. Training time per epoch for SQuAD dataset

After all the experiments are done, the Azure Machine Learning service SDK also provides a summary visualization on the selected metrics and the corresponding hyperparameter(s). Below is an example on how learning rate affects validation loss. Throughout the experiments, the learning rate has been changed from around 7e-6 (the far left) to around 1e-3 (the far right), and the best learning rate with lowest validation loss is around 3.1e-4. This chart can also be leveraged to evaluate other metrics that customers want to optimize.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Study Materials

Figure 5. Learning rate versus validation loss

Monday, 17 December 2018

Extracting insights from IoT data using the cold path data flow

This blog continues our coverage of the solution guide published by Microsoft’s Industry Experiences team. The guide covers the following components:

◈ Ingesting data
◈ Hot path processing
◈ Cold path processing
◈ Analytics clients

We already covered the recommendation for processing data for an IoT application in the solution guide and suggested using Lambda architecture for data flow. To reiterate the data paths:

◈ A batch layer (cold path) stores all incoming data in its raw form and performs batch processing on the data. The result of this processing is stored as a batch view. It is a slow-processing pipeline, executing complex analysis, combining data from multiple sources over a longer period (such as hours or days), and generating new information such as reports and machine learning models.

◈ A speed layer and a serving layer (warm path) analyzes data in real time. This layer is designed for low latency, at the expense of accuracy. It is a faster-processing pipeline that archives and displays incoming messages, and analyzes these records, generating short-term critical information and actions such as alarms.

This blog post covers the cold path processing components of the solution guide.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

We have covered timeseries analysis with Azure Time Series Insights (TSI) in detail in the solution guide. It is an analytics, storage, and visualization service for timeseries data. Please read the relevant section for the use of TSI.

As you may remember from previous blog posts, we are using the sample data published by the NIST SMS Test Bed endpoint. Our previous posts ended with the data pushed to separate Azure Event Hubs for “events” and “samples” data records.

Before we begin the rest of the discussion, we would like to emphasize that the solution of an “analytics” problem is dependent on each plant, line, machine, and so on. The data must be available and be what the business needs. We will cover two different approaches for organizing the data, but they are not exhaustive, and are meant as examples only.

Storing the raw data

Our sample implementation has a basic set of Azure Stream Analytics queries that takes the incoming data stream from the Event Hubs that the raw data is posted to and copies it into Azure Storage blobs and tables. As an example, the queries look like the following:

SELECT
     *
INTO
     [samplesTable]
FROM
     [EventHubIn]

One table is for samples and another is for events. As we were flattening the incoming data in the custom component, we added a property for the hour window the incoming data stream was in, using the following C# code snippet to help us more easily organize the data on the processing pipelines:

HourWindow =

   new DateTime(
       sample.timestamp.Year,
       sample.timestamp.Month,
       sample.timestamp.Day,
       sample.timestamp.Hour,
       0,
       0),

This data record field is especially useful in organizing the records on the Azure Storage Table, simply by using it as the partition key. We are using the sequence number of the incoming record as the row key. The object model for the storage tables are covered in the documentation, “Understanding the Table Service Data Model.” 

The Azure Blob Storage blobs generated by the ASA job are organized in containers for each hour, as a single blob for the data for the hour, in the comma separated values (CSV) format. We will be using these in the future for artificial intelligence (AI) needs.

Loading data into Azure SQL Database
We will be covering a basic way to incrementally load the records to an Azure SQL Database and later discuss potential ways for further processing them to create new aggregations and summary data.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

Our goal is to provide a barebones approach to show how data can flow into data stores and demonstrate the technologies useful for this. Any analytics solution depends heavily on the context and requirements, but we will attempt to provide basic mechanisms to demonstrate the related Azure services.

Azure Data Factory (ADF) is a cloud integration service to compose data storage, movement, and processing services in automated data pipelines. We have a simple ADF pipeline that demonstrates the incremental loading of a table using a storage table as the source.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

The pipeline has a lookup activity that performs the following query on the SQL Database:

select
     CONVERT(
         char(30),
         case when  max(SampleTimestamp) is null then '1/1/2010 12:00:00 AM'
             else max(SampleTimestamp) end, 126) as LastLoad
from [Samples]

The style used in the CONVERT function, 126, denotes the timestamp value to be formatted as “yyyy-mm-ddThh:mi:ss.mmm,” which matches the string representation of the partition key value on the storage table. The query returns the last record that was transferred to the SQL database. We can then pass that value to the next activity to query the table storage to retrieve the new records.

Next is a “Copy Data” activity, which simply uses the returned value from the lookup activity, which is the value of the “LastLoad,” and makes the following table query for the source. Please refer to Querying Tables and Entities for details on querying storage tables.

SampleTimestamp gt datetime'@{formatDateTime(activity('LookupSamples').output.FirstRow.LastLoad, 'yyyy-MM-ddThh:mm:ss.fffZ')}'

Later, this activity maps the storage table columns (properties) to SQL Database table columns. This pipeline is scheduled to run every 15 minutes, thus incrementally loading the destination SQL Database table.

Processing examples

Further processing the raw data depends on the actual requirements. This section covers two potential approaches for processing and organizing the data to demonstrate the capabilities.

Let’s first start looking at the data we collect to discover the details. Notice that the raw data on the samples table is in the form of name/value pairs. The first query will give us the different sample types recorded by each machine.

SELECT DeviceName, ComponentName, SampleName, COUNT(SampleSequence) AS SampleCount
FROM Samples
GROUP BY DeviceName, ComponentName, SampleName 
ORDER BY DeviceName ASC, ComponentName ASC, SampleName ASC, SampleCount DESC

We observe there are eight machines, and each one is sending different sets of sample types. Following is the partial result of the preceding query. We analyzed the result a bit further in Microsoft Excel to give an idea of the relative counts of the samples:

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

We may conclude that the best way to aggregate and summarize the results is first to organize the results by machine — for example, a raw data table per machine.

We will go step by step to demonstrate the concepts here. Some readers will surely find more optimized ways to implement some queries, but our goal here is to provide clear examples that demonstrate the concepts.

We may wish to process the data further by first transposing the raw data, which is in name/value pairs, as follows:

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

We can use the following query to create a new table and transpose whole rows. This query assumes that we do not differentiate any of the components and see the machine as a whole:

; WITH Machine08SamplesTransposed AS
(
     SELECT * FROM
     (
         SELECT  SampleTimestamp, sampleName, CAST(sampleValue AS NUMERIC(20,3)) AS sampleValueNumeric
         FROM Samples
         WHERE
             DeviceName = 'Machine08' and ISNUMERIC(sampleValue) != 0
     ) AS S
    
     PIVOT(
         MAX(sampleValueNumeric)
         FOR SampleName IN ([S2temp],
             [Stemp],
             [Zabs],
             [Zfrt],
             [S2load],
             [Cfrt],
             [total_time],
             [Xabs],
             [Xload],
             [Fact],
             [Cload],
             [cut_time],
             [Zload],
             [S2rpm],
             [Srpm],
             [auto_time],
             [Cdeg],
             [Xfrt],
             [S1load])
         ) AS PivotTable
         )

SELECT * INTO Machine08Samples 
FROM Machine08SamplesTransposed

We can bring this query into the ADF pipeline by moving it into a stored procedure with a parameter to query the raw table so that only the latest loaded rows are brought in, and modifying “SELECT * INTO …” to “INSERT * INTO …”. We recommend relying on stored procedures as much as possible to use SQL database resources efficiently.

The resulting table looks like the following (some columns removed for brevity).

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

One way to process this interim data set is to fill in the null values of samples from the last received value, as shown below.

We should emphasize that we are not recommending this solution for every business case and for every sample value. This approach makes sense for the values that are meaningful together. For example, in a certain case, grouping Fact (actual path feed-rate) and Zfrt (Z axis feed-rate) may make sense. However, for another case Xabs (absolute position on X axis) and Zfrt on one record, grouped this way, may not make sense. Grouping of the sample values must be done on a case-by-case basis, depending on the business need.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

Or another way is to put the individual records into time buckets, and apply an aggregate function in that group:

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

Let’s give a small example for achieving the first option. In the preceding example, we received V1.1 at t1, and received V2.2 at t2. We want to fill in the Sample1 value for t2 with t1s, V1.1.

;WITH NonNullRank AS
(
     SELECT SampleTimestamp, S2temp,  cnt = COUNT(s2temp) OVER (ORDER BY SampleTimestamp)
     FROM Machine08Samples
),

WindowsWithNoValues AS
(
     SELECT SampleTimestamp, S2temp, 
r = ROW_NUMBER() OVER (PARTITION BY cnt ORDER BY SampleTimestamp ASC) - 1
     FROM NonNullRank
)

SELECT SampleTimestamp, S2temp,
S2tempWithValues= ISNULL(S2temp, LAG(S2temp, r) OVER (ORDER BY SampleTimestamp ASC))
FROM WindowsWithNoValues

When we dissect the preceding queries, the first common table expression (CTE), NonNullRank, gives us the rank of the non-null values of S2temp sample values among the received data records.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

The second CTE, WindowsWithNoValues, gives us windows of samples with the received value at the top, and the order of null values within the windows (column r).

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

The concluding query fills in the null values using the LAG analytic function by bringing in the received value from the top of the window to the current row.

Azure Certification, Azure Tutorial and Materials, Azure Learning, Azure Live

The second option we mentioned previously is to group the received values and apply an aggregate function within the group.

;WITH With30SecondBuckets AS
(
     SELECT *,
(dateadd(second,(datediff
(second,'2010-1-1',[SampleTimestamp])/(30))*(30),'2010-1-1'))
AS  [SampleTimestamp30Seconds]
     FROM Machine08Samples
)

SELECT SampleTimestamp30Seconds, AVG(S2Temp)
FROM With30SecondBuckets GROUP BY SampleTimestamp30Seconds
ORDER BY SampleTimestamp30Seconds

We can put these queries in a stored procedure to generate new aggregate and summary tables as necessary to be used by the analytics solution.

We would like to repeat our opening argument here once more. The solution to an analytics problem depends on the available data, and what business needs. There may not be one single solution, but Azure provides many technology options for implementing a given solution.

Sunday, 16 December 2018

Streamlined IoT device certification with Azure IoT certification service

For over three years, we have helped customers find devices that work with Azure IoT technology through the Azure Certified for IoT program and the Azure IoT device catalog. In that time, our ecosystem has grown to one of the largest in the industry with more than 1,000 devices and starter kits from over 250 partners.

Today, we are taking steps to further to grow our device partner ecosystem with the release of Azure IoT certification service (AICS), a new web-based test automation workflow, which is now generally available. AICS will significantly reduce the operational processes and engineering costs for hardware manufacturers to get their devices certified for Azure Certified for IoT program and be showcased on the Azure IoT device catalog.

Over the past year, we’ve made significant improvements to the program such as improving the discovery of certified devices in the Azure Certified for IoT device catalog and expanding the program to support Azure IoT Edge devices. The goal of our certification program is simple – to showcase the right set of IoT devices for our customers’ industry specific vertical solutions and simplify IoT device development.

AICS is designed and engineered to help achieve these goals, delivering on four key areas listed below:

Consistency


AICS is a web-based test automation workflow that can work on any operating systems and web browser. AICS communicates with its own set of Azure IoT Hub instances to automatically validate against devices to Azure IoT Hub bi-directional connectivity and other IoT Hub primitives.

Previously, hardware manufacturers had to instantiate their own IoT Hub using their Azure subscription in order to get certified. AICS not only eliminates Azure subscription costs for our hardware manufacturers, but also streamlines the certification processes through automation. These changes accrue to driving more quality and consistency compared to the manual processes that were in place before.

Additional tests


The certification program for IoT devices has always validated against bi-directional connectivity from device to IoT Hub cloud service (namely device-to-cloud and cloud-to-device). As IoT devices become more intelligent to support more capabilities, we have now expanded our program to support validation of device twins and direct methods IoT Hub primitives. AICS validate these capabilities and Azure IoT device catalog will correspondingly showcased them as well that make it easy for device seekers to build IoT solutions on these rich capabilities.

The screenshot below shows customizable test cases. By default, device-to-cloud is the required test and all others are optional. This new requirement allows constrained devices such as microcontrollers to be certified.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Guides

The screenshot below shows how tested capabilities are shown on the device description page in the device catalog.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Guides

Flexibility


Previously, hardware manufacturers were required to use the Azure IoT device SDK to build an app to establish connectivity from device(s) to cloud managed by Azure IoT Hub services. Based on partners’ feedback, we now support devices that do not use Azure IoT device SDK to establish connectivity to Azure IoT Hub, for example, devices that use the IoT Hub resource provider REST API to create and manage Azure Hub programmatically or hardware manufacturers opt to use other device SDK equivalent to establish connectivity.

In addition, AICS allows hardware manufacturers to configure the necessary parameters for customized test cases such as number of messages of telemetry data sent from the devices.

The screenshot below illustrates an example page that shows the ability to configure each test case.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Guides

Simplicity


Finally, we have made investments to design a user experience that is simple and intuitive to hardware manufacturers. For example, in the device catalog, we have streamlined the process from device registration to running the validations using AICS through a simple wizard driven flow. Hardware developers can easily troubleshoot failed tests through detailed logs that improves diagnose-ability.

Because it’s a web-based workflow, serviceability of AICS is so simple that hardware manufacturers are not required to deploy any standalone test kits (no .exe, .msi, etc.) locally on their devices, which tend to become outdated over time.

The screenshot below shows each test case run. Log files show the test pass/fail along with raw data sent from device to cloud. The submit button only shows up when all the test cases selected pass. Once the tests are complete, we will review the results and notify the submitter of additional steps to complete the entire certification process.

Azure Certification, Azure Learning, Azure Tutorial and Materials, Azure Guides

Friday, 14 December 2018

Taking a closer look at Python support for Azure Functions

Azure Functions provides a powerful programming model for accelerated development and serverless hosting of event-driven applications. Ever since we announced the general availability of the Azure Functions 2.0 runtime, support for Python has been one of our top requests. At Microsoft Connect() last week, we announced the public preview of Python support in Azure Functions. This post gives an overview of the newly introduced experiences and capabilities made available through this feature.

What's in this release?


With this release, you can now develop your Functions using Python 3.6, based on the open-source Functions 2.0 runtime and publish them to a Consumption plan (pay-per-execution model) in Azure. Python is a great fit for data manipulation, machine learning, scripting, and automation scenarios. Building these solutions using serverless Azure Functions can take away the burden of managing the underlying infrastructure, so you can move fast and actually focus on the differentiating business logic of your applications. Keep reading to find more details about the newly announced features and dev experiences for Python Functions.

Powerful programming model


The programming model is designed to provide a seamless and familiar experience for Python developers, so you can import existing .py scripts and modules, and quickly start writing functions using code constructs that you're already familiar with. For example, you can implement your functions as asynchronous co-routines using the async def qualifier or send monitoring traces to the host using the standard logging module. Additional dependencies to pip install can be configured using the requirements.txt format.

Azure Functions, Microsoft Tutorial and Material, Azure Guides, Azure Certification

With the event-driven programming model in Functions, based on triggers and bindings, you can easily configure the event that'll trigger the function execution and any data sources that your function needs to orchestrate with. Common scenarios such as ML inferencing and automation scripting workloads benefit from this model as it helps streamline the diverse data sources involved, while reducing the amount of code, SDKs, and dependencies that a developer needs to configure and work with at the same time. The preview release supports binding to HTTP requests, timer events, Azure Storage, Cosmos DB, Service Bus, Event Hubs, and Event Grid. Once configured, you can quickly retrieve data from these bindings or write back using the method attributes of your entry point function.

Azure Functions, Microsoft Tutorial and Material, Azure Guides, Azure Certification

Easier development


As a Python developer, you don't need to learn any new tools to develop your functions. In fact, you can quickly create, debug and test them locally using a Mac, Linux, or Windows machine. The Azure Functions Core Tools (CLI) will enable you to get started using trigger templates and publish directly to Azure, while automatically handling the build and configuration for you.

Azure Functions, Microsoft Tutorial and Material, Azure Guides, Azure Certification

What's even more exciting is that you can use the Azure Functions extension for Visual Studio Code for a tightly integrated GUI experience to help you create a new app, add functions and deploy, all within a matter of minutes. The one-click debugging experience will let you test your functions locally against real-time Azure events, set breakpoints, and evaluate the call stack, simply on the press of F5. Combine this with the Python extension for VS Code, and you have a best-in-class auto-complete, IntelliSense, linting, and debugging experience for Python development, on any platform!

Azure Functions, Microsoft Tutorial and Material, Azure Guides, Azure Certification

Linux based hosting


Functions written in Python can be published to Azure in two different modes, Consumption plan and the App Service plan. The Consumption plan automatically allocates compute power based on the number of incoming events. Your app will be scaled out when needed to handle a load, and scaled back down when the events become sparse. Billing is based on the number of executions, execution time and memory used, so you don't have to pay for idle VMs or reserved capacity in advance.

In an App Service plan, dedicated instances are allocated to your function which means that you can take advantage of features such as long-running functions, premium hardware, Isolated SKUs, and VNET/VPN connectivity while still being able to leverage the unique Functions programming model. Since using dedicated resources decouples the cost from the number of executions, execution time, and memory used, the cost is capped to the number of instances you've allocated to the plan.

Underneath the covers, both hosting plans run your functions in a docker container based on the open source azure-function/python base image. The platform abstracts away the container, so you're only responsible for providing your Python files and don't need to worry about managing the underlying Azure Functions and Python runtime.

Tuesday, 11 December 2018

Deploying Apache Airflow in Azure to build and run data pipelines

Apache Airflow is an open source platform used to author, schedule, and monitor workflows. Airflow overcomes some of the limitations of the cron utility by providing an extensible framework that includes operators, programmable interface to author jobs, scalable distributed architecture, and rich tracking and monitoring capabilities. Since its addition to Apache foundation in 2015, Airflow has seen great adoption by the community for designing and orchestrating ETL pipelines and ML workflows. In Airflow, a workflow is defined as a Directed Acyclic Graph (DAG), ensuring that the defined tasks are executed one after another managing the dependencies between tasks.

A simplified version of the Airflow architecture is shown below. It consists of a web server that provides UI, a relational metadata store that can be a MySQL/PostgreSQL database, persistent volume that stores the DAG files, a scheduler, and worker process.

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

The above architecture can be implemented to run in four execution modes, including:

◈ Sequential Executor – This mode is useful for dev/test or demo purpose. It serializes the operations and allows only a single task to be executed at a time.
◈ Local Executor – This mode supports parallelization and is suitable for small to medium size workload. It doesn’t support scaling out.
◈ Celery Executor – This is the preferred mode for production deployments and is one of the ways to scale out the number of workers. For this to work, an additional celery backend which is a RabbitMQ or Redis broker is required for coordination.
◈ Dask Executor – This mode also allows scaling out by leveraging the Dask.distributed library, allowing users to run the task in a distributed cluster.

The above architecture can be implemented in Azure VMs or by using the managed services in Azure as shown below. For production deployments, we recommend leveraging managed services with built-in high availability and elastic scaling capabilities.

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

Puckel's Airflow docker image contains the latest build of Apache Airflow with automated build and release to the public DockerHub registry. Azure App Service for Linux is integrated with public DockerHub registry and allows you to run the Airflow web app on Linux containers with continuous deployment. Azure App Service also allow multi-container deployments with docker compose and Kubernetes useful for celery execution mode.

We have developed the Azure QuickStart template, which allows you to quickly deploy and create an Airflow instance in Azure by using Azure App Service and an instance of Azure Database for PostgreSQL as a metadata store.

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

The QuickStart template automatically downloads and deploys the latest Docker container image from puckel/docker-airflow and initializes the database in Azure Database for PostgreSQL server as shown in the following graphic:

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

The environment variables for the Airflow docker image can be set using application settings in Azure App Service as shown in the following graphic:

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

The environment variables used in the deployment are:

◈ AIRFLOW__CORE__SQL_ALCHEMY_CONN – Sets the connection string for web app to connect to Azure Database for PostgreSQL.
◈ AIRFLOW__CORE__LOAD_EXAMPLES – Set to true to load DAG examples during deployment.

The application setting WEBSITES_ENABLE_APP_SERVICE_STORAGE is set to true which can be used as a persistent storage for DAG files accessible to scheduler and worker container images.

After it is deployed, you can browse the web server UI on port 8080 to see and monitor the DAG examples as shown in the following graphic:

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

Next steps


You are now ready to orchestrate and design data pipelines for ETL and machine learning workflows by leveraging the Airflow operators. You can also leverage Airflow for scheduling and monitoring jobs across fleet of managed databases in Azure by defining the connections as shown below.

Data Warehouse, Monitoring, Apache, Azure Certifications, Azure Study Materials, Azure Tutorial and Material

If you are looking for exciting challenge, you can deploy the kube-airflow image with celery executor with Azure Kubernetes Services using helm charts, Azure Database for PostgreSQL, and RabbitMQ. Let us know if you have developed it and we would be happy to provide link it to this blog.