Wednesday 20 December 2017

Azure HDInsight Integration with Azure Log Analytics is now generally available

I am excited to announce the general availability of HDInsight Integration with Azure Log Analytics.

Azure HDInsight is a fully managed cloud service for customers to do analytics at scale using the most popular open-source engines such as Hadoop, Hive/LLAP, Presto, Spark, Kafka, Storm, HBase etc.

Thousands of our customers run their big data analytical applications on HDInsight at global scale. The ability to monitor this infrastructure, detect failures quickly and take quick remedial action is key to ensuring a better customer experience.

Log Analytics is part of Microsoft Azure's overall monitoring solution. Log Analytics helps you monitors cloud and on-premises environments to maintain availability and performance.

Our integration with log analytics will make it easier for our customers to operate their big data production workloads more effective and simple manner.

Monitor & debug full spectrum of big data open source engines at global scale


Typical big data pipelines utilize multiple open source engines such as Kafka for Ingestion, Spark streaming or Storm for stream processing, Hive & Spark for ETL, Interactive Query [LLAP] for blazing fast querying of big data.

Additionally, these pipelines may be running in different datacenters across the globe.

With new HDInsight monitoring capabilities, our customers can connect different HDInsight clusters to Log Analytics workspace and monitor them with single pane of glass.

Image: Monitoring your global big data deployments with single pane of glass

Collect logs and metrics from open source analytics engines


Once Azure Log Analytics is enabled on your cluster, you will see important logs and metrics from number of different open source frameworks as well as cluster VM level metrics such as CPU usage, memory utilization and more. Customers will be able to get a full view into their cluster, from one location.

Many of our customers take advantage of elasticity of the cloud by creating and deleting clusters to minimize their costs. However, they want to retain the job logs and other useful information even after the cluster is terminated. With Azure log analytics, customers can retain the job information even after the cluster is deleted.

Below are some of the key metrics and logs collected from your HDInsight clusters.

Yarn Resource Manager, Yarn Applications, Hive, Mapreduce, Kafka, Storm, Hive Server 2, Hive Server Interactive, Oozie, Spark, Spark executor and driver Livy, Storm, HBase, Phoenix, Juypter, LLAP, Zookeeper, and many more.

Image: Logs & Metrics from various Open Source engines.

Visualize key metrics with solution templates


To make it easier we have created number of visualizations so that our customers can understand important metrics. We have published multiple solution templates for you to get started quickly. You can install these solutions templates from Azure portal directly, under Monitoring + Management.

Image: Installing HDInsight solution templates from Azure portal

Once installed, you can see visualize the key metrics. In the example below you can see the dashboard for your Spark clusters.

Image: Spark dashboard

Troubleshoot issues faster


It’s important to be able to detect and troubleshoot issues faster and find the root cause when developing big data applications in Hive, Spark or Kafka.

With log analytics portal, you can:

◉ Write queries to quickly find issues of important data in your logs and metrics
◉ Filter, sort, and group results within a time range
◉ See your data in tabular format or in a chart

Below is the example query to look at application metrics from a Hive query

search *

| where Type contains "application_stats_dag_CL" and ClusterName_s contains "testhive02"

|order by TimeGenerated desc

Image: troubleshooting hive jobs 

Enabling Log Analytics


Log Analytics integration with HDInsight is enabled via the Azure portal, PowerShell or the Azure SDK. 

Enable-AzureRmHDInsightOperationsManagementSuite
        [-Name] <String>
        [-WorkspaceId] <String>
        [-PrimaryKey] <String>
        [-ResourceGroupName <String>]
        [-DefaultProfile <IAzureContextContainer>]
        [-WhatIf]
        [-Confirm]
        [<CommonParameters>]

Image: Enabling log Analytics from Azure portal

Related Posts

0 comments:

Post a Comment