I am excited to announce the general availability of HDInsight Integration with Azure Log Analytics.
Azure HDInsight is a fully managed cloud service for customers to do analytics at scale using the most popular open-source engines such as Hadoop, Hive/LLAP, Presto, Spark, Kafka, Storm, HBase etc.
Thousands of our customers run their big data analytical applications on HDInsight at global scale. The ability to monitor this infrastructure, detect failures quickly and take quick remedial action is key to ensuring a better customer experience.
Log Analytics is part of Microsoft Azure's overall monitoring solution. Log Analytics helps you monitors cloud and on-premises environments to maintain availability and performance.
Our integration with log analytics will make it easier for our customers to operate their big data production workloads more effective and simple manner.
Typical big data pipelines utilize multiple open source engines such as Kafka for Ingestion, Spark streaming or Storm for stream processing, Hive & Spark for ETL, Interactive Query [LLAP] for blazing fast querying of big data.
Additionally, these pipelines may be running in different datacenters across the globe.
With new HDInsight monitoring capabilities, our customers can connect different HDInsight clusters to Log Analytics workspace and monitor them with single pane of glass.
Azure HDInsight is a fully managed cloud service for customers to do analytics at scale using the most popular open-source engines such as Hadoop, Hive/LLAP, Presto, Spark, Kafka, Storm, HBase etc.
Thousands of our customers run their big data analytical applications on HDInsight at global scale. The ability to monitor this infrastructure, detect failures quickly and take quick remedial action is key to ensuring a better customer experience.
Log Analytics is part of Microsoft Azure's overall monitoring solution. Log Analytics helps you monitors cloud and on-premises environments to maintain availability and performance.
Our integration with log analytics will make it easier for our customers to operate their big data production workloads more effective and simple manner.
Monitor & debug full spectrum of big data open source engines at global scale
Typical big data pipelines utilize multiple open source engines such as Kafka for Ingestion, Spark streaming or Storm for stream processing, Hive & Spark for ETL, Interactive Query [LLAP] for blazing fast querying of big data.
Additionally, these pipelines may be running in different datacenters across the globe.
With new HDInsight monitoring capabilities, our customers can connect different HDInsight clusters to Log Analytics workspace and monitor them with single pane of glass.
Image: Monitoring your global big data deployments with single pane of glass
Collect logs and metrics from open source analytics engines
Once Azure Log Analytics is enabled on your cluster, you will see important logs and metrics from number of different open source frameworks as well as cluster VM level metrics such as CPU usage, memory utilization and more. Customers will be able to get a full view into their cluster, from one location.
Many of our customers take advantage of elasticity of the cloud by creating and deleting clusters to minimize their costs. However, they want to retain the job logs and other useful information even after the cluster is terminated. With Azure log analytics, customers can retain the job information even after the cluster is deleted.
Below are some of the key metrics and logs collected from your HDInsight clusters.
Yarn Resource Manager, Yarn Applications, Hive, Mapreduce, Kafka, Storm, Hive Server 2, Hive Server Interactive, Oozie, Spark, Spark executor and driver Livy, Storm, HBase, Phoenix, Juypter, LLAP, Zookeeper, and many more.
Image: Logs & Metrics from various Open Source engines.
Visualize key metrics with solution templates
To make it easier we have created number of visualizations so that our customers can understand important metrics. We have published multiple solution templates for you to get started quickly. You can install these solutions templates from Azure portal directly, under Monitoring + Management.
Image: Installing HDInsight solution templates from Azure portal
Once installed, you can see visualize the key metrics. In the example below you can see the dashboard for your Spark clusters.
Image: Spark dashboard
Troubleshoot issues faster
It’s important to be able to detect and troubleshoot issues faster and find the root cause when developing big data applications in Hive, Spark or Kafka.
With log analytics portal, you can:
◉ Write queries to quickly find issues of important data in your logs and metrics
◉ Filter, sort, and group results within a time range
◉ See your data in tabular format or in a chart
Below is the example query to look at application metrics from a Hive query
search *
| where Type contains "application_stats_dag_CL" and ClusterName_s contains "testhive02"
|order by TimeGenerated desc
Image: troubleshooting hive jobs
Enabling Log Analytics
Log Analytics integration with HDInsight is enabled via the Azure portal, PowerShell or the Azure SDK.
Enable-AzureRmHDInsightOperationsManagementSuite
[-Name] <String>
[-WorkspaceId] <String>
[-PrimaryKey] <String>
[-ResourceGroupName <String>]
[-DefaultProfile <IAzureContextContainer>]
[-WhatIf]
[-Confirm]
[<CommonParameters>]
Image: Enabling log Analytics from Azure portal
0 comments:
Post a Comment