Saturday 31 March 2018

BigDL Spark deep learning library VM now available on Microsoft Azure Marketplace

BigDL deep learning library is a Spark-based framework for creating and deploying deep learning models at scale. While it has previously been deployed on Azure HDInsight and Data Science VM, it is now also available on Azure Marketplace as a fixed VM image, representing a further step by Intel to reduce deployment complexity for users.

Because BigDL is an integral part of Spark, a user does not need to explicitly manage distributed computations. A BigDL application provides high-level control “knobs”, such as the number of compute nodes, cores, and batch size, a BigDL application also leverages stable Spark infrastructure for node communications and resource management during its execution. BigDL applications can be written in either Python or Scala and achieve high performance through both algorithm optimization and taking advantage of close integration with Intel’s Math Kernel Library (MKL).

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

What is the Microsoft Azure Marketplace? The Azure Marketplace is an online applications and services marketplace that enables start-ups, independent software vendors (ISVs), and MSP/SIs to offer their Azure-based solutions or services to customers around the world. 

Introduction


This post describes two use case scenarios to deploy BigDL_v0.4 in Azure VMs:

◈ First scenario: Deploying an Azure VM with a pre-built BigDL_v0.4 image and running a basic deep learning example.

◈ Second scenario: Deploying BigDL_v0.4 on a bare-bones Ubuntu VM (for advanced users).

First scenario: Deploying a pre-built BigDL_v0.4 VM image:


Log in to your Microsoft Azure account. BigDL requires you to have an Azure subscription. You can get a free trial by visiting BigDL offering on Azure Marketplace and clicking Get it now.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

You should see the following page. Click on the Create button at the bottom.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Enter the requested information in the fields at the prompts. Note that Azure imposes syntax limitations on some of the fields (such as using only alphanumeric characters and no CAPS). Use lowercase letters and digits and you will be fine. Use the following three screenshots for guidance.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Spark is memory-intensive, so select a machine with a larger amount of RAM. Note that not all VM types and sizes are available in certain regions. For simple tasks and testing, the virtual machine displayed in the following screenshot will meet requirements:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

After the VM is provisioned, copy its public IP address. Note that this public IP address will change every time you stop and restart your VM. Keep this in mind if you are thinking of BigDL automation.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

After deployment, you can modify the IP address provided in the resource group and set it up as a static IP address:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

You are now ready to SSH into your BigDL VM. You can use your favorite SSH client. For this example, MobaXterm is used.

Enter the IP address and the username you selected when creating the VM.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Check the versions of installed dependencies:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Before using pre-installed BigDL, you will need to change ownership of the directory.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

BigDL was pre-installed into the bigdlazrmktplc directory. Yet ‘testuser’ does not have full privileges to access it.

To change this, type:

$sudo chown -R testuser:testuser bigdlazrmktplc

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Now ‘testuser’ owns the bigdlazrmktplc directory.

Finally, test that BigDL actually works in this VM by entering the following commands:

$cd bigdlazrmktplc/BigDL
$export SPARK_HOME=/usr/local/spark/spark-2.2.0-bin-hadoop2.7
$export BIGDL_HOME=/home/bigdlazrmktplc/BigDL
$BigDL/bin/pyspark-with-bigdl.sh --master local[*]

If the commands are successful you will see the following:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

At the command prompt, copy and paste the following example code, the source can be found on GitHub

from bigdl.util.common import *
from pyspark import SparkContext
from bigdl.nn.layer import *
import bigdl.version
# create sparkcontext with bigdl configuration
sc = SparkContext.getOrCreate(conf=create_spark_conf().setMaster("local[*]"))
init_engine() # prepare the bigdl environment
bigdl.version.__version__ # Get the current BigDL version
linear = Linear(2, 3) # Try to create a Linear layer

If the commands are successful, you will see the following:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

BigDL is now ready for you to use.

Second Scenario: Deploying BigDL_v0.4 on a bare-bones Ubuntu Azure VM


First, you will need to create an Azure subscription. You can get a free trial by navigating to BigDL offering on Azure Marketplace and clicking Get it now.

Log in to the Azure Portal, go to New, and select Ubuntu server 16.04 LTS VM (LTS = Long Term Support).

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Enter the basic VM attributes using only lower-case letters and numbers.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

For Spark jobs you want to select VMs with a large amount of RAM available.

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

After your VM has been created, you can SSH into it using the username and password which you created previously.

Copy the Public IP address of the VM:

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

This creates a very basic Ubuntu machine, so you must install the following addtional components to run BigDL:

◈ Java Runtime Environment (JRE)
◈ Scala
◈ Spark
◈ Python packages
◈ BigDL

Installing the Java Runtime Environment (JRE)


At the command prompt, type the following commands:

$sudo add-apt-repository ppa:webupd8team/java
$sudo apt-get update
$sudo apt-get install oracle-java8-installer
$sudo apt-get install oracle-java8-set-default

Confirm the installation and JRE version by typing

$java -version

Installing Scala and confirming version


At the command prompt, type the following commands:

$sudo apt-get install scala
$scala -version

Installing Spark 2.2.x


At the command prompt, type the following commands:

$sudo wget http://mirrors.ibiblio.org/apache/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
$sudo tar xvzf spark-2.2.0-bin-hadoop2.7.tgz
$rm spark-2.2.0-bin-hadoop2.7.tgz
$sudo mkdir /usr/local/spark
$sudo mv spark-2.2.0-bin-hadoop2.7 /usr/local/spark
Verify Spark installation:
$cd  /usr/local/spark/spark-2.2.0-bin-hadoop2.7/
$./bin/spark-submit –version

Microsoft Azure BigDL, Microsoft Azure Certifications, Microsoft Azure Learning, Microsoft Azure Guides, Microsoft Azure VM

Installing BigDL


The main repo for BigDL downloadable releases.

For Spark 2.2.0 and Scala 2.11.x, select Dist-spark-2.2.0-scala-2.11.8-all-0.4.0-dist.zip

At the command prompt, type the following commands:

$cd ~
$mkdir BigDL
$cd BigDL
$sudo wget https://s3-ap-southeast-1.amazonaws.com/bigdl-download/dist-spark-2.2.0-scala-2.11.8-all-0.4.0-dist.zip

 $sudo apt-get install unzip
 $unzip dist-spark-2.2.0-scala-2.11.8-all-0.4.0-dist.zip
 $rm dist-spark-2.2.0-scala-2.11.8-all-0.4.0-dist.zip

Installing Python 2.7 packages


Ubuntu 16x on Azure comes with pre-installed python 2.7. However, there are a couple of additional packages that must be installed.
At the command prompt, type the following commands:

$sudo apt-get install python-numpy
$sudo apt-get install python-six

Update all packages and dependencies by typing

$sudo apt-get update

Verify BigDL installation


The instructions to verify that BigDL was installed correctly are available.

At the command prompt, type the following commands:

$export SPARK_HOME=/usr/local/spark/spark-2.2.0-bin-hadoop2.7
$export BIGDL_HOME=/home/bigdlazrmktplc/BigDL
Launch PySpark (from BigDL directory)

$bin/pyspark-with-bigdl.sh --master local[*]

At the prompt, copy and paste the following code, this code can also be found at Github.

from bigdl.util.common import *
from pyspark import SparkContext
from bigdl.nn.layer import *
import bigdl.version
# create sparkcontext with bigdl configuration
sc = SparkContext.getOrCreate(conf=create_spark_conf().setMaster("local[*]"))
init_engine() # prepare the bigdl environment
bigdl.version.__version__ # Get the current BigDL version
linear = Linear(2, 3) # Try to create a Linear layer

You should see the following:

creating: createLinear
cls.getname: com.intel.analytics.bigdl.python.api.Sample
BigDLBasePickler registering: bigdl.util.common  Sample
cls.getname: com.intel.analytics.bigdl.python.api.EvaluatedResult
BigDLBasePickler registering: bigdl.util.common  EvaluatedResult
cls.getname: com.intel.analytics.bigdl.python.api.JTensor
BigDLBasePickler registering: bigdl.util.common  JTensor
cls.getname: com.intel.analytics.bigdl.python.api.JActivity
BigDLBasePickler registering: bigdl.util.common  JActivity
>>>

Finally, install Maven to allow you to build BigDL applications by typing the following:

$sudo apt-get install maven

Your VM is now ready for running deep learning examples at scale.

Friday 30 March 2018

SQL Database Transparent Data Encryption with Azure Key Vault configuration checklist

Azure SQL Database and Data Warehouse offer encryption-at-rest by providing Transparent Data Encryption (TDE) for all data written to disk, including databases, log files and backups. This protects data in case of unauthorized access to hardware. TDE provides a TDE Protector that is used to encrypt the Database Encryption Key (DEK), which in turn is used to encrypt the data. The TDE protector is by default managed by the service in a fully transparent fashion, rotated every 90 days and maintained in archive for access to backups. Optionally management of the TDE Protector can be assumed by the customer if more control is desired. This requires storing the TDE protector in a customer-owned Azure Key Vault. If this option is chosen, it is important to fully understand all TDE implications and carefully plan for ongoing key management.

Overview of TDE with customer managed keys and Azure Key Vault integration:


Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

In this scenario, customers must maintain Azure Key Vault, control SQL Database permissions to Azure Key Vault and maintain access to all TDE Protectors to open or restore databases or backups and enable all other operations that require database access to the TDE Protector. The following checklist will help to systematically plan all key management related duties in Azure Key Vault. In addition, we list the most important setup considerations and configuration requirements that must be followed to configure TDE with customer managed keys in Azure Key Vault.

General guidelines:

  • Ensure Azure Key Vault and Azure SQL Database are going to be in the same tenant. Cross-tenant key vault and server interactions are not supported.
  • Decide which subscriptions will be used for the required resources. Moving the server across subscriptions later requires a new setup of TDE with BYOKs.
  • When configuring TDE with BYOK, it is important to consider the load placed on the key vault by repeated wrap/unwrap operations. For example, since all databases associated with a logical server use the same TDE protector, a failover of that server will trigger as many key operations against the vault as there are databases in the server. Based on our experience and documented key vault service limits, we recommend associating at most 500 Standard or 200 Premium databases with one Azure Key Vault in a single subscription to ensure consistently high availability when accessing the TDE protector in the vault.
  • Recommended: Keep a copy of the TDE protector on premises. This requires a HSM device to create a TDE Protector locally and a key escrow system to store a local copy of the TDE Protector.

Guidelines for configuring Azure Key Vault:


  • Use a key vault with soft-delete enabled (required) to protect from data loss in case of accidental key or key vault deletion scenarios:
    • Soft deleted resources are retained for 90 days unless they are recovered or purged.
    • The recover and purge actions have their own permissions associated in a key vault access policy.
  • Grant the SQL server access to the key vault using its Azure Active Directory (Azure AD) Identity. When using the Portal UI, the Azure AD identity will be automatically created and the key vault access permissions will be granted to the server. Using PowerShell, these steps must be completed individually in the correct order and need to be verified. See Configure TDE with BYOK for detailed step-by-step instructions when using PowerShell. Note: The server will lose access to the key vault if the Azure AD Identity is accidentally deleted or the server’s permissions are revoked using the key vault’s access policy.
  • Enable auditing and reporting in Azure Key Vault on all encryption keys: Key Vault provides logs that are easy to inject into other security information and event management (SIEM) tools. Operations Management Suite (OMS) Log Analytics is one example of a service that is already integrated.
  • To ensure high-availability of encrypted databases, configure one logical server with two Azure Key Vaults in different regions.
  • For High Availability of a single SQL database, consider configuring two key vaults:

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides
  • Use the Backup-AzureKeyVaultKey cmdlet to retrieve the key in encrypted format and then use the Restore-AzureKeyVaultKey cmdlet and specify a key vault in the second region.
  • For Geo-replicated databases, the following AKV configuration is required:

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides
  • One primary database with a key vault in region and one secondary database with a key vault in region.
  • One secondary is required, up to four secondaries are supported.
  • Secondaries of secondaries (chaining) is not supported.
    • Note: when assigning the server identity, assign the identity for the secondary first, and for the primary second.

Guidelines for configuring TDE Protectors (asymmetric key) stored in Azure Key Vault:

  • Create your encryption key locally on a local HSM device. Ensure this is an asymmetric, RSA 2048 key so it is storable in Azure Key Vault. Larger key sizes are currently not supported by Azure SQL Database.
  • Escrow the key in a key escrow system.
  • Import the encryption key file (.pfx, .byok, or .backup) to Azure Key Vault.
    • (Note: For testing purposes, it is possible to create a key with Azure Key Vault, however this key cannot be escrowed, because the private key can never leave the key vault. A key used to encrypt production data should always be escrowed, as the loss of the key (accidental deletion in key vault, expiration etc.) will result in permanent data loss.)
  • Use a key without an expiration date and never set an expiration date on a key already in use: once the key expires, the encrypted databases lose access to their TDE Protector and are dropped within 24 hours.
  • Ensure the key is enabled and has permissions to perform get, wrap key, and unwrap key operations.
  • Create an Azure Key Vault key backup before using the key in Azure Key Vault for the first time. Learn more about the Backup-AzureKeyVaultKey command.
  • Create a new backup whenever any changes are made to the key (for example, add ACLs, add tags, add key attributes).
  • Keep previous versions of the key in the key vault when rotating keys, so that databases can still access their virtual log files that continue to be encrypted with the original keys. In addition, when the TDE Protector is changed for a database, old backups of the database are not updated to use the latest TDE Protector. Key rotations can be performed following the instructions at Rotate the Transparent Data Encryption Protector Using PowerShell.
  • Keep all previously used keys in Azure Key Vault after changing back to service-managed keys. This will ensure database backups can be restored with the TDE protectors stored in Azure Key Vault. TDE protectors will have to be maintained in Azure Key Vault until all needed backups have been created while using service-managed keys.
  • Make recoverable backup copies of these keys using Backup-AzureKeyVaultKey.
  • To remove a potentially compromised key during a security incident without the risk of data loss, follow the steps at Remove a potentially compromised key.

Tuesday 27 March 2018

Microsoft creates industry standards for datacenter hardware storage and security

Today I’m speaking at the Open Compute Project (OCP) U.S. Summit 2018 in San Jose where we are announcing a next generation specification for solid state device (SSD) storage, Project Denali. We’re also discussing Project Cerberus, which provides a critical component for security protection that to date has been missing from server hardware: protection, detection and recovery from attacks on platform firmware. Both storage and security are the next frontiers for hardware innovation, and today we’re highlighting the latest advancements across these key focus areas to further the industry in enabling the future of the cloud.

A new standard for cloud SSD storage


Storage paradigms have performed well on-premises, but they haven’t resulted in innovation for increasing performance and cost efficiencies needed for cloud-based models. For this reason, we’re setting out to define a new standard for flash storage specifically targeted for cloud-based workloads and I’m excited to reveal Project Denali, which we’re establishing with CNEX Labs. Fundamentally, Project Denali standardizes the SSD firmware interfaces by disaggregating the functionality for software defined data layout and media management. With Project Denali, customers can achieve greater levels of performance, while leveraging the cost-reduction economics that come at cloud scale.

Project Denali is a standardization and evolution of Open Channel that defines the roles of SSD vs. that of the host in a standard interface. Media management, error correction, mapping of bad blocks and other functionality specific to the flash generation stays on the device while the host receives random writes, transmits streams of sequential writes, maintains the address map, and performs garbage collection. Denali allows for support of FPGAs or microcontrollers on the host side.


This provides an architectural framework that is truly cloud first. The modular architecture proposed will enable agility for new non-volatile media adoption (both NAND and Storage class memory), along with improved workload performance, through closer integration between the application and the SSD device.  It also defines a model for using software-defined data placement on SSDs to disaggregate older, monolithic storage models. When management of data placement is separated from the NAND management algorithms, non-volatile storage media is freed up to follow its own schedule for innovation. Project Denali will allow hardware companies to build simpler, less complicated hardware which will lower costs, decrease time to market, allow for workload specific tuning and enable rapid development of new NAND and memory technologies.

After maturing Project Denali with a full array of ecosystem partners, we intend to contribute the Project Denali standard to the industry to help foster even broader adoption.


Enabling hardware security


Microsoft Azure represents the cutting edge of cloud security and privacy. Microsoft spends one billion dollars per year on cybersecurity, and much of that investment goes to fundamental improvements that make Azure a trusted cloud platform. With such an intense focus on security, we recognize the need for an industry standard for hardware security. Microsoft’s Project Cerberus has been developed with the intent of creating an open industry standard for platform security.

Project Cerberus is a security co-processor that establishes a root of trust in itself for all of the hardware devices on a computing platform and helps defend platform firmware from:

◈ Malicious insiders with administrative privilege or access to hardware
◈ Hackers and malware that exploit bugs in the operating system, application, or hypervisor
◈ Supply chain attacks (manufacturing, assembly, in-transit)
◈ Compromised firmware binaries

Cerberus consists of a cryptographic microcontroller running secure code which intercepts accesses from the host to flash over the SPI bus (where firmware is stored), so it can continuously measure and attest these accesses to ensure firmware integrity and hence protect against unauthorized access and malicious updates. This enables robust pre-boot, boot-time and runtime integrity for all the firmware components in the system.

The specification is CPU and I/O architecture agnostic and is intended to easily integrate into various vendor designs over time, thus enabling more secure firmware implementations on all platform types across the industry, ranging from datacenter to IoT devices. The specification also supports hierarchical root of trust so that platform security can be extended to all I/O peripherals using the same architectural principles.

Since the introduction of Project Cerberus in late 2017, the ecosystem supporting the standard has continued to grow and we’re on the verge of contributing the hardware implementation to the community for greater collaboration and adoption.


Since 2015, we’ve been sharing the server and datacenter designs that power Microsoft Azure with the OCP community, working to empower the industry to take advantage of innovations that improve datacenter performance, efficiency, and power consumption.

Saturday 24 March 2018

Azure DNS Private Zones now available in public preview

We are pleased to announce the public preview of DNS Private Zones in all Azure Public cloud regions. This capability provides secure and reliable name resolution for your virtual networks in Azure. Private Zones was announced as a managed preview in fall of last year.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

No more custom DNS server burden


Private Zones obviates the need to setup and manage custom DNS servers. You can bring DNS zones to your virtual network as you lift-and-shift applications to the Azure cloud, or if you are building Cloud-Native applications. You also have the flexibility to use custom domain names, such as your company’s domain name.

Name resolution across virtual networks and across regions


Private zones provide name resolution both within a virtual network and across virtual networks. You can have private zones not only span across virtual networks in the same region, but also across regions and subscriptions. This feature is available in all Azure Public cloud regions.

Split-horizon support


You can configure zones with a split-horizon view, allowing for a private and a public DNS zone to share the same name. This is a common scenario when you want to validate your workloads in a local test environment, before rolling out in production for broader consumption. To realize this scenario, simply configure the same DNS zone as both a public zone and private zone in Azure DNS. Now for clients in a virtual network attached to the zone, Azure will return the DNS response from the private zone, and for clients on the internet, Azure will return the DNS response from the public zone. Since name resolution is confined to configured virtual networks, you can prevent DNS exfiltration.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

Dynamic DNS Registration


We are introducing two concepts to DNS zones with this update; Registration virtual networks and Resolution virtual networks. When you designate a virtual network as a Registration virtual network at the time of creating a private zone or later when you update the zone, Azure will dynamically register DNS A records in the private zone for the virtual machines within this virtual network and will keep track of virtual machine additions or removals within the virtual network to keep your private zone updated. This is without any work on your part.

You can also designate up to 10 virtual networks as Resolution virtual networks when creating or updating a private zone. Forward DNS queries will resolve against the private zone records from any of these virtual networks. There is no dependency or requirement that the virtual networks be peered for DNS resolution to work across virtual networks.

Azure DNS Private Zones also supports Reverse DNS queries for the private IP address space of the Registration virtual network.

Thursday 22 March 2018

Deploying WordPress Application using Visual Studio Team Services and Azure – Part two

This post is the second part of two blog posts describing how to setup a CI/CD pipeline using Visual Studio Team Services (VSTS) for deploying a Dockerized custom WordPress website working with Azure WebApp for Containers and Azure Database for MySQL. In Deploying WordPress application using Visual Studio Team Services and Azure - part 1 we described how to build a Continuous Integration (CI) process using VSTS, while in this part we are going to focus on the Continuous Delivery (CD) part by using the VSTS Release management.

Prerequisites for this part


◈ MySQL client installed on Azure Virtual Machine (apt-get install mysql-client-5.7).
◈ Allow the connectivity between the Azure VM machine to Azure Databases for MySQL
◈ Installing this VSTS Route Traffic extension

Visual Studio Team Services – Release phase


I recommend on saving after completing each of the following steps:

First, we need to create a new empty release definition, go to releases, click on the + icon and choose create release definition, select empty process as the template.

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Let’s start by adding artifacts to our pipeline. Click on add in the Artifacts section (step 1), a right blade will be presented, choose the following:

◈ Source type – Build
◈ Project – The relevant project name
◈ Source – The name of the build definition we created
◈ Default version – Latest
◈ Source alias – Keep the default

After clicking on Add, click on the trigger icon under the Artifacts section (step 2), a right blade will be shown, choose to enable the continuous deployment trigger.

We need to move to configure the environments section, click on pre-deployment conditions icon (step 3), select After release trigger and close the right blade.

Next step is adding tasks to the development pipeline, either click on the link (step 4) or click on tasks.

Development Pipeline


Under tasks tab, click on agent phase and select hosted Linux preview as agent queue. Add the following tasks by clicking the + icon:

◈ Three SSH command tasks
◈ One Azure App Service deploy task
◈ One Azure App Service Manage task

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Now, we need to add variables that we going to use during the development CD process. Click on the variables tab and start adding from the below list, select development as the scope:

◈ $(destappinsight) –Application Insight Instrumentation Key of Dev environment
◈ $(desturl) – App Service URL of Dev environment
◈ $(migrationfile) – temp file name when executing DB backup and restore operation
◈ $(mysqldestdb) – DB name of Dev environment
◈ $(mysqlhost) – Server name of Azure Database for MySQL
◈ $(mysqlpass) – Password for Azure Database for MySQL
◈ $(mysqlport) – The Port for Azure Database for MySQL
◈ $(mysqlsourcedb) – DB name of Local environment
◈ $(mysqluser) – User name for Azure Database for MySQL
◈ $(resultfile) - temp file name when executing DB backup and restore operation
◈ $(sourceappinsight) - Application Insight Instrumentation Key of Local environment
◈ $(sourceurl) – Local environment URL

It’s possible to use a more secure solution for storing sensitive values, read more about using Azure KeyVault to store sensitive values and use them in VSTS.

Before returning to the tasks tab, we need to add a new SSH endpoint (Settings/Services/New Service Endpoint/SSH). Fill your Azure Virtual Machine details.

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Now, let’s go back to the tasks tab and start editing the tasks.

The 1st SSH task is backup DB to file with the following values:

SSH endpoint – select the relevant SSH endpoint
Run – Commands
Commands –
mysqldump -P $(mysqlport) -h $(mysqlhost)  -u $(mysqluser)  -p$(mysqlpass) $(mysqlsourcedb) > $(resultfile)  

The 2nd SSH task is replace values with the following values:

SSH endpoint – select the relevant SSH endpoint
Run – Commands
Commands –
sed 's/$(sourceurl)/$(desturl)/g;s/$(sourceappinsight)/$(destappinsight)/g' $(resultfile) > $(migrationfile)

The 3rd SSH task is restore DB from migrated file with the following values:

SSH endpoint – select the relevant SSH endpoint
Run – Commands
Commands –
mysql -h $(mysqlhost) -u $(mysqluser)  -p$(mysqlpass) $(mysqldestdb) < $(migrationfile)

The 4th task is Azure App Service Deploy version 3 with the following values

Azure subscription – select the relevant Azure subscription
App type – Linux Web App
App Service name – select Dev environment App Service
Image Source – Container Registry
Registry or Namespace – Azure Container Registry login server value
Image – The Docker image name from CI process
Tag - $(Build.BuildId)
App settings –
-DB_ENV_NAME $(mysqldestdb) -DB_ENV_USER $(mysqluser) -DB_ENV_PASSWORD $(mysqlpass) -DB_ENV_HOST $(mysqlhost)

The 5th task is Azure App Service Manage, restart Azure App Service with the following values:

Azure subscription – select the relevant Azure subscription
Action – Restart App Service
App Service name - select the App Service of Dev environment

We have completed building the development CD pipeline.

Test Pipeline


Go back to pipeline tab, highlight development environment and choose to clone the environment (step 5), call the new environment test.

As Pre-deployment conditions of test environment (step 6), select after environment trigger, enable the pre-deployment approvals option, choose a member of your team as a approver to initiate the deployment process for test environment.

After closing the blade, click on the link below to view environment tasks (step 7).

The tasks tab will be presented, no need to update the first three SSH tasks.
The 4th task – update the App Service name to the correct App Service for test environment.
The 5th task – update again the App Service name.

Go to variables tab, filter according to test scope and set the variables values to test environment. We have now completed building the test CD pipeline.

Production Pipeline


Our goal is to have a production rollout without any downtime to achieve that we will use the slot mechanism and routing capabilities that Azure App Services are offering.

To create the production environment, repeat the same steps for creating the test environment (steps 8, 9 and 10).

Go to the variables tab, filter according to production scope and set the values to production environment. Set the value of $(mysqldestdb) to stage DB. In addition, add a new variable $(mysqlproddb) and set the value to production DB.

Go back to tasks tab, update the App Service name for the 4th and 5th tasks, this time check the deploy to slot option and choose ‘staging’ slot which we created for the production App Service.

Add additional tasks by clicking the + icon:

◈ One Route Traffic task - Select Production App Service, stage slot and route 100% of the traffic (see below screenshot)

◈ Two SSH command tasks – same configuration as other SSH tasks just different command

1st task Command:

mysqldump -P $(mysqlport) -h $(mysqlhost)  -u $(mysqluser)  -p$(mysqlpass) $(mysqldestdb) > $(resultfile)

2nd task command:

mysql -h $(mysqlhost) -u $(mysqluser)  -p$(mysqlpass) $(mysqlproddb) < $(resultfile)

◈ One Azure App Service deploy task

Same as 4th task but this time without checking the slot option

◈ One Azure App Service Manage task

Same as 5th task but this time without checking the slot option

◈ One Route Traffic task

Select Production App Service, stage slot and route 0% of the traffic

We have completed building the CD for Production. See the result:

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Tuesday 20 March 2018

New app usage monitoring capabilities in Application Insights

Our goal with Azure Monitoring tools is to provide full-stack monitoring for your applications. The top of this “stack” isn’t the client-side of your app, it’s your users themselves. Understanding user behavior is critical for making the right changes to your apps to drive the metrics your business cares about.

Recent improvements to the usage analytics tools in Application Insights can help your team better understand overall usage, dive deep into the impact of performance on customer experience, and give more visibility into user flows.

A faster, more insightful experience for Users, Sessions, and Events


Microsoft Tutorials and Materials, Microsoft Learning, Microsoft Applications

Understanding application usage is critical to making smart investments with your development team. An application can be fast, reliable, and highly available, but if it doesn’t have many users, it’s not contributing value to your business.

The Users, Sessions, and Events tools in Application Insights make it easy to answer the most basic usage analytics question, “How much does my application and each of its features get used?”

We've re-built the Users, Sessions, and Events tools to make them even more responsive. A new sidebar of daily and monthly usage metrics help you spot growth and retention trends. Clicking on each metric gives you more detail, like a custom workbook for analyzing monthly active users (MAU). Also, the new “Meet your users” cards put you in the shoes of some of your customers, letting you follow their journeys step-by-step in a timeline.

Introducing the Impact tool


Microsoft Tutorials and Materials, Microsoft Learning, Microsoft Applications

Are slow page loads the cause of user engagement problems in your app?

The new Impact tool in Application Insights makes it easy to find out. Just by choosing a page in your app and a user action on that page, the Impact tool graphs conversion rates by page load time. This makes it easy to spot if performance really does cause your users to churn.

The Impact tool can analyze more than just performance impact. It can look for correlations between any property or measurement in your telemetry and conversion rates. So you can see how conversion varies by country, device type, and more.

On our team, the Impact tool has uncovered several places where slow page load time was strongly correlated with decreased conversion rates. Better yet, the Impact tool quantified the page load time we should aim for, the slowest page load time that still had high conversion rates.

More capabilities for User Flows


Microsoft Tutorials and Materials, Microsoft Learning, Microsoft Applications

Now the User Flows tool can analyze what users did before they visited some page or custom event in your site, in addition to what they did afterward. New “Session Started” nodes show you where a node was the first in a user session so you can spot how users are entering your site.

A new Split By option allows you to create more detailed User Flows visualizations by segmenting nodes by property values. For example, let’s say your team is collecting a custom event name with an overly generic name like “Button Clicked”. You can better understand user behavior by separating out which button was clicked by splitting by a “Button Name” custom dimension. Then in the visualization, you’ll see nodes to the effect of “Button Clicked where Button Name = Save,” “Button Clicked where Button Name = Edit,” and so on.

We’ve made a few smaller improvements to User Flows, too. The visualization now better adapts to smaller screen sizes. A “Tour” button gives you a step-by-step look at how to get more out of the User Flows tool. Also, on-node hide and reveal controls make it easier to control the density of information on the visualization.

Saturday 17 March 2018

Network Forensics with Windows DNS Analytical Logging

Overview


DNS queries and responses are a key data source used by network defenders in support of incident response as well as intrusion discovery. If these transactions are collected for processing and analytics in a big data system, they can enable a number of valuable security analytic scenarios. An exercise to this end was conducted with Microsoft internal DNS systems. This document outlines the approach, and results  to enable Windows DNS customers to re-produce the outcome.

Motivation


The at-scale processing and analysis of DNS data in a Big Data system is a powerful capability that can be used to support analyst investigations and discovery of intrusions. Below are a selection of scenarios that are enabled –

IOC Detection

Domain Names and IP addresses are one of the most common sources of Indicators of Compromise (IOC), often referring to Command and Control servers of attacker infrastructure. The collection, processing and storage of DNS data allows for queried domains, and resource record response data for hosts within the network to be searched for these IOCs, providing a quick and accurate detection of whether the network has been impacted by an intrusion. The on-going collection of this data also allows for a powerful retrospective search of IOCs on computer networks.

Protocol Agnostic Detection

Network defenders often have access to other data sources that can be searched for IP and Domain IOCs, such as Web proxy and Firewall logs. DNS collection provides a higher-fidelity detection of these, where the protocol implemented by attacker Command and Control infrastructure does not involve HTTP, or where DNS itself is being used as a covert channel.

Covert Channel Detection

DNS can be used by adversaries as a covert channel to provide remote configuration or data transfer capability to malware inside a computer network. At scale analysis of abnormal response packets can be used to identify such covert channels.

Adversary Tracking

Historic logging of query and response data and associated analysis enables the tracking of command and control infrastructure usage used by adversaries over time, where multiple domains and IP addresses are used and infrastructure is transitioned following the discovery of activity.

Analytical Logging in Windows DNS


Windows Server DNS (2012R2 onwards) has implemented enhanced logging of various DNS server actions in Windows, including the logging of query and response data with a focus on negligible performance impact.

Negligible Performance Impact of Enabling

A DNS server running on modern hardware that is receiving 100,000 queries per second (QPS) can experience a performance degradation of 5% when analytic logs are enabled. There is no apparent performance impact for query rates of 50,000 QPS and lower

Details of Logged Data


The Analytic log type implemented through this feature contains much of the day to day operational detail of the DNS server, and although many types of data are recorded, including zone transfer requests, responses and dynamic updates, for the forensics and threat analytics we will focus on QUERY_RECEIVED and RESPONSE_SUCCESS data types in this example. These form the core of our current internal collection and pose the biggest challenge in collection due to volumes of data.

QUEY_RECEIVED and RESPONSE_SUCCESS events that are logged contain a number of the fields that make up a query and response, but crucially also contain the full packet data received enabling the processing of any aspect of one of these objects. Here is an example response event from the Applications and Services Logs\Microsoft\Windows\DNS-Server analytic log –

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Implementation


The logging of this data was implemented as an ETW (Event Tracing for Windows) provider. Many event types in Windows are enabled via this mechanism, and allow for high performance logging of data, and the subscription to these providers via their unique GUIDs. In the case of the Microsoft-Windows-DNSServer provider, GUID {EB79061A-A566-4698-9119-3ED2807060E7} is used as its identity. Windows comes with many tools to record samples of this data, such as tracelog, which will record events offered by the ETW channel and write them to a file. The Windows event viewer essentially replicates this subscription model when presenting the administrator with a view of these events, writing the collection sample to a temporary file location. As an example, here is the location of the data underlying the enhanced DNS logging feature – 

%SystemRoot%\System32\Winevt\Logs\Microsoft-Windows-DNSServer%4Analytical.etl

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

The event viewer acts as a browser for this file.

Collection of Data


One method of collecting events from Windows servers is Windows Event Collection (WEC). WEC is a mechanism built into Windows that will forward an XML representation of an event to a configured collection server, based upon a filter specifying an event identifier and selection criteria. WEC, however can only be configured for log types of ‘Operational’. Operational events are stored in a rolled permanent location inside an .evtx file on the host. When these events are created they are also written via the Windows Event Collector service, which performs forwarding off host. For more information on Windows Event Collection, see the following article on msdn – 

https://msdn.microsoft.com/en-us/library/windows/desktop/bb427443(v=vs.85).aspx

There is an inherent overhead in logging an event in this way, and is the reason DNS query and response logging was implemented as an ‘Analytic’ type. Analytic log types do not write events via the WEC service and as such have a lower performance impact. This helps to give us the negligible performance win we mention earlier at a high QPS in the order of 100 thousand queries per second.

For servers that do not have such high QPS needs, using WEC from an operational channel becomes a more viable option. Internal DNS servers which serve a dedicated enterprise network may have significantly lower QPS requirements (around 10,000 QPS.) At these levels, collection over WEC becomes a more realistic scenario. Further, from a security analytics stand point the majority of queries and responses for reputed domains such as *.microsoft.com, are less valuable to us and can be dropped, further reducing the effective QPS logged for WEC. 

For the internal Microsoft project, a high performance event collector was implemented that consumes the QUERY_RECEIVED and RESPONSE_SUCCESS events from the ETW channel on DNS servers. This consumer filters out high repute domains that are less valuable to us for security analytics and writes these to an operational log equivalent ready for collection over normal WEC channels. The following diagram gives a high level overview of the functionality of this tool –

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Query Volume Modelling


Selecting domain names for filtering can be a balancing act that requires modelling using sample data. Front loading too many domain filters in the tool can cause un-necessary processing, whilst letting high volume domains through to the event writer can result in un-necessary volume and associated storage costs. 

Customers can collect a sample of query and response data from the analytic logging in a .etl file from a DNS server on their network. This ETL file can be analyzed for the top volume queried domains in terms of Second Level Domain (SLD). Taking the top SLDs, excluding county TLDs such as .co.uk customers can model query volume reduction by the application of various SLD filters, and extrapolate these to enterprise network coverage. These figures can then be used to calculate approximate storage costs of implementing such a solution. Alternatively, data such as the Alexa Top 100 SLDs can be taken as a starting point and refined as required to fit the needs of the enterprise. 

In our prototype, a value of 100 SLDs was chosen for implementation based upon sample data logged on the network. This resulted in a reduction of volume from 3,286 QPS from our sample data set, to 136 QPS, 4.13% of the total volume. The extrapolated effective QPS rate for the whole of the network drops significantly in this scenario, easily manageable by big data and WEC infrastructure. 

The Pilot Deployment and Results


We worked with Microsoft IT to pilot the analytic logging, and the ETW consumer/filtering tool to the Microsoft corporate DNS infrastructure. The pilot project was rolled out to 29 DNS caching servers. A little snapshot of the query volume that this project was dealing with is:

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

The total size of raw data storage post filtering for all 29 Microsoft corporate DNS servers peaks at approx. 100GB  / day at the busiest times, and drops down to around 15gb / day during less busy periods such as weekends.  

This enables a whole new way of using information related to compromised domains, the identification of malicious transactions, infected machines and thus enabling us to monitor and fortify our network.

Thursday 15 March 2018

Heuristic DNS detections in Azure Security Center

Today, we are discussing some of our more complex, heuristic techniques to detect malicious use of this vital protocol and how these detect key components of common real-world attacks.

These analytics focus on behavior that is common to a variety of attacks, ranging from advanced targeted intrusions to the more mundane worms, botnets and ransomware. Such techniques are designed to complement more concrete signature-based detection, giving the opportunity to identify such behavior prior to the deployment of analyst driven rules. This is especially important in the case of targeted attacks, where time to detection of such activity is typically measured in months. The longer an attacker has access to a network, the more expensive the eventual clean-up and removal process becomes. Similarly, while rule-based detection of ransomware is normally available within a few days of an outbreak, this is often too late to avoid significant brand and financial damage for many organizations.

These analytics, along with many more, are enabled through Azure Security Center upon enabling the collection of DNS logs on Azure based servers. While this logging requires Windows DNS servers, the detections themselves are largely platform agnostic, so they can run across any client operating system configured to use an enabled server.

A typical attack scenario


A bad guy seeking to gain access to a cloud server starts a script attempting to log in by brute force guessing of the local administrator password. With no limit to the number of incorrect login attempts, following several days of effort the attacker eventually correctly guesses the perceived strong password of St@1w@rt.

Upon successful login, the intruder immediately proceeds to download and install a malicious remote administration tool. This enables a raft of useful functions, such as the automated stealing of user passwords, detection of credit card or banking details, and assistance in subsequent brute force or Denial-of-Service attacks. Once running, this tool begins periodically beaconing over HTTP to a pre-configured command and control server, awaiting further instruction.

This type of attack, while seemingly trivial to detect, is not always easy to prevent. For instance, limiting incorrect login attempts appears to be a sensible precaution, but doing so introduces a severe risk of denial of service through lockouts. Likewise, although it is simple to detect large numbers of failed logins, it is not always easy to differentiate legitimate user activity from the almost continual background noise of often distributed brute force attempts.

Detection opportunities


For many of our analytics, we are not specifically looking for the initial infection vector. While our above example could potentially have been detected from its brute force activity, in practice, this could just as easily have been a single malicious login using a known password, as might be the case following exploitation of a legitimate administrator’s desktop or successful social engineering effort. The following techniques are therefore looking to detect the subsequent behavior or the downloading and running of the malicious service.

Network artifacts


Attacks, such as the one outlined above, have many possible avenues of detection over the network, but a consistent feature of almost all attacks is their usage of DNS. Regardless of transport protocol used, the odds are that a given server will be contacted by its domain name. This necessitates usage of DNS to resolve this hostname to an IP address. Therefore, by analyzing only DNS interactions, you get a useful view of outbound communication channels from a given network. An additional benefit to running analytics over DNS, rather than the underlying protocols, is local caching of common domains. This reduces their prevalence on the network, reducing both storage and computational expense of any analytic framework.

Azure Security Center, DNS, Microsoft Tutorials and Materials

WannaCry Ransomware detected by Random Domain analytic.

Azure Security Center, DNS, Microsoft Tutorials and Materials

Malware report listing hard-coded domains enumerated by WannaCry ransomware.

Random domains


Malicious software has a tendency towards randomly generated domains. This may be for many reasons, ranging from simple language issues, avoiding the need to tailor domains to each victim’s native language. To even assisting in the automation of the registration of large numbers of such names, along with helping reduce the chances of accidental reuse or collision. This is highlighted by techniques such as Domain Generation Algorithms (DGAs) but is frequently used in static download sites and command and control servers, such as in the above WannaCry example.

Detecting these “random” names is not always straightforward. Standard tests tend to only work on relatively large amounts of data. Entropy, for instance, requires a minimum of several times the size of the character set or at least hundreds of bytes. Domain names, on the other hand, are a maximum of 63 characters in length. To address this issue, we have used basic language modelling, calculating the probabilities of various n-grams occurring in legitimate domain names. We also use these to detect the occurrence of highly unlikely combinations of characters in a given name.

Azure Security Center, DNS, Microsoft Tutorials and Materials

Malware report detailing use of randomly generated domain names by ShadowPad trojan.

Periodicity


As mentioned, this attack involved the periodic beaconing of a command and control server. For the sake of argument, let’s assume this is an hourly HTTP request. When attempting to make this request, the HTTP client will first attempt to resolve the server’s domain name through the local DNS resolver. This resolver will tend to keep some local cache of such resolutions, meaning that you cannot guarantee you will see a DNS request on every beacon. However, you can see these on some multiple of an hour.

In attempting to find such periodic activity, we use a version of Euclid’s algorithm to keep track of an approximate greatest common divisor of the time between lookups of each specific domain. Once a domain’s GCD falls within the permitted error (i.e. in the exact case to one), it is added to a bloom filter of domains to be ignored from further calculations. Assuming a GCD greater than this error, we take the current GCD or estimate of the beacon period and the number of observations to calculate the probability of observing this many concurrent lookups on multiples of this period. I.e. the chances of randomly seeing three concurrent lookups to some domain, all on multiples of two seconds is 1/2^3  or 1 in 8. On the other hand, as with our example, the probability of seeing three random lookups, precisely to the nearest second on multiples of one hour is 1/〖3600〗^3  or 1 in 46,656,000,000. Thus, the longer the time delta, the fewer observations we need to observe before we are certain it is periodic.

Conclusion


As demonstrated in the above scenario, analyzing network artifacts can be extremely useful in detecting malicious activity on endpoints. While the ideal situation is the analysis of all protocols from every machine on a network, in practice, this is too expensive to collect and process. Choosing a single protocol to give the highest chance of detecting malicious communications while minimizing the volume of data collected results in a choice between HTTP and DNS. By choosing DNS, you lose the ability to detect direct IP connections. In practice, these are rare, due to the relative scarcity of static IP addresses, alongside the potential to block such connections at firewalls. The benefits of examining DNS is its ability to observe connections across all possible network protocols from all client operating systems in a relatively small dataset. The compactness of this data is further aided by the default behavior of on-host caching of common domains.