Showing posts with label Azure SQL. Show all posts
Showing posts with label Azure SQL. Show all posts

Tuesday, 18 February 2020

SQL Server runs best on Azure. Here’s why

SQL Server customers migrating their databases to the cloud have multiple choices for their cloud destination. To thoroughly assess which cloud is best for SQL Server workloads, two key factors to consider are:

1. Innovations that the cloud provider can uniquely provide.

2. Independent benchmark results.

What innovations can the cloud provider bring to your SQL Server workloads?


As you consider your options for running SQL Server in the cloud, it's important to understand what the cloud provider can offer both today and tomorrow. Can they provide you with the capabilities to maximize the performance of your modern applications? Can they automatically protect you against vulnerabilities and ensure availability for your mission-critical workloads?

SQL Server customers benefit from our continued expertise developed over the past 25 years, delivering performance, security, and innovation. This includes deploying SQL Server on Azure, where we provide customers with innovations that aren’t available anywhere else. One great example of this is Azure BlobCache, which provides fast, free reads for customers. This feature alone provides tremendous value to our customers that is simply unmatched in the market today.

Additionally, we offer preconfigured, built-in security and management capabilities that automate tasks like patching, high availability, and backups. Azure also offers advanced data security that enables both vulnerability assessments and advanced threat protection. Customers benefit from all of these capabilities both when using our Azure Marketplace images and when self-installing SQL Server on Azure virtual machines.

Only Azure offers these innovations.

What are their performance results on independent, industry-standard benchmarks?


Benchmarks can often be useful tools for assessing your cloud options. It's important, though, to ask if those benchmarks were conducted by independent third parties and whether they used today’s industry-standard methods.

Azure Tutorial and Material, Azure Guides, Azure Certification, Azure Learning, Azure SQL Server

In December, an independent study by GigaOm compared SQL Server on Azure Virtual Machines to AWS EC2 using a field test derived from the industry standard TPC-E benchmark. GigaOm found Azure was up to 3.4x faster and 87 percent cheaper than AWS. Today, we are pleased to announce that in GigaOm’s second benchmark analysis, using the latest virtual machine comparisons and disk striping, Azure was up to 3.6x faster and 84 percent cheaper than AWS.

Thursday, 30 May 2019

Key causes of performance differences between SQL managed instance and SQL Server

Migrating to a Microsoft Azure SQL Database managed instance provides a host of operational and financial benefits you can only get from a fully managed and intelligent cloud database service. Some of these benefits come from features that optimize or improve overall database performance. After migration many of our customers are eager to compare workload performance with what they experienced with on-premises SQL Server, and sometimes they're surprised by the results. In many cases, you might get better results on the on-premises SQL Server database because a SQL Database managed instance introduces some overhead for manageability and high availability. In other cases, you might get better results on a SQL Database managed instance because the latest version of the database engine has improved query processing and optimization features compared to older versions of SQL Server.

SQL managed instance, SQL Server, Azure Study Materials, Azure Guides, Azure Tutorials and Materials

This article will help you understand the underlying factors that can cause performance differences and the steps you can take to make fair comparisons between SQL Server and SQL Database.

If you're surprised by the comparison results, it's important to understand what factors could influence your workload and how to configure your test environments to ensure you have a fair comparison. Some of the top reasons why you might experience lower performance on a SQL Database managed instance compared to SQL Server are listed below. You can mitigate some of these by increasing and pre-allocating file sizes or adding cores; however, the others are prerequisites for guaranteed high availability and are part of the PaaS service.

Simple or bulk recovery model


The databases placed on the SQL Database managed instance are using a full database recovery model to provide high availability and guarantee no data loss. In this scenario, one of the most common reasons why you might get worse performance on a SQL Database managed instance is the fact that your source database uses a simple or bulk recovery model. The drawback of the full recovery model is that it generates more log data than the simple/bulk logged recovery model, meaning your DML transaction processing in the full recovery model will be slower.

You can use the following query to determine what recovery model is used on your databases:

select name, recovery_model_desc from sys.databases

If you want to compare the workload running on SQL Server and SQL Database managed instances, for a fair comparison make sure the databases on both sides are using the full recovery model.

Resource governance and HA configuration


SQL Database managed instance has built-in resource governance that ensures 99.99% availability, and guarantees that management operations such as automated backups will be completed even under high workloads. If you don’t use similar constraints on your SQL Server, the built-in resource governance on SQL Database managed instance might limit your workload.

For example, there's an instance log throughput limit (up to 22MBs on the general purpose and up to 48MBs on the business critical tier) that ensures you can't load more data than the instance can backup. In this case, you might see higher INSTANCE_LOG_GOVERNOR wait statistics that don’t exist in your SQL Server instance. These resource governance constraints might slow down operations such as bulk load or index rebuild because these operations require higher log rates.

In addition, the secondary replicas in business critical tier instances might slow down the primary database if they can't catch-up the changes and apply them, so you might see additional HADR_DATABASE_FLOW_CONTROL or HADR_THROTTLE_LOG_RATE_SEND_RECV wait statistics.

If you're comparing your SQL Server workload running on local SSD storage to the business critical tier, note that the business critical instance is an Always On availability group cluster with three secondary replicas. Make sure that your source SQL Server has an HA implementation similarly using Always On availability groups with at least one synchronous commit replica. If you're comparing the business critical tier with a single SQL Server instance writing to the local disk, this would be an unrealistic comparison due to the absence of HA on your source instance. If you are using async always on replicas, you would have HA with better performance, but in this case you are making the trade-off between the possibility of data-loss in favor of performance, and you will get the better results on the SQL Server instance.

Automated backup schedule


One of the main reasons why you would choose the SQL Database managed instance is the fact that it guarantees you will always have backups of your databases, even under heavy workloads. The databases in a SQL Database managed instance have scheduled full, incremental, and log backups. Full backups are taken every seven days, incremental every twelve hours, and log backups are taken every five to ten minutes. If you have multiple databases on the instance there's a high chance there is at least one backup currently running.

Since the backup operations are using some instance resources (CPU, disk, network), they can affect workload performance. Make sure the databases on the system that you compare with the managed instance have similar backup schedules. Otherwise, you might need to accept that you're getting better results on your SQL Server instance because you're making a trade-off between database recovery and performance, which is not possible on a SQL Database managed instance.

If you're seeing unexpected performance differences, check if there is some ongoing full/differential backup either on the SQL Database managed instance or SQL Server instance that can affect performance of the currently running workload, using the following query:

SELECT r.command, query = a.text, start_time, percent_complete,
      eta = dateadd(second,estimated_completion_time/1000, getdate())
FROM sys.dm_exec_requests r
    CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) a
 WHERE r.command IN ('BACKUP DATABASE','BACKUP LOG')

If you see currently running full or incremental backup during the short-running benchmark, you might pause your workload and resume it once the backup finishes.

Connection and App to Database proximity


The application accessing the databases and executing the benchmark queries on the SQL Database managed instance and SQL Server instance must be in a similar network proximity range in both cases. If you are placing your application and SQL Server database in the local environment (or running an app like HammerDB from the same machine where the SQL Server is installed) you will get better results on SQL Server compared to the SQL Database managed instance, which is placed on a distributed cloud environment with respect to the application. Make sure that in both cases you're running the benchmark application or query on separate virtual machines in the same region as SQL Database managed instance to get the valid results. If you're comparing an on-premises environment with the equivalent cloud environments, try to measure bandwidth and latency between the app and database and try to ensure they are similar.

SQL Database managed instance is accessed via proxy gateway nodes that accept the client requests and redirect them to the actual database engine nodes. In order to provide the results closer to your environment, enable ProxyOverride mode on your instance using Set-AzSqlInstance PowerShell command to enable direct access from the client to the nodes currently hosting your SQL Database managed instance.

In addition, due to compliance requirements, a SQL Database managed instance enforces SSL/TLS transport encryption which is always enabled. Encryption can introduce overhead in case of a large number of queries. If your on-premises environment does not enforce SSL encryption you will see additional network overhead in the SQL Database managed instance.

Transparent data encryption


The databases on SQL Database managed instance are encrypted by default using Transparent Data Encryption. Transparent Data Encryption encrypts/decrypts every page that is exchanged with the disk storage. This spends more CPU resources, and introduces additional latency in the process of fetching and saving the data pages to or from disk storage. Make sure that both databases on SQL Database managed instance and SQL Server have Transparent Data Encryption either turned on or off, and that database encryption/decryption operations have completed before starting performance testing.

You can use the following query to determine whether the databases are encrypted:

select name, is_encrypted from sys.databases

Another important factor that might affect your performance is encrypted TempDB. TempDB is encrypted if at least one database on your SQL Server or SQL Database managed instance is encrypted. As a result, you might compare two databases that are not encrypted, but due to some other SQL Database managed instance being encrypted (although it's not involved in the workload) the TempDB will also be encrypted. The unencrypted databases will still use encrypted TempDB and any query that creates temporary objects or uses spills would be slower. Note that TempDB will only get decrypted once all user databases on an instance are decrypted, and the instance restarts. Scaling a SQL Database managed instance to a new pricing tier and back is one way to restart it.

Database engine settings


Make sure the database engine setting such as database compatibility levels, trace flags, system configurations (‘cost threshold for parallelism’, ’max degree of parallelism’), database scoped configurations (LEGACY_CARDINALITY_ESTIMATOR, PARAMETER_SNIFFING, QUERY_OPTIMIZER_HOTFIXES, etc.), and database settings (AUTO_UPDATE_STATISTICS, DELAYED_DURABILITY) on the SQL Server and SQL Database managed instances are the same on both databases.

The following sample queries can help you to identify setting on SQL Server and Azure SQL Database managed instance:

select compatibility_level, snapshot_isolation_state_desc, is_read_committed_snapshot_on,

  is_auto_update_stats_on, is_auto_update_stats_async_on, delayed_durability_desc
from sys.databases;
GO

select * from sys.database_scoped_configurations;
GO

dbcc tracestatus;
GO

select * from sys.configurations;

Compare the results of these queries on the SQL Database managed instance and SQL Server and try to align the differences if you identify some.

Note: The list of trace flags and configurations might be very long so we recommend filtering them or lookng only on the trace flags you've changed or know are affecting performance. Some of the trace flags are pre-configured on SQL Database managed instance as part of PaaS configurations and they are not affecting performance.

You might experiment with changing the compatibility level to a higher value, turning on the legacy cardinality estimator, or the automatic tuning feature on the SQL Database managed instance, which might give you better results than your SQL Server database.

Also note that SQL Database managed instance might provide better performance even if you align all parameters because it has the latest improvements, or fixes that are not bound to compatibility level, or some features, like forcing last good plan, that might improve your workload.

Hardware and environment specification


SQL Database managed instance runs on standardized hardware with pre-defined technical characteristics that are probably different than your environment. Some of the characteristics you might need to consider when comparing your environment with the environment where the SQL Database managed instance is running are:

1. Number of cores should be the same both on SQL Server and the SQL Database managed instance. Note that a SQL Database managed instance uses 2.3-2.4 GHz processors, which might be different than your processor speed. It might consume more or less CPU for the same operation due to the CPU differences. If possible, check if hyperthreading is used on the SQL Server environment when comparing to the Gen4 and Gen5 hardware generations on a SQL Database managed instance. One on Gen4 hardware does not use hyperthreading, while on Gen5 it does. If you are comparing SQL Server running on a bare-metal machine with a SQL Database managed instance or SQL Server running on a virtual machine you'll probably get better results on a bare-metal instance.

2. Amount of memory including memory/core ratio (5.1 GB/core on Gen5, 7 GB/core on Gen4). Higher memory/core ratio provides bigger buffer pool cache and increases cache hit ratio. If your workload does not perform well on a managed interface with the memory/core ratio 5, then you probably need to choose a virtual machine with the appropriate memory/core ratio instead of a SQL Database managed instance.

3. IO characteristics – You need to be aware that performance of the storage system might be very different compared to your on-premises environment. A SQL Database managed instance is a cloud database and relies on Azure cloud infrastructure.

◈ The general purpose tier uses remote Azure Premium disks where IO performance depends on the file sizes. If you reach the log limit that depends on the file size, you might notice WRITE_LOG waits and less IOPS in file statistics. This issue might occur on a SQL Database managed instance if the log files are small and not pre-allocated. You might need to increase the size of some files in the general purpose tier to get better performance.

◈ A SQL Database managed instance does not use instant file initialization, so you might see additional PREEMPTIVE_OS_WRITEFILEGATHER wait statistics since the date files are filled with zero bytes during file growth.

4. Local or remote storage types – Make sure you're considering local SSD versus remote storage while doing the comparison. The general purpose tier uses remote storage (Azure Premium Storage) that can't match your on-premises environment if it uses local SSD or a high-performance SAN. In this case you would need to use the business critical tier as a target. The general purpose tier can be compared with other cloud databases like SQL Server on Azure Virtual Machines that also use remote storage (Azure Premium Storage). In addition, beware that remote storage used by a general purpose instance is still different than remote storage used by a SQL Virtual Machine because:

◈ The general purpose tier uses a dedicated IO resource per each database file that depends on the size of the individual files, while SQL Server on Azure Virtual Machine uses shared IO resources for all files where IO characteristics depend on the size of the disk. If you have many small files, you will get better performance on a SQL Virtual Machine, while you can get better performance on a SQL Database managed instance if the usage of files can be parallelized because there are no noisy neighbors who are sharing the same IO resources.

◈ SQL Virtual Machines use a read-caching mechanism that improves read speed.

If your hardware specs and resource allocation are different, you might expect different performance results that can be resolved only by changing the service tier or increasing file size. If you are comparing a SQL Database managed instance with SQL Server on Azure Virtual Machines, make sure that you are choosing a virtual machine series that has memory/cpu ratio similar to SQL Database managed instance, such as DS series.

Azure SQL Database managed instance provides a powerful set of tools that can help you troubleshoot and improve performance of your databases, in addition to built-in intelligence that could automatically resolve potential issues. 

Friday, 11 January 2019

Performance troubleshooting using new Azure Database for PostgreSQL features

At Ignite 2018, Microsoft’s Azure Database for PostgreSQL announced the preview of Query Store (QS), Query Performance Insight (QPI), and Performance Recommendations (PR) to help ease performance troubleshooting, in response to customer feedback. This blog intends to inspire ideas on how you can use features that are currently available to troubleshoot some common scenarios.

This blog nicely categorizes the problem space into several areas and the common techniques to rule out possibilities to quickly get to the root cause. We would like to further expand on this with the help of these newly announced features (QS, QPI, and PR).

In order to use these features, you will need to enable data collection by setting pg_qs.query_capture_mode and pgms_wait_sampling.query_capture_mode to ALL.

Azure Certification, Azure Guides, Azure Tutorial and Material, Azure Learning

You can use Query Store for a wide variety of scenarios where you can enable data collection to help with troubleshooting these scenarios better. In this article, we will limit the scope to regressed queries scenario.

Regressed queries


One of the important scenarios that Query Store enables you to monitor is the regressed queries. By setting pg_qs.query_capture_mode to ALL, you get a history of your query performance over time. We can leverage this data to do simple or more complex comparisons based on your needs.

One of the challenges you face when generating a regressed query list is the selection of comparison period in which you baseline your query runtime statistics. There are a handful of factors to think about when selecting the comparison period:

◉ Seasonality: Does the workload or the query of your concern occur periodically rather than continuously?
◉ History: Is there enough historical data?
◉ Threshold: Are you comfortable with a flat percentage change threshold or do you require a more complex method to prove the statistical significance of the regression?

Now, let’s assume no seasonality in the workload and that the default seven days of history will be enough to evaluate a simple threshold of change to pick regressed queries. All you need to do is to pick a baseline start and end time, and a test start and end time to calculate the amount of regression for the metric you would like to track.

Looking at the past seven-day history, compared to last two hours of execution, below would give the top regressed queries in the order of descending percentage. Note that if the result set has negative values, it indicates an improvement from baseline to test period when it’s zero, it may either be unchanged or not executed during the baseline period.

create or replace function get_ordered_query_performance_changes(
baseline_interval_start int,
baseline_interval_type text,
current_interval_start int,
current_interval_type text)
returns table (
     query_id bigint,
     baseline_value numeric,
     current_value numeric,
     percent_change numeric
) as $$
with data_set as (
select query_id
, round(avg( case when start_time >= current_timestamp - ($1 || $2)::interval and start_time < current_timestamp - ($3 || $4)::interval then mean_time else 0 end )::numeric,2) as baseline_value
, round(avg( case when start_time >= current_timestamp - ($3 || $4)::interval then mean_time else 0 end )::numeric,2) as current_value
from query_store.qs_view where query_id != 0 and user_id != 10 group by query_id ) , 
query_regression_data as (
select *
, round(( case when baseline_value = 0 then 0 else (100*(current_value - baseline_value) / baseline_value) end )::numeric,2) as percent_change 
from data_set ) 
select * from query_regression_data order by percent_change desc;
$$
language 'sql';

If you create this function and execute the following, you will get the top regressed queries in the last two hours in descending order compared to their calculated baseline value over the last seven days up to two hours ago.

select * from get_ordered_query_performance_changes (7, 'days', 2, 'hours');

Azure Certification, Azure Guides, Azure Tutorial and Material, Azure Learning

The top changes are all good candidates to go after unless you do expect the kind of delta from your baseline period because, say, you know the data size would change or the volume of transactions would increase. Once you identified the query you would like to further investigate, the next step is to look further into query store data and see how the baseline statistics compare to the current period and collect additional clues.

create or replace function compare_baseline_to_current_by_query_id(baseline_interval_cutoff int,baseline_interval_type text,query_id bigint,percentile decimal default 1.00)
returns table(
     query_id bigint,
     period text,
     percentile numeric,
     total_time numeric,
     min_time numeric,
     max_time numeric,
     rows numeric,
     shared_blks_hit numeric,
     shared_blks_read numeric,
     shared_blks_dirtied numeric,
     shared_blks_written numeric,
     local_blks_hit numeric,
     local_blks_read numeric,
     local_blks_dirtied numeric,
     local_blks_written numeric,
     temp_blks_read numeric,
     temp_blks_written numeric,
     blk_read_time numeric,
     blk_write_time numeric
)
as $$

with data_set as
( select *
, ( case when start_time >= current_timestamp - ($1 || $2)::interval then 'current' else 'baseline' end ) as period
from query_store.qs_view where query_id = ( $3 )
)
select query_id
, period
, round((case when $4 <= 1 then 100 * $4 else $4 end)::numeric,2) as percentile
, round(percentile_cont($4) within group ( order by total_time asc)::numeric,2) as total_time
, round(percentile_cont($4) within group ( order by min_time asc)::numeric,2) as min_time
, round(percentile_cont($4) within group ( order by max_time asc)::numeric,2) as max_time
, round(percentile_cont($4) within group ( order by rows asc)::numeric,2) as rows
, round(percentile_cont($4) within group ( order by shared_blks_hit asc)::numeric,2) as shared_blks_hit
, round(percentile_cont($4) within group ( order by shared_blks_read asc)::numeric,2) as shared_blks_read
, round(percentile_cont($4) within group ( order by shared_blks_dirtied asc)::numeric,2) as shared_blks_dirtied
, round(percentile_cont($4) within group ( order by shared_blks_written asc)::numeric,2) as shared_blks_written
, round(percentile_cont($4) within group ( order by local_blks_hit asc)::numeric,2) as local_blks_hit
, round(percentile_cont($4) within group ( order by local_blks_read asc)::numeric,2) as local_blks_read
, round(percentile_cont($4) within group ( order by local_blks_dirtied asc)::numeric,2) as local_blks_dirtied
, round(percentile_cont($4) within group ( order by local_blks_written asc)::numeric,2) as local_blks_written
, round(percentile_cont($4) within group ( order by temp_blks_read asc)::numeric,2) as temp_blks_read
, round(percentile_cont($4) within group ( order by temp_blks_written asc)::numeric,2) as temp_blks_written
, round(percentile_cont($4) within group ( order by blk_read_time asc)::numeric,2) as blk_read_time
, round(percentile_cont($4) within group ( order by blk_write_time asc)::numeric,2) as blk_write_time
from data_set
group by 1, 2
order by 1, 2 asc;
$$
language 'sql';

Once you create the function, provide the query id you would like to investigate. The function will compare the aggregate values between the before and after based on the cutoff time you provide. For instance, the below statement would compare all points prior to two hours from now to points after the two hours mark up until now for the query. If you are aware of outliers that you want to exclude, you can use a percentile value.

select * from compare_baseline_to_current_by_query_id(30, 'minutes', 4271834468, 0.95);

If you don’t use any, the default value is 100 which does include all data points.

select * from compare_baseline_to_current_by_query_id(2, 'hours', 4271834468);

Azure Certification, Azure Guides, Azure Tutorial and Material, Azure Learning

If you rule out that there is not a significant data size change and the cache hit ratio is rather steady, you may also want to investigate any obvious wait event occurrence changes within the same period. As wait event types combine different wait types into buckets similar by nature, there is not a single prescription on how to analyze the data. However, a general comparison may give us ideas around the system state change.

create or replace function compare_baseline_to_current_by_wait_event (baseline_interval_start int,baseline_interval_type text,current_interval_start int,current_interval_type text)
returns table(
     wait_event text,
     baseline_count bigint,
     current_count bigint,
     current_to_baseline_factor double precision,
     percent_change numeric
)
as $$
with data_set as
( select event_type || ':' || event as wait_event
, sum( case when start_time >= current_timestamp - ($1 || $2)::interval and start_time < current_timestamp - ($3 || $4)::interval then 1 else 0 end ) as baseline_count
, sum( case when start_time >= current_timestamp - ($3 || $4)::interval then 1 else 0 end ) as current_count
, extract(epoch from ( $1 || $2 ) ::interval) / extract(epoch from ( $3 || $4 ) ::interval) as current_to_baseline_factor
from query_store.pgms_wait_sampling_view where query_id != 0
group by event_type || ':' || event
) ,
wait_event_data as
( select *
, round(( case when baseline_count = 0 then 0 else (100*((current_to_baseline_factor*current_count) - baseline_count) / baseline_count) end )::numeric,2) as percent_change
from data_set
)
select * from wait_event_data order by percent_change desc;
$$
language 'sql';

select * from compare_baseline_to_current_by_wait_event (7, 'days', 2, 'hours');

The above query will let you see some abnormal changes between the two periods. Note that event count here is taken as an approximation and the numbers should be taken within the context of the comparative load of the instance given the time.

Azure Certification, Azure Guides, Azure Tutorial and Material, Azure Learning

As you can see, with the available time series data in Query Store, your creativity is your limit to the kinds of analysis and algorithms you could implement here. We showed you some simple calculations by which you could apply straight forward techniques to identify candidates and improve. We hope that this could be your starting point and that you share with us what works, what doesn’t and how you take this to the next level.

Wednesday, 25 April 2018

Spring Data Azure Cosmos DB: NoSQL data access on Azure

We are pleased to announce that Spring Data Azure Cosmos DB is now available to provide essential Spring Data support for Azure Cosmos DB using SQL API. Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service with exceptional scalability and performance.

With Spring Data Azure Cosmos DB, Java developers now can get started quickly to build NoSQL data access for their apps on Azure. It offers a Spring-based programming model for data access, while keeping the special traits of the underlying data store with Azure Cosmos DB. Features of Spring Data Azure Cosmos DB include a POJO centric model for interacting with an Azure Cosmos DB Collection, and an extensible repository style data access layer.

Getting started


Download the Spring Data Azure Cosmos DB Sample Project to get started. The sample illustrates the process to use annotation to interact with Collection, customize a query operation with specific fields, and expose a discoverable REST API for clients.

Azure Cosmos DB, Azure Cosmos, Azure NoSQL, Azure Tutorials and Materials

Create a new database instance

To get started, first create a new database instance by using the Azure portal. You can find Azure Cosmos DB in Databases and choose SQL (Document DB) for the API. When your database has been created, you can find the URI and keys on the overview page. The values will be used to configure your Spring Boot application.

Azure Cosmos DB, Azure Cosmos, Azure NoSQL, Azure Tutorials and Materials

Configure your project

You can create a simple Spring Boot application using Spring Initializr, and locate the pom.xml file in the directory of your app. In the pom.xml file add spring-data-cosmosdb to list of dependencies. spring-data-cosmosdb is published in Maven Central Repository. 

<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>spring-data-cosmosdb</artifactId>
    <version>2.0.3</version>
</dependency>

Features of Spring Data Azure Cosmos DB

Using Spring Data Azure Cosmos DB, you can get started quickly to build NoSQL data access for their apps on Azure.

Use Annotation to interact with Collection

@Id annotation: Annotate a field in domain class with @Id, this field will be mapped to document id in Azure Cosmos DB.

@Document annotation: By default, collection name will use the name of the domain class. To customize it, add annotation @Document(collection="yourCollectionName").

@Document(collection = "mycollection")
public class User {
    @Id
    private String id;
    private String email;
    private String name;
    private Address address;
    private List<Role> roleList;
   ...
}

Customize query operation


Customized query is useful for building constraining queries over entities of the repository. You can extend the basic DocumentDbRepository for different business logics.

public interface UserRepository extends DocumentDbRepository<User, String> {

    List<User> findByName(String firstName);
    List<User> findByEmailAndAddress(String email, Address address);
    ...
}

Exposes a discoverable REST API


@RepositoryRestResource Annotation: expose a discoverable REST API for your domain model.

@Repository
@RepositoryRestResource(collectionResourceRel = "user", path = "user")
public interface UserRepository extends DocumentDbRepository<User, String> {

    List<User> findByName(String firstName);
    ...

}

Thursday, 8 March 2018

Sync SQL data in large scale using Azure SQL Data Sync

Azure SQL Data Sync allows users to synchronize data between Azure SQL Databases and SQL Server databases in one-direction or bi-direction. This feature was first introduced in 2012. By that time, people didn't host a lot of large databases in Azure. Some size limitations were applied when we built the data sync service, including up to 30 databases (five on-premises SQL Server databases) in a single sync group, and up to 500 tables in any database in a sync group.

Today, there are more than two million Azure SQL Databases and the maximum database size is 4TB. But those limitations of data sync are still there. It is mainly because that syncing data is a size of data operation. Without an architectural change, we can’t ensure the service can sustain the heavy load when syncing in a large scale. We are working on some improvements in this area. Some of these limitations will be raised or removed in the future. In this article, we are going to show you how to use data sync to sync data between large number of databases and tables, including some best practices and how to temporarily work around database and table limitations.

Sync data between many databases


Large companies and ISVs use data sync to distribute data from a central master database to many client databases. Some customers have hundreds or even thousands of client databases in the whole topology. Users may hit one of the following issues when trying to sync between many databases:

1. Hit the 30 databases per sync group limitation.

2. Hit the five on-premises SQL Server databases per sync group limitation.

3. Since all member databases will sync with the hub database, there’s significant performance impact to workload running in the hub database.

To work around the 30 databases or five on-premises databases per sync group limitation, we suggest you use a multi-level sync architecture. You can create a sync group to sync your master database with several member databases. And those member databases can become the hub databases of the sub sync groups and sync data to other client databases. According to your business and cost requirement, you can use the databases in the middle layers as client databases or dedicated forwarders.

Azure Tutorials and Materials, Azure Guides, Azure Learning, Azure Security

There are benefits from this multi-level sync architecture even you don’t hit the 30 databases per sync group limitation:

◈ You can group clients based on certain attributes (location, brand…) and use different sync schema and sync frequency.
◈ You can easily add more clients when your business is growing.
◈ The forwarders (member databases in the middle layers) can share the sync overhead from the master database.

To make this multi-level sync topology work in your system, you will need a good balance between how many client databases in a single sync group and how many levels in the overall system. The more databases in a single sync group, the higher impact it will add to the overall performance in the hub database. The more levels you have in your system, the longer it takes to have data changes broadcasted to all clients.

When you are adding more member databases to the system, you need to closely monitor the resource usage in the hub databases. If you see consistent high resource usage, you may consider upgrading your database to a higher SLO. Since the hub database is an Azure SQL database, you can upgrade it easily without downtime.

Sync data between databases with many tables


Currently, data sync can only sync between databases with less than 500 tables. You can work around this limitation by creating multiple sync groups using different database users. For example, you want to sync two databases with 900 tables. First, you need to define two different users in the database where you load the sync schema from. Each user can only see 450 (or any number less than 500) tables in the database. Sync setup requires ALTER DATABASE permission which implies CONTROL permission over all tables so you will need to explicitly DENY the permissions on tables which you don’t want a specific user to see, instead of using GRANT. You can find the exact privilege needed for sync initialization in the best practice guidance. Then you can create two sync groups, one for each user. Each sync group will sync 450 tables between these two databases. Since each user can only see less than 500 tables, you will be able to load the schema and create sync groups! After the sync group is created and initialized, we recommend you follow the best practice guidance to update the user permission and make sure they have the minimum privilege for ongoing sync.

Azure Tutorials and Materials, Azure Guides, Azure Learning, Azure Security

Optimize the sync initialization


After the sync group is created, the first time you trigger the sync, it will create all tracking tables and stored procedures and load all data from source to target database. The initial data loading is a size-of-data operation. Initializing sync between large databases could take hours or even days if it is not set up properly. Here are some tips to optimize the initialization performance:

1. Data sync will initialize the target tables using bulk insert if the target tables are empty. If you have data on both sides, even if data in source and target databases are identical (data sync won’t know that!), data sync will do a row-by-row comparison and insertion. It could be extremely slow for large tables. To gain the best initialization performance, we recommend you consolidate data in one of your databases and keep the others empty before setting up data sync.

2. Currently, the data sync local agent is a 32 bits application. It can only use up to 4GB RAM. When you are trying to initialize large databases, especially when trying to initialize multiple sync groups at the same time, it may run out of memory. If you encountered this issue, we recommend you add part of the tables into the sync group first, initialize with those tables, and then add more tables. Repeat this until all tables are added to the sync group.

3. During initialization, the local agent will load data from the database and store it as temp files in your system temp folder. If you are initializing sync group between large databases, you want to make sure your temp folder has enough space before you start the sync. You can change your temp folder to another drive by set the TEMP and TMP environment variables. You will need to restart the sync service after you update the environment variable. You can also add and initialize tables to the sync group in batch. Make sure the temp folder is cleaned up between each batch.

4. If you are initializing data from on-premises SQL Server to Azure DB, you can upgrade your Azure DB temporarily before the initialization. You can downgrade the database to the original SLO after the initialization is done. The extra cost will be minimum. If your target database is SQL Server running in a VM, add more resources to the VM will do the same.

Experiment of sync initialization performance


Following is the result of a simple experiment. I created a sync group to sync data from a SQL Server database in Azure VM to an Azure SQL database. The VM and SQL database are in the same Azure region so the impact of network latency could be ignored. It was syncing one table with 11 columns and about 2.1M rows. The total data size is 49.1GB. I did three runs with different source and target database configuration:

In the first run, the target database is S2 (50 DTU), and source database is running in D4S_V3 VM (4 vCPU, 16GB RAM). It takes 50 min to extract data to the temp folder and 471 min to load the data from the temp folder to the target database.

I upgraded the target database to S6 (400 DTU) and the Azure VM to D8S_V3 (8 vCPU, 32GB RAM) for the second run. It reduced the loading time to 98 min! The data extracting surprisingly took longer time in this run. I can’t explain the regression since I didn’t capture the local resource usage during the run. It might be some disk I/O issue. Even though, upgrading the target database to S6 reduced the total initialization time from 521 min to 267 min.

In the third run, I upgraded the target database to S12 (3000 DTU) and used the local SSD as temp folder. It reduced data extract time to 39 min, data loading time to 56 min and the total initialization time to 95 min. It was 5.5 time faster than the first configuration with extra cost of a cup of coffee!

Friday, 16 February 2018

Announcing Virtual Network integration for Azure Storage and Azure SQL

We are glad to announce the public preview of Virtual Network (VNet) Service Endpoints for Azure Storage and Azure SQL.

For many of our customers moving their business-critical data to the cloud, data breaches remain a top concern. Various Azure services that store or process the business data have Internet-reachable IP addresses. Leaked credentials or malicious insiders with administrative privileges gaining access to the data, from anywhere in the world, is an increasing concern to our customers.

To protect against these threats, private connectivity to Azure services is becoming essential to moving more critical workloads to the cloud. Most customers want to limit access to their critical resources to only their private environments, i.e. their Azure Virtual Networks and on-premises.

While some of the Azure services can be directly deployed into VNets, many others still remain public. With VNet service endpoints, we are expanding Virtual Network support to more multi-tenant Azure services.

Service endpoints extend your VNet private address space and identity to the Azure services, over a direct connection. This allows you to secure your critical service resources to only your virtual networks, providing private connectivity to these resources and fully removing Internet access.

Configuring service endpoints is very simple with a single click on a subnet in your VNet. Direct route to the services is auto-configured for you. There are no NAT or gateway devices required to set up the endpoints. You also no longer need reserved, public IP addresses in your VNets to secure Azure resources through IP firewall. Service endpoints makes it easy to configure and maintain network security for your critical resources.

Step1: Set up service endpoints once on your Virtual Network. Network administrators can turn this setting independently, allowing for separation of duties.

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Step 2: Secure your new or existing Azure service resources to the VNet, with a simple click. Set up once for the Storage account or SQL server and automatically applies to any access to child resources. Data administrators can set up independently (optional).

Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides


Azure Tutorials and Materials, Azure Certifications, Azure Learning, Azure Guides

Service endpoints is available in preview for below services and regions:

Azure Storage: WestUS, EastUS, WestCentralUS, WestUS2, AustraliaEast, and AustraliaSouthEast

Azure SQ: EastUS, WestCentralUS, WestUS2

We will be expanding the feature to more regions soon.

We are very excited to bring enhanced network security for your Azure service resources. This is only a beginning for our roadmap for tightening security for Azure services. We will expand the service endpoints to more Azure services. In addition to service endpoints, we are also very committed to giving you private connectivity to your Azure resources, from your firewalls and on-premises. Service tags is yet another investment in this direction, for your Network Security Groups (NSGs) to selectively open access only to Azure services from your VNets. Service tags is also available in preview now.