Thursday 4 July 2019

Highlights from SIGMOD 2019: New advances in database innovation

The emergence of the cloud and the edge as the new frontiers for computing is an exciting direction—data is now dispersed within and beyond the enterprise, on-premises, in the cloud, and at the edge. We must enable intelligent analysis, transactions, and responsible governance for data everywhere, from creation through to deletion (through the entire lifecycle of ingestion, updates, exploration, data prep, analysis, serving, and archival).

Azure Study Materials, Azure Learning, Azure Certifications, Azure Guide

Our commitment to innovation is reflected in our unique collaborative approach to product development. Product teams work in synergy with research and advanced development groups, including Cloud Information Services Lab, Gray Systems Lab, and Microsoft Research, to push boundaries, explore novel concepts and challenge hypotheses.

The Azure Data team continues to lead the way in on-premises and cloud-based database management. SQL Server has been identified as the top DBMS by Gartner for four consecutive years.  Our aim is to re-think and redefine data management by developing optimal ways to capture, store and analyze data.

I’m especially excited that this year we have three teams presenting their work: “Socrates: The New SQL Server in the Cloud,” “Automatically Indexing Millions of Databases in Microsoft Azure SQL Database,” and the Gray Systems Lab research team’s “Event Trend Aggregation Under Rich Event Matching Semantics.”

The Socrates paper describes the foundations of Azure SQL Database Hyperscale, a revolutionary new cloud-native solution purpose-built to address common cloud scalability limits. It enables existing applications to elastically scale without fixed limits without the need to rearchitect applications, and with storage up to 100TB.

Its highly scalable storage architecture enables a database to expand on demand, eliminating the need to pre-provision storage resources, providing flexibility to optimize performance for workloads. The downtime to restore a database or to scale up or down is no longer tied to the volume of data in the database and database point-in-time restores are very fast, typically in minutes rather than hours or even days. With read-intensive workloads, Hyperscale provides rapid scale-out by provisioning additional read replicas instantaneously without any data copy needed.

Azure SQL Database also introduced a new serverless compute option: Azure SQL Database serverless. Serverless allows compute and memory to scale independently and on-demand based on the workload requirements. Compute is automatically paused and resumed, eliminating the requirements of managing capacity and reducing cost, and is an efficient option for applications with unpredictable or intermittent compute requirements.

Index management is a challenging task even for expert human administrators. The ability to create efficiencies and fully automate the process is of critical significance to business, as discussed in the Data team’s presentation on the auto-indexing feature in Azure SQL Database.

This, coupled with the identification of how to achieve optimal query performance for complex real-world applications, underpins the auto-indexing feature.

The auto-indexing feature is generally available and generates index recommendations for every database in Azure SQL Database. If the customer chooses, it can automatically implement index changes on their behalf and validate these index changes to ensure that performance improves. This feature has already significantly improved the performance of hundreds of thousands of databases.

Discover the benefits of the auto-tuning feature in Azure SQL Database.

In the world of streaming systems, the key challenges are supporting rich event matching semantics (e.g. Kleene patterns to capture event sequences of arbitrary lengths), and scalability (i.e. controlling memory pressure and latency at very high event throughputs).

The advanced research team focused on supporting this class of queries at a very high scale and compiled their findings in Event Trend Aggregation Under Rich Event Matching Semantics. The key intuition is to incrementally maintain the coarsest grained aggregates that can support a given query semantics, enabling control of memory pressure and attainment of very good latency at scale. By carefully implementing this insight, a research prototype was built that achieves six orders of magnitude speed-up and up to seven orders of magnitude memory reduction compared to state-of-the-art approaches.

Microsoft has the unique advantage of a world-class data management system in SQL Server and a leading public cloud in Azure. This is especially exciting at a time when cloud-native architectures are revolutionizing database management.

There has never been a better time to be part of database systems innovation at Microsoft, and we invite you to explore the opportunities to be part of our team.

Related Posts

0 comments:

Post a Comment