Wednesday, 6 December 2017

ADL Tools for Visual Studio Code (VSCode) supports Python & R Programming

We are thrilled to introduce support for Azure Data Lake (ADL) Python and R extensions within Visual Studio Code (VSCode). This means you can easily add Python or R scripts as custom code extensions in U-SQL scripts, and submit such scripts directly to ADL with one click. For data scientists who value the productivity of Python and R, ADL Tools for VSCode offers a fast and powerful code editing solution. VSCode makes it simple to get started and provides easy integration with U-SQL for data extract, data processing, and data output.

With ADL Tools for VSCode, you can choose your preferred language and use already familiar techniques to build your custom code. For example, developers using Python can now use REFERENCE ASSEMBLY to bring in the needed Python libraries and leverage built-in reducers to run Python code on each job execution vertex. You can also embed your Python code, which accepts a pandas DataFrame as input and returns a pandas DataFrame as output, into your U-SQL script. For data scientist using R, you can perform massively parallel execution of R code for data science scenarios such as merging various data files, parallel feature engineering, partitioned data model building, and so on.  To facilitate code clarity and reuse, the tools also allow to write code behind using different languages for a U-SQL file.

Key customer benefits


◉ Local editor authoring and execution experience for Python Code-Behind to support distributed analytics.
◉ Local editor authoring and execution experience for R Code-Behind to support distributed analytics.
◉ Flexible mechanism to allow you to write single or multiple Python, R, and C# Code-Behind as part of a single U-SQL file.
◉ Dynamic Code-Behind to embed Python and R script into your U-SQL script.
◉ Integration with Azure Data Lake for Python and R with easy U-SQL job submissions.

How to develop U-SQL with Python and R


◉ Right-click the U-SQL script file, select ADL: Generate Python Code Behind File, and a xxx.usql.py file is generated in your working folder. Then write your Python code.

ADL, Microsoft Guides, Microsoft Tutorials and Materials, Azure Microsoft

ADL, Microsoft Guides, Microsoft Tutorials and Materials, Azure Microsoft

◉ Right-click the U-SQL script file, select ADL: Generate R Code Behind File, and a xxx.usql.r file is generated in your working folder. Then write your R code. 

ADL, Microsoft Guides, Microsoft Tutorials and Materials, Azure Microsoft

ADL, Microsoft Guides, Microsoft Tutorials and Materials, Azure Microsoft

How to install or update


First, install Visual Studio Code and download Mono 4.2.x (for Linux and Mac). Then get the latest Azure Data Lake Tools by going to the VSCode Extension repository or the VSCode Marketplace and searching “Azure Data Lake Tools”.

ADL, Microsoft Guides, Microsoft Tutorials and Materials, Azure Microsoft

Second, please complete the one-time set up to register Python and R extensions assemblies for your ADL account.

Related Posts

0 comments:

Post a Comment