Microsoft Online Guide: Cognitive Services

Showing posts with label Cognitive Services. Show all posts

Tuesday, 10 January 2023

Cognitive Services Data Science

Microsoft named a Leader in 2022 Gartner® Magic Quadrant™ for Insight Engines

How your organization can benefit, no matter the industry.

As the amount of data being generated continues to grow at an exponential rate, it's becoming increasingly important for organizations to have a rich set of tools that can help them make sense of it all. That's where insight engines come in. These powerful solutions apply relevancy methods to data of all types, from structured to highly unstructured, allowing users to describe, discover, organize, and analyze it to deliver information proactively or interactively at the right time, in the right context.

Microsoft has recently been named a Leader in the 2022 Gartner Magic Quadrant for Insight Engines, a report that evaluates the capabilities of various vendors in the market.

Microsoft Certification, Microsoft Career, Microsoft Skills, Microsoft Jobs, Microsoft Prep, Microsoft Preparation, Microsoft Tutorial and Materials, Microsoft Guides, Microsoft

Microsoft offers two integrated solutions in this space: Microsoft Search, which is available with Microsoft 365, and Azure Cognitive Search, which is available as a platform as-a-service (PaaS) with Microsoft Azure. These solutions are designed to help professionals and developers build impactful AI-powered search solutions that can solve complex problems and enhance the customer experience by enabling information discovery across the spectrum from unstructured to structured data. Whether you need a turnkey solution to reason over enterprise data or the flexibility to tailor search to specific scenarios, Microsoft has you covered.

Azure Cognitive Search can be used in a variety of industries to improve efficiency and decision-making. Some specific examples of how it can be used include:

◉ Manufacturing: Cognitive Search can be used to help manufacturers quickly find information about production processes, equipment, and materials. It can be applied to structured data scenarios such as part catalogs as well as unstructured content such as equipment manuals, safety procedures, and imagery.

◉ Energy: Cognitive Search can be used to quickly find information related to exploration, drilling, and production. Geo-location search combined with traditional search input enables discovery experiences to get the most of past and present geological site studies, and extensibility allows incorporating energy industry-specific information.

◉ Retail: Cognitive Search can be used to develop a powerful product catalog search experience for retail web sites and apps. Customizable ranking options, scale capability to handle peak traffic with low latency, and the ability for near-real time updates for critical data such as inventory make it a great fit for the scenario.

◉ Financial services: Cognitive Search can be used by financial institutions to quickly find data related to investments, market trends, and regulatory compliance. Its sophisticated semantic ranking and question-answering capabilities can enable users to answer business questions faster and more confidently.

◉ Healthcare: Cognitive Search can be used by healthcare organizations to improve patient care, streamline operations, and make better informed decisions by quickly finding and accessing relevant information within electronic medical record systems, providing real-time access to clinical guidelines and evidence-based best practices.

Nearly every user knows what to do when they see a search box. All SaaS applications targeting audiences from consumer to enterprise can greatly benefit from a great search experience over their own data. Azure Cognitive Search can deliver an out-of-the-box solution, inclusive of various multi-tenancy strategies, support for over 50 languages, and a global presence to ensure your solution is delivered in the right location for your customers.

If you're a technical decision maker in one of these industries, or any other industry, and you're interested in learning more about how Microsoft's cognitive search solutions can help you unlock the full potential of your data, you can visit the Azure Cognitive Search website and the Microsoft Search website.

You can also download a complimentary copy of the Gartner Magic Quadrant for Insight Engines to see how Microsoft is recognized in the space.

Source: microsoft.com

Tuesday, 27 December 2022

Cloud Strategy Cognitive Services

Microsoft Azure CLX: A personalized program to learn Azure

11:58 By Kristen Waston 0 Comment

Microsoft Azure CLX, Azure Career, Azure Skills, Azure Jobs, Azure Certification, Azure Preparation, Azure Study

The rise of cloud computing has created demand for proven cloud experts. That’s why we’ve launched the Microsoft Azure Connected Learning Experience (CLX) program, designed to help aspiring learners and IT professionals become Microsoft Azure cloud pros. CLX is a personalized and self-paced journey that culminates in a certificate of completion—allowing you to maximize learning while minimizing time invested.

What is the CLX program?

The CLX program is a four-step program that prepares you for the Microsoft Azure certification exams while optimizing your learning experience and minimizing time invested. The program, which is curated to meet every learner’s unique needs, consists of four steps:

◉ A knowledge assessment

◉ A Microsoft Learn study materials review

◉ A practice test

◉ A cram session

At the start of the program, you’ll take a knowledge assessment to test your skills and create a personalized learning path. You’ll then take only the Microsoft Learn courses that are useful to you—saving you time and ensuring that you learn the skills you need to accelerate your career.

What courses will I take?

The courses you take are up to you. The self-paced program is catered to your skillset, and you can embark on six tracks: Microsoft Azure Fundamentals, Microsoft Azure AI Fundamentals, Microsoft Azure Data Fundamentals, Microsoft Azure Administrator, Administering Windows Server Hybrid Core Infrastructure, and Windows Server Hybrid Advanced Series—with more on the way. Learn more about these tracks below.

Course	Learner Personas	Course Content
Microsoft Azure Fundamentals	Administrators, Business Users, Developers, Students, Technology Managers	This course strengthens your knowledge of cloud concepts and Azure services, workloads, security, privacy, pricing, and support. It’s designed for learners with an understanding of general technology concepts, such as networking, computing, and storage.
Microsoft Azure AI Fundamentals	AI Engineers, Developers, Data Scientists	This course, designed for both technical and non-technical professionals, bolsters your understanding of typical machine learning and artificial intelligence workloads and how to implement them for Azure.
Microsoft Azure Data Fundamentals	Database Administrators, Data Analysts, Data Engineers, Developers	The Data Fundamentals course instructs you on Azure core data concepts, Azure SQL, Azure Cosmos DB, and modern data warehouse analytics. It’s designed for learners with a basic knowledge of core data concepts and how they’re implemented in Azure.
Microsoft Azure Administrator	Azure Cloud Administrators, VDI Administrators, IT Operations Analysts	In Azure Administrator, you’ll learn to implement cloud infrastructure, develop applications, and perform networking, security, and database tasks. It’s designed for learners with a robust understanding of operating systems, networking, servers, and virtualization.
Administering Windows Server Hybrid Core Infrastructure	Systems Administrators, Infrastructure Deployment Engineers, Senior System Administrators, Senior Site Reliability Engineers	In this course, you’ll learn to configure on-premises Windows Servers, hybrid, and Infrastructure as a Service (IaaS) platform workloads. It’s geared toward those with the knowledge to configure, maintain, and deploy on-premises Windows Servers, hybrid, and IaaS platform workloads.
Windows Server Hybrid Advanced Series	System Administrators, Infrastructure Deployment Engineers, Associate Database Administrators	This advanced series, which is designed for those with deep administration and deployment knowledge, strengthens your ability to configure and manage Windows Server on-premises, hybrid, and IaaS platform workloads.

How do I get certified?

After you finish your personalized curriculum, you’ll complete a two-hour practice test that mimics the final certification exam. Next, you’ll attend a virtual, instructor-led cram session that dives deeply into the Microsoft Azure Certification Exam content. The four-hour session covers the entire course syllabus to ensure you’re well-prepared to pass with ease.

Once you’ve sharpened your understanding of the Azure platform and its solutions, you’ll receive your certificate of completion. You’ll also walk away with the skills to confidently pass the Microsoft Azure Certification Exams—and the proven expertise to advance your career and exceed your cloud computing goals today and in the future.

Source: microsoft.com

Saturday, 24 December 2022

Artificial Intelligence Cognitive Services

Improve speech-to-text accuracy with Azure Custom Speech

12:07 By Kristen Waston 0 Comment

Azure Career, Azure Skills, Azure Tutorial and Materials, Azure Job, Azure Certification, Azure Guides

With Microsoft Azure Cognitive Services for Speech, customers can build voice-enabled apps confidently and quickly in more than 140 languages. We make it easy for customers to transcribe speech to text (STT) with high accuracy, produce natural-sounding text-to-speech (TTS) voices, and translate spoken audio. In the past few years, we are inspired by the ways customers seek our customization features to fine-tune speech recognition to their use cases.

As our speech technology continues to change and evolve, we want to introduce four custom speech-to-text capabilities and their respective customer use cases. With these features, you can evaluate and improve the speech-to-text accuracy for your applications and products. A custom speech model is trained on top of a base model. With a custom model, you can improve recognition of domain-specific vocabulary by providing text data to train the model. You can also improve recognition based on the specific audio conditions of the application by providing audio data with reference transcriptions.

Custom Speech data types and use cases

Our Custom Speech features will let you customize Microsoft's speech-to-text engine. You will be able to customize the language model by tailoring it to the vocabulary of the application and customize the acoustic model to adapt to the speaking style of your users. By uploading text and/or audio data through Custom Speech, you'll be able to create these custom models, combine them with Microsoft's state-of-the-art speech models, and deploy them to a custom speech-to-text endpoint that can be accessed from any device.

Phrase list: A real-time accuracy enhancement feature that does not need model training. For example, in a meeting or podcast scenario, you can add a list of participant names, products, and uncommon jargon using phrase list to boost their recognition.

Plain text: Our simplest custom speech model can be made using just text data. Customers in the media industry use this in use cases such as commentary of sports events. Because each sporting event’s vocabulary differs significantly from others, building a custom model specific to a sport increases accuracy by biasing to the vocabulary of the event.

Structured text: This is text data that boosts patterns of sentences in speech. These patterns could be utterances that differ only by individual words or phrases, for example, “May I speak with name” where name is a list of possible names of individuals. The pattern can link to this list of entities (name in this case), and you can also provide their unique pronunciations.

Audio: You can train a custom speech model using audio data, with or without human-labeled transcripts. With human-labeled transcripts, you can improve recognition accuracy on speaking styles, accents, or specific background noises. For American English, you can now train without needing a labeled transcript to improve acoustic aspects such as slight accents, speaking styles, and background noises.

Research milestones

Microsoft’s speech and dialog research group achieved a milestone in reaching human parity in 2016 on the Switchboard conversational speech recognition task, meaning we had created technology that recognized words in a conversation as well as professional human transcribers. After further experimentation, we then followed up with a 5.1 percent word error rate, exceeding human parity in 2017. A technical report published outlines the details of our system. Today, Custom Speech helps enterprises and developers improve upon the milestones achieved by Microsoft Research.

Customer inspiration

Peloton: In the past, Peloton provided subtitles only for its on-demand classes. But that meant that the signature live experience so valued by members was not accessible to those who are deaf or hard of hearing. While the decision to introduce live subtitles was clear, executing on that vision proved a bit murkier. A primary challenge was determining how automated speech recognition software could facilitate Peloton’s specific vocabulary, including the numerical phrases used for class countdowns and to set resistance and cadence levels. Latency was another issue—subtitles wouldn’t be very useful, after all, if they lagged behind what instructors were saying. Peloton chose Azure Cognitive Services because it was cost-effective and allowed Peloton to customize its own machine learning model for converting speech to text—and was significantly faster than other solutions on the market. Microsoft also provided a team of engineers that worked alongside Peloton throughout the development process.

Speech Services and Responsible AI

We are excited about the future of Azure Speech with human-like, diverse, and delightful quality under the high-level architecture of the XYZ-code AI framework. Our technology advancements are also guided by Microsoft’s Responsible AI process, and our principles of fairness, inclusiveness, reliability and safety, transparency, privacy and security, and accountability. We put these ethical standards into practice through the Office of Responsible AI (ORA)—which sets our rules and governance processes, the AI Ethics and Effects in Engineering and Research (Aether) Committee—which advises our leadership on the challenges and opportunities presented by AI innovations, and Responsible AI Strategy in Engineering (RAISE)—a team that enables the implementation of Microsoft Responsible AI rules across engineering groups.

Source: microsoft.com

Sunday, 13 December 2020

Announcements Cognitive Services Machine Learning

Harness analytical and predictive power with Azure Synapse Analytics

11:51 By Kristen Waston 0 Comment

Since its preview announcement, we’ve witnessed incredible excitement and adoption of Azure Synapse from our customers and partners. We want to sincerely thank everyone that provided feedback and are now helping us bring the power of limitless analytics to all.

Unified experience

Azure Synapse brings together data integration, enterprise data warehousing, and big data analytics—at cloud scale. The unification of these workloads enables organizations to massively reduce their end-to-end development time and accelerate time to insight. It now also provides both no-code and code-first experiences for critical tasks such as data ingestion, preparation, and transformation.

Azure Exam Prep, Azure Certification, Azure Tutorial and Material, Azure Guides

With this release, the management and monitoring of your analytics system becomes significantly easier. With one click, teams can secure their entire analytics system and prevent data exfiltration by simply selecting the managed virtual network feature when creating a Synapse workspace. This gives valuable time back to teams hired to discover insights—rather than investing considerable time securing connections between services, building firewalls, or managing subnets.

Unified analytics

Over the past years, we’ve set out to rearchitect and create the next generation of query processing engine and data management to meet the needs of the modern, high-scale data workloads. The result is the new, cloud-native, distributed SQL engine that powers Azure Synapse. It can scale from queries on a handful of cores to thousands of nodes—all depending on your workload needs.

Azure Synapse enables a level of productivity and collaboration among data professionals that previously wasn’t possible by deeply integrating Apache Spark and its new SQL engine. And it supports popular languages that developers prefer including T-SQL, Python, Scala, and Java.

The new flexible service model for query processing allows data teams to use both serverless and dedicated options. Organizations can now choose the most cost-effective option for each use case—enjoying the advantages of a data lake for quick data exploration with pay-per-query pricing and/or a dedicated data warehouse for more predictable and mission-critical workloads.

Unified data teams

Azure Synapse is also deeply integrated with Microsoft Power BI and Azure Machine Learning.

With Power BI directly integrated in the Synapse Studio, BI professionals can work in the same service that houses data pipelines, data lakes, and data warehouses—reducing the time it takes to access clean and secure data for dashboards. And for lightning fast query performance, the new Power BI performance accelerator for Azure Synapse automates the creation and optimization of materialized views with just a few clicks.

For predictive analytics, teams can deploy machine learning models from the Azure Machine Learning model registry directly to Azure Synapse using a simple, guided user experience—no data movement required. The in-engine ML scoring can generate millions of predictions in seconds all while maintaining full data security as data doesn’t leave the platform. And with AutoML, all data teams—even organizations without highly trained data scientists—can automatically apply machine learning models to their data and generate predictive insights.

The code-free and programmatic integration—including CI/CD support with Git integration—enables seamless version control, collaboration, and code management between data engineers, data scientists, and BI professionals, allowing them to be highly productive across a variety of use cases.

Saturday, 4 April 2020

Artificial Intelligence Cognitive Services Hybrid Virtual Machines

Microsoft powers transformation at NVIDIA’s GTC Digital Conference

11:00 By Kristen Waston 0 Comment

The world of supercomputing is evolving. Work once limited to high-performance computing (HPC) on-premises clusters and traditional HPC scenarios, is now being performed at the edge, on-premises, in the cloud, and everywhere in between. Whether it’s a manufacturer running advanced simulations, an energy company optimizing drilling through real-time well monitoring, an architecture firm providing professional virtual graphics workstations to employees who need to work remotely, or a financial services company using AI to navigate market risk, Microsoft’s collaboration with NVIDIA makes access to NVIDIA graphics processing units (GPU) platforms easier than ever.

Microsoft Study Materials, Microsoft Online Exam, Microsoft Learning, Microsoft Tutorial and Material

These modern needs require advanced solutions that were traditionally limited to a few organizations because they were hard to scale and took a long time to deliver. Today, Microsoft Azure delivers HPC capabilities, a comprehensive AI platform, and the Azure Stack family of hybrid and edge offerings that directly address these challenges.

This year during GTC Digital, we’re spotlighting some of the most transformational applications powered by NVIDIA GPU acceleration that highlight our commitment to edge, on-prem, and cloud computing. Registration is free, so sign up to learn how Microsoft is powering transformation.

Visualization and GPU workstations

Azure enables a wide range of visualization workloads, which are critical for desktop virtualization as well as professional graphics such as computer-aided design, content creation, and interactive rendering. Visualization workloads on Azure are powered by NVIDIA’s world-class GPUs and Quadro technology, the world’s preeminent visual computing platform. With access to graphics workstations on Azure cloud, artists, designers, and technical professionals can work remotely, from anywhere, and from any connected device.

Artificial intelligence

We’re sharing the release of the updated execution provider in ONNX Runtime with integration for NVIDIA TensorRT 7. With this update, ONNX Runtime can execute open Open Neural Network Exchange (ONNX) models on NVIDIA GPUs on Azure cloud and at the edge using the Azure Stack Edge, taking advantage of the new features in TensorRT 7 like dynamic shape, mixed precision optimizations, and INT8 execution.

Dynamic shape support enables users to run variable batch size, which is used by ONNX Runtime to process recurrent neural network (RNN) and Bidirectional Encoder Representations from Transformers (BERT) models. Mixed precision and INT8 execution are used to speed up execution on the GPU, which enables ONNX Runtime to better balance the performance across CPU and GPU. Originally released in March 2019, TensorRT with ONNX Runtime delivers better inferencing performance on the same hardware when compared to generic GPU acceleration.

Additionally, the Azure Machine Learning service now supports RAPIDS, a high-performance GPU execution accelerator for data science framework using the NVIDIA CUDA platform. Azure developers can use RAPIDS in the same way they currently use other machine learning frameworks, and in conjunction with Pandas, Scikit-learn, PyTorch, and TensorFlow. These two developments represent major milestones towards a truly open and interoperable ecosystem for AI. We’re working to ensure these platform additions will simplify and enrich those developer experiences.

Edge

Microsoft provides various solutions in the Intelligent Edge portfolio to empower customers to make sure that machine learning not only happens in the cloud but also at the edge. The solutions include Azure Stack Hub, Azure Stack Edge, and IoT Edge.

Whether you are capturing sensor data and inferencing at the Edge or performing end-to-end processing with model training in Azure and leveraging the trained models at the edge for enhanced inferencing operations Microsoft can support your needs however and wherever you need to.

Supercomputing scale

Time-to-decision is incredibly important with a global economy that is constantly on the move. With the accelerated pace of change, companies are looking for new ways to gather vast amounts of data, train models, and perform real-time inferencing in the cloud and at the edge. The Azure HPC portfolio consists of purpose-built computing, networking, storage, and application services to help you seamlessly connect your data and processing needs with infrastructure options optimized for various workload characteristics.

Azure Stack Hub announced preview

Microsoft, in collaboration with NVIDIA, is announcing that Azure Stack Hub with Azure NC-Series Virtual Machine (VM) support is now in preview. Azure NC-Series VMs are GPU-enabled Azure Virtual Machines available on the edge. GPU support in Azure Stack Hub unlocks a variety of new solution opportunities. With our Azure Stack Hub hardware partners, customers can choose the appropriate GPU for their workloads to enable Artificial Intelligence, training, inference, and visualization scenarios.

Azure Stack Hub brings together the full capabilities of the cloud to effectively deploy and manage workloads that otherwise are not possible to bring into a single solution. We are offering two NVIDIA enabled GPU models during the preview period. They are available in both NVIDIA V100 Tensor Core and NVIDIA T4 Tensor Core GPUs. These physical GPUs align with the following Azure N-Series VM types as follows:

◉ NCv3 (NVIDIA V100 Tensor Core GPU): These enable learning, inference and visualization scenarios.

◉ TBD (NVIDIA T4 Tensor Core GPU): This new VM size (available only on Azure Stack Hub) enables light learning, inference, and visualization scenarios.

Hewlett Packard Enterprise is supporting the Microsoft GPU preview program as part of its HPE ProLiant for Microsoft Azure Stack Hub solution.“The HPE ProLiant for Microsoft Azure Stack Hub solution with the HPE ProLiant DL380 server nodes are GPU-enabled to support the maximum CPU, RAM, and all-flash storage configurations for GPU workloads,” said Mark Evans, WW product manager, HPE ProLiant for Microsoft Azure Stack Hub, at HPE. “We look forward to this collaboration that will help customers explore new workload options enabled by GPU capabilities.”

As the leading cloud infrastructure provider1, Dell Technologies helps organizations remove cloud complexity and extend a consistent operating model across clouds. Working closely with Microsoft, the Dell EMC Integrated System for Azure Stack Hub will support additional GPU configurations, which include NVIDIA V100 Tensor Core GPUs, in a 2U form factor. This will provide customers increased performance density and workload flexibility for the growing predictive analytics and AI/ML markets. These new configurations also come with automated lifecycle management capabilities and exceptional support.

Microsoft Study Materials, Microsoft Tutorial and Material, Microsoft Online Exam, Microsoft Cert Exam

Azure Stack Edge preview

We also announced the expansion of our Microsoft Azure Stack Edge preview with the NVIDIA T4 Tensor Core GPU. Azure Stack Edge is a cloud managed appliance that provides processing for fast local analysis and insights to the data. With the addition of an NVIDIA GPU, you’re able to build in the cloud then run at the edge.

Tuesday, 31 March 2020

Artificial Intelligence Cognitive Services Machine Learning

Extending the power of Azure AI to Microsoft 365 users

10:34 By Kristen Waston 0 Comment

What is Azure AI?

Azure AI is a set of AI services built on Microsoft’s breakthrough innovation from decades of world-class research in vision, speech, language processing, and custom machine learning. What is particularly exciting is that Azure AI provides our customers with access to the same proven AI capabilities that power Microsoft 365, Xbox, HoloLens, and Bing. In fact, there are more than 20,000 active paying customers—and more than 85 percent of the Fortune 100 companies have used Azure AI in the last 12 months.

Azure AI helps organizations:

◉ Develop machine learning models that can help with scenarios such as demand forecasting, recommendations, or fraud detection using Azure Machine Learning.

◉ Incorporate vision, speech, and language understanding capabilities into AI applications and bots, with Azure Cognitive Services and Azure Bot Service.

◉ Build knowledge-mining solutions to make better use of untapped information in their content and documents using Azure Search.

Microsoft 365 provides innovative product experiences with Azure AI

The announcement of Microsoft Editor is one example of innovation. Editor, your personal intelligent writing assistant is available across Word, Outlook.com, and browser extensions for Edge and Chrome. Editor is an AI-powered service available in more than 20 languages that has traditionally helped writers with spell check and grammar recommendations. Powered by AI models built with Azure Machine Learning, Editor can now recommend clear and concise phrasing, suggest more formal language, and provide citation recommendations.

Azure AI, Azure Microsoft Guides, Azure Learning, Azure Exam Prep, Azure Certification

Additionally, Microsoft PowerPoint utilizes Azure AI in multiple ways. PowerPoint Designer uses Azure Machine Learning to recommend design layouts to users based on the content on the slide. In the example image below, Designer made the design recommendation based on the context in the slide. It can also can intelligently crop objects and people in images and place them in optimal layout on a slide. Since its launch, PowerPoint Designer users have kept nearly two billion Designer slides in their presentation.

PowerPoint also uses Azure Cognitive Services such as the Speech service to power live captions and subtitles for presentations in real-time, making it easier for all audience members to follow along. Additionally, PowerPoint also uses Translator Text to provide live translations into over 60 languages to reach an even wider audience. These AI-powered capabilities in PowerPoint are providing new experiences for users, allowing them to connect with diverse audiences they were unable to reach before.

These same innovations can also be found in Microsoft Teams. As we look to stay connected with co-workers, Teams has some helpful capabilities intended to make it easier to collaborate and communicate while working remotely. For example, Teams offers the ability of live captioning meetings, which leverages the Speech API for speech transcription. But it doesn’t stop there. As you saw with PowerPoint, Teams also uses Azure AI for live translations when you set up Live Events. This functionality is particularly useful for company town hall meetings or even for any virtual event with up to ten thousand attendees, allowing presenters to reach audiences worldwide

These are just a few of the ways Microsoft 365 applications utilize Azure AI to deliver industry-leading experiences to billions of users. When you consider the fact that other Microsoft products such as Microsoft 365, Xbox, HoloLens 2, Dynamics 365, and Power Platform all rely on Azure AI, you begin to see the massive scale and the breadth of scenarios that only Azure can offer. Best of all, these same capabilities are available to anyone in Azure AI.

Thursday, 30 January 2020

Azure Media Services Cognitive Services Video Indexer

Six things to consider when using Video Indexer at scale

10:09 By Kristen Waston 0 Comment

Your large archive of videos to index is ever-expanding, thus you have been evaluating Microsoft Video Indexer and decided that you want to take your relationship with it to the next level by scaling up.

In general, scaling shouldn’t be difficult, but when you first face such process you might not be sure what is the best way to do it. Questions like “are there any technological constraints I need to take into account?”, “Is there a smart and efficient way of doing it?”, and “can I prevent spending excess money in the process?” can cross your mind. So, here are six best practices of how to use Video Indexer at scale.

1. When uploading videos, prefer URL over sending the file as a byte array

Video Indexer does give you the choice to upload videos from URL or directly by sending the file as a byte array, but remember that the latter comes with some constraints.

First, it has file size limitations. The size of the byte array file is limited to 2 GB compared to the 30 GB upload size limitation while using URL.

Second and more importantly for your scaling, sending files using multi-part means high dependency on your network, service reliability, connectivity, upload speed, and lost packets somewhere in the world wide web, are just some of the issues that can affect your performance and hence your ability to scale.

Microsoft Tutorial and Material, Azure Certifications, Azure Learning, Azure Prep, Azure Online Exam

When you upload videos using URL you just need to give us a path to the location of a media file and we will take care of the rest.

To upload videos using URL via API you can check this short-code sample or you can use AzCopy for a fast and reliable way to get your content to a storage account from which you can submit it to Video Indexer using SAS URL.

2. Increase media reserved units if needed

Usually in the proof of concept stage when you just start using Video Indexer, you don’t need a lot of computing power. Now, when you want to scale up your usage of Video Indexer you have a larger archive of videos you want to index and you want the process to be at a pace that fits your use case. Therefore, you should think about increasing the number of compute resources you use if the current amount of computing power is just not enough.

In Azure Media Services, when talking about computing power and parallelization we talk about media reserved units (RUs), those are the compute units that determine the parameters for your media processing tasks. The number of RUs affects the number of media tasks that can be processed concurrently in each account and their type determines the speed of processing and one video might require more than one RU if its indexing is complex. When your RUs are busy, new tasks will be held in a queue until another resource is available.

We know you want to operate efficiently and you don’t want to have resources that eventually will stay idle part of the time. For that reason, we offer an auto-scale system that spins RUs down when less processing is needed and spin RUs up when you are in your rush hours (up to fully use all of your RUs). You can easily enable this functionality by turning on the autoscale in the account settings or using Update-Paid-Account-Azure-Media-Services API.

To minimize indexing duration and low throughput we recommend you start with 10 RUs of type S3. Later if you scale up to support more content or higher concurrency, and you need more resources to do so, you can contact us using the support system (on paid accounts only) to ask for more RUs allocation.

3. Respect throttling

Video Indexer is built to deal with indexing at scale, and when you want to get the most out of it you should also be aware of the system’s capabilities and design your integration accordingly. You don’t want to send an upload request for a batch of videos just to discover that some of the movies didn’t upload and you are receiving an HTTP 429 response code (too many requests). It can happen due to the fact that you sent more requests than the limit of movies per minute we support. Don’t worry, in the HTTP response, we add a retry-after header. The header we will specify when you should attempt your next retry. Make sure you respect it before trying your next request.

4. Use callback URL

Have you ever called customer service and their response was “I’m now processing your request, it will take a few minutes. You can leave your phone number and we’ll get back to you when it is done”? The cases when you do leave your number and they call you back the second your request was processed are exactly the same concept as using callback URL.

Thus we recommend that instead of polling the status of your request constantly from the second you sent the upload request, you can just add a callback URL, and wait for us to update you. As soon as there is any status change in your upload request, we will send a POST notification to the URL you sent.

You can add a callback URL as one of the parameters of the upload-video API (see below the description from the API). If you are not sure how to do it, you can check the code samples from our GitHub repo. By the way, for callback URL you can also use Azure Functions, a serverless event-driven platform that can be triggered by HTTP and implement a following flow.

5. Use the right indexing parameters for you

Probably the first thing you need to do when using Video Indexer, and specifically when trying to scale, is to think about how to get the most out of it with the right parameters for your needs. Think about your use case, by defining different parameters you can save yourself money and make the indexing process for your videos faster.

We are giving you the option to customize your usage in Video Indexer by choosing those indexing parameters. Don’t set the preset to streaming it if you don’t plan to watch it, don’t index video insights if you only need audio insights, it is that easy.

6. Index in optimal resolution, not highest resolution

Not too long ago, we were in times when HD videos didn't exist. Now, we have videos of varied qualities from HD to 8K. The question is, what video quality do you need for indexing your videos? The higher the quality of the movie you upload means the higher the file size, and this leads to higher computing power and time needed to upload the video.

Our experiences show that, in many cases, indexing performance has almost no difference between HD (720P) videos and 4K videos. Eventually, you’ll get almost the same insights with the same confidence.

For example, for the face detection feature, a higher resolution can help with the scenario where there are many small but contextually important faces. However, this will come with a quadratic increase in runtime and an increased risk of false positives.

Therefore, we recommend you to verify that you get the right results for your use case and to first test it locally. Upload the same video in 720P and in 4K and compare the insights you get. Remember, No need to use a cannon to kill a fly.

Saturday, 12 October 2019

Artificial Intelligence Cognitive Services

Leveraging Cognitive Services to simplify inventory tracking

08:14 By Kristen Waston 0 Comment

Microsoft Tutorial and Materials, Microsoft Study Materials, Microsoft Online Guides, Azure Learning

Who spends their summer at the Microsoft Garage New England Research & Development Center (or “NERD”)? The Microsoft Garage internship seeks out students who are hungry to learn, not afraid to try new things, and able to step out of their comfort zones when faced with ambiguous situations. The program brought together Grace Hsu from Massachusetts Institute of Technology, Christopher Bunn from Northeastern University, Joseph Lai from Boston University, and Ashley Hong from Carnegie Mellon University. They chose the Garage internship because of the product focus—getting to see the whole development cycle from ideation to shipping—and learning how to be customer obsessed.

Microsoft Garage interns take on experimental projects in order to build their creativity and product development skills through hacking new technology. Typically, these projects are proposals that come from our internal product groups at Microsoft, but when Stanley Black & Decker asked if Microsoft could apply image recognition for asset management on construction sites, this team of four interns accepted the challenge of creating a working prototype in twelve weeks.

Starting with a simple request for leveraging image recognition, the team conducted market analysis and user research to ensure the product would stand out and prove useful. They spent the summer gaining experience in mobile app development and AI to create an app that recognizes tools at least as accurately as humans can.

The problem

In the construction industry, it’s not unusual for contractors to spend over 50 hours every month tracking inventory, which can lead to unnecessary delays, overstocking, and missing tools. All together, large construction sites could lose more than $200,000 worth of equipment over the course of a long project. Addressing this problem is an unstandardized mix that typically involves barcodes, Bluetooth, RFID tags, and QR codes. The team at Stanley Black & Decker asked, “wouldn’t it be easier to just take a photo and have the tool automatically recognized?”

Because there are many tool models with minute differences, recognizing a specific drill, for example, requires you to read a model number like DCD996. Tools can also be assembled with multiple configurations, such as with or without a bit or battery pack attached, and can be viewed from different angles. You also need to take into consideration the number of lighting conditions and possible backgrounds you’d come across on a typical construction site. It quickly becomes a very interesting problem to solve using computer vision.

How they hacked it

Classification algorithms can be easily trained to reach strong accuracy when identifying distinct objects, like differentiating between a drill, a saw, and a tape measure. Instead, they wanted to know if a classifier could accurately distinguish between very similar tools like the four drills shown above. In the first iteration of the project, the team explored PyTorch and Microsoft’s Custom Vision service. Custom Vision appeals to users by not requiring a high level of data science knowledge to get a working model off the ground, and with enough images (roughly 400 for each tool), Custom Vision proved to be an adequate solution. However, it immediately became apparent that manually gathering this many images would be challenging to scale for a product line with thousands of tools. The focus quickly shifted to find ways of synthetically generating the training images.

For their initial approach, the team did both three-dimensional scans and green screen renderings of the tools. These images were then overlaid with random backgrounds to mimic a real photograph. While this approach seemed promising, the quality of the images produced proved challenging.

In the next iteration, in collaboration with Stanley Black & Decker’s engineering team, the team explored a new approach using photo-realistic renders from computer-aided design (CAD) models. They were able to use relatively simple Python scripts to resize, rotate, and randomly overlay these images on a large set of backgrounds. With this technique, the team could generate thousands of training images within minutes.

On the left is an image generated in front of a green screen versus an extract from CAD on the right.

Benchmarking the iterations

The Custom Vision service offers reports on the accuracy of the model as shown below.

For a classification model that targets visually similar products, a confusion matrix like the one below is very helpful. A confusion matrix visualizes the performance of a prediction model by comparing the true label of a class in the rows with the label outputted by the model in the columns. The higher the scores on the diagonal, the more accurate the model is. When high values are off the diagonal it helps the data scientists understand which two classes are being confused with each other by the trained model.

Existing Python libraries can be used to quickly generate a confusion matrix with a set of test images.

The result

The team developed a React Native application that runs on both iOS and Android and serves as a lightweight asset management tool with a clean and intuitive UI. The app adapts to various degrees of Wi-Fi availability and when a reliable connection is present, the images taken are sent to the APIs of the trained Custom Vision model on Azure Cloud. In the absence of an internet connection, the images are sent to a local computer vision model.

These local models can be obtained using Custom Vision, which exports models to Core ML for iOS, TensorFlow for Android, or as a Docker container that can run on a Linux App Service in Azure. An easy framework for the addition of new products to the machine learning model can be implemented by exporting rendered images from CAD and generating synthetic images.

Images in order from left to right: inventory checklist screen, camera functionality to send a picture to Custom Vision service, display of machine learning model results, and a manual form to add a tool to the checklist.

What’s next

Looking for an opportunity for your team to hack on a computer vision project? Search for an OpenHack near you.

Microsoft OpenHack is a developer focused event where a wide variety of participants (Open) learn through hands-on experimentation (Hack) using challenges based on real world customer engagements designed to mimic the developer journey. OpenHack is a premium Microsoft event that provides a unique upskilling experience for customers and partners. Rather than traditional presentation-based conferences, OpenHack offers a unique hands-on coding experience for developers.

Thursday, 26 September 2019

Azure Media Services Cognitive Services Media Services & CDN

Azure Media Services' new AI-powered innovation

12:00 By Kristen Waston 0 Comment

Animated character recognition, multilingual speech transcription and more now available

At Microsoft, our mission is to empower every person and organization on the planet to achieve more. The media industry exemplifies this mission. We live in an age where more content is being created and consumed in more ways and on more devices than ever. At IBC 2019, we’re delighted to share the latest innovations we’ve been working on and how they can help transform your media workflows.

Video Indexer adds support for animation and multilingual content

We made our award winning Azure Media Services Video Indexer generally available at IBC last year, and this year it’s getting even better. Video Indexer automatically extracts insights and metadata such as spoken words, faces, emotions, topics and brands from media files, without you needing to be a machine learning expert. Our latest announcements include previews for two highly requested and differentiated capabilities for animated character recognition and multilingual speech transcription, as well as several additions to existing models available today in Video Indexer.

Animated character recognition

Animated content or cartoons are one of the most popular content types, but standard AI vision models built for human faces do not work well with them, especially if the content has characters without human features. In this new preview solution, Video Indexer joins forces with Microsoft’s Azure Custom Vision service to provide a new set of models that automatically detect and group animated characters and allow customers to then tag and recognize them easily via integrated custom vision models. These models are integrated into a single pipeline, which allows anyone to use the service without any previous machine learning skills. The results are available through the no-code Video Indexer portal or the REST API for easy integration into your own applications.

Azure Media Services, Azure AI, Azure Tutorials and Materials, Azure Learning, Azure Certifications

We built these animated character models in collaboration with select customers who contributed real animated content for training and testing. The value of the new functionality is well articulated by Andy Gutteridge, Senior Director, Studio & Post-Production Technology at Viacom International Media Networks, which was one of the data contributors: “The addition of reliable AI-based animated detection will enable us to discover and catalogue character metadata from our content library quickly and efficiently. Most importantly, it will give our creative teams the power to find the content they want instantly, minimize time spent on media management and allow them to focus on the creative.”

Multilingual identification and transcription

Some media assets like news, current affairs, and interviews contain audio with speakers using different languages. Most existing speech-to-text capabilities require the audio recognition language to be specified in advance, which is an obstacle to transcribing multilingual videos. Our new automatic spoken language identification for multiple content feature leverages machine learning technology to identify the different languages used in a media asset. Once detected, each language segment undergoes an automatic transcription process in the language identified, and all segments are integrated back together into one transcription file consisting of multiple languages.

The resulting transcription is available both as part of Video Indexer JSON output and as closed-caption files. The output transcript is also integrated with Azure Search, allowing you to immediately search across videos for the different language segments. Furthermore, the multi-language transcription is available as part of the Video Indexer portal experience so you can view the transcript and identified language by time, or jump to the specific places in the video for each language and see the multi-language transcription as captions as a video is played. You can also translate the output back-and-forth into 54 different languages via the portal and API.

Additional updated and improved models

We are also adding new and improving existing models within Video Indexer, including:

Extraction of people and locations entities

We’ve extended our current brand detection capabilities to also incorporate well-known names and locations, such as the Eiffel Tower in Paris or Big Ben in London. When these appear in the generated transcript or on-screen via optical character recognition (OCR), a specific insight is created. With this new capability, you can review and search by all people, locations and brands that appeared in the video, along with their timeframes, description, and a link to our Bing search engine for more information.

Editorial shot detection model

This new feature adds a set of “tags” in the metadata attached to an individual shot in the insights JSON to represent its editorial type (such as wide shot, medium shot, close up, extreme close up, two shot, multiple people, outdoor and indoor, etc.). These shot-type characteristics come in handy when editing videos into clips and trailers as well as when searching for a specific style of shots for artistic purposes.

Explore and read more about editorial shot type detection in Video Indexer.

Expanded granularity of IPTC mapping

Our topic inferencing model determines the topic of videos based on transcription, optical character recognition (OCR), and detected celebrities even if the topic is not explicitly stated. We map these inferred topics to four different taxonomies: Wikipedia, Bing, IPTC, and IAB. With this enhancement, we now include level-2 IPTC taxonomy.

Tanking advantage of these enhancements is as easy as re-indexing your current Video Indexer library.

New live streaming functionality

We are also introducing two new live-streaming capabilities in preview to Azure Media Services.

Live transcription supercharges your live events with AI

Using Azure Media Services to stream a live event, you can now get an output stream that includes an automatically generated text track in addition to the video and audio content. This text track is created using AI-based live transcription of the audio of the contribution feed. Custom methods are applied before and after speech-to-text conversion in order to improve the end-user experience. The text track is packaged into IMSC1, TTML, or WebVTT, depending on whether you are delivering in DASH, HLS CMAF, or HLS TS.

Live linear encoding for 24/7 over-the-top (OTT) channels

Using our v3 APIs, you can create, manage, and stream live channels for OTT services and take advantage of all the other features of Azure Media Services like live to video on demand (VOD), packaging, and digital rights management (DRM).

New packaging features

Support for audio description tracks

Broadcast content frequently has an audio track that contains verbal explanations of on-screen action in addition to the normal program audio. This makes programming more accessible for vision-impaired viewers, especially if the content is highly visual. The new audio description feature enables a customer to annotate one of the audio tracks to be the audio description (AD) track, which in turn can be used by players to make the AD track discoverable by viewers.

ID3 metadata insertion

In order to signal the insertion of advertisements or custom metadata events on a client player, broadcasters often make use of timed metadata embedded within the video. In addition to SCTE-35 signaling modes, we now also support ID3v2 or other custom schemas defined by an application developer for use by the client application.

Microsoft Azure partners demonstrate end-to-end solutions

Bitmovin is debuting its Bitmovin Video Encoding and Bitmovin Video Player on Microsoft Azure. Customers can now use these encoding and player solutions on Azure and leverage advanced functionality such as 3-pass encoding, AV1/VVC codec support, multi-language closed captions, and pre-integrated video analytics for QoS, ad, and video tracking.

Evergent is showing its User Lifecycle Management Platform on Azure. As a leading provider of revenue and customer lifecycle management solutions, Evergent leverages Azure AI to enable premium entertainment service providers to improve customer acquisition and retention by generating targeted packages and offers at critical points in the customer lifecycle.

Haivision will showcase its intelligent media routing cloud service, SRT Hub, that helps customers transform end-to-end workflows starting with ingest using Azure Data Box Edge and media workflow transformation using Hublets from Avid, Telestream, Wowza and Cinegy, and Make.tv.

SES has developed a suite of broadcast-grade media services on Azure for its satellite connectivity and managed media services customers. SES will show solutions for fully managed playout services, including master playout, localized playout and ad detection and replacement, and 24x7 high-quality multichannel live encoding on Azure.

SyncWords is making its caption automation technology and user-friendly cloud-based tools available on Azure. These offerings will make it easier for media organizations to add automated closed captioning and foreign language subtitling capabilities to their real-time and offline video processing workflows on Azure.

Global design and technology services company Tata Elxsi has integrated TEPlay, its OTT platform SaaS, with Azure Media Services to deliver OTT content from the cloud. Tata Elxsi has also brought FalconEye, its quality of experience (QoE) monitoring solution that focuses on actionable metrics and analytics, to Microsoft Azure.

Verizon Media is making its streaming platform available in beta on Azure. Verizon Media Platform is an enterprise-grade managed OTT solution including DRM, ad insertion, one-to-one personalized sessions, dynamic content replacement, and video delivery. The integration brings simplified workflows, global support and scale, and access to a range of unique capabilities available on Azure.

Many of our partners will also be presenting in the theater at our booth, so make sure you stop by to catch them!

Saturday, 20 July 2019

Azure Media Services Cognitive Services Media Services & CDN Video Indexer

New ways to train custom language models – effortlessly!

13:21 By Kristen Waston 0 Comment

Video Indexer (VI), the AI service for Azure Media Services enables the customization of language models by allowing customers to upload examples of sentences or words belonging to the vocabulary of their specific use case. Since speech recognition can sometimes be tricky, VI enables you to train and adapt the models for your specific domain. Harnessing this capability allows organizations to improve the accuracy of the Video Indexer generated transcriptions in their accounts.

Over the past few months, we have worked on a series of enhancements to make this customization process even more effective and easy to accomplish. Enhancements include automatically capturing any transcript edits done manually or via API as well as allowing customers to add closed caption files to further train their custom language models.

The idea behind these additions is to create a feedback loop where organizations begin with a base out-of-the-box language model and improve its accuracy gradually through manual edits and other resources over a period of time, resulting with a model that is fine-tuned to their needs with minimal effort.

Accounts’ custom language models and all the enhancements this blog shares are private and are not shared between accounts.

In the following sections I will drill down on the different ways that this can be done.

Improving your custom language model using transcript updates

Once a video is indexed in VI, customers can use the Video Indexer portal to introduce manual edits and fixes to the automatic transcription of the video. This can be done by clicking on the Edit button at the top right corner of the Timeline pane of a video to move to edit mode, and then simply update the text, as seen in the image below.

Media Services & CDN, Cognitive Services, Azure Media Services, Video Indexer, Azure Learning, Azure Tutorial and Materials, Azure Online Guides

The changes are reflected in the transcript, captured in a text file From transcript edits, and automatically inserted to the language model used to index the video. If you were not already using a customer language model, the updates will be added to a new Account Adaptations language model created in the account.

You can manage the language models in your account and see the From transcript edits files by going to the Language tab in the content model customization page of the VI website.

Once one of the From transcript edits files is opened, you can review the old and new sentences created by the manual updates, and the differences between them as shown below.

All that is left is to do is click on Train to update the language model with the latest changes. From that point on, these changes will be reflected in all future videos indexed using that model. Of course, you do not have to use the portal to train the model, the same can be done via the Video Indexer train language model API. Using the API can open new possibilities such as allowing you to automate a recurring training process to leverage ongoing updates.

There is also an update video transcript API that allows customers to update the entire transcript of a video in their account by uploading a VTT file that includes the updates. As a part of the new enhancements, when a customer uses this API, Video Indexer also adds the transcript that the customers uploaded to the relevant custom model automatically in order to leverage the content as training material. For example, calling update video transcript for a video titled "Godfather" will result with a new transcript file named “Godfather” in the custom language model that was used to index that video.

Improving your custom language model using closed caption files

Another quick and effective way to train your custom language model is to leverage existing closed captions files as training material. This can be done manually, by uploading a new closed caption file to an existing model in the portal, as shown in the image below, or by using the create language model and update language model APIs to upload a VTT, SRT or TTML files (similarly to what was done until now with TXT files.)

Once uploaded, VI cleans up all the metadata in the file and strip it down to the text itself. You can see the before and after results in the following table.

Type	Before	After
VTT	NOTE Confidence: 0.891635 00:00:02.620 --> 00:00:05.080 but you don't like meetings before 10 AM.	but you don’t like meetings before 10 AM.
SRT	2 00:00:02,620 --> 00:00:05,080 but you don't like meetings before 10 AM.	but you don’t like meetings before 10 AM.
TTML	<!-- Confidence: 0.891635 --> <p begin="00:00:02.620" end="00:00:05.080">but you don't like meetings before 10 AM.</p>	but you don’t like meetings before 10 AM.

From that point on, all that is left to do is review the additions to the model and click Train or use the train language model API to update the model.

Tuesday, 10 January 2023

Tuesday, 27 December 2022

What courses will I take?

How do I get certified?

Saturday, 24 December 2022

Custom Speech data types and use cases

Research milestones

Customer inspiration

Speech Services and Responsible AI

Sunday, 13 December 2020

Unified experience

Unified analytics

Unified data teams

Saturday, 4 April 2020

Visualization and GPU workstations

Artificial intelligence

Edge

Supercomputing scale

Azure Stack Hub announced preview

Azure Stack Edge preview

Tuesday, 31 March 2020

What is Azure AI?

Microsoft 365 provides innovative product experiences with Azure AI

Thursday, 30 January 2020

1. When uploading videos, prefer URL over sending the file as a byte array

2. Increase media reserved units if needed

3. Respect throttling

4. Use callback URL

5. Use the right indexing parameters for you

6. Index in optimal resolution, not highest resolution

Saturday, 12 October 2019

The problem

How they hacked it

Benchmarking the iterations

The result

What’s next

Thursday, 26 September 2019

Animated character recognition, multilingual speech transcription and more now available

Video Indexer adds support for animation and multilingual content

Animated character recognition

Multilingual identification and transcription

Additional updated and improved models

Extraction of people and locations entities

Editorial shot detection model

New live streaming functionality

Live transcription supercharges your live events with AI

Live linear encoding for 24/7 over-the-top (OTT) channels

New packaging features

Microsoft Azure partners demonstrate end-to-end solutions

Saturday, 20 July 2019

Improving your custom language model using transcript updates

Blog Archive

Labels

Popular posts

Subscribe To

Total Pageviews

Translate