After Satya Nadella opened his account at Microsoft with a big pitch to “Mobile First, Cloud First”, he has evolved this in to focus on intelligent applications across the Microsoft stack, especially at the recent Microsoft Ignite 2017 conference with Cognitive Services. Microsoft is seeing evolutions in the cloud hosting and development world but where the bigger revolution is with empowering people to make more use of complex machine learning solutions to open up ways of working that weren’t possible before. Welcome to “AI First”.
Learning how to apply these complicated algorithms to your own data is time-consuming, expensive and potentially risky as you may not benefit in the timeframes that you’d be looking for. Microsoft has realised this and so wrapped many common scenarios into a simpler, service-driven approach and this is where Cognitive Services comes in.
What is Cognitive Services?
Cognitive Services is a suite of web services provided through Azure that give you access to machine learning algorithms that are always learning from everyone else using these services. The services are broken down into some key groupings:
Image-processing algorithms to smartly identify, caption and moderate your pictures.
Convert spoken audio into text, use voice for verification, or add speaker recognition to your app.
Map complex information and data in order to solve tasks such as intelligent recommendations and semantic search.
Add Bing Search APIs to your apps and harness the ability to comb billions of webpages, images, videos, and news with a single API call.
Allow your apps to process natural language with pre-built scripts, evaluate sentiment and learn how to recognize what users want.
From Microsoft Azure/Cognitive Services
Some of these services allow you to try a sample out directly from the website, such as Text Analytics. This shows how you can identify the language of the text, pull out the key phrases and even assess the sentiment, as you can see below.
If you are looking to delve a little deeper, the API documentation makes it very easy to not only learn how to code with the services but to actually try it out first. This is so helpful when it comes to determining whether the service will be able to help with your solution without having to write any code first. Vision is a great example of being able to take an image and upload it to identify faces, emotions and suggest what is in the image.
The first step is to get an API key which you can do easily from the Azure Portal by selecting AI + Cognitive Services from the Marketplace. Choose your service (e.g. Computer Vision API) and you’ll then be able to select the API Key once it is created.
Using the API documentation, you can make calls against the API. For example, below shows me checking one of our website images to suggest some tags. You select which Visual Features you want from a selection, which language you want to return and a link to the public-facing image – it is worth noting that you can also stream the image when you use the image itself so it does not have to be public-facing.
This then returns information about that image as JSON with a confidence level for each of the tags:
“tags”: [ { “name”: “laptop”, “confidence”: 0.999942421913147 }, { “name”: “computer”, “confidence”: 0.9992400407791138 }, { “name”: “person”, “confidence”: 0.9880917072296143 }, { “name”: “electronics”, “confidence”: 0.9809827208518982 }, { “name”: “indoor”, “confidence”: 0.9737054705619812 }, { “name”: “using”, “confidence”: 0.8011789321899414 }, { “name”: “lap”, “confidence”: 0.6971763372421265 }, { “name”: “open”, “confidence”: 0.49391528964042664 }, { “name”: “keyboard”, “confidence”: 0.3783775568008423 }
From Microsoft Azure/Cognitive Services
You can see that it is almost certain that it is a laptop, a computer and has a person in the image. It thinks there is a keyboard but isn’t completely certain. What is great about this service is that it gets supplied with hundreds of thousands of images and can learn from all of them to increase the capability of its learning models.
These services are already used by many Microsoft products to enhance usability, often without users being aware. For example, Outlook 2016 now checks your emails for certain patterns and can recommend an action. You can see below that I received a mail saying that we have an updated email signature and Outlook has identified that I may have a task. Clicking on the Action Items gives me the option to take those items and add them as follow up tasks in Outlook. If you are using Microsoft To-Do, this will then automatically add them to there as well.
Another very useful example is the PowerPoint Presentation Translator that has come from Microsoft Garage. It gives live subtitles to your PowerPoint translations and can even translate that into other languages. There is an accompanying phone app that can show you the translations in your own language simultaneously. If you wanted to use this earlier in your design of a slide deck, you can also add a Dictation app to Office that you will allow you to speak to add text to your Office documents. This video offers a quick demonstration:
So how could you use this in your own applications? A good example would be to tag your images with keywords and description using the Computer Vision API. If those images are held in SharePoint Online then Microsoft Flow could trigger the Computer Vision service directly and then update the files’ metadata. This would hugely reduce the time taken to make the images searchable although the contents should be validated first.
Another example is the use of LUIS, or Language Understanding Intelligence Service, which can interpret text as a sentence, identifying from a set of intents and entities to match what a person is trying to say. These are used with Bots in the Bot Framework to track chats and connect them to an action. It can take the intention to book leave and use phrases like “can I take time off”, “book holiday” and “get leave” all linked to the same action. This was used to great effect at the keynote of the recent Microsoft Ignite event where the Cortana speakers were demonstrated booking leave from internal systems.
Microsoft also announced the Azure Machine Learning Workbench at Microsoft Ignite, allowing people to build out further their own machine learning algorithms. Where the Cognitive Services meet a large part of your needs, you can now craft algorithms for your exact needs.
If you have a good idea for how these services could be used and would like to discuss how we could help make that happen, don’t hesitate to get in contact now.