Tuesday, 01 March 2016
Machine Learning is one of those tech areas which has always seemed just slightly out of reach for me. In this article, I'll demystify machine learning and explain how acessible it is for regular developers
This article was published at GitHub. It is open source and you can make edits, comments etc.
Machine Learning is one of those tech areas which has always seemed just slightly out of reach for me. I've always assumed you need a degree in extreme nerdiness from the Boffins university to understand the algorithms and data science that sits behind Machine Learning. Perhaps you need some kind of magic to 'get it' and those of us born without magic (i.e muggles) should not attempt to enter this world for fear of smashing our face on platform 9 3/4 at London Kings Cross station.
I've discovered of late that this is really not the case. Machine Learning is conceptually fairly simple and Microsoft has some great tools in the Azure Machine Learning service, Project Oxford and Cortana Analytics which makes Machine Learning accessible to us mere muggles.
The principle of Machine Learning is actually very simple; Machine Learning is a process by which computers find patterns in data and makes those patterns available to applications. The application can then gain insights on new data based on conformity to the identified patterns.
Take a canonical example such as the product recommendations. If a Machine Learning process is given a set of historical order data and asked the question "find out which products are commonly ordered together", the Machine Learning process can examine the historical data and work out which products are most commonly ordered together based on historical trends.
Once these patterns are identified, the application can ask "which products are most commonly ordered with this one" when a user adds something to their shopping cart. The Machine Learning process will return a result set and the application can display its product recommendations.
Recommendations are just one example, but Machine Learning can help in any scenario where patterns can be usefully recognised in data, some other examples include:
Regardless of the platform being used, Machine Learning follows a relatively simple process.
The primary goal of the process is to identify a 'Model'. The Model is the main thing that applications can submit requests to in order to gain insight on new data. A person working as the role of a Data Scientist performs the Machine Learning process and will ultimately decide on the right model to use.
The process starts with a question; what are you trying to learn from your Machine Learning experiment? For example, in the case of recommendations, the question might be "identify most commonly sold products for each product in the inventory"
The next step is to provide 'prepared data'. Prepared data is one or more data sets that have been pre-processed (formatted, cleaned and sampled) in readiness to apply Machine Learning algorthms to. Preparing the data means that the data is in the best shape to draw scientific conclusions from and is not skewed in any way.
Once you have your prepared data, you apply one or more Machine Learning algorithms to it with a view to producing a Model. This is an iterative process and you may loop around testing various algorithms until you have a Model that sufficiently answers your question.
Once you have produced your chosen model, it will typically be exposed via some kind of API.
If you want to learn more about the Machine Learning process, I highly reocmmend David Chappell's Introduction for Technical Professionals white paper
Azure Machine Learning service is one of the main platforms for doing Machine Learning in a quick, easy, cloud-based way.
The service contains a set of tools and modules that help the data scientist setup and run the Machine Learning process. It is designed for applied machine learning meaning and is designed to be used by real world applications and developers.
The Azure machine learning service offers 4 main components
Learn more about the Azure Machine Learning service, including a free trial here: https://azure.microsoft.com/en-us/services/machine-learning/
If all of the above still seems like it is the work of wizards and other magical folk, do not worry because the boffins at Microsoft have pre-packaged some of the more common Machine Learning scenarios and made them available as APIs which you can easily integrate into your application.
These APIs are available under two distinct brand names (not sure why, please tweet me if you know), they are Project Oxford and Cortana Analytics. Both of these services offer a neatly packaged set of APIs which have been built by Microsoft using the Azure Machine Learning service. Both offer most APIs for free under a certain number of transactions per month (typically 10,000) and paid-for models beyond that. In both cases, the APIs are exposed as simple, well documented REST APIs which can be called by pretty much any programming language on any platform. Project Oxford has a set of SDKs for Windows, Android, IOS and other platforms.
There are many, many APIs to choose from but some of the more popular ones are
The idea of these APIs is that as a developer (on any platform), you can integrate some of the intelligence of Machine Learning into your every day applications with little or no understanding of the underlying Machine Learning process.
I myself have written an app in the Office store called Sentimental which uses the Cortana Analytics Text Analytics API to do sentiment analysis and key phrase extraction directly from within Office; its pretty cool. There is also a website that does the same thing written in ASP.NET Core: http://sentimentalweb.azurewebsites.net.
If you want to ge a quick flavour of how these APIs work, Microsoft have created a set of websites that use combination of these APIs as well as some other Azure Machine Learning services.
Machine Learning is for boffins too. Especially is you include the ready-to-use APIs that Project Oxford and Cortana Analytics provide. Any developer can get started with these APIs in minutes and add some powerful intelligence to their applications.
Got a comment?
All my articles are written and managed as Markdown files on GitHub.
Please add an issue or submit a pull request if something is not right on this article or you have a comment.
If you'd like to simply say "thanks", then please send me a .