Attend FREE Webinar on Digital Marketing for Career & Business Growth Register Now

An Ultimate Guide to Understanding Supervised Learning

 / 
An Ultimate Guide to Understanding Supervised Learning

In the early 18th century, the invention of the steam engine took the world by a storm and by paving the way for the industrial revolution, changed the course of the world.

The world shifted from manual labour to mechanized operations.

In the mid 20th century, the advent of computers and digital electronics, in general, gave birth to the digital revolution when manual data storage and basic computation went electronic.

The early 21st century is the birth period of the next great revolution which will shift most of the remaining work that still only humans can do to computers.

The technology that is triggering this revolution is called Artificial Intelligence (AI), and at the heart of AI is ‘Machine Learning’.

Bill Gates has interestingly said, “A breakthrough in machine learning is worth ten Microsofts.”

What is Supervised Learning and Machine Learning?

For computers to be artificially intelligent, i.e., to be able to learn new things, identify, categorize, classify, and make decisions, the most basic requirement is data.

This ability of machines to learn is termed ‘Machine Learning’. It can be described as the ability to learn and improve from experiences without being explicitly programmed.

Register For a
Free Webinar

Date: 17th Oct, 2019 (Thursday)
Time: 3 PM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

Let us see a simple example. For the non-living machine to identify or distinguish between a middle-aged man and an old lady, it has to first learn what the attributes of men and old women are.

This can be learned from a large set (data) of men and women. In more complex cases, the data can be as huge as millions of values, thus termed ‘big data’. of men

Machines are not cognitive beings.

Humans have to undertake the preliminary task of setting up a system for learning, which is what makes understanding machine learning important.

And which is why its no wonder that there is a rise in machine learning jobs.

According to Monster.com, one of the three most in-demand skills is Machine Learning.

Most in-demand Skills Source - Forbes

Most in-demand Skills

Types of Machine Learning

On a very basic level, ML is not vastly different from the way humans learn.

For instance, if you show a pair of shoes and a pair of socks to a child, the next time he can point out a pair of socks and identify them.

The higher the number of shoes and socks he comes across, the better he is at identifying them. Something similar happens in machines.

The data about shoes and socks that have been fed to the system is called ‘training data’.

There are many types in a way that machines can learn:

1. Supervised Learning

Every time we speak of machine learning, it is essential to be clear about ‘what is supervised learning?’.

In supervised learning, the training data provided is in a labelled format.

For example, every shoe is labelled as a shoe and the same for the socks so that the system knows the labels, and when subjected to a new type of shoes, it will identify it as ‘shoes’ without being explicitly programmed to do so.

Supervised Learning Working Source NVIDIA

Supervised Learning Working

2. Unsupervised Learning

Unlike supervised learning, the training data is not labelled, so the system intakes and learns that there is a recurring pattern in one type of items/values and the other.

It will not know that one is called shoes and the other socks, but it knows both are different categories and places them so.

3. Semi-Supervised Learning

This is a combination of supervised and unsupervised learning, where the training data provided is a mixture of labelled and unlabeled, with the larger portion being that of non-labelled.

Semi-supervised learning is helpful when a large data is to be processed but only some of it is labelled and there aren’t sufficient resources to label the remaining data.

Types of Machine Learning - Image Source: Hacker NoonTypes of Machine Learning - Image Source: Hacker Noon

Types of Machine Learning

4. Reinforcement Learning

Reinforcement learning employs goal-oriented algorithms in such a way that the system learns to achieve an objective (goal) and maximize it in a certain direction over a number of steps.

The best example of this is a game such as chess where points are maximized over multiple moves.

You can refer to this article on reinforcement learning for a better understanding of reinforcement learning.

Supervised vs Reinforcement Learning - Image Source - SFL Scientific

Supervised vs Reinforcement Learning

Understanding Supervised Learning

Supervised Learning technically means the learning of a function that gives an output for a given input based on a set of defined input-output pairs.

It does this with the help of a labelled ‘training data’ that consists of a set of training examples.

In our previous example, the picture of shoes and the name ‘shoes’ are input and output respectively.

After learning from hundreds or thousands of different shoe pictures and the name ‘shoes’ along with the same for socks when our system is given an input only (a new picture of shoes), it will give an output (name: shoes).

Often, the function y = f(x) is used to represent supervised ML where ‘x’ is the input data and ‘y’ the output variable, a function of ‘x’ that is to be predicted.

In any training data, the example pair normally consists of an input that is typically a vector (a collection of features determining a sample).

The desired output value which we call ‘supervisory signal’, the meaning of which is easy to understand from the name.

Register For a
Free Webinar

Date: 17th Oct, 2019 (Thursday)
Time: 3 PM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

Interestingly, supervised machine learning is analogous to ‘concept learning’ or ‘category learning’ in humans or animals.

This is defined as ‘the search for and listing of attributes that can be used to distinguish exemplars from non-exemplars of various categories’ (Bruner, Goodnow, Austin (1967)).

Broadly, supervised ML can be categorized in the following types:

1. Classification

As the name suggests, classification algorithms undertake the job of predicting a label or putting the variable into a category (categorization).

For example, classifying something as ‘socks’ from our previous example.

An everyday application of Classification Predictive Algorithm is the spam detector in emails, which will identify attributes that help it categorize an email under ‘spam’ or ‘not spam’.

It is important to be able to identify whether a problem is of classification or regression.

Characteristics of a classification problem:

(i) Examples can be classified into one of the two or more classes

(ii) A problem with two classes can also be called a binary problem

(iii) A problem with more than two classes can be called a multi-class classification problem

The following image shows a typical classification problem where the variable is ‘categorized’ in either the ‘cats’ category or the ‘dogs’ category.

It can also be seen from the boundary line that some errors have been made by classifying some dogs as cats and vice-versa.

This problem occurs in cases where attributes may be similar, such as when the height is one of the classification criteria and some breeds of dogs may have shorter bodies.

The larger the size of training data, the lesser are the chances of error.

Classification Problem - Image Source - en.proft.me

Classification Problem

In some cases though, a classification model may present a continuous value instead of a discrete one, which it does as a way of depicting the probability of a certain category being applicable.

For example, a specific animal may be assigned a probability of 0.9 for being a dog and 0.1 for being a cat.

It simply means there is a higher likelihood of that animal being a dog.

In such cases, the predicted probability is converted into discrete class value by selecting the one with the higher/highest probability.

2. Regression

The regression predictive algorithm does not identify a ‘category’ of the variable but assigns a quantity/number to it based on historical data.

It uses the relationship between an independent variable and a dependent variable from the historical relationship data and predicts a quantity.

A common example of this is predicting temperature on a particular day or predicting commodity prices with respect to time.

The data points in regression have continuous values, such as ‘$10,000 to $50,000’.

Regression Problem - Image Source - Stanford

Regression Problem

The image shows how a straight line is drawn through all data points in order to depict a linear relationship, which is called linear regression.

Now, if the system encounters a value of age, say 10, for which we do not have any training data examples, the system can still predict a numerical continuous value such as 1.3 – 1.4 based on the historical relationship.

Steps Involved in Supervised Learning

1. Determine the Type of Training Examples

For example, deciding on the type of shoes and socks images or cats and dogs examples to be fed for training

2. Prepare/Gather the Training Data

All the input and output values in labelled form have to be gathered. The set should be representative of the real-world applications of the function.

For example, if there are 30% of a particular breed of dog in the real world, the percentage in training data should not be 60%.

Even for semi-supervised learning, although most part of data is not labelled, it should still be relevant to real-world conditions.

3. Determine Relation Between Input Feature & Representing Learned Function

The input features in practical cases will not be simple as shoes and socks, but complex with multiple features, thus used typically as vectors.

This means the accuracy of the function may vary. The number of features should be optimum for optimum accuracy

4. Select a Learning Algorithm

Selection of the correct algorithm matters in both supervised and unsupervised learning.

A variety of algorithms are available for selection and there is no single perfect one, but the one suitable to the problem’s need should be selected.

For example, the linear regression model can be selected when the relationship between variables is linear, decision tree model can be selected where a decision on final value is to be made based on a set of sequential input variables.

Register For a
Free Webinar

Date: 17th Oct, 2019 (Thursday)
Time: 3 PM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

5. Run the Selected Algorithm on Training Data

Running algorithm completes training and the performance can be optimized by using cross-validation using a validation set of variables.

6. Evaluate the Accuracy of the Learned Function Using Values from Test Set

The function is ready to use, but after measuring the performance on a test set of variables, which is different from the training set.

The following video by Google Cloud Platform gives a good basic idea about the steps involved in machine learning.

Common Issues Faced While Using Supervised Learning

(i) A lot of computation time is required for training and also for classification, especially when big data is involved.

(ii) Overfitting: The model may learn the noise in the data to such an extent that instead of considering it an inconsistency, it can be considered a concept for learning.

(iii) A basic difference between supervised and unsupervised learning – in case of an input which cannot belong to any class, instead of creating a new class, the model will include it in one of the existing ones.

Practical Applications of Supervised Machine Learning

For beginners at least, probably knowing ‘what does supervised learning achieve’ becomes equally or more important than simply knowing ‘what is supervised learning’.

A very large number of practical applications of the method can be outlined, but the following are some of the common ones

(i) Detection of spam

(ii) Detection of fraudulent banking or other activities

(iii) Medical Diagnosis

(iv) Image recognition

(v) Predictive maintenance

With increasing applications each day in all the fields, machine learning knowledge is an essential skill.

Register For a
Free Webinar

Date: 17th Oct, 2019 (Thursday)
Time: 3 PM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

We hope this article has helped you move forward in that direction and that you will never need to ask something like ‘Exactly what is supervised learning really?’

If you are an aspiring Machine Learning Engineer and want to build your career in Machine Learning, enroll in our Machine Learning using Python Course.




Your Comment

Your email address will not be published.