Attend FREE Webinar on Data Science for Career Growth Register Now

Data Analytics Blog

Data Analytics Case Studies, WhyTos, HowTos, Interviews, News, Events, Jobs and more...

Different Types Of Machine Learning

5 (100%) 4 votes

What is Machine Learning really?

As you probably know, there are several Machine Learning definitions available on the internet, one reliable of them all is: “the field of study that gives computers ability to learn without being explicitly programmed.” However, this is an older, informal definition.

Tom Mitchell provides a more modern definition, which is: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Scratching your head? Don’t worry, let’s take an example and break it down, for instance, playing checkers, where,

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

If the performance of machine playing checkers, measured by how many games it wins, improves with playing many games of checkers, we can say that the machine is learning by itself, which is Machine Learning.

So, how to identify problems of Machine Learning? In general, there are two types of machine learning algorithms, Supervised Machine Learning and Unsupervised Machine Learning. In addition, new categories evolve with development in the field which can be identified as reinforcement learning. Let’s dive into what these categories are and how they work.

Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output. Supervised learning problems are further categorized into regression and classification problems.

Regression

In regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. To exemplify, given data about the size of houses on the real estate market, try to predict their price. Another example would be, given a picture of a person, we have to predict their age or gender.

Classification

Classification, on the other hand, is finding the category of the input variable, or in more academic terms, mapping input variables into discrete categories. Ideal sentence to find a classification problem would be, whether this or that, like, yes or no, 0 or 1, true or false. For example, from the example of house price given above, if we change the output to “Sells for more or less than asking price,” then it is a classification problem. Another example is, given a patient with tumour, we have to predict whether the tumour is malignant or benign.

How Supervised learning works

To describe the supervised learning problem slightly more formally, our goal is, given a training set, to learn a function h : X → Y so that h(x) is a “good” predictor for the corresponding value of y. For historical reasons, this function h is called a hypothesis. Seen pictorially, the process is therefore like this:

How Supervised Learning Works Source- Coursera

When the target variable that we’re trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.

Algorithms

There are plenty of different algorithms to solve different kind of problems. There is no right or wrong in the algorithms, it is just some apply to some problems better than the others. Supervised machine learning algorithms include Linear regression, Logistic regression, Random forest, KNN, Decision tree and so on. Let’s understand how these machine learning supervised learning algorithm works,

Linear Regression

Linear regression is simply estimating real values based on continuous variable(s). In more technical terms, we establish relation between independent and dependent variables by fitting a best line (real estate example). This line is known as regression line, which is represented by a linear equation, Y = a*X + B, where,

Y— Dependent variable

a— Slope

X— Independent variable

b— Intercept

Source- AnalyticsVidya

Moreover, Linear regression is of mainly two types, simple and multiple. In simple, there is only one independent variable, whereas in multiple, as the name suggest, there are more than one independent variables.

Logistic Regression

Logistic Regression is a classification algorithm, don’t confuse with its name. It estimates discrete values based on independent variable(s). Since it predicts the probability of occurrence of a particular event by fitting data to a logistic function, output is a s expected between 0 and 1.

Source- AnalyticsVidya

Decision Tree

This is the most favourite algorithm of all times. It is used mainly for classification problems and is of course supervised algorithm having pre-defined target variable. In this algorithm, we split the sample into two or more sub-parts based on most significant differentiator in input variables, which is done by various techniques like Gini, Chi-square, entropy etc.

Source- AnalyticsVidya

Unsupervised Learning

On the contrary to Supervised learning, Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data. With Unsupervised learning there is no feedback based on the prediction results. For example, take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on. This is a good example of clustering. Whereas, for a non-clustering problem such as “Cocktail Party Problem”, it helps in identifying voices music from a mesh of sounds at a cocktail party.

Data Analytics Course by Digital Vidya

Free Data Analytics Webinar

Date: 15th Nov, 2018 (Thu)
Time: 3 PM to 4 PM (IST/GMT +5:30)

Algorithms

Unsupervised learning algorithms helps in wide range of problems such as Social Network Analysis, Astronomical Data Analysis, and so on. Google news is using this approach as well. Neural networks are a part of unsupervised learning. Let’s understand how few of them works.

K-means (Clustering)

The goal of clustering is to create groups of data points such that points in different clusters are dissimilar while points within a cluster are similar. With k-means clustering, we want to cluster our data points into k groups. A larger k creates smaller groups with more granularity, a lower k means larger groups and less granularity.

Source- AnalyticsVidya

Reinforcement Learning

Reinforcement Learning is, when exposed to an environment, how the machine train itself using trial and error. Machine mainly learns from past experiences and tries to perform best possible solution to a certain problem. In past couple of years, a lot of improvements in this particular area has been seen. Main example includes DeepMind’s Alpha Go, beating the champion of the game Go in 2016.

The Reinforcement Learning Process

Reinforcement Leaning Process Source- Medium.com

Let’s understand the learning process of machine/agent by the example of agent learning to play Super Mario Bros. The process can be modelled a s a loop that works like this,

  • Agent receives state S0 from the environment, which in our case is, the first frame of our game.
  • Based on that state S0, agent takes an action A0, moving right/foreword.
  • Right after that the environment transit to a new state S1, which is basically a new frame.
  • Environment gives some reward R1 to agent (not dead: +1)

Endnotes

By now, I am sure that you have enough idea about the different machine learning types and algorithms to get you started. Machine Learning is a field in which you learn 4 times faster by doing it rather than studying it. I would suggest take up small problems and develop your idea about how you can solve the same with Machine Learning, then find an appropriate algorithm to solve it and have fun. Do let us know in the comments if you have any doubt regarding anything written up there, we are happy to help.

Happy learning.

Guest Blogger (Data Science) at Digital Vidya. A Data passionate who loves reading and diving deeper into the Machine Learning and Data Science arts. Always eager to learn about new research and new ways to solve problems using ML and AI.

  • Data-Analytics


  • There is 1 comment


    • 3 months ago

      Emily   /   Reply

      Hello, I have to implement machine learning algorithms in python so could you help me in this. anybody provide me the proper code for an algorithm.

    Your Comment

    Your email address will not be published.