## Introduction

Enough time has passed for us to know what Machine Learning is! But, Machine learning is one broad area because there are several techniques you can use to analyze your data.

However; before learning the advanced concepts, we must learn some common Machine Learning tools and techniques to understand what is really happening in the much-hyped Machine Learning world. Therefore, in this post, we are going to explore different types of Machine Learning techniques.

But before diving into that, let’s talk about terminology. Here I am going to use three different terms: *Techniques, Algorithms, *and *Models. *Let me explain to you each one.

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)

Techniques in Machine Learning refers to the way of solving problems. For example, Regression (which we will see later on) is a technique to predict a value. To do some regression, a Data Scientist would have to apply specific Algorithms like linear regression in order to get the job done.

And finally, having applied an algorithm to some data, the end result would be a trained model on which you can use new data to generate new results. Don’t worry if you didn’t get any of there, it will be lucid as you read on.

## 5 Essential Machine Learning Techniques

Let’s get right through some of the techniques in Machine Learning.

### #1 Regression

In the regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. To exemplify, given data about the size of houses on the real estate market, try to predict their price. Another example would be, given a picture of a person, we have to predict their age or gender.

*Linear Regression* is one of the widely used and well-understood regression algorithm in Machine Learning as well as Statistics. It is simply estimating real values based on continuous variable(s). In more technical terms, we establish a relationship between independent and dependent variables by fitting the best line (real estate example). This line is known as the regression line, which is represented by a linear equation, Y = a*X + B, where, Y is Dependent variable, a is Slope, X is Independent variable, b is Intercept.

Also, there are sever other techniques to create a linear regression model. Which includes, simple linear regression, ordinary least square, Gradient descent, and Regularization. Regularization techniques in Machine Learning comes in handy when there is overfitting in your input training data.

### #2 Classification

Classification is basically predicting the class, or to which class our data points belong. As we know, our community of Machine Learning is not always good at giving names to methods. So, the class is sometimes referred to as *targets, labels, or categories. *

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)

Classification is in the same category as the Regression, which is supervised learning. For instance, spam detection in emails can be identified as a classification problem. This is a binary classification since there are only two classes, *spam *or *no spam.* The application of classification is wide. It can be useful in domains such as credit approval, medical diagnosis, target marketing and so on.

However; diving into a little deeper, there are two types of learners in classification,* lazy learners and eager learners.* Lazy learners store the training data and wait until the testing data appears on the surface. Hence, they have less time in training but more in predicting. For example, k-nearest neighbor. On the other hand, Eager learners construct a classification model based on training data before classification. Examples of eager learners are, Decision tree and Naïve Bayes.

### #3 Clustering

Clustering is a common unsupervised technique in Machine Learning which groups the data points of the same category. Given some data points, we can use any clustering algorithm to specify each data point into a specific group. In Data Science, it is highly used to gain valuable insights from our data. There are mainly five algorithms in clustering. The most popular and widely used algorithm for solving clustering problems is K-means clustering.

The diagram you see above is the real-world diagram. In which, you can see that data points are clustered into 5 categories. In case you are wondering, the black arrows represent part of the process of calculating clusters and their boundaries. This approach is frequently used for customer segmentation.

You can evaluate credit risk or you can even do things like finding similarities between written documents. Basically, if you have large amount of data and don’t know where to start, clustering data points of the same group is a good way to start.

### #4 Regularization

One major aspect of training your model is avoiding overfitting. The model will have low accuracy if it is overfitting. Overfitting happens because your model is trying to hard to capture noise in your training dataset. There are several methods to deal with overfitting, such as cross-validation. However, another one is Regularization. It is a technique which encourages learning a more complex and flexible model, and to avoid risk of overfitting.

So, what do we achieve from Regularization? If you know your way around Variance-Bias trade-off, regularization significantly reduces the variance of the model without substantially increasing its bias. A popular library for implementing all these algorithms is *scikit-learn*, find yourself some data and play with it to get a better idea of how these things work.

### #5 Anomaly detection

Sometimes you don’t want to group things or classify them into categories. Instead what you are looking for is something unusual, something that stands out in some way. That approach is Anomaly Detection.

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)

It a classic technique in data mining practical world because in real life finding outliers is the most tedious task. Anomalies can broadly categorize as:

Point Anomalies- Single instance of data is anomalous. For example, detecting credit card spend based on the amount spent.

Contextual Anomalies- When the abnormality is context-specific. For example, spending $100 on food every day during the holiday season is alright but not otherwise.

Collective Anomalies- Set of collective data instances helps in detecting anomalies. For example, a potential cyber attack flagged because someone is trying to copy data from a remote machine to the localhost.

#### Endnotes

These are the five, but not limited to, Machine Learning techniques I find very basic to start with. There are endless application and advantages of Machine Learning techniques which goes from detecting a cancer, there are Machine Learning techniques for stock prediction, for self-driving cars and so on. Finding the right technique to solve the right problem is the key to success in Machine Learning.

To build a successful career in Machine Learning, enroll in Digital Vidya’s Machine Learning Course using Python.

If you have any doubts about these techniques, do let us know in the comments. We will be happy to resolve your doubts.

Happy learning.

## 0 Comments