**Introduction**

The world is growing rapidly and so does technology. Each day you see something of which you haven’t heard of. Machine Learning is one of that field in which new advancements take place every day. However; in order to reach the summit—in our case mastering Machine Learning techniques—you must start from the bottom. Linear regression is one of the earliest and most used algorithms in Machine Learning and a good start for novice Machine Learning wizards.

Therefore, in this tutorial of linear regression using python, we will see the model representation of the linear regression problem followed by a representation of the hypothesis. After that, we will dive into understanding how cost function works and a brief idea about what gradient descent is before ending our tutorial with an example. Let’s get started.

**Model Representation**

To narrate the linear regression problem more technically, given a training set, our goal is, to learn a function h: X -> Y so that h(x) is a *good* predictor for the corresponding value of y. Because we, people in Machine Learning, are not good at determining terms, this ‘h’ is called the hypothesis, which is basically a function. To visualize it, look down at the diagram.

When the target variable is continuous, such as our y, the problem is regression problem and we use regression algorithms to solve that, whereas if y can take only discrete values, we call it classification problem.

**How do we represent the Hypothesis?**

Now that we know what our hypothesis is, let’s take a look at how it works. As I mentioned earlier, linear regression has two methods— Univariate and Multivariate, the hypothesis for both is different. For those who are aware of linear algebra must know that the equation for the straight line would be,

*Y = x0 + m(X1) + constant*

Where m is the slope of the line, whereas c is constant. Our hypothesis is the same thing in one way or another,

*h(x) = theta0 + theta1(x)* —* Univariate*

*h(x) = theta0 + theta1(x1) + theta2(x2) ….* — Multivariable

Where x is our input variable. And, theta0 and theta1 are what we call *parameters. *As you can see from the image below, the line is our hypothesis, linear in nature. You must be wondering, what about theta0 and theta1? What are they? What is their value? Where do we use them? Right? Let’s get to it.

Now that you have the equation for the hypothesis, you should select the values of theta0 and theta1 such as the line, which we are plotting on the data, fits perfect and can give us the output we desire. The question remains, how do we find those values? So, for that, there are several techniques which we will see later in this post.

Let’s parrot back what we have seen so far, linear regression classifies as a Supervised Learning problem. Given the data we predict the value for something, for example, given the data of area and prices of 50 houses, we predict what will be the price of our house. In order to predict we need a hypothesis or function to which we will feed the data and it will give us the output. That hypothesis must fit through the data to give us the most accurate output. Clear until now? If now I’d suggest you go back and read it once more. If you are clear, let’s roll to understand how we will derive theta0 and theta1.

**Cost Function**

The idea behind the cost function is that we choose the values of theta0 and theta1 such that the h(x), our hypothesis or you can say the output of the function, is close to y— output variable— for our training example (x, y). The cost function is also called Squared error function.

The equation is as shown above. Don’t mind my bad handwriting. Also, please note that m= number of the training set and ½ is taken for the sake of simplicity in the calculation for the later stage. Let’s take an example to understand what we have seen up until now to get the picture even more clear. Our goal for this example is to find the values of theta0 and theta1 such that it can give us a global minimum.

- Let’s assume theta0 = 0 and theta1 = 1, give that if we plot our hypothesis, it would look something like this,

Now that we have theta0 and theta1 let’s find our J(theta), meaning cost function. All output points and our hypothesis perfectly match therefore the difference between them would be, of course, zero. Which would look like the image on the right side, the value of J(theta1), which is J(1) is 1.

- Now let’s find the value for theta0 = 0 and theta1 = 0.5, for that the plots would look like this,

As you can see, there’s a difference between the output value and out predicted value. Now if we find the difference and put it in our cost function equation it will give us value near 0.58 as it is marked.

- For the third example, let’s take theta0 and theta1 both equal to zero. That a line on the x-axis.

That will give us a value near 2.3 as marked. Now plotting different charts would give us something like this,

From which you can tell that the value of J(theta) is minimum at J(1) so we will select that as our value for theta. Now you know all there is to know about Linear regression machine-learning python.

**A Step Further**

Gradient Descent— It is a first-derivative optimization algorithm. Now that we have our hypothesis and cost function, to optimize it we will use a derivative of our cost function. The slope defines which way to move and the size of each step is determined by the learning parameter* alpha.*

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)

**Example**

Let’s hit some python regression python code to get a more lucid idea of what we saw earlier. The best way to implement any machine learning algorithm is to use *the scikit-learn *library. To know more about scikit-learn visit their official website.

[Code]

importmatplotlib.pyplotaspltimportnumpyasnpfromsklearnimportdatasets, linear_modelfromsklearn.metricsimportmean_squared_error, r2_score# Load the diabetes datasetdiabetes = datasets.load_diabetes()# Use only one featurediabetes_X = diabetes.data[:, np.newaxis, 2]# Split the data into training/testing setsdiabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:]# Split the targets into training/testing setsdiabetes_y_train = diabetes.target[:-20] diabetes_y_test = diabetes.target[-20:]# Create linear regression objectregr = linear_model.LinearRegression()# Train the model using the training setsregr.fit(diabetes_X_train, diabetes_y_train)# Make predictions using the testing setdiabetes_y_pred = regr.predict(diabetes_X_test)# The coefficients\n', regr.coef_)# The mean squared error%.2f" % mean_squared_error(diabetes_y_test, diabetes_y_pred))# Explained variance score: 1 is perfect prediction%.2f' % r2_score(diabetes_y_test, diabetes_y_pred))# Plot outputsplt.scatter(diabetes_X_test, diabetes_y_test, color='black') plt.plot(diabetes_X_test, diabetes_y_pred, color='blue', linewidth=3) plt.xticks(()) plt.yticks(()) plt.show()

[Output]

Coefficients: [938.23786125]

Mean squared error: 2548.07

Variance score: 0.47

[Plot]

**Endnotes**

If you don’t know where to start, learning Machine Learning would be a tedious task. But now that I have provided all you need to start with your first algorithm that is Linear regression Machine Learning python, you are set to embark on your journey to become a Data Science or AI wizard. However; if any part of the article isn’t clear to you, feel free to leave a comment down in the comment box and we will make sure you are not stuck there.

Happy Learning.