### Introduction

Linear regression is a linear methodology for modeling the relationship between a scalar response and one or more explanatory variables in statistics. The situation of one explanatory variable is called simple linear regression and for more than one, the process is called multiple linear regressions.

### Description

The connections are modeled using linear predictor functions whose unknown model parameters are assessed from the data in linear regression. These types of models are called linear models. Most normally, the conditional mean of the response specified the values of the explanatory variables. Not as commonly, the conditional median or some other quantile is used. Linear regression focuses on the conditional probability distribution of the response known as the values of the predictors like all forms of regression analysis. Somewhat than on the combined probability distribution of all of these variables that is the domain of multivariate analysis.

### Practical Uses

Linear regression has several practical uses. Best applications fall into one of the following two broad groups:

Linear regression may be used to fit a predictive model to an observed data set of values of the response and explanatory variables if the object is prediction, estimating, or error reduction. If extra values of the explanatory variables are placidly deprived of an accompanying response value, the fitted model might be used to make a prediction of the response after developing such a model.

Linear regression analysis can be applied to quantify the strength of the relationship between the response and the explanatory variables if the objective is to clarify the variation in the response variable that may be credited to variation in the explanatory variables. And in specific to define whether some explanatory variables can have no linear relationship with the response at all, or to classify which subsets of explanatory variables can cover redundant information about the response.

### Machine Learning and Linear regression

Linear regression shows a vital role in the subfield of artificial intelligence known as machine learning. The linear regression algorithm is one of the important supervised machine-learning algorithms due to its comparative ease and familiar properties.

More exactly the field of predictive modeling in Machine learning is mainly concerned with reducing the error of a model or making the most precise predictions likely, at the outlay of explains ability. We will copy in applied machine learning, reuse and steal algorithms from various diverse fields, with statistics and use them towards these ends.

Linear regression was advanced in the field of statistics. It is studied as a model for accepting the relationship between input and output numerical variables. It has been rented by machine learning. It is together a statistical algorithm and a machine learning algorithm. It is the best, record-famous, and well-understood algorithm in statistics and machine learning.

### Linear Regression Model Representation

Linear regression is a nice-looking model as the representation is so simple. The representation is a linear equation, which joins an exact set of input values (x) the solution to which is the predicted output for that set of input values (y). By way of that together the input values (x) and the output value are numeric.

The linear equation gives one scale factor to all input values or columns. Those are named a coefficient and represented by the capital Greek letter Beta (B). One and the only extra coefficient is also added, giving the line an additional degree of freedom and is frequently called the intercept or the bias coefficient.

For illustration, the form of the model would be in a single x and a single y in a simple regression problem:

y = B0 + B1*x

When we have more than one input (x) in higher dimensions, the line is called a plane or a hyper-plane. So, the representation is the form of the equation and the specific values used for the coefficients. Generally, we talk about the complexity of a regression model like linear regression. This mentions the number of coefficients used in the model.

It successfully eliminates the effect of the input variable on the model and therefore from the prediction made from the model (0 * x = 0) when a coefficient becomes zero. This becomes applicable if we look at regularization methods that change the learning algorithm to decrease the complexity of regression models by placing pressure on the total size of the coefficients, lashing some to zero.

### How to Prepare Data for Linear Regression

Linear regression has been studied at great length. By itself, there is a lot of complexity when talking about these desires and expectations which can be scary. We can use these rules more like rules of thumb when using Ordinary Least Squares Regression in practice as the most common application of linear regression. Attempt different preparations of data using these heuristics and see what works best for our problem.

**Linear Supposition.**Linear regression assumes that the relationship between our input and output is linear. It does not support anything else. This can be clear, but it is good to remember when we have a lot of attributes. We may need to transform data to make the relationship linear e.g. log transform for an exponential relationship.**Eliminate Noise.**Linear regression accepts that our input and output variables are not noisy. Deliberate using data cleaning operations that let us healthier expose and clarify the signal in our data. This is of utmost importance for the output variable and we want to eradicate outliers in the output variable (y) if possible.**Eradicate Collinearity.**Linear regression will over-fit our data when we have highly correlated input variables. Consider calculating pairwise correlations for our input data and removing the most correlated.**Gaussian Distributions.**Linear regression would make more reliable predictions if our input and output variables have a Gaussian distribution. We can get some benefit using transforms on variables to make their distribution more Gaussian looking.

Rescale Inputs: Linear regression would frequently create new dependable predictions if we rescale input variables using standardization or normalization.