Tuesday, November 14, 2017

Basics of Data science and machine learning

Hypothesis function = prediction function.
Example: linear regression (i.e. graph of the values predicted by linear regression function is a line). The formula for this function is a function of a line. H(x) = a + b*x. Another popular notation is using  h(x) = θ0 +  θ1*x. Here  θ0 and  θ1 are input parameters used for prediction.

Here, x is the input variable/feature e.g. size of house, and H(x) is the price of the house.

Training examples = existing data about the features (i.e. x1, x2, x3, ...) and corresponding value that we want to predict. E.g. we have data about houses like size of the house, no. of rooms, year built and price. When we use it to create a prediction (hypothesis) function, it is called Training examples.

Cost function - the function to check accuracy of our hypothesis function i.e. the accuracy of our prediction (to get the best possible line in a linear model).

Cost function in linear regression = sum of squared errors
      = total of (difference in actual value and predicted value)^2 for the entire dataset

Gradient Descent
It is a function to find out minimum value of a cost function - i.e. to find out the best parameter values of the hypothesis function, such that the cost (i.e. inaccuracy) is minimum.

Multivariate Linear Regression

Also known as Linear regression with Multiple variables.

Multiple features x1, x2, x3, ... xn and multiple parameters  θ1,  θ2,  θ3, ...  θn.
Hypothesis function: h(x) θ0*x0 +  θ1*x1 + θ2*x2 +  θ3*x3 + ... + θn*xn


Watch this space for more updates!

No comments: