The term ‘Machine Learning’ sounds quite fantastic and revolving in the air all time. Here we going to learn Machine Learning (ML).
Before heading towards ML there is some vocabulary to know:
Data- facts and statistics collected together for reference or analysis or things are known or assumed as facts, making the basis of reasoning or calculation.
Data Analysis- Data analysis is a method in which data is collected and organized so that one can derive helpful information from it. In other words, the main purpose of data analysis is to look at what the data is trying to tell us.
For example, from a database only those data can be extracted that are present in the database if any other value wants to extract, an error will show.
Data Analytics- From the recorded data (past), prediction of the future can be done or method basically looks at future outcomes using historical data. This is called analytics.
What is Machine Learning?
As from our birth, we humans start to see, listen, feel taste and smell (through sense organ) and go through every single object multiple times to remember it.
This is how we learn from very starting to till now.
For example, from childhood you have noticed mobile phones multiple times which creates an image in the brain that mobile looks like this and someone asks ‘Is this a mobile phone?’ then answer can be given easily.
Machine don’t have sense organ thus can’t see the object or feel it so it relies on data. Yes, the data is everything to machine just like food is everything to us.
Machine Learning is a technique in which the machine learns from recorded data to predict the future or what happens in the future. For example, Season (weather not Netflix) we know (approximate) when monsoon, spring, Autumn come through our experience (from childhood to till now) the same can be done by a machine/program by just feeding the data of last 10 (vary) year and it will predict.
Why Machine Learning?
The one and the only reason is “machines are quite fast than human and everyone knows that”. (If you know any please let me know).
NOTE: — Machine means not your complete system or computer, laptop. Here machine means a program which is written by a human in some programming language and called it coding.
The main point comes, what you want from your program, which type of data you want that is an integer type or Boolean type. Here two type of prediction comes into play:
Regression- Regression is a statistical method to compute the relationship between one dependent variable and one or more independent variables or in laymen term means continuous progression.
Where, x=independent variable
(Don’t get confuse everything will be clear)
There are various type of regression, here we’re going to discuss two type
Linear Regression: — linear means straight or progressing from one stage to another in a single series of steps and regression is continuous progression.
That is progressing continuously in a straight line is called Linear Regression.
In linear regression, there is only one independent variable and one dependent variable (in general x is independent means its value don’t depend on another while Y is dependent variable as it depends on ‘x’ and ‘b’).
y= ax+b; The formula of linear regression. Ummm, let me think only one x, and y that means I can plot a graph.
Here a straight line is plotted on the graph (we will talk about the blue dot later).
Multi-linear Regression: — Multi means more than one that is when there is more than one independent variable. And in real-world our data is mostly multilinear.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -
Classification: — classification is the process of categorizing a given set of data into classes or groups.
How to know when to use regression and when to use classification?
If the data you want is in a continuous manner then regression and if you want your answer in the form of ‘true or false’, ‘yes or no’, placed or not placed’, etc.
For example, what salary you will get after being placed is done by regression but whether you are placed or not is done by classification.
The theoretical part ends here for now. Big relief.
Some practical work and for this, I’m using python (version 3.6)
Python has lots of libraries that makes our work easy.
Now again, why we need machine learning?
In the above equation, there is 4 variable whose value is unknown but X and Y are given by the user (in the above rectangle) that means two variable are still left. If only one value to X and Y is there then the human can also calculate the value but what if there are thousands of X and thousands of Y. Here human takes a lot of time to calculate but the machine can do it within 2 seconds (! wow) and can help to predict Y according to the value of X.
Now the main 3 lines are coming which will :
(i) Give the formula
(ii) Calculate weight (a) and constant (b)
(iii) Predict Y according to X
The accuracy of the predicted value depends on the number of data. The more the data, the more the accuracy.
For Example, I have 2 dependent variables (multilinear but you can relate ) duration and internals and external as a dependent variable. Now my program reads the data and fits it according to weight and constant and when I give a new x to predict it will give a predicted value for the same.
In figure(i) the red line is called the regression line or the best fit line or line that is drawn by your program and the blue dots are the actual value of X. The difference between the blue dot and red line for a particular value is called error or loss.
In the next post, we will be going to know the whole concept of multilinear regression, error, how to split a dataset into training and test set & lots more.