In this article I have tried to cover the main concepts of Machine Learning. For a detailed analysis you can Click on the headings or sub-heading.
What is Machine Learning?
Machine Learning is a part of Artificial Intelligence that uses various algorithms to enable a system to learn from the training data rather than learning through explicit programming. So instead of you doing the programming or writing the code, what you do is you train the generic algorithm and based on the training data the algorithm/machine/model builds the logic.
The process followed by Machine Learning 🙂
- Preparing data
- Training the data
- Generate a Machine Learning algorithm
- Making and refining predictions
Machine Learning is further classified into 3 parts, namely
Supervised learning as the name suggests the presence of a supervisor as a teacher. It is learning in which we teach or train the machine using data that is well labeled which means data is already tagged with the correct answer.
And then, the machine is given a new set of data i.e. input data to ensure that a supervised learning algorithm analyzes the learning data and creates a proper outcome from labeled data. It is further categorized into two parts
Supervised learning algorithm:
- Linear regression
- Nearest neighbor
- Decision tree
- Gaussian Naive Bayes
- Random forest
- Support vector machines(SVM)
- Unsupervised learning
Unsupervised learning is the training of machine using data that is neither classified nor labeled and allowing the algorithm to act on that data with no guidance. Here the role of the machine is grouping unsorted data according to similarities, differences, and patterns without any previous training of data. It is further categorized into two parts
- Reinforcement learning
It’s the ability of an agent to work with the environment and find the best outcome. It uses the concept of hit and trial method. The agent is penalized or rewarded with a point for a correct or a wrong answer, and based on the positive reward points received the model trains itself. And once trained it can predict the results of the new input data.
Why is Machine Learning important?
Its growing prevalence in society and everyday life have made Machine Learning one of the most important aspect of our lives.
- Recommendations of what to watch on YouTube or Netflix
- Ads and messages that appear online
- Voice assistants like Alexa and Siri
- The emergence of self-driving cars
- Character recognition or facial detection
Applications of Machine Learning
- Classification – driving objects into classes
- Regression – discovering relationships between variables
- Clustering – grouping objects based on similar characteristics
Decision Tree Analysis is a basic, predictive modeling tool that has applications spanning a variety of different areas. Generally, decision trees are built via an algorithmic strategy that identifies methods to split a data set based on different conditions.
It’s among the most commonly used and useful techniques for supervised learning. It is a non-parametric supervised learning method used for both regression and classification. The aim is creating a design that predicts the value of a target adjustable by learning simple decision rules inferred from the data features.
Regression in Machine Learning, probably the most popular type of machine learning models, estimates the interactions between variables. It is a statistical way to model the connection between a dependent (target) and independent (predictor) variables with one or more than one independent variable.
Types of regression in Machine Learning
- Linear Regression
- Logistics Regression
- Polynomial Regression
- Support Vector Regression
- Decision Tree Regression
- Random Forest Regression
- Ridge Regression
- Lasso Regression
Linear Regression is an ML algorithm based on supervised learning. It performs a regression process based on independent variables Regression models a target prediction value. It’s normally used for finding out the connection between forecasting and variables. Different regression models differ based on – the relationship type between independent and dependent variables, they’re considering and the number of independent variables being used.
After Linear Regression, the most famous machine learning algorithm is Logistic Regression. In many ways, logistic regression and linear regression are similar. However, the difference lies in what they’re used for. Linear regression algorithms are used for predicting or forecasting values but logistic regression is used in classification tasks. It is further divided into two parts:
- Binary or Binomial
Classification can be performed on structured and unstructured data. The process of predicting the class of given data points is known as Classification. Classes can be called as targets, labels or categories. Classification predictive modeling is the process of approximating a mapping function (f) from input variables (X) to discrete output variables (y).
Types of classifications –
- Logistic regression
- Support Vector Machines
- Decision Tree Classification
- K-Nearest Neighbor (KNN)
- Naive Bayes
- Kernel Support Vector Machines (SVM)
- Random Forest Classification
Neural Network is also known as Artificial Neural Network or just Neural Net. It is a computational learning system that uses a network of functions to understand and translate input data of a particular type into the desired output, i.e. in another form. The artificial neural network was inspired by human biology and how neurons in the human brain function together to understand inputs like human senses.
Types of Neural Network
- Recurrent Neural Network (RNN)
- Convolutional Neural Network (CNN)
Clustering is a type of Unsupervised learning method. Clustering is a Machine Learning method that involves the grouping of data points. Provided a set of data points, to classify each data point into a specific group we can use clustering algorithms.
Theoretically, data points that are in the same group need to have very similar properties and/or features, while data points in different groups must have highly dissimilar properties and/or features. Clustering is a common method for statistical data analysis used in several fields.
Types of clustering in machine learning
- Hard clustering
- Soft clustering
- Hierarchical Clustering Algorithm
- K means clustering
- Mean-Shift Clustering
- DBSCAN or Density-based clustering
It is a supervised machine learning model that uses classification algorithms for two-group classification problems. After providing an SVM model set of labeled training data for either of 2 groups, they are in a position to categorize new examples.
Types of SVM
K-NN algorithm assumes the similarity involving the available cases and the new case and places the new case into the category which is most similar to the available categories. K-NN algorithm stores all the available data and classify a new data point depending on the similarity. This means when new data appear then it can be classified into a properly suite category by utilizing the K-NN algorithm. K-NN algorithm can be used for Regression at the same time as for Classification but mainly it’s used for the Classification problems.
Random Forest is a popular ML algorithm that belongs to the supervised mastering algorithms. It may be used for both Regression and Classification problems in ML. It’s based on the idea of ensemble learning, which happens to be a procedure of pairing several classifiers in order to solve a complicated problem and to boost the overall performance of the algorithm.
The larger the number of trees in the forest, the higher the accuracy of the algorithm and it also prevents the problem of over-fitting.
Naive Bayes is a powerful algorithm for predicting modeling. It is a supervised learning algorithm based on Bayes theorem and is used to solve classification problems. It is mainly used in text classification including a high-dimensional data set.