Thursday, May 14, 2020

Decision Trees in Machine Learning -Computational learning theory

Decision Trees in Machine Learning -Computational learning theory



Decision Trees in Machine Learning -Computational learning theory


Decision Tree is one of the key predictive modelling approaches used in Statistics & Data mining . Machine learning is a decision tree learning.

Computational learning theory is an investigation of theoretical aspects of Machine Learning of what can & cannot be learned from data.

Computational learning theory


Computational learning theory is regarded as a field which pertains to Artificial Intelligence research, studying the design of a machine-learning algorithm to figure out which types of problems can be considered as learnable.

The main target of this field is to ascertain the theoretical basis of deep learning programs to improve effectiveness and enhance the level of accuracy.

Probably Approximately Correct (PAC):


Leslie Valiant in the year 1984, coined this framework for mathematical analysis of machine learning.

Under this framework, samples are made accessible to the learners and are required to select a generalization function amongst the class of possible functions.

The target is that, with high probability the selected function will have low generalization error.



Merits and demerits of Machine Learning Language.


Merits:


  • Capable of identifying trends.
  • Being automation based,  human intervention is minimal.
  • Area of application is wide.
  • Can handle data under uncertain environments.
  • Efficient and effective in handling multidimensional data.

Demerits:


  • One major limitation of Machine Learning  is that it needs unbiased and good quality data.
  • Need massive data sets to train on.
  • Requires huge resources to function.
  • Requires plenty of time to allow the algorithm learn and develop in such a manner, so that it can produce the desired outcome.
  • Highly susceptible to errors.


Although it is widely believed that machine learning is highly  productive if used properly, but it may not be recommended for everyone.

What is a decision tree?


A decision tree is a flow chart resembling a structure of a tree where each internal node denotes a test on attribute, each branch constitutes the outcome of the test & each node leaf reflects a class label.

Under this concept a decision tree is considered as a predictive model to go from an observation about an item to a conclusion about the items target value. In these tree models where the target variables takes a distinct set of values are referred to as classification trees.

In such tree structure, leaves depict class labels & branches depict conjunctions of features that leads to those labels

In data mining two types of decision trees are used:


1. Classification tree analysis:

When the predicted outcome is the class to which data belongs.

2. Regression tree analysis

When the predicted outcome is basically a real number.

The term Classification & Regression Tree (CART) analysis is a combined term used while referring to both the above-mentioned procedures.

Some techniques  leads to more than one decision tree ( also called as Ensemble Methods).


Boosted Trees:


These are used basically for regression & classification type problems. Eg. Adaboost.
Under this concept training is provided to new instance to highlight the instances earlier mismodelled, thereby gradually building an ensemble.

Bootstrap aggregated decision trees:


Out here multiple decision trees are built by repeatedly re-sampling  training data with replacement & subsequently voting the trees for a consensus prediction.

Rotation Forest:


Under this concept, first up Principal component analysis ( PCA) is applied on a random subset of input features to train every decision tree.

It is noteworthy that the decision list is a one-side decision tree where every internal node has 1 leaf node & 1 internal node as a child however, the bottom-most node is slightly different as its only child is basically having a single leaf node.

A decision tree can be regarded as a structure resembling a flow chart where one can see that each internal node denotes a test on an attribute & each branch reflects the result of a test & one can also find that each leaf node, holds a class label.

We must also keep in mind that the topmost node in a tree is the root node.

Some important specific decision tree algorithms are mentioned below:


  • CART.
  • C4.5 ( it is a successor of ID3).
  • Chi - square automatic interaction detection ( CHAID).
  • MARS.
  • Conditional interferences tree.


Merits of decision tree:


  • Easy to comprehend.
  • Capable of handling numerical as well as categorical data.
  • Minimal data preparation is needed.
  • It is very robust especially against co-linearity.
  • It facilitates to validate models by means of statistical tests.
  • Has the capacity to analyze large amount of data.

Demerits of a decision tree:


  • We must know that trees can be fairly robust, in fact it has been observed that a small change in the training data can lead to a large scale alteration in the tree & eventually the final prediction.
  • Practical decision tree learning are usually related to hueristics or it can be considered as a common sense rule which is meant to increase the probability of reaching the right decision. For eg. we may consider the case of a greedy algorithm where locally optimal decision are made, but the same may not hold true in relation to global optimal decision.


What are decision trees used for?


It can be referred to as a graph that takes into consideration a branching method to exhibit every possible outcome of a decision. It helps in operation research. It offers a methodology by virtue of which the best strategy to obtain the desired outcome or reaching the target , can be spotted.

Conclusion:


A decision tree surely guides us to make the right strategy in reaching the given targets. It is undoubtedly a helpful flow chart based upon which strategies are formulated to achieve the set targets, but we must be wary about the fact that alongside it's strength, it suffers from certain weaknesses.


No comments:

Post a Comment

Please do not enter any spam link in the comment box.

Popular Post

Contact Form

Name

Email *

Message *

Featured Post

Business Intelligence: What Are Business Intelligence Tools ?

What Is Business Intelligence: A Highly Productive Tool Decision-making based on data plays a pivotal role. Data-driven decision-making hold...