Introduction
A hierarchical data structure that represents data by implementing a divide and conquer strategy
- Given a collection of examples, learn a decision tree that represents it
- use this representation to classify new examples
Can be used as a non-parametric classification and regression method

- Classifiers for instances represented as features vectors (color= ;shape= ;lable= )
- Nodes are tests for feature values
- Leaves specify the categories (labels)
- Output is a discrete category. Real valued outputs are possible (regression)
- Also not too many features, or overfit will apear
Algorithm: recursive
Examples: 数据
Attributes: 用于分类的特征属性(颜色、形状)

其中在Pick时是随机选择的,因此生成的所有树是不一样的
ID3

The recursive algorithm is a greedy heuristic search for a simple tree, but cannot guarantee optimality.