Machine learning algorithms broadly categorized as unsupervised or supervised by what experience allowed during process. A dataset collection examples, sometimes also call examples data points. the oldest datasets studied by statisticians and machine learning researchers Iris dataset ( Fisher, 1936 ). of measurements parts of 150 iris plants. Each individual plant corresponds example. The features within each example are the measurements of the parts of the plant: the sepal length, sepal width, petal length, and petal width. The dataset also records which species each plant belonged to. Three different species are
Many machine learning technologies perform both tasks. , the chain rule of probability states that for a vector ∈ R n, the joint distribution decomposed as each example a label or target. , the Iris dataset is annotated with the species iris plant. A supervised learning algorithm can study the Iris dataset and learn to classify iris plants into three different species their measurements. Roughly speaking, unsupervised learning involves observing several a random vector and attempting to implicitly or explicitly learn the probability distribution p( ), or some interesting properties of that distribution, while supervised learning involves observing several a random vector and an associated value or vector, and learning to predict from, usually by, estimating p( | ). The term supervised learning originates from the view of the target being provided by or teacher who shows the machine learning system what . In unsupervised learning, no instructor or teacher, algorithm must learn sense of without this guide. Unsupervised learning and supervised learning formally defined terms. The lines between them are often blurred. Many machine learning technologies perform both tasks. , the chain rule of probability states that for a vector ∈ R n, the joint distribution decomposed as
p( ) = p(x i | x , . . . , xi).
This decomposition solve the ostensibly unsupervised problem of modeling p( ) by splitting it into n supervised learning problems. Alternatively, solve the supervised learning problem of learning p( y | ) by using traditional unsupervised learning technologies the joint distribution p(, y ) and inferring.
p (y | ) = p ( , y ) / y p (,y)
Though unsupervised learning and supervised learning completely formal or distinct concepts, help to roughly categorize we do with machine learning algorithms. Traditionally, people regression, classification, and structured output problems as supervised learning. Density estimation in support of other tasks considered unsupervised learning. Other variants of paradigm are possible. , in semi-supervised learning, some examples include a supervision target but others do not. In multi-instance learning, collection of examples is labeled as containing or not containing an example of , but the individual members of labeled. For a recent example of multi-instance learning with deep models, see Kotzias et al. ( 2015 ). Some machine learning algorithms just experience dataset. , reinforcement learning algorithms interact with an environment, so a between system and its experiences. Such algorithms are beyond the scope of this book. Please see Sutton and Barto ( 1998 ) or Bertsekas and Tsitsiklis ( 1996 ) for information about reinforcement learning, and Mnih et al. ( 2013 ) for the deep learning approach to reinforcement learning.