lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

index.md (2667B)


      1 +++
      2 title = 'Classification 2'
      3 template = 'page-math.html'
      4 +++
      5 # Classification 2
      6 uncertainty is everywhere. probabilistic models look at learning as process of reducing uncertainty.
      7 
      8 probability can be for single variables, but also conditional/posterior — how existing beliefs change in light of new evidence
      9 
     10 ## Naive Bayes classifier
     11 
     12 given a set of classes, use Bayes rule to get posterior probabilities that object with features belongs to class.
     13 
     14 the class with highest posterior probability is most likely class.
     15 naive — assuming that elements in feature vector are conditionally independent
     16 
     17 $P(C_{i} | X) = \frac{P(X | C_{i}) \times P(C_{i})}{P(X)}$
     18 
     19 ## Hidden Markov Model Classifier
     20 works on a set of temporal data (when time is important)
     21 each clock tick, the system moves to new state (can be the previous one)
     22 we do not know these states (hidden), but we see observations
     23 steps:
     24 
     25 - Train by calculating:
     26     - probability that person is in state x
     27     - transition probability P(xj | xi)
     28     - observation probability P(yi | xi)
     29 - Use HMM as classifier
     30     - given observation y, use Bayes to calculate P(xi | y)
     31     - class with highest P wins
     32 
     33 ## Unsupervised learning
     34 
     35 do not have training sets, explore data and search for naturally occurring patterns and clusters
     36 
     37 once clusters are found we make decisions
     38 
     39 two inputs cluster if their vectors are similar (they are close to each other in feature space)
     40 
     41 ![screenshot.png](6690bab9dc8c17cf8396b94cd09f3d6f.png)
     42 
     43 ## Evaluating classifiers
     44 
     45 predictive accuracy — proportion of new, unseen instances that classifies correctly
     46 
     47 classification error — correctly classified or not
     48 error rate — # of classification errors / # of classifications attempted
     49 
     50 true positives/negatives VS false positives/negatives — false negatives can be most dangerous!
     51 
     52 true positive rate (hit rate) — proportion of positive instances that are correctly classified as positive (TP/(TP+FN))
     53 
     54 false positive rate — negative instances that are erroneously classified as positive (FP/(FP+TN))
     55 
     56 accuracy — percent of correct classifications
     57 
     58 confusion matrix gives info on how frequently instances were correctly/incorrectly classified. the diagonal is what’s important.
     59 
     60 when writing a report, it’s best to explicitly give the confusion matrix
     61 ![screenshot.png](f43f65c9be0fe3566b08e933c48e957a.png)
     62 
     63 receiver operating characteristics (ROC) graphs
     64 useful for organising classifiers and visualising their performance
     65 depict tradeoff between hit rates and false alarm rates over noisy channel
     66 
     67 ![screenshot.png](41a367424533a6e08fb95638b9c2b11e.png)![screenshot.png](e57408d5aa6439eabd8137bda295d117.png)