lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

Classification.html (5312B)


      1 <?xml version="1.0" encoding="UTF-8"?>
      2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      3 <html><head><link rel="stylesheet" href="sitewide.css" type="text/css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.5.2 (457164)"/><meta name="altitude" content="0"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2017-11-19 12:13:40 +0000"/><meta name="latitude" content="52.37362670126313"/><meta name="longitude" content="4.836090082173948"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2017-11-27 13:55:46 +0000"/><title>Classification</title></head><body><div>a pattern is an entity vaguely defined, that could be given a name.</div><div>recognition is identification of pattern as member of a category</div><div><br/></div><div><span style="font-weight: bold;">Types of pattern recognition (classification) systems:</span></div><div><ul><li>speech recognition: </li></ul></div><div><ul><ol><li>PC card converts analog waves from mic into digital format</li><li>acoustical model breaks the word into phonemes</li><li>language model compares phonemes to words in built-in dictionary</li><li>software decides on what spoken word was and displays best match</li></ol><li>brain-computer interface that acquires signals directly from the brain</li><li>gesture recognition using acceleration magnitude from watch</li><li>image recognition</li></ul><div><br/></div></div><div><span style="font-weight: bold;">Classification (known categories)</span></div><div><ul><li>given a few classes, each item belongs to one class</li><li>objects are described by features</li><li>system needs a training set (both positive and negative examples)</li><li>if a new item comes, its features are measured and the system decides which class it belongs to</li></ul></div><div><img src="Classification.resources/screenshot.png" height="625" width="920"/><br/></div><div><br/></div><div>Components:</div><div><ul><li>Sensing module</li><li>Preprocessing mechanism</li><li>Feature extraction mechanism</li><li>Classifier</li><li>Training set of already classified examples</li></ul><div><br/></div></div><div><span style="font-weight: bold;">Building a pattern recognition system:</span></div><div><ol><li>Choose features, define classes (e.g. coins 10 cent, 20 cent, 50 cent, 1$, 2$)</li><ul><li>features need to have discriminative power</li><li>not too many, but enough to reliably separate classes based on them</li><li>e.g. coins colour and diameter</li><li>algorithms</li><ul><li>simple: rule-based activity recognition (If…And/Or…Then)</li><li>complicated: machine learning decision trees, HMM, neural networks</li></ul></ul><li>Extract features</li><ul><li>image recognition</li><ul><li>shape decriptors</li><ul><li>form factor (round object has 1, others smaller)</li><li>Euler number (number of objects minus number of holes in objects)</li><li>perimeter, area, roundness ratio…</li></ul><li>preprocessing</li><ul><li>binarisation, morphological operators, segmentation</li></ul><li>extract features (e.g. area, coordinates of centre of mass)</li><li>optical character recognition (OCR)</li><ul><li>converts image into machine readable text</li><li>uses statistical moments (total mass, centroid, elliptical parameters, etc.)</li><li>invariant moments of Hu</li></ul></ul><li>sound recognition</li><ul><li>features</li><ul><li>frequency spectrum</li><li>spectrograms</li><li>Mel cepstrum coefficients — FFT to Log(|x|) to IFFT results in cepstrum</li></ul><li>vowels recognition: second formant vs first formant frequency for vowels (significant freqs)</li></ul></ul><li>Train the classifier</li><li>Evaluate the performance of classification</li></ol></div><div><br/></div><div><b>Classifiers</b></div><div><u>Rule-based:</u> if-then-else</div><div><ul><li>exhaustive, mutually exclusive rules</li><li>works well if there aren’t too many features</li></ul><div><br/></div></div><div><u>Template-matching:</u> a set of reference patterns is available, match an unknown using nearest-neighbour</div><div>get a fingerprint for a specific signal, using FFT (freq. spectrum) or Mel cepstrum coefficients</div><div>train with various words, store fingerprints, and then apply</div><div>two approaches:</div><div><ul><li>maximum correlation</li><li>minimum error — calculate Euclidian distance between vectors</li></ul></div><div><br/></div><div><u>Neural networks:</u></div><div>synapses are weights</div><div>output is binary, depends on comparison between weighted sum of inputs and threshold θ</div><div>a neuron has:</div><div><ul><li>set of weighted inputs — dendrites+synapses</li><li>an adder — soma</li><li>an activation function to decide whether or not the neuron fires</li></ul><div><br/></div></div><div>a neuron cannot learn, but a perceptron can. by changing the weights which are adjustable.</div><div>neural networks are collections of artificial neurons, and have hidden layers.</div><div>they learn by testing output against desired output and adjusting weights accordingly.</div><div><img src="Classification.resources/screenshot_1.png" height="121" width="216"/></div><div><br/></div></body></html>