index.md - lectures.alex.balgavy.eu - Lecture notes from university.

index.md (3632B)
      1 +++
      2 title = 'Introduction'
      3 template = 'page-math.html'
      4 +++
      5 # Introduction
      6 ## What is ML?
      7 Deductive vs inductive reasoning:
      8 
      9   * Deductive (conclusion by logic): discrete, unambiguous, provable, known rules
     10   * Inductive (conclusion from experience): fuzzy, ambiguous, experimental, unknown rules
     11 
     12 ML lets systems learn and improve from experience without being explicitly programmed (for a specific situation).
     13 
     14 Used in software, analytics, data mining, data science, statistics.
     15 
     16 Problem is suitable for ML _if we can't solve it explicitly_.
     17 
     18   * when approximate solutions are ok
     19   * when reliability is not the biggest focus
     20 
     21 Why don't we have explicit solutions? Sometimes could be expensive, or could change over time, or other reasons.
     22 
     23 ![overview-diagram.png](6610df2f6a4a4d21ad34c09c3468f115.png)
     24 
     25 An intelligent agent:
     26 
     27   * online learning: acting + learning simultaneously
     28   * reinforcement learning: online learning in a world based on delayed feedback
     29 
     30 Offline learning: separate learning and acting
     31 
     32   * take fixed dataset of examples
     33   * train model on that dataset
     34   * test the model, and if it works, use it in prod
     35 
     36 ## Supervised ML
     37 Supervised: explicit examples of input and output. Learn to predict output for unseen input.
     38 
     39 learning tasks:
     40 
     41   * classification: assign class to each example
     42   * regression: assign number to each example
     43 
     44 ### Classification
     45 how do you reduce a problem to classification? e.g. every pixel in a grayscale image is a feature, label each feature
     46 
     47 classification: output labels are classes (categorical data)
     48 
     49 linear classifier: just draw a line, plane, or hyperplane
     50 
     51   * feature space: contains features
     52   * model space: contains models. the bright spots have low loss.
     53   * loss function: performance of model on data, the lower the better
     54 
     55 decision tree classifier: every node is a condition for a feature, go down branch based on condition. would look like a step function in a graph.
     56 
     57 k-nearest-neighbors: lazy, doesn't do anything, just remembers the data (?? have to look this up in more detail)
     58 features: numerical or categorical
     59 
     60 binary classification: only have two classes
     61 
     62 multiclass classification: more than two classes
     63 
     64 ### Regression
     65 regression: output labels are numbers. the model we're trying to learn is a function from feature space to ℜ
     66 
     67 loss function: maps model to number that expresses how well it fits the data
     68 
     69 common example: $loss(p) = \frac{1}{n} \sum_i (f_p (x_i) - y_i)^2$
     70 
     71 takes difference between model prediction and target value (residual), then square and sum all residuals
     72 
     73 overfitting: the model is too specific to the data, it's memorizing the data instead of generalizing
     74 
     75 split test and training data. don't judge performance on training data, the aim is to minimise loss on _test_ data.
     76 
     77 ## Unsupervised ML
     78 Unsupervised: only inputs provided, find _any_ pattern that explains something about data.
     79 
     80 learning tasks:
     81 
     82 * clustering: classification, except no target column, so model outputs cluster id
     83 * density estimation: model outputs a number (probability density), should be high for instances of data that are likely. e.g. fitting prob distribution to data
     84 * generative modeling: build a model from which you can sample new examples
     85 
     86 ## What isn't ML?
     87 ML is a subdomain of AI.
     88 
     89 * AI, but not ML: automated reasoning, planning
     90 * Data Science, not ML: gathering, harmonising, and interpreting data
     91 * Data mining is more closely related, but e.g. finding fraud in transaction networks is closer to data mining
     92 * Stats wants to figure out the truth, whereas with ML it just has to work well enough, but doesn't necessarily have to be true
     93 
     94
	lectures.alex.balgavy.eu Lecture notes from university.
	git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
	Log \| Files \| Refs \| Submodules