lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

Introduction.html (10475B)


      1 
      2 				<!DOCTYPE html>
      3 				<html>
      4 					<head>
      5 						<meta charset="UTF-8">
      6 
      7 						<title>Introduction</title>
      8 					<link rel="stylesheet" href="pluginAssets/katex/katex.css" /><link rel="stylesheet" href="./style.css" /></head>
      9 					<body>
     10 
     11 <div id="rendered-md"><h1 id="introduction">Introduction</h1>
     12 <nav class="table-of-contents"><ul><li><a href="#introduction">Introduction</a><ul><li><a href="#what-is-ml">What is ML?</a></li><li><a href="#supervised-ml">Supervised ML</a><ul><li><a href="#classification">Classification</a></li><li><a href="#regression">Regression</a></li></ul></li><li><a href="#unsupervised-ml">Unsupervised ML</a></li><li><a href="#what-isnt-ml">What isn&#39;t ML?</a></li></ul></li></ul></nav><h2 id="what-is-ml">What is ML?</h2>
     13 <p>Deductive vs inductive reasoning:</p>
     14 <ul>
     15 <li>Deductive (conclusion by logic): discrete, unambiguous, provable, known rules</li>
     16 <li>Inductive (conclusion from experience): fuzzy, ambiguous, experimental, unknown rules</li>
     17 </ul>
     18 <p>ML lets systems learn and improve from experience without being explicitly programmed (for a specific situation).</p>
     19 <p>Used in software, analytics, data mining, data science, statistics.</p>
     20 <p>Problem is suitable for ML <em>if we can't solve it explicitly</em>.</p>
     21 <ul>
     22 <li>when approximate solutions are ok</li>
     23 <li>when reliability is not the biggest focus</li>
     24 </ul>
     25 <p>Why don't we have explicit solutions? Sometimes could be expensive, or could change over time, or other reasons.</p>
     26 <p><img src="_resources/6610df2f6a4a4d21ad34c09c3468f115.png" alt="overview-diagram.png"></p>
     27 <p>An intelligent agent:</p>
     28 <ul>
     29 <li>online learning: acting + learning simultaneously</li>
     30 <li>reinforcement learning: online learning in a world based on delayed feedback</li>
     31 </ul>
     32 <p>Offline learning: separate learning and acting</p>
     33 <ul>
     34 <li>take fixed dataset of examples</li>
     35 <li>train model on that dataset</li>
     36 <li>test the model, and if it works, use it in prod</li>
     37 </ul>
     38 <h2 id="supervised-ml">Supervised ML</h2>
     39 <p>Supervised: explicit examples of input and output. Learn to predict output for unseen input.</p>
     40 <p>learning tasks:</p>
     41 <ul>
     42 <li>classification: assign class to each example</li>
     43 <li>regression: assign number to each example</li>
     44 </ul>
     45 <h3 id="classification">Classification</h3>
     46 <p>how do you reduce a problem to classification? e.g. every pixel in a grayscale image is a feature, label each feature</p>
     47 <p>classification: output labels are classes (categorical data)</p>
     48 <p>linear classifier: just draw a line, plane, or hyperplane</p>
     49 <ul>
     50 <li>feature space: contains features</li>
     51 <li>model space: contains models. the bright spots have low loss.</li>
     52 <li>loss function: performance of model on data, the lower the better</li>
     53 </ul>
     54 <p>decision tree classifier: every node is a condition for a feature, go down branch based on condition. would look like a step function in a graph.</p>
     55 <p>k-nearest-neighbors: lazy, doesn't do anything, just remembers the data (?? have to look this up in more detail)<br>
     56 features: numerical or categorical</p>
     57 <p>binary classification: only have two classes</p>
     58 <p>multiclass classification: more than two classes</p>
     59 <h3 id="regression">Regression</h3>
     60 <p>regression: output labels are numbers. the model we're trying to learn is a function from feature space to ℜ</p>
     61 <p>loss function: maps model to number that expresses how well it fits the data</p>
     62 <p>common example: <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>l</mi><mi>o</mi><mi>s</mi><mi>s</mi><mo stretchy="false">(</mo><mi>p</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><msub><mo>∑</mo><mi>i</mi></msub><mo stretchy="false">(</mo><msub><mi>f</mi><mi>p</mi></msub><mo stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo>−</mo><msub><mi>y</mi><mi>i</mi></msub><msup><mo stretchy="false">)</mo><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">loss(p) = \frac{1}{n} \sum_i (f_p (x_i) - y_i)^2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.01968em;">l</span><span class="mord mathdefault">o</span><span class="mord mathdefault">s</span><span class="mord mathdefault">s</span><span class="mopen">(</span><span class="mord mathdefault">p</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1.190108em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.845108em;"><span style="top:-2.6550000000000002em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">n</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:-0.0000050000000000050004em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.16195399999999993em;"><span style="top:-2.40029em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.29971000000000003em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.15139200000000003em;"><span style="top:-2.5500000000000003em;margin-left:-0.10764em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">p</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.286108em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.31166399999999994em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1.064108em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.03588em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.31166399999999994em;"><span style="top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141079999999999em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></p>
     63 <p>takes difference between model prediction and target value (residual), then square and sum all residuals</p>
     64 <p>overfitting: the model is too specific to the data, it's memorizing the data instead of generalizing</p>
     65 <p>split test and training data. don't judge performance on training data, the aim is to minimise loss on <em>test</em> data.</p>
     66 <h2 id="unsupervised-ml">Unsupervised ML</h2>
     67 <p>Unsupervised: only inputs provided, find <em>any</em> pattern that explains something about data.</p>
     68 <p>learning tasks:</p>
     69 <ul>
     70 <li>clustering: classification, except no target column, so model outputs cluster id</li>
     71 <li>density estimation: model outputs a number (probability density), should be high for instances of data that are likely. e.g. fitting prob distribution to data</li>
     72 <li>generative modeling: build a model from which you can sample new examples</li>
     73 </ul>
     74 <h2 id="what-isnt-ml">What isn't ML?</h2>
     75 <p>ML is a subdomain of AI.</p>
     76 <ul>
     77 <li>AI, but not ML: automated reasoning, planning</li>
     78 <li>Data Science, not ML: gathering, harmonising, and interpreting data</li>
     79 <li>Data mining is more closely related, but e.g. finding fraud in transaction networks is closer to data mining</li>
     80 <li>Stats wants to figure out the truth, whereas with ML it just has to work well enough, but doesn't necessarily have to be true</li>
     81 </ul>
     82 </div></div>
     83 					</body>
     84 				</html>