lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

state-space-search.html (11638B)


      1 <!DOCTYPE html>
      2 <html>
      3 <head>
      4 <link rel="Stylesheet" type="text/css" href="style.css">
      5 <script type="text/javascript" async src="https://cdn.jsdelivr.net/gh/mathjax/MathJax@2.7.5/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
      6 <title>state-space-search</title>
      7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      8 </head>
      9 <body>
     10 
     11 <div id="State space search"><h1 id="State space search">State space search</h1></div>
     12 <p>
     13 How do we find the solutions of previous problems?
     14 </p>
     15 <ul>
     16 <li>
     17 search the state space
     18 
     19 <li>
     20 search through explicit tree generation: root is initial state, nodes/leaves generated through successor function
     21 
     22 <li>
     23 search generates a graph
     24 
     25 </ul>
     26 
     27 <p>
     28 state: representation of a physical configuration
     29 </p>
     30 
     31 <p>
     32 node: data structure in the search tree. it contains state, parent node, actions, etc.
     33 </p>
     34 
     35 <div id="State space search-Uninformed search strategies"><h2 id="Uninformed search strategies">Uninformed search strategies</h2></div>
     36 <p>
     37 strategy defines picking order of node expansion
     38 </p>
     39 
     40 <p>
     41 uninformed methods: only use problem definition
     42 </p>
     43 
     44 <p>
     45 informed methods: use heuristic function to estimate cost of solution
     46 </p>
     47 
     48 <p>
     49 evaluating performance:
     50 </p>
     51 <ul>
     52 <li>
     53 does it always find a solution if there is one? (completeness)
     54 
     55 <li>
     56 does it find least-cost solution? (optimality)
     57 
     58 <li>
     59 how many nodes generated/expanded? (time complexity)
     60 
     61 <li>
     62 how many nodes are stored in memory during search? (space complexity)
     63 
     64 <li>
     65 complexity measured in terms of:
     66 
     67 <ul>
     68 <li>
     69 b: max branching factor of search tree)
     70 
     71 <li>
     72 d: depth of least cost solution
     73 
     74 <li>
     75 m: max depth of state space, could be infinite
     76 
     77 </ul>
     78 <li>
     79 time/space complexity measured in terms of max branching factor of search tree, depth of least cost solution, max depth of state space (could be infinite)
     80 
     81 </ul>
     82 
     83 <p>
     84 <img src="img/search-alg-summary.png" alt="Summary of uninformed search algorithms" />
     85 </p>
     86 
     87 <div id="State space search-Uninformed search strategies-Breadth-first (BF) search"><h3 id="Breadth-first (BF) search">Breadth-first (BF) search</h3></div>
     88 <ul>
     89 <li>
     90 algorithm:
     91 
     92 <ul>
     93 <li>
     94 expand shallowest unexpanded node
     95 
     96 <li>
     97 implementation - fringe (nodes that have to be explored) is a FIFO queue
     98 
     99 </ul>
    100 <li>
    101 evaluation:
    102 
    103 <ul>
    104 <li>
    105 completeness: yes, if branching factor b is finite
    106 
    107 <li>
    108 time complexity: if every state has b successors, and solution is at depth d, then \(O(b^{d+1})\) because of number of nodes generated
    109 
    110 <li>
    111 space complexity: shitty if every node has to be in memory - \(O(b^{d+1})\)
    112 
    113 <li>
    114 optimality: in general yes, unless actions have different cost
    115 
    116 </ul>
    117 <li>
    118 memory requirements are bigger problem than execution time
    119 
    120 </ul>
    121 
    122 <div id="State space search-Uninformed search strategies-Depth-first (DF) search"><h3 id="Depth-first (DF) search">Depth-first (DF) search</h3></div>
    123 <ul>
    124 <li>
    125 algorithm:
    126 
    127 <ul>
    128 <li>
    129 expand deepest unexpanded node
    130 
    131 <li>
    132 implementation: fringe is a stack
    133 
    134 </ul>
    135 <li>
    136 evaluation:
    137 
    138 <ul>
    139 <li>
    140 completeness: no, unless search space is finite and no loops are possible
    141 
    142 <li>
    143 time complexity: shitty if m is larger than d (depth of optimal solution) -- \(O(b^m)\). but if many solutions, faster than BF
    144 
    145 <li>
    146 space complexity: backtracking search uses less memory, one successor instead of all b -- \(O(bm+1)\)
    147 
    148 <li>
    149 optimality: no, same issues as completeness
    150 
    151 </ul>
    152 </ul>
    153 
    154 <div id="State space search-Uninformed search strategies-Depth-limited search"><h3 id="Depth-limited search">Depth-limited search</h3></div>
    155 <ul>
    156 <li>
    157 DF-search with depth limit l (nodes at depth l have no successors)
    158 
    159 <li>
    160 solves infinite-path problem
    161 
    162 <li>
    163 evaluation:
    164 
    165 <ul>
    166 <li>
    167 time: \(O(b^l)\)
    168 
    169 <li>
    170 space: \(O(bl)\)
    171 
    172 <li>
    173 completeness: not if l &lt; d
    174 
    175 <li>
    176 optimality: not if if l &gt; d
    177 
    178 </ul>
    179 </ul>
    180 
    181 <div id="State space search-Uninformed search strategies-Iterative deepening search"><h3 id="Iterative deepening search">Iterative deepening search</h3></div>
    182 <ul>
    183 <li>
    184 strategy to find best depth limit l
    185 
    186 <li>
    187 often used in combination with DF search
    188 
    189 <li>
    190 after each iteration, throw away everything and increase depth limit
    191 
    192 <li>
    193 combines benefits of DF (space complexity) and BF search (time complexity)
    194 
    195 <li>
    196 evaluation:
    197 
    198 <ul>
    199 <li>
    200 completeness: yes (no infinite paths)
    201 
    202 <li>
    203 time: \(O(b^d)\)
    204 
    205 <li>
    206 space: \(O(bd)\)
    207 
    208 </ul>
    209 </ul>
    210 
    211 <div id="State space search-Informed search (heuristics)"><h2 id="Informed search (heuristics)">Informed search (heuristics)</h2></div>
    212 
    213 <p>
    214 Heuristic function: "educated guess that reduces/limits search for solutions"
    215 </p>
    216 
    217 <p>
    218 informedness property of heuristics:
    219 </p>
    220 <ul>
    221 <li>
    222 two heuristics \(h_1(n),\; h_2(n)\) with \(0 \leq h_1(n) \leq h_2(n) \leq h*(n)\)
    223 
    224 <li>
    225 then \(h_2(n)\) is more informed than \(h_1(n)\)
    226 
    227 <li>
    228 with \(h_1\) fewer nodes have to be searched with \(h_2\)
    229 
    230 <li>
    231 but \(h_2\) is often more expensive to calculate
    232 
    233 <li>
    234 perfect heuristics: \(h(n) = h*(n)\)
    235 
    236 <li>
    237 trivial heuristics: \(h(n) = 0\)
    238 
    239 </ul>
    240 
    241 <p>
    242 Best-first search
    243 </p>
    244 <ul>
    245 <li>
    246 the general approach of informed search
    247 
    248 <li>
    249 node selected for expansion based on <em>evaluation function f(n)</em>
    250 
    251 <li>
    252 evaluation function measures distance to goal, choose node which appears best
    253 
    254 <li>
    255 fringe is queue in order of decreasing desirability
    256 
    257 </ul>
    258 
    259 <div id="State space search-Informed search (heuristics)-A Search"><h3 id="A Search">A Search</h3></div>
    260 <p>
    261 best-known form of best-first search
    262 </p>
    263 
    264 <p>
    265 avoid expanding paths that are already expensive
    266 </p>
    267 
    268 <p>
    269 evaluation function: \(f(n) = g(n) + h(n)\)
    270 </p>
    271 <ul>
    272 <li>
    273 g(n) the cost so far to reach node n
    274 
    275 <li>
    276 h(n) estimated cost to get from node n to goal
    277 
    278 <li>
    279 f(n) estimated total cost of path through node n to goal
    280 
    281 </ul>
    282 
    283 <div id="State space search-Informed search (heuristics)-A* Search"><h3 id="A* Search">A* Search</h3></div>
    284 
    285 <p>
    286 A search, but with an admissible heuristic
    287 </p>
    288 <ul>
    289 <li>
    290 heuristic is admissible if it <em>never</em> overestimates the cost to get to goal
    291 
    292 <li>
    293 admissible heuristics are optimistic
    294 
    295 <li>
    296 formally: \(h(n) \leq h*(n)\) where \(h*(n)\) is true cost from n to goal
    297 
    298 </ul>
    299 
    300 <p>
    301 evaluation:
    302 </p>
    303 <ul>
    304 <li>
    305 complete: yes
    306 
    307 <li>
    308 time: exponential with path length
    309 
    310 <li>
    311 space: all nodes are stored
    312 
    313 <li>
    314 optimal: yes
    315 
    316 </ul>
    317 
    318 <div id="State space search-Adversarial search"><h2 id="Adversarial search">Adversarial search</h2></div>
    319 <p>
    320 search has no adversary, solution is a (heuristic) method for finding a goal
    321 </p>
    322 
    323 <p>
    324 games have an adversary, solution is a strategy. time limits force an <em>approximate</em> solution.
    325 </p>
    326 
    327 <p>
    328 you need a function to evaluate the "goodness" of a game position
    329 </p>
    330 
    331 <p>
    332 types of games:
    333 </p>
    334 
    335 <table>
    336 <tr>
    337 <th>
    338 &nbsp;
    339 </th>
    340 <th>
    341 deterministic
    342 </th>
    343 <th>
    344 chance
    345 </th>
    346 </tr>
    347 <tr>
    348 <td>
    349 perfect information
    350 </td>
    351 <td>
    352 chess, checkers, go, othello
    353 </td>
    354 <td>
    355 backgammon, monopoly
    356 </td>
    357 </tr>
    358 <tr>
    359 <td>
    360 imperfect information
    361 </td>
    362 <td>
    363 &nbsp;
    364 </td>
    365 <td>
    366 bridge poker, scrabble, nuclear war
    367 </td>
    368 </tr>
    369 </table>
    370 
    371 <div id="State space search-Adversarial search-Minimax"><h3 id="Minimax">Minimax</h3></div>
    372 
    373 <div id="State space search-Adversarial search-Minimax-Setup"><h4 id="Setup">Setup</h4></div>
    374 <p>
    375 two players: MAX, MIN
    376 </p>
    377 
    378 <p>
    379 MAX moves first, take turns until game is over. winner gets award, loser gets penalty.
    380 </p>
    381 
    382 <p>
    383 how does this relate to search?
    384 </p>
    385 <ul>
    386 <li>
    387 initial state: game configuration e.g. with chess
    388 
    389 <li>
    390 successor function: list of &lt;move, state&gt; pairs with legal moves
    391 
    392 <li>
    393 terminal test: game finished?
    394 
    395 <li>
    396 utility function: numerical value of terminal states (win +1, lose -1, draw 0)
    397 
    398 <li>
    399 MAX uses search tree to determine next move
    400 
    401 </ul>
    402 
    403 <div id="State space search-Adversarial search-Minimax-Optimal strategies"><h4 id="Optimal strategies">Optimal strategies</h4></div>
    404 
    405 <p>
    406 find contingent strategy for MAX assuming infallible MIN.
    407 </p>
    408 
    409 <p>
    410 assume that both players play optimally.
    411 </p>
    412 
    413 <p>
    414 given game tree, optimal strategy can be found with minimax value of each node:
    415 </p>
    416 
    417 <pre>
    418     minimax(n) = utility(n)                      if n is a terminal
    419                  minimax(max(successors of n))   if n is a max node
    420                  minimax(min(successors of n))   if n is a min node
    421 </pre>
    422 
    423 <div id="State space search-Adversarial search-Minimax-Evaluation"><h4 id="Evaluation">Evaluation</h4></div>
    424 <ul>
    425 <li>
    426 complete: yes
    427 
    428 <li>
    429 time: \(O(b^m)\)
    430 
    431 <li>
    432 space: \(O(bm)\)
    433 
    434 <li>
    435 optimal: yes
    436 
    437 </ul>
    438 
    439 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax"><h3 id="Reducing problems of complexity with Minimax">Reducing problems of complexity with Minimax</h3></div>
    440 
    441 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax-Cutting off search:"><h4 id="Cutting off search:">Cutting off search:</h4></div>
    442 <p>
    443 instead of <code>if TERMINAL(state) then return UTILITY(state)</code> do <code>if CUTOFF-TEST(state, depth) then return EVAL(state)</code>
    444 </p>
    445 
    446 <p>
    447 this introduces fixed-limit depth. also loses completeness!
    448 </p>
    449 
    450 <p>
    451 utility: value based on quality of state
    452 </p>
    453 
    454 <p>
    455 heuristics: value based on estimation of quality of state
    456 </p>
    457 
    458 <p>
    459 heuristic EVAL:
    460 </p>
    461 <ul>
    462 <li>
    463 produces estimate of expected utility of a game from a given position
    464 
    465 <li>
    466 should order terminal nodes in same way as utility
    467 
    468 <li>
    469 computation shouldn't take too long
    470 
    471 </ul>
    472 
    473 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax-Alpha-Beta pruning (efficient Minimax)"><h4 id="Alpha-Beta pruning (efficient Minimax)">Alpha-Beta pruning (efficient Minimax)</h4></div>
    474 
    475 <p>
    476 with minimax, the number of states is exponential to number of moves
    477 </p>
    478 
    479 <p>
    480 so, don't examine every node and prune the subtrees you don't have to examine
    481 </p>
    482 
    483 <p>
    484 Alpha: value of best MAX choice so far
    485 </p>
    486 
    487 <p>
    488 Beta: value of best MIN choice so far
    489 </p>
    490 
    491 <p>
    492 you prune the rest of the level if, at any point, beta &lt;= alpha.
    493 </p>
    494 
    495 <p>
    496 pruning doesn't affect final results, entire subtrees can be pruned
    497 </p>
    498 
    499 <p>
    500 good move ordering improves effectiveness of pruning
    501 </p>
    502 
    503 <p>
    504 with 'perfect ordering', time complexity is: \(O(b^{m/2})\)
    505 </p>
    506 
    507 <div id="State space search-Adversarial search-Search with no or partial information"><h3 id="Search with no or partial information">Search with no or partial information</h3></div>
    508 
    509 <p>
    510 problems:
    511 </p>
    512 <ul>
    513 <li>
    514 contingency: percepts provide new info
    515 
    516 <li>
    517 exploration: when states/actions of environment are unknown
    518 
    519 <li>
    520 sensorless/conformant: agent may have no idea where it is
    521 
    522 </ul>
    523 
    524 <div id="State space search-Adversarial search-Search with no or partial information-Perfect information Monte Carlo sampling (rdeep)"><h4 id="Perfect information Monte Carlo sampling (rdeep)">Perfect information Monte Carlo sampling (rdeep)</h4></div>
    525 <ol>
    526 <li>
    527 rollout:
    528 
    529 <ul>
    530 <li>
    531 assume a belief state (with perfect info)
    532 
    533 <li>
    534 play random game in that state.
    535 
    536 </ul>
    537 <li>
    538 average the rollouts
    539 
    540 <li>
    541 choose one with max average
    542 
    543 </ol>
    544 
    545 <div id="State space search-Adversarial search-Games with chance"><h3 id="Games with chance">Games with chance</h3></div>
    546 <p>
    547 <img src="img/games-chance.png" alt="Games with chance" />
    548 </p>
    549 
    550 <div id="State space search-Summary (Schnapsen)"><h2 id="Summary (Schnapsen)">Summary (Schnapsen)</h2></div>
    551 <p>
    552 Phase 2: minimax &amp; alpha-beta pruning
    553 </p>
    554 
    555 <p>
    556 Phase 1: PIMC sampling
    557 </p>
    558 
    559 <p>
    560 what next? give the agent information about the game
    561 </p>
    562 
    563 <div id="State space search-Search direction"><h2 id="Search direction">Search direction</h2></div>
    564 <p>
    565 Data-driven: start with initial state (e.g. a maze)
    566 </p>
    567 
    568 <p>
    569 Goal-driven: start with goal state, but has bigger branching factor (<span class="todo">TODO</span> confirm this)
    570 </p>
    571 
    572 </body>
    573 </html>