state-space-search.html (11638B)
1 <!DOCTYPE html> 2 <html> 3 <head> 4 <link rel="Stylesheet" type="text/css" href="style.css"> 5 <script type="text/javascript" async src="https://cdn.jsdelivr.net/gh/mathjax/MathJax@2.7.5/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> 6 <title>state-space-search</title> 7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 8 </head> 9 <body> 10 11 <div id="State space search"><h1 id="State space search">State space search</h1></div> 12 <p> 13 How do we find the solutions of previous problems? 14 </p> 15 <ul> 16 <li> 17 search the state space 18 19 <li> 20 search through explicit tree generation: root is initial state, nodes/leaves generated through successor function 21 22 <li> 23 search generates a graph 24 25 </ul> 26 27 <p> 28 state: representation of a physical configuration 29 </p> 30 31 <p> 32 node: data structure in the search tree. it contains state, parent node, actions, etc. 33 </p> 34 35 <div id="State space search-Uninformed search strategies"><h2 id="Uninformed search strategies">Uninformed search strategies</h2></div> 36 <p> 37 strategy defines picking order of node expansion 38 </p> 39 40 <p> 41 uninformed methods: only use problem definition 42 </p> 43 44 <p> 45 informed methods: use heuristic function to estimate cost of solution 46 </p> 47 48 <p> 49 evaluating performance: 50 </p> 51 <ul> 52 <li> 53 does it always find a solution if there is one? (completeness) 54 55 <li> 56 does it find least-cost solution? (optimality) 57 58 <li> 59 how many nodes generated/expanded? (time complexity) 60 61 <li> 62 how many nodes are stored in memory during search? (space complexity) 63 64 <li> 65 complexity measured in terms of: 66 67 <ul> 68 <li> 69 b: max branching factor of search tree) 70 71 <li> 72 d: depth of least cost solution 73 74 <li> 75 m: max depth of state space, could be infinite 76 77 </ul> 78 <li> 79 time/space complexity measured in terms of max branching factor of search tree, depth of least cost solution, max depth of state space (could be infinite) 80 81 </ul> 82 83 <p> 84 <img src="img/search-alg-summary.png" alt="Summary of uninformed search algorithms" /> 85 </p> 86 87 <div id="State space search-Uninformed search strategies-Breadth-first (BF) search"><h3 id="Breadth-first (BF) search">Breadth-first (BF) search</h3></div> 88 <ul> 89 <li> 90 algorithm: 91 92 <ul> 93 <li> 94 expand shallowest unexpanded node 95 96 <li> 97 implementation - fringe (nodes that have to be explored) is a FIFO queue 98 99 </ul> 100 <li> 101 evaluation: 102 103 <ul> 104 <li> 105 completeness: yes, if branching factor b is finite 106 107 <li> 108 time complexity: if every state has b successors, and solution is at depth d, then \(O(b^{d+1})\) because of number of nodes generated 109 110 <li> 111 space complexity: shitty if every node has to be in memory - \(O(b^{d+1})\) 112 113 <li> 114 optimality: in general yes, unless actions have different cost 115 116 </ul> 117 <li> 118 memory requirements are bigger problem than execution time 119 120 </ul> 121 122 <div id="State space search-Uninformed search strategies-Depth-first (DF) search"><h3 id="Depth-first (DF) search">Depth-first (DF) search</h3></div> 123 <ul> 124 <li> 125 algorithm: 126 127 <ul> 128 <li> 129 expand deepest unexpanded node 130 131 <li> 132 implementation: fringe is a stack 133 134 </ul> 135 <li> 136 evaluation: 137 138 <ul> 139 <li> 140 completeness: no, unless search space is finite and no loops are possible 141 142 <li> 143 time complexity: shitty if m is larger than d (depth of optimal solution) -- \(O(b^m)\). but if many solutions, faster than BF 144 145 <li> 146 space complexity: backtracking search uses less memory, one successor instead of all b -- \(O(bm+1)\) 147 148 <li> 149 optimality: no, same issues as completeness 150 151 </ul> 152 </ul> 153 154 <div id="State space search-Uninformed search strategies-Depth-limited search"><h3 id="Depth-limited search">Depth-limited search</h3></div> 155 <ul> 156 <li> 157 DF-search with depth limit l (nodes at depth l have no successors) 158 159 <li> 160 solves infinite-path problem 161 162 <li> 163 evaluation: 164 165 <ul> 166 <li> 167 time: \(O(b^l)\) 168 169 <li> 170 space: \(O(bl)\) 171 172 <li> 173 completeness: not if l < d 174 175 <li> 176 optimality: not if if l > d 177 178 </ul> 179 </ul> 180 181 <div id="State space search-Uninformed search strategies-Iterative deepening search"><h3 id="Iterative deepening search">Iterative deepening search</h3></div> 182 <ul> 183 <li> 184 strategy to find best depth limit l 185 186 <li> 187 often used in combination with DF search 188 189 <li> 190 after each iteration, throw away everything and increase depth limit 191 192 <li> 193 combines benefits of DF (space complexity) and BF search (time complexity) 194 195 <li> 196 evaluation: 197 198 <ul> 199 <li> 200 completeness: yes (no infinite paths) 201 202 <li> 203 time: \(O(b^d)\) 204 205 <li> 206 space: \(O(bd)\) 207 208 </ul> 209 </ul> 210 211 <div id="State space search-Informed search (heuristics)"><h2 id="Informed search (heuristics)">Informed search (heuristics)</h2></div> 212 213 <p> 214 Heuristic function: "educated guess that reduces/limits search for solutions" 215 </p> 216 217 <p> 218 informedness property of heuristics: 219 </p> 220 <ul> 221 <li> 222 two heuristics \(h_1(n),\; h_2(n)\) with \(0 \leq h_1(n) \leq h_2(n) \leq h*(n)\) 223 224 <li> 225 then \(h_2(n)\) is more informed than \(h_1(n)\) 226 227 <li> 228 with \(h_1\) fewer nodes have to be searched with \(h_2\) 229 230 <li> 231 but \(h_2\) is often more expensive to calculate 232 233 <li> 234 perfect heuristics: \(h(n) = h*(n)\) 235 236 <li> 237 trivial heuristics: \(h(n) = 0\) 238 239 </ul> 240 241 <p> 242 Best-first search 243 </p> 244 <ul> 245 <li> 246 the general approach of informed search 247 248 <li> 249 node selected for expansion based on <em>evaluation function f(n)</em> 250 251 <li> 252 evaluation function measures distance to goal, choose node which appears best 253 254 <li> 255 fringe is queue in order of decreasing desirability 256 257 </ul> 258 259 <div id="State space search-Informed search (heuristics)-A Search"><h3 id="A Search">A Search</h3></div> 260 <p> 261 best-known form of best-first search 262 </p> 263 264 <p> 265 avoid expanding paths that are already expensive 266 </p> 267 268 <p> 269 evaluation function: \(f(n) = g(n) + h(n)\) 270 </p> 271 <ul> 272 <li> 273 g(n) the cost so far to reach node n 274 275 <li> 276 h(n) estimated cost to get from node n to goal 277 278 <li> 279 f(n) estimated total cost of path through node n to goal 280 281 </ul> 282 283 <div id="State space search-Informed search (heuristics)-A* Search"><h3 id="A* Search">A* Search</h3></div> 284 285 <p> 286 A search, but with an admissible heuristic 287 </p> 288 <ul> 289 <li> 290 heuristic is admissible if it <em>never</em> overestimates the cost to get to goal 291 292 <li> 293 admissible heuristics are optimistic 294 295 <li> 296 formally: \(h(n) \leq h*(n)\) where \(h*(n)\) is true cost from n to goal 297 298 </ul> 299 300 <p> 301 evaluation: 302 </p> 303 <ul> 304 <li> 305 complete: yes 306 307 <li> 308 time: exponential with path length 309 310 <li> 311 space: all nodes are stored 312 313 <li> 314 optimal: yes 315 316 </ul> 317 318 <div id="State space search-Adversarial search"><h2 id="Adversarial search">Adversarial search</h2></div> 319 <p> 320 search has no adversary, solution is a (heuristic) method for finding a goal 321 </p> 322 323 <p> 324 games have an adversary, solution is a strategy. time limits force an <em>approximate</em> solution. 325 </p> 326 327 <p> 328 you need a function to evaluate the "goodness" of a game position 329 </p> 330 331 <p> 332 types of games: 333 </p> 334 335 <table> 336 <tr> 337 <th> 338 339 </th> 340 <th> 341 deterministic 342 </th> 343 <th> 344 chance 345 </th> 346 </tr> 347 <tr> 348 <td> 349 perfect information 350 </td> 351 <td> 352 chess, checkers, go, othello 353 </td> 354 <td> 355 backgammon, monopoly 356 </td> 357 </tr> 358 <tr> 359 <td> 360 imperfect information 361 </td> 362 <td> 363 364 </td> 365 <td> 366 bridge poker, scrabble, nuclear war 367 </td> 368 </tr> 369 </table> 370 371 <div id="State space search-Adversarial search-Minimax"><h3 id="Minimax">Minimax</h3></div> 372 373 <div id="State space search-Adversarial search-Minimax-Setup"><h4 id="Setup">Setup</h4></div> 374 <p> 375 two players: MAX, MIN 376 </p> 377 378 <p> 379 MAX moves first, take turns until game is over. winner gets award, loser gets penalty. 380 </p> 381 382 <p> 383 how does this relate to search? 384 </p> 385 <ul> 386 <li> 387 initial state: game configuration e.g. with chess 388 389 <li> 390 successor function: list of <move, state> pairs with legal moves 391 392 <li> 393 terminal test: game finished? 394 395 <li> 396 utility function: numerical value of terminal states (win +1, lose -1, draw 0) 397 398 <li> 399 MAX uses search tree to determine next move 400 401 </ul> 402 403 <div id="State space search-Adversarial search-Minimax-Optimal strategies"><h4 id="Optimal strategies">Optimal strategies</h4></div> 404 405 <p> 406 find contingent strategy for MAX assuming infallible MIN. 407 </p> 408 409 <p> 410 assume that both players play optimally. 411 </p> 412 413 <p> 414 given game tree, optimal strategy can be found with minimax value of each node: 415 </p> 416 417 <pre> 418 minimax(n) = utility(n) if n is a terminal 419 minimax(max(successors of n)) if n is a max node 420 minimax(min(successors of n)) if n is a min node 421 </pre> 422 423 <div id="State space search-Adversarial search-Minimax-Evaluation"><h4 id="Evaluation">Evaluation</h4></div> 424 <ul> 425 <li> 426 complete: yes 427 428 <li> 429 time: \(O(b^m)\) 430 431 <li> 432 space: \(O(bm)\) 433 434 <li> 435 optimal: yes 436 437 </ul> 438 439 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax"><h3 id="Reducing problems of complexity with Minimax">Reducing problems of complexity with Minimax</h3></div> 440 441 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax-Cutting off search:"><h4 id="Cutting off search:">Cutting off search:</h4></div> 442 <p> 443 instead of <code>if TERMINAL(state) then return UTILITY(state)</code> do <code>if CUTOFF-TEST(state, depth) then return EVAL(state)</code> 444 </p> 445 446 <p> 447 this introduces fixed-limit depth. also loses completeness! 448 </p> 449 450 <p> 451 utility: value based on quality of state 452 </p> 453 454 <p> 455 heuristics: value based on estimation of quality of state 456 </p> 457 458 <p> 459 heuristic EVAL: 460 </p> 461 <ul> 462 <li> 463 produces estimate of expected utility of a game from a given position 464 465 <li> 466 should order terminal nodes in same way as utility 467 468 <li> 469 computation shouldn't take too long 470 471 </ul> 472 473 <div id="State space search-Adversarial search-Reducing problems of complexity with Minimax-Alpha-Beta pruning (efficient Minimax)"><h4 id="Alpha-Beta pruning (efficient Minimax)">Alpha-Beta pruning (efficient Minimax)</h4></div> 474 475 <p> 476 with minimax, the number of states is exponential to number of moves 477 </p> 478 479 <p> 480 so, don't examine every node and prune the subtrees you don't have to examine 481 </p> 482 483 <p> 484 Alpha: value of best MAX choice so far 485 </p> 486 487 <p> 488 Beta: value of best MIN choice so far 489 </p> 490 491 <p> 492 you prune the rest of the level if, at any point, beta <= alpha. 493 </p> 494 495 <p> 496 pruning doesn't affect final results, entire subtrees can be pruned 497 </p> 498 499 <p> 500 good move ordering improves effectiveness of pruning 501 </p> 502 503 <p> 504 with 'perfect ordering', time complexity is: \(O(b^{m/2})\) 505 </p> 506 507 <div id="State space search-Adversarial search-Search with no or partial information"><h3 id="Search with no or partial information">Search with no or partial information</h3></div> 508 509 <p> 510 problems: 511 </p> 512 <ul> 513 <li> 514 contingency: percepts provide new info 515 516 <li> 517 exploration: when states/actions of environment are unknown 518 519 <li> 520 sensorless/conformant: agent may have no idea where it is 521 522 </ul> 523 524 <div id="State space search-Adversarial search-Search with no or partial information-Perfect information Monte Carlo sampling (rdeep)"><h4 id="Perfect information Monte Carlo sampling (rdeep)">Perfect information Monte Carlo sampling (rdeep)</h4></div> 525 <ol> 526 <li> 527 rollout: 528 529 <ul> 530 <li> 531 assume a belief state (with perfect info) 532 533 <li> 534 play random game in that state. 535 536 </ul> 537 <li> 538 average the rollouts 539 540 <li> 541 choose one with max average 542 543 </ol> 544 545 <div id="State space search-Adversarial search-Games with chance"><h3 id="Games with chance">Games with chance</h3></div> 546 <p> 547 <img src="img/games-chance.png" alt="Games with chance" /> 548 </p> 549 550 <div id="State space search-Summary (Schnapsen)"><h2 id="Summary (Schnapsen)">Summary (Schnapsen)</h2></div> 551 <p> 552 Phase 2: minimax & alpha-beta pruning 553 </p> 554 555 <p> 556 Phase 1: PIMC sampling 557 </p> 558 559 <p> 560 what next? give the agent information about the game 561 </p> 562 563 <div id="State space search-Search direction"><h2 id="Search direction">Search direction</h2></div> 564 <p> 565 Data-driven: start with initial state (e.g. a maze) 566 </p> 567 568 <p> 569 Goal-driven: start with goal state, but has bigger branching factor (<span class="todo">TODO</span> confirm this) 570 </p> 571 572 </body> 573 </html>