sql-overview.md (3076B)
1 +++ 2 title = "SQL Overview" 3 +++ 4 5 # SQL Overview 6 7 ## Basics 8 9 Basic query: `SELECT columns FROM table WHERE condition is true` 10 11 The `FROM` clause is like declaring variables that range over tuples of 12 a relation: 13 14 ```sql 15 SELECT E.ENO, E.TOPIC -- choose columns ENO, TOPIC 16 FROM EXERCISES E -- table exercises, E being an 'alias' for the current row 17 WHERE E.CAT = 'H' -- where the column CAT contains the value 'H' 18 ``` 19 20 If name of tuple variable (\'E\' in the code above) is not given 21 explicitly, the variable will have the name of the relation 22 (\'EXERCISES\' in the code above). 23 24 A reference to attribute A of variable R may be written as `A` if R is 25 *the only tuple variable* with an attribute named A. 26 27 It is almost always an error if there are two tuple variables that 28 aren\'t linked via join conditions. 29 30 Don\'t join more tables than needed. 31 32 In some scenarios, we might have to consider more than one tuple of the 33 same relation to get a result tuple. 34 35 Duplicates have to be explicitly eliminated, using `DISTINCT`. 36 37 Sufficient condition for superfluous DISTINCT, where K is set of 38 attributes uniquely determined by result. 39 40 1. Assume WHERE clause is a conjunction. Let K be the set of attributes 41 in the SELECT clause. 42 2. Add to K attributes A s.t.: 43 - A = c for a constant c is in the WHERE clause 44 - A = B for B ∈ K is in the WHERE clause 45 - if K contains key of a tuple variable, add all attributes of 46 that variable 47 3. Repeat 2 until K is stable. 48 4. If K contains a key of every tuple variable listed under `FROM`, 49 then `DISTINCT` is superfluous. 50 51 Typical mistakes: 52 53 - missing join conditions 54 - unnecessary joins (may slow down query) 55 - self joins with incorrect treatment of multiple tuple variables over 56 same relation, missing conditions 57 - unexpected duplicates 58 - unnecessary `DISTINCT` 59 60 ## Subqueries & non-monotonic constructs 61 62 monotonic: if further rows get inserted, the queries yield a superset of 63 rows 64 65 non-monotonic: \'there is no\', \'does not exist\', \'for all\', 66 \'min/max\'. testing whether or not a query yields an empty result. 67 68 example to select students without any homework result: 69 70 ```sql 71 SELECT FIRST, LAST 72 FROM STUDENTS 73 WHERE SID NOT IN (SELECT SID 74 FROM RESULTS 75 WHERE CAT = 'H') 76 ``` 77 78 conceptually, the subquery (`SELECT SID...`) is evaluated before main 79 query. 80 81 constructs: 82 83 - `NOT IN`: something is not in result of subquery 84 - `NOT EXISTS`: result of subquery is empty 85 - \"For all\": 86 - no universal quantifier in SQL, but you can use `EXISTS`. In 87 logic, `∀X(φ) ↔ ¬∃ X (¬ φ)`. 88 - common pattern is `∀X (φ₁ → φ₂)` becomes `¬∃ X (φ₁ ∧ ¬ φ₂)` 89 90 Comparing values: 91 92 - `ALL`: compare with all values in a set, has to be true for all 93 - `ANY`/`SOME`: has to be true for at least some values in a set 94 (also, `x IN s` == `x = ANY s`) 95 - for both, subquery has to return a single column 96 - comparisons with subquery results (`POINTS = (SELECT ...)`) must 97 return a single row