lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

sql-overview.md (3076B)


      1 +++
      2 title = "SQL Overview"
      3 +++
      4 
      5 # SQL Overview
      6 
      7 ## Basics
      8 
      9 Basic query: `SELECT columns FROM table WHERE condition is true`
     10 
     11 The `FROM` clause is like declaring variables that range over tuples of
     12 a relation:
     13 
     14 ```sql
     15 SELECT  E.ENO, E.TOPIC  -- choose columns ENO, TOPIC
     16 FROM    EXERCISES E     -- table exercises, E being an 'alias' for the current row
     17 WHERE   E.CAT = 'H'     -- where the column CAT contains the value 'H'
     18 ```
     19 
     20 If name of tuple variable (\'E\' in the code above) is not given
     21 explicitly, the variable will have the name of the relation
     22 (\'EXERCISES\' in the code above).
     23 
     24 A reference to attribute A of variable R may be written as `A` if R is
     25 *the only tuple variable* with an attribute named A.
     26 
     27 It is almost always an error if there are two tuple variables that
     28 aren\'t linked via join conditions.
     29 
     30 Don\'t join more tables than needed.
     31 
     32 In some scenarios, we might have to consider more than one tuple of the
     33 same relation to get a result tuple.
     34 
     35 Duplicates have to be explicitly eliminated, using `DISTINCT`.
     36 
     37 Sufficient condition for superfluous DISTINCT, where K is set of
     38 attributes uniquely determined by result.
     39 
     40 1.  Assume WHERE clause is a conjunction. Let K be the set of attributes
     41     in the SELECT clause.
     42 2.  Add to K attributes A s.t.:
     43     -   A = c for a constant c is in the WHERE clause
     44     -   A = B for B ∈ K is in the WHERE clause
     45     -   if K contains key of a tuple variable, add all attributes of
     46         that variable
     47 3.  Repeat 2 until K is stable.
     48 4.  If K contains a key of every tuple variable listed under `FROM`,
     49     then `DISTINCT` is superfluous.
     50 
     51 Typical mistakes:
     52 
     53 -   missing join conditions
     54 -   unnecessary joins (may slow down query)
     55 -   self joins with incorrect treatment of multiple tuple variables over
     56     same relation, missing conditions
     57 -   unexpected duplicates
     58 -   unnecessary `DISTINCT`
     59 
     60 ## Subqueries & non-monotonic constructs
     61 
     62 monotonic: if further rows get inserted, the queries yield a superset of
     63 rows
     64 
     65 non-monotonic: \'there is no\', \'does not exist\', \'for all\',
     66 \'min/max\'. testing whether or not a query yields an empty result.
     67 
     68 example to select students without any homework result:
     69 
     70 ```sql
     71 SELECT  FIRST, LAST
     72 FROM    STUDENTS
     73 WHERE   SID NOT IN (SELECT  SID
     74                     FROM    RESULTS
     75                     WHERE   CAT = 'H')
     76 ```
     77 
     78 conceptually, the subquery (`SELECT SID...`) is evaluated before main
     79 query.
     80 
     81 constructs:
     82 
     83 -   `NOT IN`: something is not in result of subquery
     84 -   `NOT EXISTS`: result of subquery is empty
     85 -   \"For all\":
     86     -   no universal quantifier in SQL, but you can use `EXISTS`. In
     87         logic, `∀X(φ) ↔ ¬∃ X (¬ φ)`.
     88     -   common pattern is `∀X (φ₁ → φ₂)` becomes `¬∃ X (φ₁ ∧ ¬ φ₂)`
     89 
     90 Comparing values:
     91 
     92 -   `ALL`: compare with all values in a set, has to be true for all
     93 -   `ANY`/`SOME`: has to be true for at least some values in a set
     94     (also, `x IN s` == `x = ANY s`)
     95 -   for both, subquery has to return a single column
     96 -   comparisons with subquery results (`POINTS = (SELECT ...)`) must
     97     return a single row