lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

index.md (5631B)


      1 +++
      2 title = "The relational model"
      3 +++
      4 
      5 # The relational model
      6 
      7 ## Concepts: schema, state
      8 
      9 ![Database schema tree](database-schema-tree.png)
     10 
     11 ### Data types
     12 
     13 all table entries are data values that conform to some selection of data
     14 types.
     15 
     16 example data types:
     17 
     18 -   strings:
     19     -   `VARCHAR(n)` strings up to n characters, use this if you know a
     20         certain maximum length
     21     -   `TEXT` variable length up to 2 GB (but does not always take up 2
     22         GB)
     23 -   numbers:
     24     -   `INT` integer
     25     -   `NUMERIC(n)` decimal number with n digits
     26 -   date and time
     27 -   binary data, like a `BLOB` (binary large object)
     28 
     29 domain `val(D)` of a type D is the set of possible values for that type
     30 
     31 with SQL, we can create our own application-specific domains (new data
     32 types). examples:
     33 
     34 -   `CREATE DOMAIN EXNUM AS NUMERIC(2)`
     35 -   `CREATE DOMAIN EXNUM AS NUMERIC(2) CHECK(VALUE > 0)`
     36 
     37 ### Relation schema
     38 
     39 relation schema s (schema of single relation) defines:
     40 
     41 -   finite sequence A₁..An of distinct attribute names
     42 -   for each attribute Ai a data type (domain) Di
     43 
     44 written as: s = { A₁ : D₁, \..., An : Dn }
     45 
     46 SQL notation:
     47 
     48 ```sql
     49 CREATE TABLE exercises
     50     (CAT    CHAR(1),
     51      ENO    NUMERIC(2),
     52      TOPIC  VARCHAR(40),
     53      MAXPT  NUMERIC(2))
     54 ```
     55 
     56 a relational database schema S defines:
     57 
     58 -   finite set of relation names {R₁, \..., Rm}
     59 -   for every relation R a relation schema sch(R)
     60 -   a set of C integrity constraints
     61 
     62 so S = ({R₁, \..., Rm}, sch, C)
     63 
     64 ### Tuples
     65 
     66 used to formalize table rows. so e.g. in table exercises:
     67 `(’H’, 1, ’Rel.Alg.’, 10)`
     68 
     69 ### Database states
     70 
     71 let a database schema ({R₁, \..., Rm}, schema, C) be given.
     72 
     73 a database state I for this database schema defines for every relation
     74 name Ri to a finite set of tuples I(Ri) w.r.t the relation schema
     75 Schema(Ri)
     76 
     77 ## Null values
     78 
     79 relational model allows missing attribute values.
     80 
     81 these are represented by `NULL`. not the same as 0 or an empty string.
     82 
     83 used to model scenarios like:
     84 
     85 -   a value exists IRL but is not known
     86 -   no value exists
     87 -   attribute is not applicable
     88 -   any value will do
     89 
     90 without null values, you\'d have to split a relation into specific
     91 relations, like `student_with_email` and `student_without_email`. or the
     92 users would have to make up a fake value, which would not be uniform and
     93 you\'d have no idea how to query.
     94 
     95 problems:
     96 
     97 -   no clear semantics, since null values can be used in many scenarios
     98 -   SQL has three-valued logic (true, false, unknown), so if you want to
     99     check for null, you have to say `WHERE A IS NULL`
    100 
    101 so declaring many attributes as NOT NULL simplifies the program, but
    102 only do that if you\'re sure that there\'s always a value.
    103 
    104 ## Constraints
    105 
    106 the database should be an image of the relevant subset of the real
    107 world. plain definition of tables often allows too many database states.
    108 
    109 Integrity constraints (IC): conditions which every database state has to
    110 satisfy, restricting the set of possible database states.
    111 
    112 In `CREATE TABLE`, possible constraints:
    113 
    114 -   `NOT NULL`: no value in this column can be null
    115 -   `Keys`: each key value can only appear once
    116 -   `Foreign keys`: values in a column must appear as key values in
    117     another table
    118 -   `CHECK`: column values have to satisfy a given predicate
    119 
    120 Why specify constraints:
    121 
    122 -   some protection against data input errors
    123 -   document knowledge about database states
    124 -   enforce law/company standards
    125 -   protect against inconsistency
    126 -   simplify queries in programs
    127 
    128 ## Keys
    129 
    130 key of relation R is attribute A that uniquely identifies tuples in R.
    131 this refers to all possible database states, not just the current one.
    132 
    133 key constraint is satisfied in database state iff all tuples have
    134 different values for A.
    135 
    136 once something is declared as a key, the DBMS refuses insertion of
    137 tuples with duplicate key values.. this refers to all possible database
    138 states, not just the current one.
    139 
    140 keys can consist of several attributes, then they are composite keys. if
    141 columns A,B form a composite key, there cannot be two tuples t ≠ u which
    142 agree in both attributes. keys should never change.
    143 
    144 a key constraint becomes weaker (less restrictive) if attributes are
    145 added to it.
    146 
    147 a key is minimal if no attribute can be removed without destroying the
    148 key constraint.
    149 
    150 a relation may have more than one minimal key. one is the primary key,
    151 which cannot be null. other keys are alternate/secondary.
    152 
    153 choosing a primary key: single simple attribute, never updated
    154 
    155 ### Foreign keys
    156 
    157 use key attributes to uniquely reference a tuple, like a pointer.
    158 denoted with arrows:
    159 
    160 ```sql
    161 RESULTS (SID → STUDENTS,
    162         (CAT, ENO) → EXERCISES,
    163         POINTS)
    164         STUDENTS (SID,FIRST,LAST,EMAIL)
    165         EXERCISES (CAT,ENO,TOPIC,MAXPT)
    166 ```
    167 
    168 to refer from relation R to tuples of S, add primary key attributes of S
    169 to attributes of R. only stable if the logical \'address\' of a tuple
    170 does not change.
    171 
    172 implements a one-to-many relationship.
    173 
    174 an existence guarantee is needed.
    175 
    176 foreign key constraints in SQL:
    177 
    178 ```sql
    179 CREATE TABLE RESULTS (
    180     ...
    181     FOREIGN KEY (SID) REFERENCES STUDENTS(SID)
    182 )
    183 ```
    184 
    185 this ensures that every referenced row exists, which ensures referential
    186 integrity of the database.
    187 
    188 these operations violate the constraints:
    189 
    190 -   insertion into table with foreign key without matching referenced
    191     tuple. DBMS rejects update.
    192 -   deletion from table if the tuple is referenced via foreign key.
    193     either DBMS rejects, both are deleted (cascade), or foreign key is
    194     set to null.
    195 
    196 only keys may be referenced. a table with composite key must be
    197 referenced by composite key that has same number of attributes.