lecture-4.md (3587B)
1 +++ 2 title = "Lecture 4" 3 template = "page-math.html" 4 +++ 5 6 # Elementary properties of regular languages 7 If L₁, L₂, L are regular languages, then so are: L₁ ∪ L₂, L₁ ∩ L₂, L₁L₂, $\bar{L}$, L₁\L₂, $L<sup>*$, $L</sup>R$ 8 9 Membership is decidable (i.e. if word u is member of regular language L): 10 1. Represent L as a DFA M. Here you might run into practical issues with potentially exponential number of states, so only generate those states visited when reading u (on-the-fly). 11 2. Check if u is accepted by M. 12 13 Emptyness is decidable (i.e. if a regular language L is empty): 14 1. construct DFA/NFA M with L(M) = L 15 2. Check if M has path from starting to final state 16 3. If yes, then L ≠ ∅. Else, L = ∅. 17 18 Subsets are decidable (i.e. if for regular languages L₁ ⊆ L₂) 19 1. L₁ ⊆ L₂ ↔ L₁ \ L₂ = ∅ 20 2. So, language (L₁ \ L₂) is regular. 21 3. So, emptyness is decidable. 22 23 Equivalence is decidable (i.e. if two regular languages are equal): 24 1. L₁ = L₂ ↔ (L₁ ⊆ L₂) ∧ (L₂ ⊆ L₁) 25 2. The RHS of the bi-implication is decidable. 26 27 # Word (string) matching 28 "For input word u and regex r, does u contain a subword in L(r)?" 29 30 Algorithm (used in Unix's `grep`): 31 1. Transform regex Σ^*⋅𝑟 into an NFA 32 2. Compute on-the-fly path of u in the corresponding DFA. 33 3. Terminate as soon as final state is reached. 34 35 Worst-case time complexity of O(|r|⋅|u|) 36 37 # Minimising DFAs 38 Joerg's explanation is too abstract for me, [I recommend this video](https://www.youtube.com/watch?v=1GZOzTJOBuM). 39 It's also a nicer algorithm imo, not as much guess-and-check. But sadly it's not accepted on iSubmit. 40 41 The one Joerg wants us to use, in English: 42 1. Split up the states of the DFA into two sets: final, and nonfinal. Also, making a transition table for the DFA might be useful. 43 2. Split the sets repeatedly: 44 1. Take two of the sets of states (at the start, you can only choose final and nonfinal). One is the source, the other the target; which one is up to you. 45 2. Pick a symbol in Σ. 46 3. Split the source set into two sets of states: 47 * those that can reach a state in the target set via a transition on the chosen symbol, and 48 * those that cannot. 49 3. Once no more splitting is possible, you have a minimal DFA. 50 4. Convert the sets into a DFA diagram: 51 * each set of states becomes one state on the minimal DFA, 52 * if a state was final on the original DFA, any set containing that state becomes final on the minimal DFA 53 * remember to mark the initial state - the set containing the set that was initial on the original DFA 54 5. Enjoy your free points 55 56 # Lexical analysis 57 Converts sequence of characters into sequence of tokens. 58 59 How? 60 * regular expressions - every regex corresponds to a token 61 * lexical analysis searches the longest prefix of input that matches one of the regexes, and that's transformed into a token 62 * when no prefix matches, you get an error 63 * when there are multiple longest, one is chosen 64 65 # Non-regular languages 66 L = { aⁿbⁿ | n ≥ 0 } is not regular; proof by contradiction. 67 68 You can also use pumping lemma: 69 * Let L be regular language 70 * there exists m > 0 st every w ∈ L with |w| ≥ m 71 * can be written as w = xyz 72 * with |xy| ≤ m and |y| ≥ 1 and $xy^i z \in L$ for every i ≥ 0 73 74 In English: all words that are long enough can have a middle section of the word repeated an arbitrary number of times, to produce a new word, which is also part of the language. 75 76 If a language is regular, it _always_ satisfies pumping lemma. By contradiction, you can prove a language is not regular.