Introduction to the Theory of Computation | Context-Free Languages
The sequence of substitutions to obtain a string is called a derivation.
Definition: A context-free grammar is a 4-tuple , where
- is a finite set called the variables
- is a finite set, disjoint from , called the terminals
- is a finite set of rules, with each rule being a variable and a string of variables and terminals
- is the start variable
The language of the grammar is
Method: Convert DFA into equivalent CFG.
- make a variable for each state of the DFA
- add the rule to the CFG if
- add the rule to the CFG if is an acceptstate of the DFA.
- make the start variable of the grammar, where is the start state of the machine.
Definition: A string is derived ambiguously in context-free grammar if it has two or more different leftmost derivations. Grammar G is ambiguous if it generates some string ambiguously.
Definition: Languages that can be generated only by ambiguous grammars are called inherently ambiguous.
Definition: A context-free grammar is in Chomsky normal form if every rule is of the form
where is any terminal and , , and are any variables—except that and may not be the start variable. In addition, we permit the rule , where is the start variable.
We can convert any grammar into Chomsky normal form.
- 去除无用符号,去除空产生式,去除单位产生式
- 把所有长度大于一的生成式全部改成变元,比如 改成
- 通过二分的方法把所有长度大于 2 的产生式改成 2 的,比如 改成 其中
乔姆斯基的一大作用是将分析树变成二叉树: 如果分析树的高度为 那么产出的最大也就是前 层是满二叉树的情况,产出长度为
Definition: A pushdown automaton is a 6-tuple , where , , , and are all finite sets, and
- is the set of states
- is the input alphabet
- is the stack alphabet
- is the transition function
- is the start state
- is the set of accept states
Theorem: A language is context free if and only if some pushdown automaton recognizes it.
Pumping lemma for context-free languages states that, if is a context-free language, then there is a number (the pumping length) where, if is any string in of length at least , then may be divided into five pieces satisfying the conditions
Nondeterministic pushdown automata are more powerful than their deterministic counterparts. The languages that are recognizable by deterministic pushdown automata (DPDAs) are called deterministic context-free languages (DCFLs).
Lemma: Every DPDA has an equivalent DPDA that always reads the entire input string.
For simplicity, we'll assume henceforth that DPDAs read their input to the end.
The class of DCFLs is closed under complementation.