Stanford CS143 – Compilers 笔记 1

Compiler Structure

  • Lexical Analyzer
  • Parsing
  • Semantic Analysis
  • Optimization
  • Code Generation

Lexical Analyzer


Lexical analysis is achieved by DFA/NFA derived from regular expressions that collect substrings to produce tokens.


Step 1 – Define the finite set of tokens(integer, keyword, relation, identifier, whitespaces, brackets and stuff)

Step 2 – Form an implementation to convert strings to token/lexeme pairs, discard meaningless tokens.

Implementation of a lexical analyzer

Define regular expressions to catch lexical structures, then use DFA, NFA to implement the process of regular expressions.

Procedure – partition into tokens; input the string to get the matching result.

Problems May Emerg:

  • Ambiguity 1: multiple prefixes can be matched. Solution: maximal much policy
  • Ambiguity 2: able to be interpreted as multiple types of tokens. Solution: priority list for token types
  • Error Handling: write a regex to match all errornous strings

Finite Automata


DFA Features:

  • One move leads to one transition exactly.
  • No epsilon-moves

NFA Features:

  • Multiple transitions are allowed with epsilon-moves.

NFA can be converted to DFA with exponentially increasing size in space.(Suppose there are \(n\) states for NFA, \(2^n – 1\) states are possible for generated DFA)

DFA can be represented as a jump table, so it is structure-trivial.





您的电子邮箱地址不会被公开。 必填项已用*标注