Regular Expression concept!
Regular expressions are used to characterize tokens (lexical constructs). A token characterizes a pattern of characters having the same meaning in the source program.
Regexps known as Regular expressions is a crucial notation for describing lexeme patterns! , and is composed of smaller regular expressions representing different languages (by applying defining rules).
A lexical analyzer reads the stream of characters making up the source program and groups the characters into meaningful sequences called lexemes.
A regular set is a language that a regular expression can identify.
Regular Expression Review!
Symbol: an abstract concept that we won't formally describe->(0,a,..)
Alphabet: a limited collection of symbols from which we can create greater structures -> ( Σ={0,a,…})
String: a juxtaposed, finite series of symbols from a specific alphabet-> (abcd,abd,…)
Formal language Σ*: a set of all strings that can be created from a given alphabet in formal language-> {set of all string with length2}={ab,ac,ad,….etc}
Regular Expression Rules!
Built up from three operators:
Concatenation xy
Alternation x|y (x or y)
Repetition x* (x repeated 0 or more times) OR x+ (x repeated 1 or more times)
Recursive rules :
Regular expressions can be defined in the recursive rule as:
Every symbol of Σ is a regular expression
ε is a regular expression
if r1 and r2 are regular expressions, so are (r1) r1r2 r1 | r2 r1*
Nothing else is a regular expression.
Related knowledge:
A. V. Aho and A. V. Aho, Eds., Compilers: principles, techniques, & tools, 2nd ed. Boston: Pearson/Addison Wesley, 2007.
Comments