Tokens
Tokens are represented as a pair with token_name and an optional attribute value.
<token_name , attribute value(optional)>
token_name may be either keyword, identifier, constant
Patterns
Describes the form that the lexemes of a token is represented
Regular expressions are used for representing the patterns of lexemes.
Lexemes
Sequence of characters present in the source program that matches the pattern for a token.
Terminologies to Remember:
Alphabet
Set of symbols and is represented as Ʃ
Symbols
Includes numbers, characters and punctuation marks
String
Represents countable sequence of symbols drawn from alphabet.
Language
Includes finite set of strings over some fixed alphabet.