Key Components



Tokens

  • Tokens are represented as a pair with token_name and an optional attribute value.

<token_name , attribute value(optional)>

  • token_name may be either keyword, identifier, constant

Patterns

  • Describes the form that the lexemes of a token is represented

  • Regular expressions are used for representing the patterns of lexemes.

Lexemes

  • Sequence of characters present in the source program that matches the pattern for a token.

Terminologies to Remember:
Alphabet

  • Set of symbols and is represented as Ʃ

Symbols

  • Includes numbers, characters and punctuation marks

String

  • Represents countable sequence of symbols drawn from alphabet.

Language

  • Includes finite set of strings over some fixed alphabet.