Back to ComputerTerms, InformationRetrieval

See Also: StopWords

Lexical analysis is the process of converting an input stream of characters into a stream of words or tokens. Tokens are groups of characters with collective significance. This is the first stage of automated indexing and of the query processing.

Issues:

Implementation:

  1. Use alexical analyzer generator like lex: This is the best approach when the lexical analyzer is complicated.
  2. Write a lexical analyzer by hand - ad hoc: The worst solution, this will likely have subtle errors and may not be efficient.
  3. Write a lexical analyzer by hand as a finite state machine: Must be a good way, because this the the one our book chose to implement.

Back to ComputerTerms, InformationRetrieval