Back to ComputerTerms, InformationRetrieval
Signature files typically use Super Imposed Coding
Each document is divided into logical blocks containing D distinct words (StopWords are usually removed before we make the block)
Each word yields a binary "word signature" using some kind of hash code that is F bits in length with m bits set to 1.
The word signature are OR'd together to form the block signature
- The block signatures are concatenated together to form the document signature.
Back to ComputerTerms, InformationRetrieval