Wednesday, March 16, 2011

Word Suggest

This is general algorithm we use in search engines. Question is Given a dictionary suggest the words for a given misspelled word.

The word could be misspelled in either of the following 4 ways.

  1. Transpositions: These are set of words where 2 adjacent characters in a word are trans positioned. Ex: hlelo –> hello . There are atmost n-1 trans positioned words for a given word.
  2. Insertion: Word could be misspelled by inserting an extra character. Ex: helxlo –> hello There could be at the max n+1 words with this.
  3. Substitution: Word could be misspelled by typing wrong character in place of another character. Ex: helxo –> hello . Max No. of Substituted words could 25 * ( n )
  4. Omissions: These are set of words where a character is omitted Ex: hllo –> hello . Max No of omitted words could be 26*(n+1)

Lets code them now.

No comments: