Monday, March 21, 2011

Longest Palindrome in a string

Brute force Algorithm.

  1. Have 2 for loops
    for i = 1 to i less than array.length -1
    for j=i+1 to j less than array.length
  2. This way you can get substring of every possible combination from the array
  3. Have a palindrome function which checks if a string is palindrome
  4. so for every substring (i,j) call this function, if it is a palindrome store it in a string variable
  5. If you find next palindrome substring and if it is greater than the current one, replace it with current one.
  6. Finally your string variable will have the answer

Brute force approach for this problem takes O(n3) time.

Another approach is

  1. Reverse the string and store it in different string
  2. Now find the largest matching substring between both the strings

This too will take O(n2) time

We can solve this problem using suffix trees, but constructing the suffix tree itself seems to be more complex in terms of both time and space complexity.

We can solve this problem using a better approach which will revolve around with a center for every single character.

For each position in the string we store an upper and lower pointer that tracks the upper and lower bounds of the largest palindrome centered at that position.

For instance, position 2 in the string "racecar" would start as:

image

At this point, the longest known palindrome at position 2 is the single character 'c'. We then increment the upper pointer and decrement the lower pointer and compare their respective values.

image

Since the upper and lower values are not the same, we stop and now know that the longest palindrome centered at position 2 is 'c'. We then start the process again at the next string position, 3.

image

We increment and decrement the upper and lower pointers and see that they are equal. Our current longest palindrome centered at position 3 is now 'cec'.

image

Since they were equal, we increment and decrement the pointers again and see that they are still equal. Our new longest palindrome centered at position 3 is now 'aceca'.

image

We continue incrementing and decrementing the upper and lower pointers until they don't match or they hit one of the ends of the string.

image

Since our pointers are equal, we increment and decrement them again. We hit an end of the string and the pointers are equal so we know that our longest palindrome centered at position 3 is 'racecar'.

We're done with position 3 and continue this same process again at position 4. We'll do this until we've covered every position in the string.

NOTE: We initially assumed that the palindrome had an odd length. This algorithm works for even palindromes as well. The only change we make is that when we initialize the pointers, instead of having the lower and upper pointer point to the same position we have the upper pointer point at the next position. Everything else with the algorithm remains the same. i.e. The initial pointers for an even palindrome 'centered' at position 3 would look like:

image

Runtime:

Worst-Case: O(N^2)

Best-Case: O(N)

The worst case is achieved when the string is a single repeated character because every substring is a palindrome.

The best case is achieved when the string contains no palindromes.

Typically if a string only contains a single palindrome (the fewer the better), the closer to O(N) it will run. This is because every time it checks a position in the string, it checks the character before and after that position, and if they don't match then it stops looking for the palindrome. Positions in the string can be discarded after only one lookup if that position doesn't have a palindrome, so if there are no palindromes you only do N comparisons.

JavaCode:

No comments: