Seems to me stripping as you index would be more efficient for searching. What your code shows, however, is more than just stripping out HTML - you're also removing parenthesis and slashes, etc.
As far as the HTML is concerned, one simple RegEx should do it.
(Editor is not letting me paste the URL for you to look at... )
Try this one more time..
There we go.. check that link and see if that helps.
^_^