That spaces part gives me an idea. Can I get a dump of the entire thread in one text file.
I think if you parsed every word into a column and then did something like:
Select word, count (word) as TimesRepeated
WHERE word not in (SELECT word from CommonWords)
FROM Word
group by word
ORDER BY count (word) Asc