An implementation of a suffix stripping algorithm for English

New version 1.1b, dated 31/Jan/2007. See link below.

As part of an ongoing research project in the discovery of low-frequency but important relationships in text files, (a development ofchance discovery, I recently implemented the widely-used suffix stripping algorithm of Porter (1980) as a package available under the GPL.

The software can be downloaded from the link below.

wordstem_2006.pdf (104KB)

REFERENCE

http://www.leshatton.org/Documents/lh_stem11b.zip

FEEDBACK

I’m very grateful to Stephen Cox for completing my original implementation concerning treatment of the letter ‘y’ and fixing a defect. Stephen updated the code and shared much information on timings and an interesting implementation he has done using Stratego. His changes are incorporated in this release.