Golang implementation Porter Stemming for Russian language.
go get github.com/liderman/rustemmer
Getting base word:
wordBase := rustemmer.GetWordBase("вазы")
// wordBase = "ваз"
Normalization of the text:
text := "г. Москва, ул. Полярная, д. 31А, стр. 1"
fmt.Print(
rustemmer.NormalizeText(text),
)
// Displays:
// г Москв ул Полярн д 31А стр 1
- Need at least
go1.2
or newer.
You can read package documentation here.
Unit-tests:
go test -v
Benchmarks:
go test -test.bench .
The test result on computer mac-mini 2012 (Intel Core i5):
PASS
BenchmarkNormalizeText-4 5000 304275 ns/op
BenchmarkGetWordBase-4 2000 1176104 ns/op
ok /src/github.com/liderman/rustemmer 4.043s