Using Wikitionary for large test data set

bruce kissinger brucekissinger at
Fri Feb 4 14:48:59 EST 2011

I was searching for a large data set that I could use to test Riak and I
ended up using the Wikipedia Dictionary named Wiktionary.

You can download it here:

Wiktionary contains about 2.3 million entries and it's easy to parse the
data.  I just pulled out the word's title and definition.  If any one wants
a little Java program that I wrote to parse the data and add it to Riak send
me a private email.

Bruce Kissinger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list