Using Wikitionary for large test data set

Sean Cribbs sean at
Fri Feb 4 14:53:02 EST 2011


If you have the time, sanitize the code you wrote to load the data and send a pull request to the Riak Function Contrib. We'd love to have another example!

We'd also love to put a link to the sample data on the wiki:

If you don't have time to do either of those, send me an email privately and we'll take care of it next week.

Sean Cribbs <sean at>
Developer Advocate
Basho Technologies, Inc.

On Feb 4, 2011, at 2:48 PM, bruce kissinger wrote:

> I was searching for a large data set that I could use to test Riak and I ended up using the Wikipedia Dictionary named Wiktionary. 
> You can download it here:
> Wiktionary contains about 2.3 million entries and it's easy to parse the data.  I just pulled out the word's title and definition.  If any one wants a little Java program that I wrote to parse the data and add it to Riak send me a private email.
> -- 
> Bruce Kissinger
> _______________________________________________
> riak-users mailing list
> riak-users at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list