Using Wikitionary for large test data set

bruce kissinger brucekissinger at gmail.com
Fri Feb 4 14:48:59 EST 2011


I was searching for a large data set that I could use to test Riak and I
ended up using the Wikipedia Dictionary named Wiktionary.

You can download it here:

   http://download.wikimedia.org/enwiktionary/


Wiktionary contains about 2.3 million entries and it's easy to parse the
data.  I just pulled out the word's title and definition.  If any one wants
a little Java program that I wrote to parse the data and add it to Riak send
me a private email.

-- 
Bruce Kissinger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110204/b1d12517/attachment.html>


More information about the riak-users mailing list