Using Wikitionary for large test data set

Sean Cribbs sean at basho.com
Fri Feb 4 14:53:02 EST 2011


Bruce,

If you have the time, sanitize the code you wrote to load the data and send a pull request to the Riak Function Contrib. We'd love to have another example!

https://github.com/basho/riak_function_contrib
http://contrib.basho.com/

We'd also love to put a link to the sample data on the wiki:

https://github.com/basho/riak_wiki
http://wiki.basho.com/Sample-Data.html

If you don't have time to do either of those, send me an email privately and we'll take care of it next week.

Sean Cribbs <sean at basho.com>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Feb 4, 2011, at 2:48 PM, bruce kissinger wrote:

> I was searching for a large data set that I could use to test Riak and I ended up using the Wikipedia Dictionary named Wiktionary. 
> 
> You can download it here:
> 
>    http://download.wikimedia.org/enwiktionary/
> 
> 
> Wiktionary contains about 2.3 million entries and it's easy to parse the data.  I just pulled out the word's title and definition.  If any one wants a little Java program that I wrote to parse the data and add it to Riak send me a private email.
> 
> -- 
> Bruce Kissinger
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110204/2b13b9d4/attachment.html>


More information about the riak-users mailing list