Riak search, post schema change reindexation

Guillaume Boddaert guillaume at lighthouse-analytics.co
Mon Aug 29 03:56:22 EDT 2016


Hi,

I recently needed to alter my Riak Search schema for a bucket type that 
contains ~30 millions rows. As a result, my index was wiped since we are 
waiting for a Riak Search 2.2 feature that will sync Riak storage with 
Solr index on such an occasion.

I adapted a since script suggested by Evren Esat Özkan there 
(https://github.com/basho/yokozuna/issues/130#issuecomment-196189344). 
It is a simple python script that will stream keys and trigger a store 
action for any items. Unfortunately it failed past 178k items due to 
time out on the key stream. I calculated that this kind of reindexation 
mechanism would take up to 5 days without a crash to succeed.

I was wondering if there would be a pure Erlang mean to achieve a 
complete forced rewrite of every single element in my bucket type rather 
that an error prone and very long python process.

How would you guys reindex a 30 million item bucket type in a fast and 
reliable way ?

Thanks, Guillaume
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160829/10ac681a/attachment-0002.html>


More information about the riak-users mailing list