Riak Search - Fast Bulk Insert

Alexander Sicular siculars at gmail.com
Thu Aug 14 11:15:41 EDT 2014


And, afaik, a single index.xml file with multiple docs should probably be broken up into one file per doc to make better use of the parallelism already mentioned. 

Regards,
Alexander 

@siculars
http://siculars.posthaven.com

Sent from my iRotaryPhone

> On Aug 14, 2014, at 10:43, "Eric Redmond" <eredmond at basho.com> wrote:
> 
> Note that the search-cmd is for search pre 2.0, which does not use solr. If you're planning on using the new Solr based search, you'll need to run Riak 2.0, and write an import script as Dmitri pointed out.
> 
> Eric Redmond, Engineer @ Basho
> 
> 
> On Thu, Aug 14, 2014 at 7:38 AM, Dmitri Zagidulin <dzagidulin at basho.com> wrote:
> 
> Hi Mark,
> 
> The best way to bulk load objects into Riak (and into Solr) is to take advantage of Riak's parallelism.
> Spin up a bunch of worker threads (and have them share a pool of connections) and have them issue parallel concurrent puts to all of the nodes in a cluster (you can either use something like HAProxy to load balance, or a riak client's internal load balancing capabilities). 
> 
> This is what https://github.com/basho-labs/riak-data-migrator does, for example, to restore filed out objects quickly.
> 
> 
> 
> 
> 
>> On Thu, Aug 14, 2014 at 5:26 AM, Mark Richard Thomas <mark.thomas at equifax.com> wrote:
>> Hello
>> 
>> 
>> What’s the fastest way (best practice) to insert 20 million documents into a Riak Search index?
>> 
>> 
>> search-cmd solr my_bucket /insert.xml
>> 
>> 
>> For a proof-of-concept I’ve create a file (index.html) containing 100,000 documents:
>> 
>> 
>> <add>
>> 
>> <doc></doc>
>> 
>> <doc></doc>
>> 
>> :
>> 
>> </add>
>> 
>> 
>> Thanks
>> 
>> 
>> Mark Thomas | Software Engineer | Equifax UK
>> 
>> 
>> p: +44 (0)208 941 0573
>> 
>> m: +44 (0)7908 798 270
>> 
>> e: mark.thomas at equifax.com
>> 
>> 
>> Equifax Ltd, Capital House, 25 Chapel Street, London, NW1 5DS
>> 
>> 
>> Equifax Limited is registered in England with Registered No. 2425920. Registered Office: Capital House, 25 Chapel Street, London NW1 5DS. Equifax Limited is authorised and regulated by the Financial Conduct Authority.
>> 
>> Equifax Touchstone Limited is registered in Scotland with Registered No. SC113401. Registered Office: 54 Deerdykes View, Westfield Park, Cumbernauld G68 9HN.
>> 
>> Equifax Commercial Services Limited is registered in the Republic of Ireland with Registered No. 215393. Registered Office: IDA Business & Technology Park, Rosslare Road, Drinagh, Wexford.
>> 
>> 
>> This message contains information from Equifax which may be confidential and privileged. If you are not an intended recipient, please refrain  from any disclosure, copying, distribution or use of this information and note that such actions are prohibited. If you have received this transmission in error, please notify by e-mail postmaster at equifax.com.
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140814/b5d0d137/attachment.html>


More information about the riak-users mailing list