Yokozuna Scale

anandm anand at zerebral.co.in
Wed Sep 17 00:37:07 EDT 2014

I started checking out Riak as an alternative for one my projects. We are
currently using SolrCloud 4.8 (36 Shards + 3 Replica each) - and all stored
fields (64 fields per doc + each doc at about 6kb on average)
I want to get this thing migrated - so push all data out of Solr and store
it in a KV - like Riak, but keep the indexes going in Solr (as I have a lot
of code written already around Solr)

Came across Yokozuna today and sounds like thats going to be a perfect match
for my requirement...

Just a couple of questions I have - I tried searching online for answers
(but couldn't find references to large Scale Yokozuna deployments)

1. I have over 250M documents indexed & stored (thats very bad) in current
SolrCloud deployment - with the replication factor of 3 - total Solr Index +
Data Size is about 4.5TB spread across 6 Servers (12 core (24 threads) +
    Index Search performance and write performance is good enough with 36
Shards and Composite Id routing - I want to migrate this straight to Riak
with Yokozuna enabled.
    I'll be deploying a 5-6 node Riak Cluster - that would mean roughly
about 50M docs will be stored on each node - and Yokozuna will index it
locally on each node's Solr too (only indexed fields) -
           a. Will this Solr instance have just one core to index the data?
(As of now I just plan to have one bucket)
           b. Would it be able to handle the load of searching through 50M
docs with just one core? I think RAM wont be an issue - but I have not seen
a single Solr instance serving 50M docs so a bit worried about that.

2. Every time I query the Solr instance via Riak - /search hander - The
actual search query will run in a distributed manner on Solr nodes in the
cluster - but will Yokozuna also fetch the Stored fields for the docs or the
entire docs from the underlying Riak instances too and return that to the
search request? Or would my client app need to query the Riak docs in a
separate query?

3. Anybody with a large scale Yokozuna deployment and if you could make a
quick comment on the deployment size, the hardware and overall throughput -
that will help. 


View this message in context: http://riak-users.197444.n3.nabble.com/Yokozuna-Scale-tp4031808.html
Sent from the Riak Users mailing list archive at Nabble.com.

More information about the riak-users mailing list