Riak performance issue with many connected objects (11 million +)

Jared Morrow jared at basho.com
Fri Aug 31 13:40:53 EDT 2012


*My first reply bounced due to it being too big, so I'll try again with
your original email shortened.   So pardon duplicate replies if it did in
fact go through.*

Lei,

One issue that stands out is that you are running a single-node cluster.
With a n_val of 3, you will be storing 3 copies of every piece of data on
that one node.  Also, if you are doing a read with a r of 3, you are
waiting for the data to be read three times from a single node.  If
possible, consider making a larger cluster (even with smaller machines) to
get a better idea about how Riak operates.  We recommend a minimum of five
physical nodes for a n_val of 3.  If you lower your n_val you can possibly
lower your physical node count and still get good performance at the
sacrifice of higher availability.

Given all of the above, it is still unusual that your reads *never* return.
 There might be another issue there irregardless of cluster size.  When you
say "connected" objects, are you referring to links and link-walking?   If
so, how deep are you going with your links?  Some context here would be
helpful.

Thanks,
Jared

On Fri, Aug 31, 2012 at 9:54 AM, Lei Gu <legu at e-dialog.com> wrote:

> I started performance testing Riak by loading 11 million connected objects
> in Riak. Loading has not been an issue, although it did take a long time.
> Now when I tried to access an object, Riak never returns and was using
> 100% of CPU and 74% of memory.
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  8952 riak      20   0 23.4g  11g 1708 S 100.6 74.1   1520:49 beam.smp
> <snip>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120831/3947abf4/attachment.html>


More information about the riak-users mailing list