Riak stalls, leveldb backend

Sorin Manole sorin.manole at trustev.com
Tue Jan 6 06:01:22 EST 2015


Hey Andy,

We had the same issue few days ago. We were getting timeouts when trying to
read a key from riak.

Also we were seeing in the logs a warning about reading/writing a large
object. In our case that object was first read from riak and after written
back. It was that big (17Mb) because of some load testing that we did.

We deleted the object and we didn't experience timeouts anymore.

We tried to repro the issue, so we can be sure it was this and we posted a
20Mb json to the same key and the timeouts came back again, deleted the
20Mb object and everything worked fine again.

I don't know why this is happening but we can say that this was the issue
for our timeouts.

Maybe someone has a better understanding on this.. I'd like to be in the
loop of this conversation if possible.

Thanks!
Sorin.

On 5 January 2015 at 18:07, Andy Pellett <andy at embed.ly> wrote:

> Hi all,
>
> I've been experiencing stalls where riak won't return any data (queries
> time out) with my riak cluster. Here are some basic details:
>
> - 8 nodes
> - riak 1.4.10 (upgraded from 1.4.6 -> 1.4.8 -> 1.4.10)
> - leveldb backend
> - n_val is 2
> - allow_mult is false
> - ec2 i2.2xlarge boxes (8 cores, 61gb ram, 800gb disk space)
> - about 33% disk space utilization per node
>
> The riak cluster will stall for as long as a few minutes at a time, but
> will otherwise work as expected for hours. There doesn't seem to be an
> obvious pattern as to when the stalls happen.
>
> My first thought was that the stalls may be related to AAE, but I've
> disabled that via 'riak attach' and the settings file. Sidenote, I still
> see messages like:
>
> 2015-01-05 12:24:04.666 [info]
> <0.574.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
> AAE throttle from 0 -> 10 msec/key, based on maximum vnode mailbox size 209
> from 'riak-user at riak-host'
>
> which makes me question whether AAE is actually turned off.
>
> Now I'm leaning towards leveldb compactions being the issue. What can I do
> to verify this is the issue, and how can I fix it?
>
> I see log messages about large objects:
>
> 2015-01-05 16:11:28.046 [warning]
> <0.6398.0>@riak_kv_vnode:encode_and_put_no_sib_check:1830 Writing very
> large object (11307735 bytes) to <<"BucketName">>/<<"keys_1420466400">>
>
> Could these be causing longer-running compactions, or more frequent
> compactions?
>
> Thanks for reading,
> Andy
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
[image: photo]
*Sorin Manole*
Senior Software Engineer, Trustev
m:+353 86 051 2658 | e:sorin.manole at trustev.com | w:www.trustev.com
<http://webapp.wisestamp.com/www.trustev.com>| a: Trustev Ltd, 2100 Airport
Business Park, Cork, Ireland.

-- 


This message is for the named person's use only. If you received this 
message in error, please immediately delete it and all copies and notify 
the sender. You must not, directly or indirectly, use, disclose, 
distribute, print, or copy any part of this message if you are not the 
intended recipient. Any views expressed in this message are those of the 
individual sender and not Trustev Ltd. Trustev is registered in Ireland No. 
516425 and trades from 2100 Cork Airport Business Park, Cork, Ireland.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150106/173f8ffc/attachment-0002.html>


More information about the riak-users mailing list