Riak nodes constantly crashing

Steven Joseph steven at streethawk.co
Wed Oct 26 17:17:46 EDT 2016


I don't think you should disable AAE, you can tune its frequency.

Steven

On Thu, 27 Oct 2016 03:50 Ricardo Mayerhofer <ricardo.ekm at gmail.com> wrote:

> Yes, I'll check if the problem is the AAE! I will disable it and see the
> results.
>
> Thanks Steven!
>
> On Tue, Oct 25, 2016 at 6:54 PM, Steven Joseph <steven at streethawk.co>
> wrote:
>
> Hi Ricardo,
>
> If you are using systemd might have to check LimitNOFILE for your units.
> Active anti entropy runs periodically.
>
> Steven
>
> On Wed, 26 Oct 2016 04:36 Ricardo Mayerhofer <ricardo.ekm at gmail.com>
> wrote:
>
> What's weird is that the node crashes every minute at the same second. Is
> there anything Riak may be running every minute?
>
> On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <ricardo.ekm at gmail.com
> > wrote:
>
> I'm also pasting the free -m:
>
>              total       used       free     shared    buffers     cached
> Mem:         15039      14557        482          0         37       4594
> -/+ buffers/cache:       9925       5114
> Swap:            0          0          0
>
> On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <ricardo.ekm at gmail.com
> > wrote:
>
> Hi Alexander,
> Thanks for your response. We use multi-backend with bitcask and leveldb.
>
> - File descriptors seems to be ok, at least the config.
>
> ubuntu at ip-10-2-58-5:/var/log/riak$ sudo su riak
> sudo: unable to resolve host ip-10-2-58-5
> riak at ip-10-2-58-5:/var/log/riak$ ulimit -n
> 65535
>
> - Memory seems to be ok as well:
>
> KiB Mem: * 15400916 *total,* 14493744 *used,*   907172 *free,*    36244 *
> buffers
>
> - Disk is ok
>
> /dev/xvda1       20G  4.1G   15G  22% / # root device
> /dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak
> data disk
> /dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk
>
> Any other idea? Thanks.
>
> On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <siculars at basho.com>
> wrote:
>
> Disk, memory or file descriptors would be my guess. Bitcask?
>
>
> On Monday, October 24, 2016, Ricardo Mayerhofer <ricardo.ekm at gmail.com>
> wrote:
>
> Hi all,
> I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5
> nodes are affected.
>
> However it seems Riak manage to get them up again.
>
> Any idea on whats going on? Erros logs below.
>
> Thanks.
>
> error.log
> ...
> 2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process
> <0.24570.1174> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
> 2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process
> <0.7109.1175> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
> 2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process
> <0.19612.1175> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
> 2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process
> <0.1356.1176> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
> 2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process
> <0.11380.1176> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
> 2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process
> <0.24279.1176> with 0 neighbours crashed with reason: no case clause
> matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line
> 107
>
> crash.log
> 2016-10-24 21:51:56 =CRASH REPORT====
>   crasher:
>     initial call: mochiweb_acceptor:init/3
>     pid: <0.28136.1621>
>     registered_name: []
>     exception error:
> {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
>     ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
>     messages: []
>     links: [<0.201.0>,#Port<0.235869290>]
>     dictionary: []
>     trap_exit: false
>     status: running
>     heap_size: 377
>     stack_size: 24
>     reductions: 423
>   neighbours:
> 2016-10-24 21:52:56 =CRASH REPORT====
>   crasher:
>     initial call: mochiweb_acceptor:init/3
>     pid: <0.7845.1622>
>     registered_name: []
>     exception error:
> {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
>     ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
>     messages: []
>     links: [<0.201.0>,#Port<0.235879110>]
>     dictionary: []
>     trap_exit: false
>     status: running
>     heap_size: 377
>     stack_size: 24
>     reductions: 406
> --
> Ricardo Mayerhofer
>
>
>
> --
>
>
> Alexander Sicular
> Solutions Architect
> Basho Technologies
> 9175130679
> @siculars
>
>
>
>
> --
> Ricardo Mayerhofer
>
>
>
>
> --
> Ricardo Mayerhofer
>
>
>
>
> --
> Ricardo Mayerhofer
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
> --
> Ricardo Mayerhofer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20161026/eb48630a/attachment-0002.html>


More information about the riak-users mailing list