my cluster spontaneously loses a node after ~48hrs
jason at soundhound.com
Tue May 12 14:30:50 EDT 2015
Bryan- thanks for this tip,
i probably wouldn't have suspected ulimit.
It's been 4 days now since my last restart, which i've never reached
so i'm cautious to say that the problem has gone away.
i suspect this progress is due to some library updates
we performed on the hosts. it turns out some of them were out
of date, which i wan't aware of. we updated them all at once,
so there's no way to identify which one(s) caused the problem,
assuming that was the problem. hopeful now,
On 12.05.2015 05:37, Bryan Hunt wrote:
> Also ensure ulimit is set according to the recommendations on
> docs.basho.com. ulimit set too low is a common cause of node
> On 5 May 2015 21:23, "Jason Golubock" <jason at soundhound.com> wrote:
>> Scott - thanks for the response,
>> yes i've used all those tools at one point, but i'm not sure
>> exactly what i'm looking for or what to do with the output.
>> i've restarted my cluster again but next time it happens,
>> i'll attach some output/snapshot files.
>> ~ Jason
>> On 04.05.2015 19:32, Scott Lystig Fritchie wrote:
>>> Hi, Jason. Have you tried using the system inspection utilities
>>> with Riak?
>>> The "top" utility can show very quickly the most active processes
>>> the virtual machine.
>> riak-users mailing list
>> riak-users at lists.basho.com
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users