crash after single insert

Nico Meyer nico.meyer at adition.com
Tue May 10 07:10:56 EDT 2011


Hi again!

I just encountered this problem again myself, so I was able to check my 
theory.
So one of the bitcask.write.lock files contained this:

2272 
/var/lib/riak/bitcask/1121816686466884466511812771987303177196838846464/1305008752.bitcask.data

and sure enough 'ps axu' gives me:

riak      2269  0.0  0.0  10624   396 ?        S    12:46   0:00 
inet_gethost 4
riak      2270  0.0  0.0  10624   432 ?        S    12:46   0:00 
inet_gethost 4
riak      2271  0.0  0.0  10624   432 ?        S    12:46   0:00 
inet_gethost 4
riak      2272  0.0  0.0  10624   384 ?        S    12:46   0:00 
inet_gethost 4
root      3139  0.0  0.0      0     0 ?        S    13:00   0:00 
[flush-254:1]

Cheers,
Nico

Am 10.05.2011 03:07, schrieb Gary William Flake:
> (Removing riak-users.)
>
> This was on an Umbuntu 10.04 box.  Riaksearch was auto started in init.d but we occasionally start/stop the service as part of our application stack.  In this one case, we did a shutdown from an admin web console, which may have not called the proper shutdown procedures in init.d.  On restart, I noticed the issues and found the locked files.  Removing them did the trick.
>
> -- GWF
>
>
>
>
>
>
> On May 9, 2011, at 7:10 AM, David Smith wrote:
>
>> Hmm...ok. Will have to ponder how we can fix that.
>>
>> Thanks!
>>
>> D.
>>
>> On Mon, May 9, 2011 at 8:09 AM, Nico Meyer<nico.meyer at adition.com>  wrote:
>>> Hi Dave,
>>>
>>> I believe problem occours if there happens to be another process with
>>> the same PID as the old (now gone) riak node. This can happen if the
>>> machine was rebooted since the riak node crashed or if the PIDs wrapped,
>>> they are only two bytes after all.
>>> os_pid_exists/1 only checks for ANY process with the PID from the
>>> lockfile
>>> (https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L116).
>>>
>>>
>>>
>>> Am Montag, den 09.05.2011, 07:06 -0600 schrieb David Smith:
>>>> On Sat, May 7, 2011 at 9:25 AM, Gary William Flake<gary at flake.org>  wrote:
>>>>> That was it, Nico.  Thanks.
>>>>>
>>>>> I know we did a forced shutdown this week, which was probably the cause.  But I would have thought that riak would have taken care of its own lock file bookkeeping on restarting.
>>>> Bitcask does:
>>>>
>>>> https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L46
>>>>
>>>> It's curious that the logic didn't handle the case. What platform/OS
>>>> are you on? Are you using init scripts to restart on boot?
>>>>
>>>> Thanks,
>>>>
>>>> D.
>>>
>>>
>>
>>
>> -- 
>> Dave Smith
>> Director, Engineering
>> Basho Technologies, Inc.
>> dizzyd at basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list