riak random block missing on GET after some hour

Charles Bijon bijon.charles at gmail.com
Wed Jul 23 05:55:21 EDT 2014


hello,

we have a problem since 4 days.

The story :

We did a migration of our 38 old machines  to 45 new servers to increase 
our production capacity. To do this, we make update of RIAK (1.4.8 -> 
1.4.9 -> 1.4.10). Today we have Riak (1.4.10) riakcs (1.4.5) and 
stanchion (1.4.3).

Now something strange appeared our storage : New data put on the Riak 
cluster become corrupted over time (  AAE enabled and was enabled during 
the migration ) .

We have this message in log : error] 
<0.13320.0>@riak_cs_get_fsm:waiting_chunks:311 riak_cs_get_fsm: Cannot 
get S3 <<"blabla">> <<"blabla/blabla/blabla/blabla.foo">> block#
{<<94,144,214,192,123,131,68,132,142,55,30,108,189,81,242,106>>,0}: 
{error,notfound}

Yesterday I deactivated AAE to test if the problem continues and we put 
the dataagain to rebuild storage partialy.

The riakdiag is ok and also the ring-status

Is someone already had this trouble ?

Is it advisable to go back to the 1.4.8 version ?

Is what I have to restore AAE? And under what conditions?

If i should back to the 1.4.8 version of Riak, how to without loose 
something ? Is this the right approach? Is that corrupt data will become 
viable after the rear back?

It's a little hell right now, I really need a helping hand.

Regards,

Charles








More information about the riak-users mailing list