documents disappear from riak search index

francisco treacy francisco.treacy at gmail.com
Tue Jan 31 07:44:57 EST 2012


Hi Ryan,

It's happening again.

I think it's even worse now: when I compare books by counting (search
vs k/v) they yield the same amount of documents. Yet users are still
complaining. I was mistakenly relying on those numbers.

This is about users and the books they see on their bookshelves, which
is a search query on bookId AND userId. In this case 'maya012' and
'8542'.

db.get('user_books', 'maya012-8542')
=> { }

It's found, k/v works at least.

db.addSearch('user_books', 'bookId:8542 AND userId:maya012').reduce({
language: 'erlang', module: 'riak_kv_mapreduce', function:
'reduce_identity' }).run(function(err,d) { console.log(d.length) })
=> 0

Uh oh, nothing there. Let's see what we have:

db.addSearch('user_books', 'userId:maya0121').reduce({ language:
'erlang', module: 'riak_kv_mapreduce', function: 'reduce_identity'
}).run(function(err,d) { console.log(d) })
=> [ [ 'user_books', 'maya012-8528' ] ]

Aha. User is supposed to have 2 books but has one. Let's see if
there's a disparity between counts in K/V and search:

db.add('user_books').map('query', 'where
.bookId:val("8542")').run(function(err, d) { console.log(d.length) })
=> 660

db.addSearch('user_books', 'bookId:8542').reduce({ language: 'erlang',
module: 'riak_kv_mapreduce', function: 'reduce_identity'
}).run(function(err,d) { console.log(d.length) })
=> 660

WTH? It seems that 'userId' is introducing some weirdness here.

I fetched/saved the document, and now things are back to normal...

db.addSearch('user_books', 'bookId:8542 AND userId:maya012').reduce({
language: 'erlang', module: 'riak_kv_mapreduce', function:
'reduce_identity' }).run(function(err,d) { console.log(d.length) })
=> 1

...for maya012.  Go figure how many other users are getting incorrect
data. Oh yea, I need to write another script to check them one by one.

I waded through the logs, because there are tons of errors regarding
Luwak... (oh right, that too! Probably 5% of the files return 0 bytes,
awesome – but I digress) ... and can't find anything that is even
close to be search-related.

I did this for all 3 nodes in the cluster. I did health-checks,
nothing abnormal... enough disk space, no resource spikes, all
transfers complete... I don't know. Things seem to be normal, or maybe
I'm not enough of a mad scientist to notice it. Should I need some
cleanup, restart nodes?

This is starting to get bad.

Thanks,
Francisco


2012/1/14 Ryan Zezeski <rzezeski at basho.com>:
>
>
> On Sat, Jan 14, 2012 at 11:58 AM, francisco treacy
> <francisco.treacy at gmail.com> wrote:
>>
>>
>> (I don't know how much Search has changed in 1.0.x, but keep in mind
>> I'm running 0.14.2 in production.)
>>
>> I'd love to provide you with more information, but don't have much
>> time to go digging around. The workaround is, well, a workaround, but
>> it let us move on. I'll keep a special eye on this until it happens
>> again, and let's definitely keep each other posted.
>>
>
> Even in 0.14.2 I find it hard to believe the index is dropping 5% of the
> data on the floor.  It's entirely possible, but just not the first, or
> second or third thing I would expect.
>
> Please keep me posted.  I'll keep this incident in the back of my head when
> looking at Search code.



More information about the riak-users mailing list