"Failed to compact" in RiakSearch

Rusty Klophaus rusty at basho.com
Thu Apr 14 14:00:14 EDT 2011

Hi Morten,

Thanks for sending the log files. I was able to figure out, at least
partially, what's going on here.

The "Failed to compact" message is a result of trying to index a token
that's greater than 32kb in size. (The index storage engine, called
merge_index, assumes tokens sizes smaller than 32kb.) I was able to decode
part of the term in question by pulling data from the log file, and it looks
like you may be indexing HTML with base64 encoded inline images, ie: <img
src="data:image/jpeg;base64,iVBORw0KG..."> The inline image is being treated
as a single token, and it's greater than 32kb.

The short term workaround is to either:

1) Preprocess your data to avoid this situation.
2) Or, create a custom analyzer that limits the size of terms (See
http://wiki.basho.com/Riak-Search---Schema.html for more information about
analyzers and custom analyzers.)

The long term solution is for us to increase the maximum token size in
merge_index. I've filed a bugzilla issue for this, trackable here:

Still investigating the "Too many db tables" error. This is being caused by
the system opening too many ETS tables. It *may* be related to the
compaction error described above, but I'm not sure.

Search (specifically merge_index) uses ETS tables heavily, and the number of
tables is affected by a few different factors. Can you send me some more
information to help debug, specifically:

   - How many partitions (vnodes) are in your cluster? (If you haven't
   changed any settings, then the default is 64.)
   - How many machines are in your cluster?
   - How many segments are on the node where you are seeing these errors?
   (Run: "*find DATAPATH/merge_index/*/*.data | wc -l*", replacing DATAPATH
   with the path to your Riak data directory for that node.)
   - Approximately how much data are you loading (# Docs and # MB), and how
   quickly are you trying to load it?


On Thu, Apr 14, 2011 at 3:07 AM, Morten Siebuhr <sbhr+lists at sbhr.dk> wrote:

> Hi Rusty & al,
> On Wed, Apr 13, 2011 at 11:20 PM, Rusty Klophaus <rusty at basho.com> wrote:
> > Thanks Morten, having the logs (including the numbers) will help us debug
> > what's going on.
> Here it is.
> It seems we've hit some db-imposed limit during the night's test data
> import - I'll have to investigate that too...
> Kind regards,
> Morten Siebuhr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110414/555ed468/attachment.html>

More information about the riak-users mailing list