riaksearch memory growth issues

Gordon Tillman gtillman at mezeo.com
Tue May 31 17:25:42 EDT 2011


Howdy Gilbert,

Hey we are testing a fix now.  If this works I will send you a copy of the update file.

--gordon


On May 31, 2011, at 12:55 , Gilbert Glåns wrote:

> Hi Gordon,
> Thank you for sharing the information.  We are seeing the same exact
> type of behavior from our search cluster.  I have tracked the
> problem(s) though the query system.  It looks like the mailboxes we
> are both seeing are "abandoned" and / or the messages are never
> matched within the Erlang code (it_op_collector_loop,
> riak_search_op_utils.erl); the messages are then never processed,
> therefore the resources they utilize never released.  This is a major
> problem.
> 
> I have been debugging this for some time and I wish I could say it was
> going well.  The implementation is convoluted -- have you gotten
> through it?  Can you verify the same cause?
> 
> We have been internally discussing the possibility of removing this
> query processing implementation completely and replacing it with
> something built in-house because the problems we have uncovered trying
> to debug the "abandoned mailbox" problem are related and systemic:  1)
> indeterminate and possibly very large data structures created and
> manipulated for intermediate and final sets of results, 2) very poor
> or non-existent ability to gain any insight into what is executing
> within the "plumbing" of the current query execution system without
> "herculean" effort (in my opinion), and 3) unacceptable performance
> (predictably or subjectively) from the merge_index riak_search
> backend.
> 
> Are there any other backends available for riak_search with the
> Enterprise Riak offering?  I really like the design of riak_search but
> the performance seems to be only a very small fraction of our
> equivalent SOLR installation, even with several times the amount of
> resources "thrown at it" -- it does not seem to use resources we
> "throw at it" well, either, or in the mailboxes case, responsibly.
> 
> I will quickly admit I may be doing something wrong.  Is there a
> user-error situation in which mailboxes should be abandoned taking up
> memory?
> 
> Does anyone else have experiences with equivalent riak_search vs. SOLR
> installations?
> 
> Thanks again for sharing Gordon.  Your results make me feel like this
> may not be entirely stupidity on my part.
> 
> Gilbert
> 
> 
> On Tue, May 31, 2011 at 8:51 AM, Gordon Tillman <gtillman at mezeo.com> wrote:
>> Howdy Gilbert,
>> I reproduced the issue this morning and then ran the command that you
>> specified on two of the non-empty mailboxes.
>> The output from that is posted here:
>> https://gist.github.com/1000735
>> Please let me know if this corresponds to the issue that you are seeing.
>> Thank you,
>> --gordon
>> 
>> On May 27, 2011, at 20:10 , Gilbert Glåns wrote:
>> 
>> Gordon,
>> Could you try:
>> 
>> erlang:process_info(list_to_pid("<0.16614.32>"), [messages,
>> current_function, initial_call, links, memory, status]).
>> 
>> in a riak search console for one/some of those mailboxes and share the
>> results? I am curious to see if you are having the same systemic
>> memory consumption I am experiencing.
>> 
>> Gilbert
>> 
>> On Fri, May 27, 2011 at 5:15 PM, Gordon Tillman <gtillman at mezeo.com> wrote:
>> 
>> Howdy Gang,
>> 
>> We are having a bit of an issue with our 3-node riaksearch cluster.  What is
>> happing is this:
>> 
>> Cluster is up and running.  We start testing our application against it.  As
>> the application runs the erlang process consumes more and more memory
>> without ever releasing it.
>> 
>> In trying to investigate the issue we ran the riaksearch-admin cluster_info
>> command.  It appears that the bulk of this memory is being consumed by a
>> bunch of mailboxes.
>> 
>> I have posted both the output of the cluster_info command and the app.config
>> from one of the nodes here:
>> 
>> https://gist.github.com/996419
>> 
>> I would be very grateful if someone from Basho would take a look at the
>> cluster_info and see if they can spot anything obvious.
>> 
>> Each machine in the cluster has an 8-core Xeon and 16GB RAM.  I believe all
>> of the platform details, etc., are in the cluster_info dump.
>> 
>> Many thanks,
>> 
>> --gordon
>> 
>> _______________________________________________
>> 
>> riak-users mailing list
>> 
>> riak-users at lists.basho.com
>> 
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 





More information about the riak-users mailing list