Solr error message

Jim Raney jim.raney at physiq.com
Mon Apr 11 21:21:12 EDT 2016



On Apr 11, 2016, at 3:35 PM, Fred Dushin <fdushin at basho.com<mailto:fdushin at basho.com>> wrote:

Hi Jim,

Interesting problem.

That error is occurring here:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L275

because length(Mapping) and length(UniqNodes) are unequal:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L262

This might be because you are getting timeouts trying to query the port on remote nodes:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_solr.erl#L324

As you can see, there is a hard-wired 1-second timeout on that RPC call, which could account for why you are seeing this failure into a load run.

You might try to rebuild a version of this module with an increased timeout, to see if that gets you over the hump, or consider making a configurable timeout.

Riak 2.1.3 ships with yokozuna 2.1.2, who's GIT SHA 3520d11ec21ee08b7c18478fbbe1b61d7e3d8e0f, so you'd want to branch off that point of the tree, if you care to experiment.

If you rebuild the module, you can place the generated beam file in the lib/basho-patches directory of each of your riak installs, and restart Riak (or manually re-load the module on each node via the Riak console, if you need to keep your riak nodes up and running)

Let us know what you find or if you need more assistance.

-Fred

On Apr 11, 2016, at 4:11 PM, Jim Raney <jim.raney at physiq.com<mailto:jim.raney at physiq.com>> wrote:

Failed to determine Solr port for all nodes in search plan

_______________________________________________
riak-users mailing list
riak-users at lists.basho.com<mailto:riak-users at lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Fred,

Thanks for the quick response.  After you basically verified that it was a a solr timeout issue I rebuilt the cluster with 14 nodes to see what would happen.  The amount of time it took for the query fails (and associated log entries) basically doubled as well.

I -could- try increasing the hard coded timeout but I don't think that's the route we want to go as it is likely this system will have that much data or more being pushed into and long query times won't work.  I imagine there is probably some solr tuning we can do - any ideas on what we could look at that we could pass through the riak config?

I'm going to try an Oracle 1.8 JDK with it later and see if any GC tuning helps in case there are long GC pauses.

--
Jim Raney
Jim.raney at physiq.com<mailto:Jim.raney at physiq.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160412/2221e139/attachment-0002.html>


More information about the riak-users mailing list