Query on Riak Search in a cluster of 3 nodes behind ELB is giving different result everytime

Zeeshan Lakhani zlakhani at basho.com
Mon Mar 9 10:02:19 EDT 2015


Hey Santi, Baskar,

Are you noticing increased CPU load as you create more and more indexes? Running `riak-admin top -interval 2` a few times may bring sometime to light.

I’d see how you could increase resources or think more critically on how you’re indexing data for Solr. Does the data share most fields? Can you reuse indexes for some of the data and filter certain queries?

You may also wanted to look at this thread, https://groups.google.com/forum/#!topic/nosql-databases/9ECQpVS0QjE <https://groups.google.com/forum/#!topic/nosql-databases/9ECQpVS0QjE>, which discusses modeling Riak Search data and the issues you’ll have with the overhead with gossiping so much metadata and the what Solr can handle.

Zeeshan Lakhani
programmer | 
software engineer at @basho | 
org. member/founder of @papers_we_love | paperswelove.org
twitter => @zeeshanlakhani

> On Mar 9, 2015, at 8:25 AM, Santi Kumar <santi at veradocs.com> wrote:
> 
> Hi Zeeshan,
> 
> We have typically seen this issue when we have lots of indexes created in that instance. On a t2.medium machine we already have around 512+ indexes created in data folder. In such case, if we trying to create any new indexes it's taking time. Association of Index to Bucket is failing even after  the FetchIndex operation returning sucess as shown in the below code.
> 
> is there any limitation of the number of Indexes? Any thing related to FileSystem handlers causing this issue?
> 
> while(!isCreated){
> 
>     FetchIndex fetchIndex = new FetchIndex.Builder(indexName).build();
> 
>     RiakFuture<com.basho.riak.client.core.operations.YzFetchIndexOperation.Response, String> fetchIndexFuture = client.executeAsync(fetchIndex);
> 
>     try{
> 
>     fetchIndexFuture.await();
> 
>     com.basho.riak.client.core.operations.YzFetchIndexOperation.Response response = fetchIndexFuture.get();
> 
>     List<YokozunaIndex> indexes = response.getIndexes();
> 
>     for(YokozunaIndex index:indexes){
> 
>     if(indexName.equals(index.getName())){
> 
>     isCreated=true;
> 
>     logger.info("Index "+indexName+" created ");
> 
>     continue;
> 
>     }
> 
>     }
> 
>     }catch(Exception e){
> 
>     logger.warn("Unable to get "+indexName+" Still trying");
> 
>     isCreated=false;
> 
>     }
> 
>     }
> 
> 
> On Fri, Mar 6, 2015 at 2:11 AM, Zeeshan Lakhani <zlakhani at basho.com <mailto:zlakhani at basho.com>> wrote:
> Hello Baskar, Santi,
> 
> 2-15 minutes is a long while, and we’ve not seen index creation/propagation be so slow. I’d definitely take a closer look at how you’re creating these indexes dynamically on the fly, as index creation is typically a more straightforward admin task.
> 
> We’ve added defaults to solrconfig.xml to handle most typical use-cases. You can read more about solrconfig.xml at http://wiki.apache.org/solr/SolrConfigXml#mainIndex_Section <http://wiki.apache.org/solr/SolrConfigXml#mainIndex_Section>. You may want to take another look and optimize/improve your schema design to prevent such issues. You can read more about Solr’s performance factors here -> http://wiki.apache.org/solr/SolrPerformanceFactors <http://wiki.apache.org/solr/SolrPerformanceFactors>. 
> 
> Thanks.
> 
> 
> Zeeshan Lakhani
> programmer | 
> software engineer at @basho | 
> org. member/founder of @papers_we_love | paperswelove.org <http://paperswelove.org/>
> twitter => @zeeshanlakhani
> 
>> On Mar 5, 2015, at 3:00 PM, Baskar Srinivasan <baskar at veradocs.com <mailto:baskar at veradocs.com>> wrote:
>> 
>> Hello Zeeshan,
>> 
>> Thanks for the pointer regarding waiting for index creation in each node in the cluster.
>> 
>> Presently, when the indices get created on one node, it takes a full 2-15 minutes for it to get created on other nodes in the cluster. Following are the timestamps on 3 nodes for a single index:
>> 
>> #Create index request from our server via load balancer
>> 11:16:52.999 [http-bio-8080-exec-3] INFO  c.v.s.u.RiakClientUtil - Created index for bsr-test-fromlocal-1-Access_index
>> 
>> #1st node, immediate creation (12 secs) once call is issued from our server
>> 2015-03-05 19:17:04.135 [info] <0.17388.104>@yz_index:local_create:189 Created index bsr-test-fromlocal-1-Access_index with schema
>> 
>> #2nd node, takes another 4 minutes for creation request to propagate
>> 
>> 
>> 2015-03-05 19:21:17.879 [info] <0.20606.449>@yz_index:local_create:189 Created index bsr-test-fromlocal-1-Access_index
>> 
>> #3rd node, takes 15 minutes for creation request to propagate
>> 
>> 
>> 2015-03-05 19:32:32.172 [info] <0.14715.94>@yz_index:local_create:189 Created index bsr-test-fromlocal-1-Access_index
>> 
>> Is there a solr config we can tune to make the 2nd and 3rd node propagation more immediate in the order of < 60 seconds?
>> 
>> Thanks,
>> 
>> Baskar
>> 
>> 
>> On Thu, Mar 5, 2015 at 9:11 AM, Zeeshan Lakhani <zlakhani at basho.com <mailto:zlakhani at basho.com>> wrote:
>> Hello Santi, Baskar. Please keep your messages on the user group mailing list, btw. Thanks.
>> 
>> Here’s an example of our testing harness’s wait_for_index function, https://github.com/basho/yokozuna/blob/develop/riak_test/yz_rt.erl#L420 <https://github.com/basho/yokozuna/blob/develop/riak_test/yz_rt.erl#L420>. We check for the index on each of the nodes, which is an approach you can take. 
>> 
>> And, as I mentioned, I’m currently working on making Index creation synchronous to make this easier.
>> 
>> If your logs are not pointing to any errors and being that your bucket, index contains so few objects, I’d delete or mv the search-root/index directory (./data/yz/<<index_name>>) and let AAE resync the data, which should then give you consistent results.
>> 
>> Thanks.
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150309/dea4de42/attachment-0002.html>


More information about the riak-users mailing list