Reindexing solr after backup restore

Jason Campbell xiaclo at xiaclo.net
Fri Apr 24 20:56:37 EDT 2015


This may be a case of force-replace vs replace vs reip.  I'm happy to see if I can get new cluster from backup to keep the Solr indexes.

The disk backup was all of /var/lib/riak, so definitely included the YZ indexes before the force-replace, and they were kept on the first node that was changed with reip.  I stopped each node before the snapshot to ensure consistency.  So I would expect the final restored cluster to be somewhere between the first and last node snapshot in terms of data, and AAE to repair things to a consistent state for that few minute gap.

I'll experiment with different methods of rebuilding the cluster on Monday and see if I can get it to keep the Solr indexes.  Maybe moving the YZ indexes out of the way during the force-replace, then stopping the node and putting them back could help as well.  I'll let you know the results of the experiments either way.

Thanks,
Jason

> On 25 Apr 2015, at 09:25, Zeeshan Lakhani <zlakhani at basho.com> wrote:
> 
> Hey Jason,
> 
> Yeah, nodes can normally be joined without a cluster dropping its Solr Index and AAE normally rebuilds the missing KV bits.
> 
> In the case of restoring from a backup and having missing data, we can only recommend a reindex (the indexes that have the issue) with aggressive AAE settings to speed things up. It can be pretty fast. Recreating indexes are cheap in Yokozuna, but are the `data/yz` directories missing from the nodes that were force-replaced? Unless someone else wants to chime in, I’ll gather more info on what occurred from the reip vs the force-replace. 
> 
> Zeeshan Lakhani
> programmer | 
> software engineer at @basho | 
> org. member/founder of @papers_we_love | paperswelove.org
> twitter => @zeeshanlakhani
> 
>> On Apr 24, 2015, at 7:02 PM, Jason Campbell <xiaclo at xiaclo.net> wrote:
>> 
>> Is there a way to do a restore without rebuilding these indexes though?  Obviously this could take a long time depending on the amount of indexed data in the cluster.  It's a fairly big gotcha to say that Yokozuna fixes a lot of the data access issues that Riak has, but if you restore from a backup, it could be useless for days or weeks.
>> 
>> As far as disk consistency, the nodes were stopped during the snapshot, so I'm assuming on-disk it would be consistent within a single node.  And cluster wide, I would expect the overall data to fall somewhere between the first and last node snapshot.  AAE should still repair the bits left over, but it shouldn't have to rebuild the entire Solr index.
>> 
>> So the heart of the question can I join a node to a cluster without dropping it's Solr index?  force-replace obviously doesn't work, what is the harm in running reip on every node instead of just the first?
>> 
>> Thanks for the help,
>> Jason
>> 
>>> On 25 Apr 2015, at 00:36, Zeeshan Lakhani <zlakhani at basho.com> wrote:
>>> 
>>> Hey Jason,
>>> 
>>> Here’s a little more discussion on Yokozuna backup strategies: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-January/014514.html.
>>> 
>>> Nonetheless, I wouldn’t say the behavior’s expected, but we’re going to be adding more to the docs on how to rebuild indexes.
>>> 
>>> To do so, you could just remove the yz_anti_entropy directory, and make AAE more aggressive, via
>>> 
>>> ```
>>> rpc:multicall([node() | nodes()], application, set_env, [yokozuna, anti_entropy_build_limit, {100, 1000}]).
>>> rpc:multicall([node() | nodes()], application, set_env, [yokozuna, anti_entropy_concurrency, 4])
>>> ```
>>> 
>>> and the indexes will rebuild. You can try to initialize the building of trees with `yz_entropy_mgr:init([])` via `riak attach`, but a restart would also kick AAE into gear. There’s a bit more related info on this thread: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-March/016929.html.
>>> 
>>> Thanks.
>>> 
>>> Zeeshan Lakhani
>>> programmer | 
>>> software engineer at @basho | 
>>> org. member/founder of @papers_we_love | paperswelove.org
>>> twitter => @zeeshanlakhani
>>> 
>>>> On Apr 24, 2015, at 1:34 AM, Jason Campbell <xiaclo at xiaclo.net> wrote:
>>>> 
>>>> I think I figured it out.
>>>> 
>>>> I followed this guide: http://docs.basho.com/riak/latest/ops/running/nodes/renaming/#Clusters-from-Backups
>>>> 
>>>> The first Riak node (changed with riak-admin reip) kept it's Solr index.  However, the other nodes when joined via riak-admin cluster force-replace, dropped their Solr indexes.
>>>> 
>>>> Is this expected?  If so, it should really be in the docs, and there should be another way to restore a cluster keeping Solr intact.
>>>> 
>>>> Also, is there a way to rebuild a Solr index?
>>>> 
>>>> Thanks,
>>>> Jason
>>>> 
>>>>> On 24 Apr 2015, at 15:16, Jason Campbell <xiaclo at xiaclo.net> wrote:
>>>>> 
>>>>> I've just done a backup and restore of our production Riak cluster, and Yokozuna has dropped from around 125 million records to 25million.  Obviously the IPs have changed, and although the Riak cluster is stable, I'm not sure Solr handled the transition as nicely.
>>>>> 
>>>>> Is there a way to force Solr to rebuild the indexes, or at least get back to the state it was in before the backup?
>>>>> 
>>>>> Also, is this expected behaviour?
>>>>> 
>>>>> Thanks,
>>>>> Jason
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list