Deleting items from search index increases disk usage

Jeremy Raymond jeraymond at gmail.com
Mon Oct 29 08:19:44 EDT 2012


So the only way to actually free the disk space consumed by the
tombstones in the search index is to bring down the cluster and blow
away the merge index (at /var/lib/riak/merge_index)?

--
Jeremy


On Fri, Oct 26, 2012 at 9:14 AM, Jeremy Raymond <jeraymond at gmail.com> wrote:
> Yes I've read about the tombstoning, but figured writing the tombstone
> would overwrite the existing item (at worst the data usage would stay
> the same) vs writing something new (the data usage is growing).
>
> In another list post [2] from John M he outlines the tombstone removal
> behaviour. I've just re-read this. With the default settings for Riak,
> to have the tombstones removed from the search index I'll need to read
> the deleted items again (but not the actual items, I'd need to read
> the item's index somehow)?
>
> [2]:http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-October/006048.html
>
> --
> Jeremy
>
>
> On Fri, Oct 26, 2012 at 8:40 AM, Vladimir Shapovalov
> <shapovalov at gmail.com> wrote:
>> Hi  Jeremy,
>>
>> As far I'm concerned, delete operation doesn't delete the data physically,
>> just mark it as deleted. I've encountered that problem a while ago and was
>> also surprised about that fact that the data grows instead of reduce.
>>
>> Cheers
>> Vladimir
>>
>> On Fri, Oct 26, 2012 at 2:26 PM, Jeremy Raymond <jeraymond at gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I had Riak search enabled on a bucket containing millions of items. I
>>> no longer need these items to be searchable so I uninstalled search on
>>> the bucket via search-cmd. I'm looking to free the space consumed by
>>> the search index for this bucket. Following a previous post [1] on
>>> this list I'm deleting the items from the search index by running
>>> search:delete_docs/1. After getting through about 1 million items I'm
>>> seeing a noticeable increase in disk usage across the 3 node cluster,
>>> an increase of about 7GB per node.
>>>
>>> Any ideas on why my disk usage would be increasing rather than
>>> decreasing? The data in the cluster is static, the only activity is me
>>> deleting the items from the search index.
>>>
>>> I'm running riak 1.1.2 with leveldb as the backend.
>>>
>>>
>>> [1]:http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-September/005696.html
>>>
>>> --
>>> Jeremy
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>




More information about the riak-users mailing list