object sizes

Alex De la rosa alex.rosa.box at gmail.com
Mon Apr 13 07:26:07 EDT 2015

Hi Bryan,

Thanks for your answer; i don't know how to code in erlang, so all my
system relies on Python.

Following Ciprian's curl suggestion, I tried to compare it with this python
code during the weekend:

Map object:
curl -I
> 1058 bytes
print sys.getsizeof(obj.value)
> 3352 bytes

Standard object:
curl -I
> 9718 bytes
print sys.getsizeof(obj.encoded_data)
> 9755 bytes

The standard object seems pretty accurate in both approaches even the image
binary data was only 5kbs (I assume some overhead here)

The map object is about 3x the difference between curl and getting the
object via Python.

Not so sure if this is a realistic way to measure their growth (moreover
because the objects i would need this monitorization are Maps, not
unaltered binary data that I can know the size before storing it).

Would it be possible in some way that the Python get() function would
return something like "obj.content-lenght" returning the size is currently
taking? that would be a pretty nice feature.


On Mon, Apr 13, 2015 at 12:47 PM, bryan hunt <bhunt at basho.com> wrote:

> Alex,
> Maps and Sets are stored just like a regular Riak object, but using a
> particular data structure and object serialization format. As you have
> observed, there is an overhead, and you want to monitor the growth of these
> data structures.
> It is possible to write a MapReduce map function (in Erlang) which
>  retrieves a provided object by type/bucket/id and returns the size of it's
> data. Would such a thing be of use?
> It would not be hard to write such a module, and I might even have some
> code for doing so if you are interested. There are also reasonably good
> examples in our documentation -
> http://docs.basho.com/riak/latest/dev/advanced/mapreduce
> I haven't looked at the Python PB API in a while, but I'm reasonably
> certain it supports the invocation of MapReduce jobs.
> Bryan
> On 10 Apr 2015, at 13:51, Alex De la rosa <alex.rosa.box at gmail.com> wrote:
> Also, I forgot, i'm most interested on bucket_types instead of simple riak
> buckets. Being able how my mutable data inside a MAP/SET has grown.
> For a traditional standard bucket I can calculate the size of what I'm
> sending before, so Riak won't get data bigger than 1MB. Problem arise in
> MAPS/SETS that can grown.
> Thanks,
> Alex
> On Fri, Apr 10, 2015 at 2:47 PM, Alex De la rosa <alex.rosa.box at gmail.com>
> wrote:
>> Well... using the HTTP Rest API would make no sense when using the PB
>> API... would be extremely costly to maintain, also it may include some
>> extra bytes on the transport.
>> I would be interested on being able to know the size via Python itself
>> using the PB API as I'm doing.
>> Thanks anyway,
>> Alex
>> On Fri, Apr 10, 2015 at 1:58 PM, Ciprian Manea <ciprian at basho.com> wrote:
>>> Hi Alex,
>>> You can always query the size of a riak object using `curl` and the REST
>>> API:
>>> i.e. curl -I <riak-node-ip>:8098/buckets/test/keys/demo
>>> Regards,
>>> Ciprian
>>> On Thu, Apr 9, 2015 at 12:11 PM, Alex De la rosa <
>>> alex.rosa.box at gmail.com> wrote:
>>>> Hi there,
>>>> I'm using the python client (by the way).
>>>> obj = RIAK.bucket('my_bucket').get('my_key')
>>>> Is there any way to know the actual size of an object stored in Riak?
>>>> to make sure something mutable (like a set) didn't added up to more than
>>>> 1MB in storage size.
>>>> Thanks!
>>>> Alex
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150413/6f649c7d/attachment-0002.html>

More information about the riak-users mailing list