object sizes

Alex De la rosa alex.rosa.box at gmail.com
Thu Apr 16 16:08:23 EDT 2015


Hi Matthew,

Thanks for your answer : ) i always have interesting questions : P

about point [2]... if you see my examples, i'm already using
sys.getsizeof()... but sizes are not so accurate, also, I believe that is
the size they take on RAM when loaded by Python and not the full exact size
of the object (specially on Maps that differs quite some).

I will open the ticket then : ) I think it can be very helpful future
feature.

Thanks,
Alex

On Thu, Apr 16, 2015 at 10:03 PM, Matthew Brender <mbrender at basho.com>
wrote:

> Hi Alex,
>
> That is an interesting question! I haven't seen a request like that in
> our backlog, so feel free to open a new issue [1]. I'm curious: why
> not use something like sys.getsizeof [2]?
>
> [1] https://github.com/basho/riak-python-client/issues
> [2]
> http://stackoverflow.com/questions/449560/how-do-i-determine-the-size-of-an-object-in-python
>
> Matt Brender | Developer Advocacy Lead
> Basho Technologies
> t: @mjbrender
>
>
> On Mon, Apr 13, 2015 at 7:26 AM, Alex De la rosa
> <alex.rosa.box at gmail.com> wrote:
> > Hi Bryan,
> >
> > Thanks for your answer; i don't know how to code in erlang, so all my
> system
> > relies on Python.
> >
> > Following Ciprian's curl suggestion, I tried to compare it with this
> python
> > code during the weekend:
> >
> > Map object:
> > curl -I
> >> 1058 bytes
> > print sys.getsizeof(obj.value)
> >> 3352 bytes
> >
> > Standard object:
> > curl -I
> >> 9718 bytes
> > print sys.getsizeof(obj.encoded_data)
> >> 9755 bytes
> >
> > The standard object seems pretty accurate in both approaches even the
> image
> > binary data was only 5kbs (I assume some overhead here)
> >
> > The map object is about 3x the difference between curl and getting the
> > object via Python.
> >
> > Not so sure if this is a realistic way to measure their growth (moreover
> > because the objects i would need this monitorization are Maps, not
> unaltered
> > binary data that I can know the size before storing it).
> >
> > Would it be possible in some way that the Python get() function would
> return
> > something like "obj.content-lenght" returning the size is currently
> taking?
> > that would be a pretty nice feature.
> >
> > Thanks!
> > Alex
> >
> > On Mon, Apr 13, 2015 at 12:47 PM, bryan hunt <bhunt at basho.com> wrote:
> >>
> >> Alex,
> >>
> >>
> >> Maps and Sets are stored just like a regular Riak object, but using a
> >> particular data structure and object serialization format. As you have
> >> observed, there is an overhead, and you want to monitor the growth of
> these
> >> data structures.
> >>
> >> It is possible to write a MapReduce map function (in Erlang) which
> >> retrieves a provided object by type/bucket/id and returns the size of
> it's
> >> data. Would such a thing be of use?
> >>
> >> It would not be hard to write such a module, and I might even have some
> >> code for doing so if you are interested. There are also reasonably good
> >> examples in our documentation -
> >> http://docs.basho.com/riak/latest/dev/advanced/mapreduce
> >>
> >> I haven't looked at the Python PB API in a while, but I'm reasonably
> >> certain it supports the invocation of MapReduce jobs.
> >>
> >> Bryan
> >>
> >>
> >> On 10 Apr 2015, at 13:51, Alex De la rosa <alex.rosa.box at gmail.com>
> wrote:
> >>
> >> Also, I forgot, i'm most interested on bucket_types instead of simple
> riak
> >> buckets. Being able how my mutable data inside a MAP/SET has grown.
> >>
> >> For a traditional standard bucket I can calculate the size of what I'm
> >> sending before, so Riak won't get data bigger than 1MB. Problem arise in
> >> MAPS/SETS that can grown.
> >>
> >> Thanks,
> >> Alex
> >>
> >> On Fri, Apr 10, 2015 at 2:47 PM, Alex De la rosa <
> alex.rosa.box at gmail.com>
> >> wrote:
> >>>
> >>> Well... using the HTTP Rest API would make no sense when using the PB
> >>> API... would be extremely costly to maintain, also it may include some
> extra
> >>> bytes on the transport.
> >>>
> >>> I would be interested on being able to know the size via Python itself
> >>> using the PB API as I'm doing.
> >>>
> >>> Thanks anyway,
> >>> Alex
> >>>
> >>> On Fri, Apr 10, 2015 at 1:58 PM, Ciprian Manea <ciprian at basho.com>
> wrote:
> >>>>
> >>>> Hi Alex,
> >>>>
> >>>> You can always query the size of a riak object using `curl` and the
> REST
> >>>> API:
> >>>>
> >>>> i.e. curl -I <riak-node-ip>:8098/buckets/test/keys/demo
> >>>>
> >>>>
> >>>> Regards,
> >>>> Ciprian
> >>>>
> >>>> On Thu, Apr 9, 2015 at 12:11 PM, Alex De la rosa
> >>>> <alex.rosa.box at gmail.com> wrote:
> >>>>>
> >>>>> Hi there,
> >>>>>
> >>>>> I'm using the python client (by the way).
> >>>>>
> >>>>> obj = RIAK.bucket('my_bucket').get('my_key')
> >>>>>
> >>>>> Is there any way to know the actual size of an object stored in Riak?
> >>>>> to make sure something mutable (like a set) didn't added up to more
> than 1MB
> >>>>> in storage size.
> >>>>>
> >>>>> Thanks!
> >>>>> Alex
> >>>>>
> >>>>> _______________________________________________
> >>>>> riak-users mailing list
> >>>>> riak-users at lists.basho.com
> >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>
> >>>>
> >>>
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >>
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150416/2a33a466/attachment-0002.html>


More information about the riak-users mailing list