Lots of sparse columns. Efficient like Cassandra? Some measures of my dataset
jeremiah.peschka at gmail.com
Wed Jul 17 08:25:06 EDT 2013
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop
On Jul 17, 2013, at 4:38 AM, gbrits <gbrits at gmail.com> wrote:
> Somewhere (can't find it now) I've read that Riak, like Cassandra could be
> classified as a column store.
That is incorrect. Riak is a key value database where the value is an opaque blob.
> This is just a name of course but what I understand from Cassandra is that
> this allows for space-efficient encoding of column-values. Basically storage
> is surrounded around columns instead of rows, allowing for different
> persistence strategies on a per-column, or column-family, basis. Moreover,
> it would allow for zero storage overhead for non-existent column values.
> I.e: basically allowing for efficient storage of sparse data-sets.
> Does Riak have this property as well?
No. Riak will happily store whatever you throw at it. That being said, most good serialization libraries will leave off nullable properties.
> More specifically, I've got a datastructure on paper with the following
> properties, when mapped to riak nomenclature:
> - ~ 1.000.000 keys (will not grow)
> - ~ 1.000 columns. (may grow)
> - 1 particular key has a median of ~50 columns. In other words the entire
> set is ~ 95% sparse.
> - Wherever a key has a value for a particular column, that value is always
> exactly a String (base 255) of 4KB length.
> - the 4KB values themselves are pretty 'sparse' so would benefit a lot from
> run-length encoding. Is this supported out of the box?
> Given these properties how would Riak hold up? Hard to say of course, but
> I'm looking for some general advice.
Riak objects should be no more than ~10MB for performance reasons. You should be safe.
> View this message in context: http://riak-users.197444.n3.nabble.com/Lots-of-sparse-columns-Efficient-like-Cassandra-Some-measures-of-my-dataset-tp4028367.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users