Riak on SAN

John E. Vincent lusis.org+riak-users at gmail.com
Wed Oct 2 22:24:07 EDT 2013

Man I go away for a few hours for family time and off things go ;)

So this lead to some interesting convos on twitter and here. Others have
addressed some things. I figure it helps to explain the sad lonely world I
live in - It's called "Enterprise".

A few folks are somewhat aware that our product uses Riak under the covers.
We have a hosted SaaS version and we also allow customers to install it
entirely isolated on their own networks. The only people who do this are
traditional enterprises.

The very first question that comes up during an installation after "you
need HOW many servers?" is "How do I back this up".  Since we use LevelDB,
we have the worst of backup options - coordinated node shutdown and
tarball'ing. The thing is we can't say to them "you never need to back this
up. Just add more nodes!"

That doesn't check the box they have. That doesn't meet the legal and
industry guidelines they have to follow. So now that they've swallowed the
"you need 5 servers just for the DB" we now hit them with "your backup
strategy involves this complicated orchestrated shutdown process and some
tarballs". When faced with that, we ran into a new issue. They started
doing vm snapshots and stupid shit like vmotioning the instances (oh yes,
they virtualize it =/).

If you aren't familiar with vmotion, it's basically vmware's bullshit that
says they can somehow defy the laws of physics. If you read the details on
exactly what vmotion does (hint - it doesn't actually take the node offline
- vmware just "buffers" the pending network requests among other things),
you can see how this can TOTALLY fuck up Riak clusters.

Anyway so this is the world we have to live in and we have to provide
something that resembles a backup they can DR from. Our normal course of
action is to tell them to contact Basho for RiakDS and go multi-site. SAN
based snapshots largely meet that need for them.

For what it's worth, this is not just a problem with Riak and there are
legitimate use cases for wanting to have a "copy" of production data for
testing new code against. The biggest problem is once you get data IN to
riak (and other stores), it's REALLY difficult to prune it outside of
expensive "walk all the things", an external index of some kind or
resorting to application-level business logic tooling.

I'm not making a judgement call. Trade offs are a thing but it's definitely
a issue. At this point I'm considering resorting to a post-commit hook
machination of some kind.

On Wed, Oct 2, 2013 at 6:02 PM, Jeremiah Peschka <jeremiah.peschka at gmail.com
> wrote:

> Responses inline.
> TL;DR - I actually agree with John, SANs make management of storage
> stupidly easy, but you pay more money for it. Make the right decision for
> your org, but make sure you can monitor and backup that decision. The SAN
> isn't a magic box. And  a Drobo b1200i [2] is definitely not a SAN.
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
> On Wed, Oct 2, 2013 at 2:12 PM, John E. Vincent <
> lusis.org+riak-users at gmail.com> wrote:
>> I'm going to take a competing view here.
>> SAN is a bit overloaded of a term at this point. Nothing precludes a SAN
>> from being performant or having SSDs. Yes the cost is overkill for fiber
>> but iSCSI is much more realistic. Alternately you can even do ATAoE.
> Agreed. You can buy a glorified direct attached storage device with a few
> ethernet ports in it, but vendors will call it a SAN.
>> From a hardware perspective, if I have 5 pizza boxes as riak nodes, I can
>> only fit so many disks in them. Meanwhile I can add another shelf to my SAN
>> and expand as needed.
> We have the ability to cram 16x 960GB SSDs into the front of a Dell R720
> for about $550 per drive... no SAN vendor can beat you on price for that.
> SAN storage is an order of magnitude more expensive, but...
>> Additionally backup of a SAN is MUCH easier than backup of a riak node
>> itself. It's a snapshot and you're done. Mind you nothing precludes you
>> from doing LVM snapshots in the OS but you still need to get the data OFF
>> that system for it to be truly backed up.
> The products worth of being called a SAN offer you fantastic features like
> application aware volume snapshots, multi-site async and synchronous block
> level synchronization, and all kinds of amazing features that mean you
> never need to think about your storage beyond "HEY THERE, MAGIC BOX, I NEED
> 500GB OF SPACE!"
>> I love riak and other distributed stores but backing them up is NOT a
>> solved problem. Walking all keys, coordinating the take down of all your
>> nodes in a given order or whatever your strategy is a serious pain point.
>> Using a SAN or local disk also doesn't excuse you from watching I/O
>> performance. With a SAN I get multiple redundant paths to a block device
>> and I don't get that necessarily with local storage.
>> Just my two bits.
> For many applications, if you need storage performance outside of the main
> chassis, you could also look at an approach like Microsoft take with the
> Fast Track Data Warehouse Reference Architecture [1]. For those who don't
> want to read, you line up the ability of your CPUs to process data with the
> ability of your disks to produce data. For SQL Server, you assume ~300MB/s
> of processing per core. Core count * 300MB/s = total combined disk speed.
> It's easy to use something like a Dell MD1220 or an HP MSA to get this kind
> of performance, too, without breaking the bank and upgrading to something
> like a 3PAR or EMC.
> [1]:
> http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/reference-architecture.aspx
> [2]: http://www.droboworks.com/B1200i.asp?gclid=CPbhhL2T-bkCFeI-Mgod0hEAaA
>> On Wed, Oct 2, 2013 at 2:18 AM, Jeremiah Peschka <
>> jeremiah.peschka at gmail.com> wrote:
>>> Could you do it? Sure.
>>> Should you do it? No.
>>> An advantage of Riak is that you can avoid the cost of SAN storage by
>>> getting duplication at the machine level rather than rely on your storage
>>> vendor to provide it.
>>> Running Riak on a SAN also exposes you to the SAN becoming your
>>> bottleneck; you only have so many fiber/iSCSI ports and a fixed number of
>>> disks. The risk of storage contention is high, too, so you can run into
>>> latency issues that are difficult to diagnose without looking into both
>>> Riak as well as the storage system.
>>> Keeping cost in mind, too, SAN storage is about 10x the cost of consumer
>>> grade SSDs. Not to mention feature licensing and support... The cost
>>> comparison isn't favorable.
>>> Please note: Even though your vendor calls it a SAN, that doesn't mean
>>> it's a SAN.
>>>  On Oct 1, 2013 11:08 PM, "Guy Morton" <Guy.Morton at bksv.com> wrote:
>>>> Does this make sense?
>>>> --
>>>> Guy Morton
>>>> Web Development Manager
>>>> Brüel & Kjær EMS
>>>> This e-mail is confidential and may be read, copied and used only by
>>>> the intended recipient. If you have received it in error, please contact
>>>> the sender immediately by return e-mail. Please then delete the e-mail and
>>>> do not disclose its contents to any other person.
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20131002/a8f61885/attachment.html>

More information about the riak-users mailing list