Riak on SAN

Sean Cribbs sean at basho.com
Wed Oct 2 23:16:15 EDT 2013


To inject a Basho-colored perspective briefly (opinions/thoughts are my
own, disclaimer blah blah):

We have had customers deploy with SANs backing Riak. We recommend against
it for all the reasons discussed in this thread: lack of isolation (storage
and bus), extensive cost, etc. Nevertheless, they have been successful in
their deployments -- sometimes you just have to use what you have. As a
smaller vendor, we can only change the behavior of "Enterprise" so much in
one go. You have to pick your battles.

Also, John is not the only person who has had to deal with the "how do you
back this up" problem and related problems/requests like "export my data to
this external data warehouse" and "backup only the stuff since yesterday".
I'd hazard to say that part of the reason we haven't had good tools for the
backup issue yet has been difficulty identifying what the checkbox actually
*means* for Riak. We have engineers actively investigating these needs and
prototyping.

The checkboxes *are* important, even if they seem infeasible or ridiculous
at first, because they're really about an underlying need. Sometimes it
just doesn't come across as the need, only the checkbox. Discussions like
this thread are good, and help us identify real things we can improve upon.

#hugops


On Wed, Oct 2, 2013 at 9:24 PM, John E. Vincent <
lusis.org+riak-users at gmail.com> wrote:

> Man I go away for a few hours for family time and off things go ;)
>
> So this lead to some interesting convos on twitter and here. Others have
> addressed some things. I figure it helps to explain the sad lonely world I
> live in - It's called "Enterprise".
>
> A few folks are somewhat aware that our product uses Riak under the
> covers. We have a hosted SaaS version and we also allow customers to
> install it entirely isolated on their own networks. The only people who do
> this are traditional enterprises.
>
> The very first question that comes up during an installation after "you
> need HOW many servers?" is "How do I back this up".  Since we use LevelDB,
> we have the worst of backup options - coordinated node shutdown and
> tarball'ing. The thing is we can't say to them "you never need to back this
> up. Just add more nodes!"
>
> That doesn't check the box they have. That doesn't meet the legal and
> industry guidelines they have to follow. So now that they've swallowed the
> "you need 5 servers just for the DB" we now hit them with "your backup
> strategy involves this complicated orchestrated shutdown process and some
> tarballs". When faced with that, we ran into a new issue. They started
> doing vm snapshots and stupid shit like vmotioning the instances (oh yes,
> they virtualize it =/).
>
> If you aren't familiar with vmotion, it's basically vmware's bullshit that
> says they can somehow defy the laws of physics. If you read the details on
> exactly what vmotion does (hint - it doesn't actually take the node offline
> - vmware just "buffers" the pending network requests among other things),
> you can see how this can TOTALLY fuck up Riak clusters.
>
> Anyway so this is the world we have to live in and we have to provide
> something that resembles a backup they can DR from. Our normal course of
> action is to tell them to contact Basho for RiakDS and go multi-site. SAN
> based snapshots largely meet that need for them.
>
> For what it's worth, this is not just a problem with Riak and there are
> legitimate use cases for wanting to have a "copy" of production data for
> testing new code against. The biggest problem is once you get data IN to
> riak (and other stores), it's REALLY difficult to prune it outside of
> expensive "walk all the things", an external index of some kind or
> resorting to application-level business logic tooling.
>
> I'm not making a judgement call. Trade offs are a thing but it's
> definitely a issue. At this point I'm considering resorting to a
> post-commit hook machination of some kind.
>
>
> On Wed, Oct 2, 2013 at 6:02 PM, Jeremiah Peschka <
> jeremiah.peschka at gmail.com> wrote:
>
>> Responses inline.
>>
>> TL;DR - I actually agree with John, SANs make management of storage
>> stupidly easy, but you pay more money for it. Make the right decision for
>> your org, but make sure you can monitor and backup that decision. The SAN
>> isn't a magic box. And  a Drobo b1200i [2] is definitely not a SAN.
>>
>> ---
>> Jeremiah Peschka - Founder, Brent Ozar Unlimited
>> MCITP: SQL Server 2008, MVP
>> Cloudera Certified Developer for Apache Hadoop
>>
>>
>> On Wed, Oct 2, 2013 at 2:12 PM, John E. Vincent <
>> lusis.org+riak-users at gmail.com> wrote:
>>
>>> I'm going to take a competing view here.
>>>
>>> SAN is a bit overloaded of a term at this point. Nothing precludes a SAN
>>> from being performant or having SSDs. Yes the cost is overkill for fiber
>>> but iSCSI is much more realistic. Alternately you can even do ATAoE.
>>>
>>
>> Agreed. You can buy a glorified direct attached storage device with a few
>> ethernet ports in it, but vendors will call it a SAN.
>>
>>
>>>
>>> From a hardware perspective, if I have 5 pizza boxes as riak nodes, I
>>> can only fit so many disks in them. Meanwhile I can add another shelf to my
>>> SAN and expand as needed.
>>>
>>
>> We have the ability to cram 16x 960GB SSDs into the front of a Dell R720
>> for about $550 per drive... no SAN vendor can beat you on price for that.
>> SAN storage is an order of magnitude more expensive, but...
>>
>>
>>> Additionally backup of a SAN is MUCH easier than backup of a riak node
>>> itself. It's a snapshot and you're done. Mind you nothing precludes you
>>> from doing LVM snapshots in the OS but you still need to get the data OFF
>>> that system for it to be truly backed up.
>>>
>>
>> The products worth of being called a SAN offer you fantastic features
>> like application aware volume snapshots, multi-site async and synchronous
>> block level synchronization, and all kinds of amazing features that mean
>> you never need to think about your storage beyond "HEY THERE, MAGIC BOX, I
>> NEED 500GB OF SPACE!"
>>
>>
>>>
>>> I love riak and other distributed stores but backing them up is NOT a
>>> solved problem. Walking all keys, coordinating the take down of all your
>>> nodes in a given order or whatever your strategy is a serious pain point.
>>>
>>> Using a SAN or local disk also doesn't excuse you from watching I/O
>>> performance. With a SAN I get multiple redundant paths to a block device
>>> and I don't get that necessarily with local storage.
>>>
>>> Just my two bits.
>>>
>>
>> For many applications, if you need storage performance outside of the
>> main chassis, you could also look at an approach like Microsoft take with
>> the Fast Track Data Warehouse Reference Architecture [1]. For those who
>> don't want to read, you line up the ability of your CPUs to process data
>> with the ability of your disks to produce data. For SQL Server, you assume
>> ~300MB/s of processing per core. Core count * 300MB/s = total combined disk
>> speed. It's easy to use something like a Dell MD1220 or an HP MSA to get
>> this kind of performance, too, without breaking the bank and upgrading to
>> something like a 3PAR or EMC.
>>
>>
>> [1]:
>> http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/reference-architecture.aspx
>> [2]:
>> http://www.droboworks.com/B1200i.asp?gclid=CPbhhL2T-bkCFeI-Mgod0hEAaA
>>
>>
>>>
>>>
>>>
>>> On Wed, Oct 2, 2013 at 2:18 AM, Jeremiah Peschka <
>>> jeremiah.peschka at gmail.com> wrote:
>>>
>>>> Could you do it? Sure.
>>>>
>>>> Should you do it? No.
>>>>
>>>> An advantage of Riak is that you can avoid the cost of SAN storage by
>>>> getting duplication at the machine level rather than rely on your storage
>>>> vendor to provide it.
>>>>
>>>> Running Riak on a SAN also exposes you to the SAN becoming your
>>>> bottleneck; you only have so many fiber/iSCSI ports and a fixed number of
>>>> disks. The risk of storage contention is high, too, so you can run into
>>>> latency issues that are difficult to diagnose without looking into both
>>>> Riak as well as the storage system.
>>>>
>>>> Keeping cost in mind, too, SAN storage is about 10x the cost of
>>>> consumer grade SSDs. Not to mention feature licensing and support... The
>>>> cost comparison isn't favorable.
>>>>
>>>> Please note: Even though your vendor calls it a SAN, that doesn't mean
>>>> it's a SAN.
>>>>  On Oct 1, 2013 11:08 PM, "Guy Morton" <Guy.Morton at bksv.com> wrote:
>>>>
>>>>> Does this make sense?
>>>>>
>>>>> --
>>>>> Guy Morton
>>>>> Web Development Manager
>>>>> Brüel & Kjær EMS
>>>>>
>>>>> This e-mail is confidential and may be read, copied and used only by
>>>>> the intended recipient. If you have received it in error, please contact
>>>>> the sender immediately by return e-mail. Please then delete the e-mail and
>>>>> do not disclose its contents to any other person.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
Sean Cribbs <sean at basho.com>
Software Engineer
Basho Technologies, Inc.
http://basho.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20131002/fbab7967/attachment.html>


More information about the riak-users mailing list