Advice on making a Riak middleware easy to configure

Matthew Brender mbrender at basho.com
Mon Mar 16 11:40:28 EDT 2015


Marc,
Our side conversation starts to help, but I'm sure there's more to discuss
here. I believe you want to know: what configuration assumptions about Riak
can be made before connecting into apiman vs what functionality does apiman
have to support?

If you're looking to start simple, I'd think beginning from an assumption
of interfacing with the load balancer in front of a Riak cluster to be a
good move.

Perhaps other members note the definition of a "typical" configuration to
help.

Best,
Matt

----


Hi Matt,

It would seem that I fell on the wrong side of the divide between
providing enough context and making my query clear!

In short, apiman consists of two main parts: runtime (gateway) and
design-time (manager). The manager pushes configuration to the gateway
which then enforces policies on transiting traffic - that might be
something like rate-limiting, authentication, or metrics collection on
HTTP traffic.

The gateway is horizontally scalable, so it requires a data store to
facilitate shared state functionality (e.g. rate-limit applies across
the whole cluster of gateways). We have pluggability at the data-store
level - so, by implementing a few interfaces we can use a given
data-store in an abstracted manner (i.e. zero knowledge about the
underlying specifics when a policy uses a shared state component).

Users can choose whichever data-store suits their needs; simply select
it and provide relevant configuration information to the plug-in (via a
config file, or whatever).

The issue I have is simply: what configuration options should we provide
for our plug-in so it can connect to a typical Riak set-up(s) (given I
have zero knowledge of their set-up and Riak user's conventions).

For instance in the config file, do I:

- Accept a list of Riak nodes and try to *construct* a cluster for them;
or is it safe to assume they've done this in advance?

- Try to define buckets & associated data-types, or should I assume this
is done in advance?

- Just assume everyone uses Riak behind a load-balancer, and I just need
to accept a single URI?

Some of these scenarios run into idempotence issues, so it may be that
it's unsafe or poor for performance to allow those.

I'm happy to support multiple configurations, just I'm not sure which
ones are typical, given there are a large number of possible permutations.

I hope I've been a bit clearer this time; please let me know if I haven't!

Appreciate your assistance.

Regards,
Marc
ᐧ

*Matt Brender | Developer Advocacy Lead*
Basho Technologies
t: @mjbrender <https://twitter.com/mjbrender>
c: +1 617.817.3195

On Sat, Mar 7, 2015 at 10:35 AM, Marc Savy <msavy at redhat.com> wrote:

> Hi All,
>
> I'm involved in a FOSS API management project (apiman), and I've been
> thinking about providing a Riak implementation of its gateway components
> in the community (where we already have ElasticSearch and Infinispan).
> These components provide the distributed storage for tasks like
> rate-limiting counters, IP white-listing, black-listing, etc and are
> applied by a horizontally scalable, async gateway (to vastly
> oversimplify!).
>
> I'm in need of advice principally in regards to configuration and
> set-up. Namely, what assumptions can I safely make about a Riak user's
> set-up, and which settings I should expose in the component's
> configuration. Note that many gateways can exist, and hence any set-up
> ideally needs to already in advance, or be idempotent in case multiple
> nodes attempt to do it at once (or otherwise for it to be
> lockable/exclusionary).
>
> To be more concrete, should I, for example, expect the user to have
> already set up and joined together their Riak cluster a priori, with
> everything behind a load-balancer: just give me a single URI to connect
> to). Or, should I expect a list of FQDNs/IPs and attempt to join them
> together into a cluster on the user's behalf - or will there be
> idempotence issues if I do that multiple times?
>
> As far as I can tell, there is no node discovery/sharing
> implementation[1], so I take it there's no way, for instance, to hit a
> single node (which has already been joined with other nodes), and
> thereby automatically gain knowledge of all cluster members?
>
> A couple of other configuration issues: Given the introduction of Riak
> Data Types on buckets, whom should I expect to set up the data types[2]?
> Should I create them automatically if they don't exist? Same for the
> bucket itself.
>
> I'm very interested to know to present a convenient set of options that
> will allow a typical development and deployment environment to be
> supported.
>
> Regards,
> Marc
>
> [0] With the usual consistency limitations
> [1] https://github.com/basho/riak/issues/356
> [2] http://docs.basho.com/riak/latest/dev/using/data-types/#
> Setting-Up-Buckets-to-Use-Riak-Data-Types
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150316/bfc8bd5e/attachment-0002.html>


More information about the riak-users mailing list