Advice on making a Riak middleware easy to configure

Marc Savy marc.savy at redhat.com
Thu Mar 19 08:37:10 EDT 2015


Hi Dmitri,

Many thanks for your impressive response, it's very helpful indeed!

> In order to answer the setup questions specifically, we'd need to
> know more about what the project is intending to do. Will users be
> typically installing their own Riak clusters and then setting up
> apiman to help manage APIs?

This is one of the areas I'm not sure about. Ideally I'd like the user
to bring their own Riak cluster and manage it themselves, but wasn't
sure if there was a large use-base that might *expect* us to set
everything up for them

In the ideal world: users come along and say "I want to use Riak for the
distributed components rather than Infinispan or ElasticSearch, and
here's an end-point(s) to connect to".

If I can make that assumption for a dev/test environment too, that's great!

> Or is this more of a multi-tenant kind of situation, where apiman
> would be spinning up nodes or clusters for users? To put it another
> way, how does apiman handle ElasticSearch?

We would leave the spinning up of additional nodes to some other element
of the system (e.g. kubernetes). It's outside of our domain, really.

> Ah, ok, if I understand your question correctly -- if you're not
> spinning up VMs or setting up the nodes yourself via ssh (using
> something like our Ansible playbook), then you can expect an already
> set up cluster. (FWIW, the various configuration management tools
> such as Ansible that install Riak clusters do provide idempotency). I
> can't really picture a situation where users would set up nodes but
> not join them and leave that up to apiman.

You understood correctly. Thanks!

Essentially, I can use a simple RiakClient with a single address (or
list of addresses) and I don't need to worry about a more complex
RiakCluster set-up routine as below; correct?

addresses = <list of addrs>
... nodes = RiakNode.Builder.buildNodes(builder, addresses);
... cluster = new RiakCluster.Builder(nodes).build();
cluster.start();
RiakClient client = new RiakClient(cluster)

> A load balancer is crucial (we recommend either a hardware based one,
> or something like HAProxy or Nginx). I know some users connect to a
> Riak cluster using the round-robin load balancing built into a Riak
> client, but that should be a last resort measure (if, for example,
> you're not allowed to spin up another machine for HAProxy). A
> dedicated load balancer (with a least-connection load balancing
> algorithm) is significantly faster. (Not to mention, provides logging
> and a rich ecosystem of tools and dashboards).
>
>> Given the introduction of Riak Data Types on buckets, whom should I
>> expect to set up the data types?
>
> There isn't currently an API to create bucket types remotely. So
> unless apiman has daemons that will be running on the individual Riak
> nodes and can make commandline calls, you will have to leave bucket
> type creation to the users.

This is all excellent information, and exactly what I wanted to know.

> For example, Strongly Consistent buckets are useful for atomic
> operations like user password management, security group management
> and so on. So, you could require that users would create a bucket
> type named 'sc' and enable Strong Consistency on it. (Any buckets
> under that bucket type would then also be strongly consistent, and
> usable by apiman or by the users' client code).
>> Similarly, given that metering is a goal, you would also need
>> bucket types for the various server-side Data Types. That is,
>> require users to create a Maps bucket type named 'maps', a Counters
>> type named 'counters', and a Sets type named 'sets', for example.
>>
>> Other things to keep on your radar, as far as bucket types:
>>
>> * You can attach a Solr Search index to a bucket type. However,
>> given that you can only associate a single search index with a
>> bucket type, this isn't as generic/reusable as Data Types. I could
>> see setting up a Search index for something like API logging,
>> though.
>>
>> * You probably want provisions for Riak Authentication &
>> Authorization (http://docs.basho.com/riak/2.0.4/ops/running/authz/
>> ). (Specifically, for supporting user-created users & passwords,
>> since at the moment we don't have a remote API to manage these).

I could provide a script to set everything up as an example, and also
document the process in the community. That being said, it would be nice
to be able to do this kind of preparatory and meta-data set-up using a
simple schema (e.g. json-schema). For instance, things like setting up
buckets, data types, name-spaces, etc.

This is invaluable information; thank you very much. We'll definitely
consider Riak implementations for those elements, too.

> In terms of options, do you mean like best-practice/recommended riak
>  config files that you'd point your users to?

I was thinking more of what config they would expect to be available in
Apiman's config to facilitate using their Riak cluster with our
components. I think you've answered this point already.

Regards,
Marc

On 17/03/2015 13:14, Dmitri Zagidulin wrote:
> Hi Marc,
>
> This sounds like a very cool project! I'd be very interested in hearing
> more about this, and answering any data modeling or setup questions.
>
> In order to answer the setup questions specifically, we'd need to know
> more about what the project is intending to do. Will users be typically
> installing their own Riak clusters and then setting up apiman to help
> manage APIs? Or is this more of a multi-tenant kind of situation, where
> apiman would be spinning up nodes or clusters for users? To put it
> another way, how does apiman handle ElasticSearch?
>
> Couple of thoughts, from your questions.
>
>  > To be more concrete, should I, for example, expect the user to have
>  > already set up and joined together their Riak cluster a priori, with
>  > everything behind a load-balancer: just give me a single URI to connect
>  > to). [Or attempt to join them into a cluster].
>
> Ah, ok, if I understand your question correctly -- if you're not
> spinning up VMs or setting up the nodes yourself via ssh (using
> something like our Ansible playbook), then you can expect an already set
> up cluster. (FWIW, the various configuration management tools such as
> Ansible that install Riak clusters do provide idempotency). I can't
> really picture a situation where users would set up nodes but not join
> them and leave that up to apiman.
>
>  > As far as I can tell, there is no node discovery/sharing
>  > implementation
>
> If you know the IP of one node, you can definitely discover the other
> nodes via an HTTP call to /stats
> http://docs.basho.com/riak/latest/ops/running/nodes/inspecting/ (via
> 'ring_members'). But, unless apiman provides some sort of monitoring or
> keepalive-checking capability, I don't think there's any reason to do that.
>
> A load balancer is crucial (we recommend either a hardware based one, or
> something like HAProxy or Nginx). I know some users connect to a Riak
> cluster using the round-robin load balancing built into a Riak client,
> but that should be a last resort measure (if, for example, you're not
> allowed to spin up another machine for HAProxy). A dedicated load
> balancer (with a least-connection load balancing algorithm) is
> significantly faster. (Not to mention, provides logging and a rich
> ecosystem of tools and dashboards).
>
>  > Given the introduction of Riak
>  > Data Types on buckets, whom should I expect to set up the data types?
>
> There isn't currently an API to create bucket types remotely. So unless
> apiman has daemons that will be running on the individual Riak nodes and
> can make commandline calls, you will have to leave bucket type creation
> to the users.
>
> That said, I could easily see you requiring a certain set of bucket
> types of your users.
>
> For example, Strongly Consistent buckets are useful for atomic
> operations like user password management, security group management and
> so on. So, you could require that users would create a bucket type named
> 'sc' and enable Strong Consistency on it. (Any buckets under that bucket
> type would then also be strongly consistent, and usable by apiman or by
> the users' client code).
>
> Similarly, given that metering is a goal, you would also need bucket
> types for the various server-side Data Types. That is, require users to
> create a Maps bucket type named 'maps', a Counters type named
> 'counters', and a Sets type named 'sets', for example.
>
> Other things to keep on your radar, as far as bucket types:
>
> * You can attach a Solr Search index to a bucket type. However, given
> that you can only associate a single search index with a bucket type,
> this isn't as generic/reusable as Data Types. I could see setting up a
> Search index for something like API logging, though.
>
> * You probably want provisions for Riak Authentication & Authorization
> (http://docs.basho.com/riak/2.0.4/ops/running/authz/ ). (Specifically,
> for supporting user-created users & passwords, since at the moment we
> don't have a remote API to manage these).
>
>  > I'm very interested to know to present a convenient set of options that
>  > will allow a typical development and deployment environment to be
> supported.
>
> In terms of options, do you mean like best-practice/recommended riak
> config files that you'd point your users to?
>
> Let me know if you have further questions.
>
> Dmitri
>
>
>
>
> On Sat, Mar 7, 2015 at 10:35 AM, Marc Savy <msavy at redhat.com
> <mailto:msavy at redhat.com>> wrote:
>
>     Hi All,
>
>     I'm involved in a FOSS API management project (apiman), and I've been
>     thinking about providing a Riak implementation of its gateway components
>     in the community (where we already have ElasticSearch and Infinispan).
>     These components provide the distributed storage for tasks like
>     rate-limiting counters, IP white-listing, black-listing, etc and are
>     applied by a horizontally scalable, async gateway (to vastly
>     oversimplify!).
>
>     I'm in need of advice principally in regards to configuration and
>     set-up. Namely, what assumptions can I safely make about a Riak user's
>     set-up, and which settings I should expose in the component's
>     configuration. Note that many gateways can exist, and hence any set-up
>     ideally needs to already in advance, or be idempotent in case multiple
>     nodes attempt to do it at once (or otherwise for it to be
>     lockable/exclusionary).
>
>     To be more concrete, should I, for example, expect the user to have
>     already set up and joined together their Riak cluster a priori, with
>     everything behind a load-balancer: just give me a single URI to connect
>     to). Or, should I expect a list of FQDNs/IPs and attempt to join them
>     together into a cluster on the user's behalf - or will there be
>     idempotence issues if I do that multiple times?
>
>     As far as I can tell, there is no node discovery/sharing
>     implementation[1], so I take it there's no way, for instance, to hit a
>     single node (which has already been joined with other nodes), and
>     thereby automatically gain knowledge of all cluster members?
>
>     A couple of other configuration issues: Given the introduction of Riak
>     Data Types on buckets, whom should I expect to set up the data types[2]?
>     Should I create them automatically if they don't exist? Same for the
>     bucket itself.
>
>     I'm very interested to know to present a convenient set of options that
>     will allow a typical development and deployment environment to be
>     supported.
>
>     Regards,
>     Marc
>
>     [0] With the usual consistency limitations
>     [1] https://github.com/basho/riak/__issues/356
>     <https://github.com/basho/riak/issues/356>
>     [2]
>     http://docs.basho.com/riak/__latest/dev/using/data-types/#__Setting-Up-Buckets-to-Use-__Riak-Data-Types
>     <http://docs.basho.com/riak/latest/dev/using/data-types/#Setting-Up-Buckets-to-Use-Riak-Data-Types>
>
>     _________________________________________________
>     riak-users mailing list
>     riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>     http://lists.basho.com/__mailman/listinfo/riak-users___lists.basho.com
>     <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>
>





More information about the riak-users mailing list