Fwd: RiakCS 504 Timeout on s3cmd for certain keys

Kota Uenishi kota at basho.com
Mon Aug 18 22:03:39 EDT 2014


Alex,

Riak CS 1.4.5 and 1.5.0 had a lot of improvement after those articles you
put the URL, not it is not using Riak's bucket listing but using Riak's
internal API for more efficient listing. What version of Riak CS are you
using? I want you to make sure you're using those versions and a line
`{fold_objects_for_list_keys, true},` at riak_cs section of app.config
(assuming all other Riak part correctly configured).

> Based on this I’m thinking that cost of this type of query is only going
to get worse over time as we add more keys to this bucket (unless secondary
indexes can be added). Or am I totally out to lunch here and there’s some
other underlying problem?

The strange part is s3cmd. Riak CS has incremental bucket listing API that
requires clients to iterate on every 1000 objects (common prefixes), but
s3cmd iterates all the specified bucket before printing them all. You can
observe how s3cmd and Riak CS interacts if you specify '-d' option like
this:

```
s3cmd -d -c yours.s3cfg ls -r s3://yourbucket/yourdir/
```

I would expect Riak CS's listing API is not much slow  as to need 5 seconds
(or, say, >10 seconds) because, on each request it just returns 1000
objects.

There might be another possibility on slow query - if you had many (say,
more than 10 thousands) deleted objects on the same bucket it might affect
each 1000 listing. This will eventually be solved as Riak CS's garbage
collection removes deleted manifests, which is just marked as deleted (and
to be ignored correctly).

[1]
http://www.quora.com/Riak/Is-it-really-expensive-for-Riak-to-list-all-buckets-Why

On Thu, Aug 14, 2014 at 6:05 AM, Alex Millar <alex at gobonfire.com> wrote:

> Good afternoon Charlie,
>
> So the issue we’re having is only with bucket listing.
>
> alxndrmlr at alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls
> s3://bonfirehub-resources-can-east-doc-conversion
>                        DIR
> s3://bonfirehub-resources-can-east-doc-conversion/organizations/
>
> real 2m0.747s
> user 0m0.076s
> sys 0m0.030s
>
> where as…
>
> alxndrmlr at alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls
> s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals
>                        DIR
> s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals/
>
> real 0m10.262s
> user 0m0.075s
> sys 0m0.028s
>
> The contents of this bucket contains a lot of very small files (basically
> for each PDF we receive I split it to .JPG foreach page and store them
> here. Based on the my latest counts it looks like we have around *170,000* .JPG
> files in that bucket.
>
> Here’s a snippet from the HAProxy log for the 504 timeouts…
>
> Aug 12 16:01:34 <http://airmail.calendar/2014-08-12%2016:01:34%20EDT> localhost.localdomain
> haproxy[4718]: 192.0.223.236:48457 [12/Aug/2014:16:01:24.454] riak_cs~
> riak_cs_backend/riak3 161/0/0/-1/10162 504 194 - - sH-- 0/0/0/0/0 0/0 {
> bonfirehub-resources-can-east-doc-conversion.bf-riakcs.com} "GET
> /?delimiter=/ HTTP/1.1"
>
> I’ve put together a video showing off the top results of each of the 5
> riak nodes while performing $ time s3cmd -c .s3cfg-riakcs-admin ls
> s3://bonfirehub-resources-can-east-doc-conversion
>
>
> https://dl.dropboxusercontent.com/u/5723659/RiakCS%20ls%20monitoring%20results.mov
>
> Now I’ve had a hunch this is just a fundamentally expensive operation
> which exceeds the 5000ms response time threshold set in our HAProxy config
> (which I raised during the video to illustrate what’s going on). After
> reading
> http://www.quora.com/Riak/Is-it-really-expensive-for-Riak-to-list-all-buckets-Why
>  and http://www.paperplanes.de/2011/12/13/list-all-of-the-riak-keys.html I’m
> feeling like this is just a fundamental issue with the data structure in
> Riak.
>
> Based on this I’m thinking that cost of this type of query is only going
> to get worse over time as we add more keys to this bucket (unless secondary
> indexes can be added). Or am I totally out to lunch here and there’s some
> other underlying problem?
>
> I’ve cc’d the mailing list on this as suggested.
>
>   [image: Bonfire Logo]  *Alex Millar*, CTO
> Office: 1-800-354-8010 ext. 704 <+18003548010>
> Mobile: 519-729-2539 <+15197292539>
> *GoBonfire*.com <http://GoBonfire.com>
>
> From: Charlie Voiselle <cvoiselle at basho.com> <cvoiselle at basho.com>
> Reply: Charlie Voiselle <cvoiselle at basho.com>> <cvoiselle at basho.com>
> Date: August 13, 2014 at 10:36:51 AM
> To: Alex Millar <alex at gobonfire.com>> <alex at gobonfire.com>
> Cc: Tad Bickford <tbickford at basho.com>> <tbickford at basho.com>
> Subject:  Fwd: RiakCS 504 Timeout on s3cmd for certain keys
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
Kota UENISHI / @kuenishi
Basho Japan KK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140819/4416111b/attachment.html>


More information about the riak-users mailing list