Using riak as a `Comment` Store - Slow results

Herman Junge hermanalonsojunge at gmail.com
Tue Oct 30 11:27:51 EDT 2012


Hi list.

I am doing a research on using riak as a solution to store comments. 
Unfortunately my results were far from favorable. I will develop the 
architecture I used, schemas chosen, steps taken and results; Hoping to 
get feedbackboth from basho or any experienced user on what to do to 
improve these times, or wheter to discard riak as a store for comments.

1.The problem

Store comments. Given a _parent_url_ (which could be a blog post, 
animage, anything with an url), group its comments.


2. Architecture


2.1. Riak Database

Used a joyent cloud and setup 4 SmartOS machines with 1024 MB RAM each. 
They have riak preinstalled.


2.2. Client Application

Application built in node.js, used expressJS framework 
(https://github.com/visionmedia/express) to respond HTTP requests 
(specifically PUT and GET). The Riak library is node_riak 
(https://github.com/mranney/node_riak), which has been `tested in 
combat` by its creators in voxer.


The client application runs in another machine in the joyent cloud, this 
machine an ubuntu 12.04with 1024 MB RAM.


3. Schemas Chosen

I went with a very simple schema: Since the comments are grouped by 
_parent_url_. I'm using parent_url as a key, its value being an array of 
the comments in json.

An example for a key is: 
<server_url>/riak/parent_url/http%3A%2F%2Fpath%2Fto%2Fmy%2Fsite%2Ffile.html

An example for avalue is:

{ "comments"  :
   [
     { "date"    : "'2012-10-30T14:50:11.898Z"
     , "text"    : "Lorem ipsum dolor sit amet, consectetur adipiscing 
elit."
     , "author"  : "John Doe"
     }
   , { "date"    : "'2012-10-30T14:50:11.898Z"
     , "text"    : "Lorem ipsum dolor sit amet, consectetur adipiscing 
elit."
     , "author"  : "John Doe"
     }
   , { "date"    : "'2012-10-30T14:50:11.898Z"
     , "text"    : "Lorem ipsum dolor sit amet, consectetur adipiscing 
elit."
     , "author"  : "John Doe"
     }
   ]
}

4. Steps Taken

4.1. Client API:

My Client API tooks two requests:

* PUT /comment
* GET /comments/:parent_url?offset=<offset>&limit=<limit>

4.1.1 PUT /comment

Stores a comment in the parent_url given inside the request. I use the 
node_riak's method `client.modify()`, which `GET`'s the parent_url value 
to take its value, then apply the mutation (given by the library user, 
in this case is just pushing the json value of the comment in the 
array), then, `PUT`'s its new value on the parent_url key.

4.1.2 GET /comments/:parent_url?offset=<offset>&limit=<limit>

GETS the comments from a parent_url given, starting from <offset> to 
<limit>.

Internally I just issue a `GET` to riak, the controller of my client 
does the offset, limit extraction.

4.2. The Stress Test

Issued a new joyent machine (an Ubuntu 12.04 with 1024 MB RAM) just to 
make `ab` stress tests.

I done six tests:

API method  nº of requests  concurrency
PUT (*1)  10,000  5
PUT (*1)  10,000  50
PUT (*1)  10,000  500
GET (*2)  10,000  5
GET (*2)  10,000  50
GET (*2)  10,000  500

(*1) PUT /comment
(*2) GET 
comments/http%3A%2F%2Fpath%2Fto%2Fmy%2Fsite%2F1111.html?offset=25&limit=20


5. Results

The following tables show the results I got on each test:

PUT
10000 5



50% 116
65% 142
70% 161
85% 177
90% 274
95% 486
98% 751
99% 1165
100%  1065

PUT
10000 50



50% 1879
65% 1990
70% 2068
85% 2124
90% 2364
95% 2734
98% 4062
99% 4591
100%  11258

PUT
10000 500



50% 20876
65% 21491
70% 21919
85% 22202
90% 23136
95% 23914
98% 25036
99% 25835
100%  29611

   GET
   10000



50% 68
65% 75
70% 80
85% 83
90% 94
95% 107
98% 145
99% 475
100%  535

   GET
   10000



50% 631
65% 673
70% 701
85% 719
90% 783
95% 913
98% 1054
99% 1099
100%  1265

   GET
   10000



50% 6363
65% 6636
70% 6820
85% 6934
90% 7218
95% 7442
98% 7691
99% 7836
100%  8435


6. Conclusion

At first sight, I'm getting very unfavorable results (compared with one 
table MySQL unconfigured under the very same requests). So I'm 
requesting from feedback from you:

a) ¿Is it a good idea to use Riak as a comment store?

b) Are these times expected? (in other words, where I am making a big 
mistake)?

Regards,

Herman Junge
@hermanjunge







-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121030/608a4b97/attachment.html>


More information about the riak-users mailing list