Riak and SEC Filings

Ryan Zezeski rzezeski at basho.com
Tue Nov 8 16:09:31 EST 2011


On Tue, Nov 8, 2011 at 7:08 AM, Hector Castro <hectcastro at gmail.com> wrote:
>
>
>        * In going through the search querying documentation, I haven't
> found a way to extract a section of a result containing matches.  Something
> similar to Google's search results page where you see an excerpt of the
> webpage contents that match your query.  Is something like this built-in so
> that it doesn't have to be done by the application?
>

It hasn't been documented anywhere (sorry) but the 1.0.0 release includes
field listing support for the solr-like interface [1].

http://localhost:8098/solr/<index>/select?q=query&fl=blurb

       * Given that the documents total ~1TB of storage (not including the
> generated indexes), does something like decreasing the n_val make sense?
>  Mostly the documents are bulk inserted on a daily or weekly basis – other
> than that all of the operations are read-only.
>

I wouldn't recommend setting N=1 because if you loose disks you'll lose
data.  However, for loading data you could set W=1 for quicker loading.


> Other than these specific questions, if anyone can provide general insight
> on issues that would arise from a dataset like this within Riak, please
> feel free to mention them.
>
>
Riak should handle the data fine as your actual object sizes are very
practical.  My concern would be the types of queries you plan to run.
 Search can become overwhelmed if you search for an oft occurring term.
 Typically these types of fields are things like gender, name, color, etc.
 I.e. fields that have a small set of total terms that are repeated often.
 These are often tagging data used for classification.  For these you'll
want to make use of inline fields [2,3].  In that case you use a query to
match the documents and then inline fields to further filter those results
[4].

http://localhost:8098/solr/<index>/select?q=query&filter=tag:foo&fl=blurb

[1]: For all the gory details see
https://github.com/basho/riak_search/pull/86

[2]: http://wiki.basho.com/Riak-Search---Schema.html

[3]:
https://github.com/rzezeski/try-try-try/tree/master/2011/riak-search-inline-fields

[4]:
http://wiki.basho.com/Riak-Search---Querying.html#Querying-via-the-Solr-Interface
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111108/3b76b6fc/attachment.html>


More information about the riak-users mailing list