loading erlang terms

Charles Blair chas at uchicago.edu
Tue Oct 26 11:11:19 EDT 2010


On Mon, Oct 25, 2010 at 07:34:13PM -0700, Dan Reverri wrote:
> I don't understand your use case; can you expand on what you are doing?

Perhaps this in the category of "too much information," but since you asked, I had written an OAI-PMH provider in erlang. My backing store is a dets file, which I load into memory and read into an ets table when the server starts up. The memory-resident database consists of keys associated with values which are erlang terms. I use these terms for filtering queries conforming to the OAI-PMH protocol specification. The way I create the dets file is to begin with a file of plain text. Suppose I had one term in it (I'll typically have hundreds or thousands), it might look like this:

{oai_dc,"AEP-WYS97",                                                                       
        {{2004,10,28},{17,21,25}},
        ["aep"],                                                                           
        "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}.

In erlang itself, if I've called the file AEP-WYS97, I can do something like this:

1> {_, Term} = file:consult("AEP-WYS97").
{ok,[{oai_dc,"AEP-WYS97",
             {{2004,10,28},{17,21,25}},
             ["aep"],
             "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]}

2> Term.
[{oai_dc,"AEP-WYS97",
         {{2004,10,28},{17,21,25}},
         ["aep"],
         "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]

3> element(2, hd(Term)).
"AEP-WYS97"

Now, if I replace dets/ets with riak, I can do two things (I think). The first would be to write an erlang function that does something like the above. But the second (which is where my question comes into play) is, if I can load arbitrary data types into riak, then is there a way I can specify that what I'm loading are erlang terms. If I don't do that, and load strings, I run into this sort of nastiness:

(riak at 127.0.0.1)295> {ok, Tk} = C:get(<<"oai_dc">>, <<"dsalhensley-m025">>).

(riak at 127.0.0.1)296> V = riak_object:get_value(Tk).                         
<<"{oai_dc,dsalhensley-m025,        {{2004,11,9},{15,38,27}},        [dsal,dsal:hensley],        /storage/dsal/hensley/"...>>

(riak at 127.0.0.1)300> A = binary_to_list(V).
"{oai_dc,dsalhensley-m025,        {{2004,11,9},{15,38,27}},        [dsal,dsal:hensley],        /storage/dsal/hensley/metadata/oai_dc/oai_dc-dsalhensley-m025.xml}"

(riak at 127.0.0.1)301> {ok, Tokens, _} = erl_scan:string(A).
{ok,[{'{',1},
     {atom,1,oai_dc},
     {',',1},
     {atom,1,dsalhensley},
     {'-',1},
     {atom,1,m025},
     {',',1},
     {'{',1},
     {'{',1},
     {integer,1,2004},
     {',',1},
     {integer,1,11},
     {',',1},
     {integer,1,9},
     {'}',1},
     {',',1},
     {'{',1},
     {integer,1,15},
     {',',1},
     {integer,1,38},
     {',',1},
     {integer,1,27},
     {'}',1},
     {'}',1},
     {',',1},
     {'[',...},
     {...}|...],
    1}

So far, so good. However:

(riak at 127.0.0.1)302> erl_parse:parse_term(Tokens).                          
{error,{1,erl_parse,["syntax error before: ","'/'"]}}

This happens because I have unescaped quotation marks in the input. If I had intended my input to be strings, then that's something I should have taken care of beforehand, but if I intend the input to be terms, then I shouldn't have to.

I'm prepared to use option 1, above, or else convert my terms into strings, and parse strings, but string handling in erlang is not particularly efficient, so before I go either of those routes, I'm curious whether I can tell riak, this is an erlang term, have it be stored as such, so when I go to use it, my code doesn't have to do any further transformation that isn't related to the job at hand (filtering).

I'm relatively new to riak (less than a week), so perhaps there is a better approach entirely.

Thanks.








More information about the riak-users mailing list