Jiak per-request read/write masks

Paul Rogers paul at riakrest.com
Tue Dec 15 16:23:50 EST 2009


I've coded a modification for per-request read and write masks of Riak data via
Jiak interaction. I use this Riak modification to support a concept I call
"point of view" in the Ruby library RiakREST: http://riakrest.com

Summary:
  Riak mod allows per-request read and write fields:
    - During a write to existing bucket/key data, any fields not in the request
      JSON payload are copied from the existing data values on the server. This
      is the current Riak behavior for any fields not in the JSON payload that
      are also not in the current bucket read mask.
    - During a read of existing bucket/key data, only those fields specified in
      the request are returned in the JSON payload. Current behavior returns
      the fields in the current bucket read mask.

  Implementations notes:
    - I have not as yet "polished" this code for the following reasons:
       - If this is not desirable by others I'll just continue to carry my patch
         forward for users of RiakREST.
       - If desired, there are other options for implementation and I'd welcome
         some discussion before committing to the "polishing".
    - Per-request write mask is designated by a URI query string "copy=true".
    - Per-request read mask is designated by a URI query string
      "read=f1,f2,fn".
    - The per-request read/write masks cannot read/write fields that are not in
      the current bucket read/write masks, i.e., this facility cannot be used to
      circumvent the current bucket masks.
    - For a link query, the request read mask only affects the return fields
      for objects resulting in the last "step" of the query. To apply different
      read masks to objects accumulated during intermediate steps would require
      more wrangling, and such link queries are not the common case. (Note: I
      don't support intermediate return results in RiakREST so I didn't concern
      myself with that use case.)

  For a brief description of a use-case supporting the need for per-request
  masks in RiakREST see http://riakrest.com/#examples:buoy_pov


There are two primary benefits to having per-request read and write fields:
    
  Reduced HTTP message sizes. Only the per-request fields are transported via
  JSON messaging. Depending on data field numbers and sizes this could be
  significant.

  Data protection under certain concurrent bucket access patterns:
    This requires a bit of explanation. Consider the following access pattern
    by clients X and Y:

      - X creates bucket B with default schema: AF=[*] RF=[], RM=[*], WM=[*]
      - X puts a=1,b=2 in B under key K.
      - Y sets B schema to AF=[*] RF=[], RM=[a], WM=[*]
      - Y reads K, getting a=1
      - X sets B schema to AF=[*] RF=[], RM=[a,b], WM=[*]
      - Y updates K with a=11
      - X reads K, getting a=11, b=null

    Note the loss of data value 'b'.

    The problem lies in the clients use of bucket schemas, which work at a
    global level, to restrict data access at a request level. Client Y only
    wants to read/write field 'a', whereas client X wants to read/write fields
    'a' and 'b'. But since there is only one current bucket schema at any time,
    the above interaction pattern leads to apparent data loss of the value in
    field 'b'.

    Using the per-request patch, the same interaction pattern would be:

      - X creates bucket B with default schema: AF=[*] RF=[], RM=[*], WM=[*]
      - X puts a=1,b=2 in B under key K.
      - Y reads K using 'read=a', getting a=1
      - Y updates K with a=11, using 'copy=true'
      - X reads K, getting a=11, b=2

    The clients no longer need to change the singular bucket schema to achieve
    restricted data access, and no data loss occurs.





More information about the riak-users mailing list