-
Notifications
You must be signed in to change notification settings - Fork 8
First pass at a quality-factor implementation in fabric #163
base: quality-factor-working-branch
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit unconventional to call back into the coordinating module but that's OK.
|
Nice first pass @ksnavely. I can, however, think of a significantly more efficient implementation which eliminates a lot of the extra RPC traffic and sets us up to be able to do more sophisticated processing over time. The key to this implementation would be to avoid trying to compute the QF in the RPC worker directly and instead transmit all the "raw materials" back to the coordinator for, y'know, coordination. Let's think about the bits of data that are needed from each shard here:
Figuring out the list of "relevant" internal replications can be a bit tricky. In your current approach you're asking for all the UUIDs of the peer replicas of the shard and then pulling up the checkpoint docs directly. An alternative that might be worth exploring is to actually just walk the local docs tree (using something like The checkpoint docs are supposed to be pretty compact but if this turned out to be too large a payload we could strip out just the necessary bits of the checkpoint and return those. Then the coordinator would have all the information necessary to do whatever computation we needed in local memory. We might even consider exposing all of that information as an admin API endpoint and pushing the local out to Python or JavaScript where we've got better libraries for manipulation and visualization. I realize I'm throwing a lot at you here but I think this is a place where we can make a pretty nice improvement and where we currently have pretty poor visibility. |
|
Oh, the other bit I wanted to comment on is that a difference of sequences is not an immediately meaningful value for users. I think in 99% of cases the number of documents that are not up-to-date is the relevant quantity. We can compute this using |
|
Because I had some extra free time this afternoon, here's a stab at an implementation of what I'm talking about for the RPC worker: get_quality_factor_data(Db) ->
#db{update_seq = CurrentSeq, local_tree = Bt} = Db,
UUID = couch_db:get_uuid(Db),
% Compute the checkpoint ID for replications where we are the source
% size(<<"_local/shard-sync-base64(md5(UUID))">>) == 40
<<MyPrefix:40/binary, _/binary>> = mem3_rep:make_local_id(UUID, nil),
% Function to grab all checkpoint docs where we are the source, and for
% each doc compute the number of documents in the database that have been
% modified since the checkpointed sequence.
Fun = fun({Id, {_Rev, {Body}}}, _Offset, Acc) ->
case Id of <<MyPrefix:40/binary, _/binary>> ->
CheckpointSeq = couch_util:get_value(<<"seq">>, Body, 0),
ModifiedDocCount = couch_db:count_changes_since(Db, CheckpointSeq),
NewBody = {[
{<<"_id">>, Id},
{<<"docs_remaining">>, ModifiedDocCount} |
Body
]},
{ok, [NewBody | Acc]};
_Else ->
% We've reached a checkpoint doc with a different source node
{stop, Acc}
end
end,
% Execute the function above against the local docs btree. The combination
% of start_key in the Options list and the pattern match for MyPrefix in
% the function ensures that we only grab documents where we are the source
{ok, _, AccOut} = couch_btree:foldl(Bt, Fun, [], [{start_key, MyPrefix}]),
InfoList = [
{uuid, UUID},
{update_seq, CurrentSeq},
{checkpoints, AccOut}
],
{ok, InfoList}.In this code I specifically avoided any checkpoint docs that concerns sources other than our current shard file. I think this is OK for our purposes. One could make the argument to return all of them because as far as the internal replicator is concerned a checkpoint sequence is only valid if both source and target agree on the sequence number. Here we're throwing away any information on whether the target agrees with us, the source. Like I said, I think it's OK for the purposes of this exercise. In very rare scenarios it could cause us to underestimate the QF temporarily, but the current implementation has this challenge as well. |
|
@kocolosk TYVM for the comments! I knew I was going to end up with an under-performant implementation the first time around -- most of the algorithm I yanked verbatim from snippet. The best part of the hackday (for me at least) was just sitting down with fabric and rexi, reading the source, and coming away with a better understanding of that part of the system. |
|
Sure, makes a ton of sense. Hopefully we're "speaking the same language" now 😃 |
Here I implement a new fabric module for calculating a database's quality factor. The quality factor is an upper bound of missing updates between shard replicas for all shard ranges in the database. The current logic is in snippet.
I looked at other fabric modules for examples and spent time flipping between chttpd, fabric, and rexi to make sure I had my bearings. However, I'm sure this needs more updating. I've opened this PR against a working branch so I can hopefully get some comments and refine this, and maybe we can eventually merge it in along with some chttpd work.
@chewbranca @kocolosk @rnewson @davisp
Verbose %% comments on for great learning.