-
Notifications
You must be signed in to change notification settings - Fork 96
Updated verifier logic #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…1-verifier implementation
jules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, LGTM. I have tested it locally as well and it all worked out fine - it'll be good to test this on the standby server before merging though.
kellpossible
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me!
| finished_at: None, | ||
| contributor_ids, | ||
| verifier_ids, | ||
| verifier_ids: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does verifier_ids need to be removed entirely from Round?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is still in use when the new round is created
| return Err(CoordinatorError::ExpectedVerifier); | ||
| } | ||
| // Initialize the next challenge file. | ||
| storage.initialize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could benefit from InitializeAction https://github.com/AleoHQ/aleo-setup/blob/01ee29f3ec74ee8a1068633f649f0fbb88149101/phase1-coordinator/src/objects/round.rs#L721 in #367, to avoid mutating storage in this module, we could keep an eye on which one gets merged first and implement it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit controversial. In #367 do you want to replace a direct call which does the change with an inderect call, which performs the changes somewhat later? I want to see an explicit call here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes this closer to a pure function, which makes it more predictable and easier to test
| phase1-cli = { path = "../phase1-cli", features = ["parallel"] } | ||
| phase1-coordinator = { path = "../phase1-coordinator", features = ["operator", "parallel"]} | ||
| setup1-shared = { version = "0.1", path = "../setup1-shared" } | ||
| setup1-shared = { version = "0.1", path = "../setup1-shared", features = ["twitter"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this extra twitter feature flag part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, to reduce the number of dependencies verifier pulls in
|
|
||
| Ok(buffer) | ||
| } | ||
| // /// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, I intentionally left it here
| // Initialize the shutdown listener | ||
| let verifier = self.clone(); | ||
| tokio::task::spawn(async move { | ||
| let _ = verifier.shutdown_listener().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does the verifier go with shutting down now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great question! As you can see, I've removed the tasks queue from the verifier. It reuses only 3 files now, and clears them when needed. If the verifier is killed there are a few options:
- it has got a task assigned, but didn't download the files to verify. In this case the assigned task will be cleared from the cache in a minute, and another verifier will pick it up
- if the verifier goes down after downloading the files, the result will be the same as in a previous case
- if the verifier is killed in the middle of uploading the files, the coordinator will produce an error and reassign the work if the verifier doesn't show up in time
- if the verifier is killed after uploading the verification, nothing happens. The coordinator will write the files to disk and call
try_verifyfunction
|
Thanks for the comments, let's have some more tests before merging this. |
Ready for review now
In this PR I relied on
LockedLocatorsstructure a lot, which gives enough information to figure out all the required paths:The verifier logic has been simplified a lot. The verifier will download tasks one by one, sequentially. And verifiers will be able to join mid round. This allowed to remove verifier drop logic, and simplify many places where previously we had to handle both verifier and contributor. Now many functions expect either only contributor, or only verifier. The new verifier implementation also does the verification faster.
As a side effect, fixed one
StorageLocatorMissingerror, when a function was trying to remove a missing file.