Conversation
| ) -> Result<Option<EnqueueJob>, JobError> { | ||
| let aggregators = Aggregators::find() | ||
| .filter(AggregatorColumn::IsFirstParty.eq(true)) // eventually we may want to check for a capability | ||
| .all(db) |
There was a problem hiding this comment.
eventually there will be too many records to do this with and we will want to use stream
src/queue/job/v1/task_sync.rs
Outdated
| for aggregator in aggregators { | ||
| let client = aggregator.client(job_state.http_client.clone()); | ||
| for task_id in client.get_task_ids().await? { | ||
| if 0 == Tasks::find_by_id(&task_id).count(db).await? { |
There was a problem hiding this comment.
we may eventually want to perform a single query to prefetch all ids for the given aggregator
There was a problem hiding this comment.
This likely also should constrain the query to a given aggregator, since it's just as invalid for the aggregator to have another aggregator's task as it is to have a task that we don't know about at all
tgeoghegan
left a comment
There was a problem hiding this comment.
I think this PR accomplishes what it aims to do, but I'm not certain if deleting all tasks from Janus that divviup-api doesn't know about is the right thing to do. How will this interact with taskprov? @inahga can you comment?
|
I think we'll need to add task provenance to the aggregator api representation |
This represents a sort of "minimal first version of task sync." It deletes tasks from first party aggregators if we don't have them locally. It does not yet compare parameters. There are a lot of optimizations and rate limiting changes that could be applied to this, but those changes should be informed by benchmarks on representative data.
Refs #328