-
Notifications
You must be signed in to change notification settings - Fork 12
fix: use threads directive for vCPU and platform-aware resource validation #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
As described in issue snakemake#34, validation assumed a Fargate compute environment. We can detect the compute environment by looking up the queue, and only run the Fargate specific validation when necessary.
Use self.job.threads for vCPU instead of the undocumented _cores resource. Falls back to _cores for backward compatibility if threads is not set.
📝 WalkthroughWalkthroughAdds two boto3-delegating methods to BatchClient and implements platform-aware platform detection and resource validation in BatchJobBuilder, distinguishing EC2 vs FARGATE resource rules and using the detected platform for job definition registration. Changes
Sequence Diagram(s)sequenceDiagram
participant BJB as BatchJobBuilder
participant BC as BatchClient
participant API as AWS Batch API
participant QC as Job Queue / Compute Environments
rect rgba(100,150,200,0.5)
Note over BJB,QC: Platform Detection
BJB->>BC: describe_job_queues()
BC->>API: describe_job_queues()
API-->>BC: queue details
BC-->>BJB: queue details
BJB->>BC: describe_compute_environments()
BC->>API: describe_compute_environments()
API-->>BC: compute environment details
BC-->>BJB: compute environment details
BJB->>BJB: _get_platform_from_queue() (detect EC2 or FARGATE)
end
rect rgba(150,100,200,0.5)
Note over BJB: Resource Validation
alt Platform == FARGATE
BJB->>BJB: _validate_fargate_resources() (strict vCPU/memory mapping)
else Platform == EC2
BJB->>BJB: _validate_ec2_resources() (basic sanity checks)
end
end
rect rgba(200,150,100,0.5)
Note over BJB,API: Job Definition Registration
BJB->>BC: register_job_definition() (with detected platform)
BC->>API: register_job_definition()
API-->>BC: registration result
BC-->>BJB: success/failure
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@snakemake_executor_plugin_aws_batch/batch_job_builder.py`:
- Around line 119-125: The code assumes there is at least one matching entry in
VALID_RESOURCES_MAPPING for the given vcpu which can cause min([]) to raise
ValueError; change the branch that computes min_mem to first build the list
matches = [m for m, v in VALID_RESOURCES_MAPPING.items() if vcpu in v] and if
matches is empty raise a WorkflowError (include the invalid vcpu value in the
message) instead of calling min(), otherwise compute min_mem = min(matches), log
the warning via self.logger.warning and return the str(vcpu), str(min_mem) as
before; update any callers/tests expecting a WorkflowError when an invalid vCPU
is supplied.
🧹 Nitpick comments (2)
snakemake_executor_plugin_aws_batch/batch_job_builder.py (2)
70-71: Consider handling mixed-platform compute environments.The current implementation only inspects the first compute environment in the queue. If a job queue has multiple compute environments with mixed platforms (e.g., EC2 and FARGATE), this could lead to incorrect platform detection depending on priority order.
This may be acceptable if mixed-platform queues are rare in practice, but consider adding a comment explaining this design decision or logging when multiple compute environments are present.
139-150: Consider accepting integer parameters directly.The method signature takes strings but immediately converts to integers. The caller at line 163 converts ints to strings via
str(vcpu), str(mem), then this method converts back. Accepting integers directly would be cleaner.♻️ Proposed simplification
- def _validate_resources(self, vcpu: str, mem: str) -> tuple[str, str]: + def _validate_resources(self, vcpu: int, mem: int) -> tuple[str, str]: """Validates vcpu and memory based on platform requirements. https://docs.aws.amazon.com/batch/latest/APIReference/API_ResourceRequirement.html """ - vcpu_int = int(vcpu) - mem_int = int(mem) - if self.platform == BATCH_JOB_PLATFORM_CAPABILITIES.FARGATE.value: - return self._validate_fargate_resources(vcpu_int, mem_int) + return self._validate_fargate_resources(vcpu, mem) else: - return self._validate_ec2_resources(vcpu_int, mem_int) + return self._validate_ec2_resources(vcpu, mem)Then update the caller at line 163:
- vcpu_str, mem_str = self._validate_resources(str(vcpu), str(mem)) + vcpu_str, mem_str = self._validate_resources(vcpu, mem)
Avoid ValueError from min() on empty list when vCPU is not valid for any Fargate memory configuration.
|
@jakevc I would appreciate a review (CC: @johanneskoester) |
Summary
self.job.threadsfor vCPU allocation instead of the undocumented_coresresource (falls back to_coresfor backward compatibility)Details
This PR addresses two issues:
threads directive ignored: The plugin was ignoring
self.job.threadsand instead looking for an undocumented_coresresource (defaulting to 1 vCPU). This is now fixed to use the standard snakemakethreadsdirective.Overly restrictive resource validation: The
VALID_RESOURCES_MAPPINGvalidation is only applicable to Fargate (which has strict memory/vCPU ratios), but was being applied to all compute environments including EC2. This PR incorporates the fix from fix: issue with resource validation assuming Fargate (#34) #37 which auto-detects the platform and applies appropriate validation.Test plan
threads: 4and verify the AWS Batch job requests 4 vCPUFixes #34
Incorporates #37
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.