-
Notifications
You must be signed in to change notification settings - Fork 25
refactor!: config parsing #292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
c2e64f7
feat: begin config refactor
tristan-f-r 4d1a19c
feat: mostly structured config
tristan-f-r b56ecde
feat: add enum variants on ml
tristan-f-r bf95888
fix: some defaults
tristan-f-r 51d6a7b
feat: fully finish config parsing
tristan-f-r a27d38d
style: fmt
tristan-f-r dd4674a
fix: remove dep mark, use strict is None
tristan-f-r 5a8826d
chore: correct config loc
tristan-f-r a47b0df
fix: specify hac params
tristan-f-r d721656
Merge branch 'umain' into config-pydantic
tristan-f-r 8d75604
fix: expand class params
tristan-f-r afa1de5
fix: expand on pca_params
tristan-f-r 5eefc51
fix: drop include dict
tristan-f-r 5243186
fix: call items
tristan-f-r 3b20c48
fix: better typing and deafults
tristan-f-r 4c1fcb6
Merge branch 'umain' into config-pydantic
tristan-f-r 2d4a90f
style: fmt
tristan-f-r 1c55925
Merge branch 'umain' into config-pydantic
tristan-f-r 2a4fb2e
refactor: add config forbid
tristan-f-r 4ded57e
refactor: update config imports
tristan-f-r 22b5686
refactor: better names to schema files
tristan-f-r ea59e4c
fix: no default include, mention model_config allow reason
tristan-f-r fa7d7c9
fix(config): case-insensitive check on labels
tristan-f-r 52eab21
refactor: merge config
tristan-f-r 3c305f4
docs: use concepts link
tristan-f-r 49e50a0
refactor: mv util_enum -> util
tristan-f-r cb28f61
docs: correct util_enum path
tristan-f-r 2c938ed
fix: add spras.config to pyproject
tristan-f-r cdbaf41
Merge branch 'umain' into config-pydantic
tristan-f-r ebcf6b0
docs: mention `args` in contributing
tristan-f-r 4733b95
Revert "docs: mention `args` in contributing"
tristan-f-r fa51d79
docs: document some pydantic choices
tristan-f-r d07d2af
Merge branch 'umain' into config-pydantic
tristan-f-r 1bef8c7
fix: add defaults for kde and remove_empty_pathways
tristan-f-r 593206c
style: fmt
tristan-f-r c465c9c
test: update rn import
tristan-f-r 39faf41
Merge branch 'umain' into config-pydantic
tristan-f-r 2443735
style: typos
tristan-f-r 18a173f
docs: grammar
tristan-f-r 5225616
docs: use nicer alphanumeric explanation
tristan-f-r e82eddf
Merge branch 'umain' into config-pydantic
tristan-f-r File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,168 @@ | ||
| """ | ||
| Contains the raw pydantic schema for the configuration file. | ||
|
|
||
| Using Pydantic as our backing config parser allows us to declaratively | ||
| type our config, giving us more robust user errors with guarantees | ||
| that parts of the config exist after parsing it through Pydantic. | ||
|
|
||
| We declare models using two classes here: | ||
| - `BaseModel` (docs: https://docs.pydantic.dev/latest/concepts/models/) | ||
| - `CaseInsensitiveEnum` (see ./util.py) | ||
| """ | ||
|
|
||
| import re | ||
| from typing import Annotated, Optional | ||
|
|
||
| from pydantic import AfterValidator, BaseModel, ConfigDict, Field | ||
|
|
||
| from spras.config.util import CaseInsensitiveEnum | ||
|
|
||
| # Most options here have an `include` property, | ||
| # which is meant to make disabling parts of the configuration easier. | ||
| # When an option does not have a default, it means that it must be set by the user. | ||
|
|
||
| class SummaryAnalysis(BaseModel): | ||
| include: bool | ||
|
|
||
| # We prefer to never allow extra keys, to prevent | ||
| # any user mistypes. | ||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class CytoscapeAnalysis(BaseModel): | ||
| include: bool | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| # Note that CaseInsensitiveEnum is not pydantic: pydantic | ||
| # has special support for enums, but we avoid the | ||
| # pydantic-specific "model_config" key here for this reason. | ||
| class MlLinkage(CaseInsensitiveEnum): | ||
| ward = 'ward' | ||
| complete = 'complete' | ||
| average = 'average' | ||
| single = 'single' | ||
|
|
||
| class MlMetric(CaseInsensitiveEnum): | ||
| euclidean = 'euclidean' | ||
| manhattan = 'manhattan' | ||
| cosine = 'cosine' | ||
|
|
||
| class MlAnalysis(BaseModel): | ||
| include: bool | ||
| aggregate_per_algorithm: bool = False | ||
| components: int = 2 | ||
| labels: bool = True | ||
| kde: bool = False | ||
| remove_empty_pathways: bool = False | ||
| linkage: MlLinkage = MlLinkage.ward | ||
| metric: MlMetric = MlMetric.euclidean | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class EvaluationAnalysis(BaseModel): | ||
| include: bool | ||
| aggregate_per_algorithm: bool = False | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class Analysis(BaseModel): | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| summary: SummaryAnalysis = SummaryAnalysis(include=False) | ||
| cytoscape: CytoscapeAnalysis = CytoscapeAnalysis(include=False) | ||
| ml: MlAnalysis = MlAnalysis(include=False) | ||
| evaluation: EvaluationAnalysis = EvaluationAnalysis(include=False) | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
|
|
||
| # The default length of the truncated hash used to identify parameter combinations | ||
| DEFAULT_HASH_LENGTH = 7 | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| def label_validator(name: str): | ||
| """ | ||
| A validator takes in a label | ||
| and ensures that it contains only letters, numbers, or underscores. | ||
| """ | ||
| label_pattern = r'^\w+$' | ||
| def validate(label: str): | ||
| if not bool(re.match(label_pattern, label)): | ||
| raise ValueError(f"{name} label '{label}' contains invalid values. {name} labels can only contain letters, numbers, or underscores.") | ||
| return label | ||
| return validate | ||
|
|
||
| class ContainerFramework(CaseInsensitiveEnum): | ||
| docker = 'docker' | ||
| # TODO: add apptainer variant once #260 gets merged | ||
| singularity = 'singularity' | ||
| dsub = 'dsub' | ||
|
|
||
| class ContainerRegistry(BaseModel): | ||
| base_url: str | ||
| owner: str = Field(description="The owner or project of the registry") | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class AlgorithmParams(BaseModel): | ||
| include: bool | ||
| directed: Optional[bool] = None | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # TODO: use array of runs instead. We currently rely on the | ||
| # extra parameters here to extract the algorithm parameter information, | ||
| # which is why this deviates from the usual ConfigDict(extra='forbid'). | ||
| model_config = ConfigDict(extra='allow') | ||
|
|
||
| class Algorithm(BaseModel): | ||
| name: str | ||
| params: AlgorithmParams | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class Dataset(BaseModel): | ||
| # We prefer AfterValidator here to allow pydantic to run its own | ||
| # validation & coercion logic before we check it against our own | ||
| # requirements | ||
| label: Annotated[str, AfterValidator(label_validator("Dataset"))] | ||
| node_files: list[str] | ||
| edge_files: list[str] | ||
| other_files: list[str] | ||
| data_dir: str | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class GoldStandard(BaseModel): | ||
| label: Annotated[str, AfterValidator(label_validator("Gold Standard"))] | ||
| node_files: list[str] | ||
| data_dir: str | ||
| dataset_labels: list[str] | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class Locations(BaseModel): | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| reconstruction_dir: str | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| # NOTE: This setting doesn't have any uses past setting the output_dir as of now. | ||
| class ReconstructionSettings(BaseModel): | ||
| locations: Locations | ||
|
|
||
| model_config = ConfigDict(extra='forbid') | ||
|
|
||
| class RawConfig(BaseModel): | ||
| # TODO: move these container values to a nested container key | ||
| container_framework: ContainerFramework = ContainerFramework.docker | ||
| unpack_singularity: bool = False | ||
| container_registry: ContainerRegistry | ||
|
|
||
| hash_length: int = DEFAULT_HASH_LENGTH | ||
| "The length of the hash used to identify a parameter combination" | ||
ntalluri marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| algorithms: list[Algorithm] | ||
| datasets: list[Dataset] | ||
| gold_standards: list[GoldStandard] = [] | ||
| analysis: Analysis = Analysis() | ||
|
|
||
| reconstruction_settings: ReconstructionSettings | ||
|
|
||
| # We include use_attribute_docstrings here to preserve the docstrings | ||
| # after attributes at runtime (for future JSON schema generation) | ||
| model_config = ConfigDict(extra='forbid', use_attribute_docstrings=True) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| from enum import Enum | ||
| from typing import Any | ||
|
|
||
|
|
||
| # https://stackoverflow.com/a/76883868/7589775 | ||
| class CaseInsensitiveEnum(str, Enum): | ||
| """ | ||
| We prefer this over Enum to make sure the config parsing | ||
ntalluri marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| is more relaxed when it comes to string enum values. | ||
| """ | ||
| @classmethod | ||
| def _missing_(cls, value: Any): | ||
| if isinstance(value, str): | ||
| value = value.lower() | ||
|
|
||
| for member in cls: | ||
| if member.lower() == value: | ||
| return member | ||
| return None | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.