-
-
Notifications
You must be signed in to change notification settings - Fork 414
Open
Labels
C-tracking-issueAn issue to track to track the progress of multiple PRs or issuesAn issue to track to track the progress of multiple PRs or issueshelp wantedExtra attention is neededExtra attention is needed
Description
Even though the foundation is set, it needs another push to actually make it work with different kinds of hashes.
Tasks
- remove hash-type specific methods from
git-hashand replace them with parametric usage ofgit_hash::Kind - all code assuming hashes of len 20 should receive this value as parameter instead. This is what git does for the old index and pack file formats.
- a way to pass
--object-hashinformation to thegixCLI - Remove default
sha1feature fromgix-hashcrate and deal with the fallout- Add forwarding features to every plumbing crate that depends on it directly or indirectly
- adjust tests to set
sha1by default - configure docs.rs to enable sha1
- CI should execute script that runs
cargo check -p gix-<name>to show it's failing with an error message - adjust all other jobs
- let
gixchose SHA1 as default
- remove SHA1 mention from
git-featuresfeature toggles - parameterize hash len when decoding non-blob objects (see this for an example)
understand and implement pack idx V3.- see if git actually implements this, and maybe decide thatgitoxidewon't handle the transition period, is either one has or another.- add new Sha256 enum variant, consider putting it behind a feature flag, and add a hasher for it as well.
- general tests for reading refs and objects of different len
- tests for writing and reading objects of different len, maybe even write a conversion program which transforms an entire repo and double-checks with git-fsck
- when cloning, check the
uninmplemented!()invocation to configure the repo for expecting a different hash
Implementation ideas
- make sure once Sha256 is added as ObjectId variant, that it's behind a feature toggle to allow builds that opt-out of SHA256 support to not unnecessarily use more memory than needed. Maybe there are alternatives to this, too.
-
One way to do that with approximately zero overhead would be to such functions generic on the object ID, using a trait that has a method to get the type. Then object IDs with a known type return a constant from that method, and object IDs with a runtime dispatched type return the value of that enum.
- @joshtriplett - taken verbatim as I'd barely be able to improve on it when paraphrasing. In short, have a trait for
oidor allow efficient conversions tooid(it's just a slice, so that should work for specifically sized types as well especially if these were provided bygit-hash.
- @joshtriplett - taken verbatim as I'd barely be able to improve on it when paraphrasing. In short, have a trait for
Notes
- find ways to use the existing highly-parallel pack traversal (along with integration of loose-objects) to build an inverse-ref table to quickly traverse objects bottom-up to change the hash used along with all references, while being fast. This ties into being able to build new packs quickly, ideally even with delta-compression (the latter then has to be re-created as most objects actually change) - re-using deltas for blobs is the only way.
- The existing traversal can mutate data in the tree, which is enough to decode the object and keep direct references for later.
Related
yonas, kklem0, theoparis, bvergnaud, hikiko4ern and 20 more
Metadata
Metadata
Assignees
Labels
C-tracking-issueAn issue to track to track the progress of multiple PRs or issuesAn issue to track to track the progress of multiple PRs or issueshelp wantedExtra attention is neededExtra attention is needed