diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index d714ddf..b71cab9 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -32,6 +32,84 @@ cargo version To keep code style consistent, run `cargo x lint --fix` to automatically fix any style issues before committing your changes. +## Build and Test + +We recommend using `cargo x` as a single entrypoint (provided by the workspace `xtask` crate). This repo defines the `cargo x` alias in `.cargo/config.toml`, which maps to `cargo run --package x -- ...`. + +Build: + +```shell +cargo build --workspace +``` + +Test: + +```shell +cargo x test +# or +cargo test --workspace --no-default-features +``` + +Lint: + +```shell +cargo x lint +``` + +## Manual workflow (without xtask) + +`cargo x lint` runs the following steps. Use these directly when you need more control or want to isolate failures: + +```shell +cargo +nightly clippy --tests --all-features --all-targets --workspace -- -D warnings +cargo +nightly fmt --all --check +taplo format --check +typos +hawkeye check +``` + +Automatic fix commands: + +```shell +cargo +nightly clippy --tests --all-features --all-targets --workspace --allow-staged --allow-dirty --fix +cargo +nightly fmt --all +taplo format +hawkeye format --fail-if-updated=false +``` + +Install the extra tools with: + +```shell +cargo install taplo-cli typos-cli hawkeye +``` + +## Serialization snapshots and test data generation + +Some tests depend on snapshot files under `datasketches/tests/serialization_test_data`. If they are missing, tests will fail. Regenerate them with: + +```shell +python3 ./tools/generate_serialization_test_data.py --all +``` + +The script pulls `datasketches-java` and `datasketches-cpp` and writes files to: + +- `datasketches/tests/serialization_test_data/java_generated_files` +- `datasketches/tests/serialization_test_data/cpp_generated_files` + +You can generate them separately: + +```shell +python3 ./tools/generate_serialization_test_data.py --java +python3 ./tools/generate_serialization_test_data.py --cpp +``` + +The script requires these commands on PATH (and network access): + +- Java data: `git`, `java`, `mvn` +- C++ data: `git`, `cmake`, `ctest` + +The current `datasketches-java` generation flow requires JDK >= 25 and Maven >= 3.9.11, otherwise Maven Enforcer will fail. + ## Code of Conduct We expect all community members to follow our [Code of Conduct](https://www.apache.org/foundation/policies/conduct.html).