Skip to content

Implement gazelle_compilation_test Bazel macro#188

Draft
mateusz-olczyk wants to merge 38 commits intoEngFlow:mainfrom
mateusz-olczyk:testdata-compilation
Draft

Implement gazelle_compilation_test Bazel macro#188
mateusz-olczyk wants to merge 38 commits intoEngFlow:mainfrom
mateusz-olczyk:testdata-compilation

Conversation

@mateusz-olczyk
Copy link
Collaborator

@mateusz-olczyk mateusz-olczyk commented Jan 23, 2026

A solution proposal for #189

gazelle_compilation_test() checks that the test data for gazelle_generation_test() makes sense, i.e., that the BUILD files generated by Gazelle allow the workspace to be successfully built by Bazel.

I've used rules_bazel_integration_test module, which provides exactly what we need for that purpose. I foresee the future potential of this, e.g., wrapping .github/workflows/test_e2e.sh script as a Bazel target.

Currently, only language/cc/testdata/virtual_include_paths is covered by gazelle_compilation_test(), for demonstration purposes. If you like it, I'll cover the remaining test scenarios in a separate PR.

Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Copy link
Collaborator

@jayconrod jayconrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the remaining tests and see how long it takes for the tests to run? It looks like the one test case took 20s, but I'm curious what the total is.

This change looks fine, and the added coverage from actually building the tests would be nice, BUT bazel_integration_test is extremely slow, so I'm very hesitant to adopt it. It necessarily creates a separate workspace directory and runs Bazel there, which means each workspace shares nothing cached locally. Each run needs to fetch and recompile everything. The tests also have tags = ["manual", "exclusive"], so they don't run automatically, and only one test can run at a time.

bazel_integration_test is only really necessary when you want to either run a different version of Bazel or verify a failure (actually I'm not sure it can do that).

For us, a more idealized solution might be a test-only module extension that creates a repo for each test directory and runs Gazelle there. Our tests could then check the generated build files and build the targets in each repo.

(If we choose to invest the effort in that, I think it would be better to provide it through the Gazelle repo to other extensions too, since it wouldn't be specific to gazelle_cc.)

Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
@mateusz-olczyk
Copy link
Collaborator Author

Could you add the remaining tests and see how long it takes for the tests to run? It looks like the one test case took 20s, but I'm curious what the total is.

I've added more tests. Not all of the test scenarios can be included, however. Some contain expected errors, and they can't compile by design. Some contain subtle errors, which I was unable to fix at this point. Even though I'm satisfied with such improved coverage.

This change looks fine, and the added coverage from actually building the tests would be nice, BUT bazel_integration_test is extremely slow, so I'm very hesitant to adopt it. It necessarily creates a separate workspace directory and runs Bazel there, which means each workspace shares nothing cached locally. Each run needs to fetch and recompile everything.

You're absolutely right. "Inner" Bazel server knows nothing about "outer" Bazel server. Even small changes in the test scenarios invalidate their corresponding cache entries. On the other hand, they're not changed so often. Once they succeed, the developer have their results cached for a long time.

The tests also have tags = ["manual", "exclusive"], so they don't run automatically, and only one test can run at a time.

I've removed the "exclusive" tag, so they can run in parallel. Running so many parallel Bazel servers doesn't seem super efficient, but let's hope they won't crash GitHub CI workers.

bazel_integration_test is only really necessary when you want to either run a different version of Bazel or verify a failure (actually I'm not sure it can do that).

For us, a more idealized solution might be a test-only module extension that creates a repo for each test directory and runs Gazelle there. Our tests could then check the generated build files and build the targets in each repo.

(If we choose to invest the effort in that, I think it would be better to provide it through the Gazelle repo to other extensions too, since it wouldn't be specific to gazelle_cc.)

Yes. I guess, in the ideal world, we should have a kind of simplified version of go_deps & go_repository extensions. So the tests would be run within the same Bazel server, sharing resources like, e.g., already built gazelle_cc binary, without the need of building everything from scratch.

The current solution is, of course, not perfect and can be improved later. I would say this is the initial implementation. But we gain better coverage, which is very valuable. Currently, when I design a new test scenario, I of course check if it builds in some side, temporary source tree. But eventually, I leave only gazelle_generation_test, believing the compilation will stay correct. However, I'm afraid of the regression, especially for more complicated scenarios.

Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
@jayconrod
Copy link
Collaborator

Even small changes in the test scenarios invalidate their corresponding cache entries. On the other hand, they're not changed so often. Once they succeed, the developer have their results cached for a long time.

gazelle_cc itself is an input for the tests. All of the integration tests need to be re-run whenever anything in gazelle_cc changes. It's important that these tests run very quickly.

I've removed the "exclusive" tag, so they can run in parallel. Running so many parallel Bazel servers doesn't seem super efficient, but let's hope they won't crash GitHub CI workers.

It looks like this has timed out in several configurations after 15 minutes.

I would like to be able to run our CI in 1-2 minutes. From where we are now, several other changes are needed to make that possible (remote caching at the very least), but this change will make that goal impossible. Running bazel multiple times just too slow and expensive.

Let's run compilation on one test only. Let's try to make that test representative of what gazelle_cc can do, covering all the generated rules, but don't worry about covering every edge case. This will basically be a smoke test, to make sure we're generating rules that work with the right load statements.

@mateusz-olczyk mateusz-olczyk marked this pull request as draft January 27, 2026 09:44
This reverts commit d59780b.

Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
Signed-off-by: Mateusz Olczyk <molczyk@virtuslab.com>
@mateusz-olczyk
Copy link
Collaborator Author

gazelle_cc itself is an input for the tests. All of the integration tests need to be re-run whenever anything in gazelle_cc changes. It's important that these tests run very quickly.

No, my macro gazelle_compilation_test does not depend on Gazelle binary. It only renames BUILD.out to BUILD.bazel in a generated workspace to run bazel build there. So it requires rerunning only if test scenarios change, but not gazelle_cc. However, changes in gazelle_cc usually affect test scenarios, so you're effectively right.

I would like to be able to run our CI in 1-2 minutes. From where we are now, several other changes are needed to make that possible (remote caching at the very least), but this change will make that goal impossible. Running bazel multiple times just too slow and expensive.

Let's run compilation on one test only. Let's try to make that test representative of what gazelle_cc can do, covering all the generated rules, but don't worry about covering every edge case. This will basically be a smoke test, to make sure we're generating rules that work with the right load statements.

Yes, it turned out (without a surprise) that spawning multiple inner Bazel servers is a horrible idea. However, running only one is not that bad. So I prepared the //language/cc/integration_test, which contains a compilable subset of original unit tests from //language/cc/testdata. It treats all the scenarios, however, as subpackages of a single workspace. Thanks to this, only one Bazel process needs to be spawned to handle it. It required additional work, since all absolute labels and C++ include paths in the test scenarios had to be prepended with those subpackages. But in general, it works.

I didn't have time to convert all the test scenarios; here is the list of tests included up to this moment:

  • keep_deps
  • keep-assigned-groups
  • package_by_unit
  • relative_include_paths
  • rules_cleanup
  • rules_with_no_sources
  • subdirectory_basic
  • subdirectory_invalid_glob
  • subdirectory_match
  • subdirectory_match_conflict
  • subdirectory_with_build_file
  • virtual_include_paths

I was also experimenting with a hack: compiling test scenarios directly in the root repository. So we could drop @rules_bazel_integration_test. And call gazelle_generation_test with build_out_suffix = ".bazel" to use just BUILD.bazel files. This is not so easy, though.

The macro implementation expects MODULE.bazel to be passed, so it knows where the workspace root is. Normally, build_out_suffix = ".out", so MODULE.bazel is excluded from BUILD.in <-> BUILD.out comparison. When build_out_suffix = ".bazel" something weird happens, and the macro tries to treat it with similar logic. Probably some changes need to be made to make build_out_suffix = ".out" a well-defined option.

@mateusz-olczyk mateusz-olczyk marked this pull request as ready for review January 27, 2026 17:36
@jayconrod
Copy link
Collaborator

No, my macro gazelle_compilation_test does not depend on Gazelle binary. It only renames BUILD.out to BUILD.bazel in a generated workspace to run bazel build there. So it requires rerunning only if test scenarios change, but not gazelle_cc. However, changes in gazelle_cc usually affect test scenarios, so you're effectively right.

I see, that's a good way to do it. I was thinking the test built Gazelle, ran it, then tried to build all the targets.

Yes, it turned out (without a surprise) that spawning multiple inner Bazel servers is a horrible idea. However, running only one is not that bad. So I prepared the //language/cc/integration_test, which contains a compilable subset of original unit tests from //language/cc/testdata. It treats all the scenarios, however, as subpackages of a single workspace. Thanks to this, only one Bazel process needs to be spawned to handle it. It required additional work, since all absolute labels and C++ include paths in the test scenarios had to be prepended with those subpackages. But in general, it works.

This looks good to me. The main cost is the number of Bazel workspaces. This gives us pretty good coverage.

I was also experimenting with a hack: compiling test scenarios directly in the root repository. So we could drop @rules_bazel_integration_test. And call gazelle_generation_test with build_out_suffix = ".bazel" to use just BUILD.bazel files. This is not so easy, though.

The macro implementation expects MODULE.bazel to be passed, so it knows where the workspace root is. Normally, build_out_suffix = ".out", so MODULE.bazel is excluded from BUILD.in <-> BUILD.out comparison. When build_out_suffix = ".bazel" something weird happens, and the macro tries to treat it with similar logic. Probably some changes need to be made to make build_out_suffix = ".out" a well-defined option.

I feel like a better approach might be possible with a repository rule or module extension, but I'm struggling to come up with a concrete suggestion that would work. A repository rule could at least create a repo, copy files from a directory, and rename BUILD.out to BUILD. Then we could build all the targets within that repo without requiring a separate Bazel workspace.

@mateusz-olczyk mateusz-olczyk marked this pull request as draft February 2, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants