Add ability to control path inside PEX file#42
Add ability to control path inside PEX file#42mouadino wants to merge 1 commit intobenley:masterfrom
Conversation
1e0b0e7 to
3650d94
Compare
3650d94 to
f3a6b64
Compare
|
This is great! I actually cooked up a similar hack that I called "pythonpath" for our own Python monorepo. I'd be happy to post my change for comparison, but this looks very similar! I'd love to help get something merged upstream! |
| dpath = dpath[3:] | ||
|
|
||
| for prefix in strip_pex_path_prefixes: | ||
| if dpath.startswith(prefix): |
There was a problem hiding this comment.
I think you have the same bug here that I had: a path can match two prefixes.
Consider strip_prefixes set to: ['aaa', 'bbb']
If the files are the following:
aaa/bbb/foo(expect to strip aaa and producebbb/foo)bbb/aaa/bar(expect to strip bbb and produceaaa/bar)
However, depending on the order the prefixes are applied, you will probably get one of foo or bar as an output, since it strips both.
To fix it, in my version I strip the LONGEST prefix match, which is a bit annoying but makes in unambiguous, and works when there are "overlapping" strip prefixes (e.g. strip root1 and root1/foo/bar/subroot)
There was a problem hiding this comment.
I am unsure if it's less unambiguous then stripping by the first matched prefix which is what this patch does.
There was a problem hiding this comment.
Ah crap I missed the break on the next line. Silly me, sorry.
Which one will be first though? It will depend on the order deps are specified in BUILD files. I guess that could actually be useful in some cases.
At any rate: Not matching more than one prefix is really the important part.
There was a problem hiding this comment.
Not deps but first in the strip_pex_path_prefixes list, the stripping afterall apply to all files in runfiles which can be either deps, srcs or even data attribute, if I recall well.
There was a problem hiding this comment.
Ah! I see: This adds the attribute to pex_binary. I see! I added it to pex_library since we wanted to have some pex_library make the python package whatever, no matter where it lives in the Bazel workspace.
Anyway: Great minds think alike! I agree that something like this should end up as part of these rules.
There was a problem hiding this comment.
What would be the benefit of adding this to pex_library versus pex_binary?
There was a problem hiding this comment.
It might be useful to have it apply to data from pex_library rules; otherwise you would have to manually propagate that information along to the final pex_binary in some situations.
There was a problem hiding this comment.
Hmm... Now that you mention it: I'm pretty sure my version of this does not actually consider the data attribute, and only applies it to the srcs attribute. I think the internal code where I'm using this doesn't use any data files.
There was a problem hiding this comment.
So I've been playing with this and I definitely think that applying it to the pex_library (both srcs and data) makes the most sense as it removes the need for duplicating code. In my case I have two binaries that make use of the same library and I'd have to strip the same prefix in both binaries. In Buck this is called base_module maybe we can adopt the same name (see: https://buckbuild.com/rule/python_library.html#base_module).
There was a problem hiding this comment.
A counter-argument will be that you wouldn’t include test data in the
pex_library? B/c if I understood correctly having this attribute in
pex_library only requires that you will need to define anything that
eligible to striping in the pex_library rule, correct?
On the other hand, having it in pex_binary and pex_test have direct effect
on the resulting Pex since this later are the rules that create them, and
yes, I agree that interface wise it's not DRY (and personally I don't like it too :)) however IMHO it gives users more control.
First off, thanks for this project it's have been helpful a lot.
Background
I am working on a project to migrate our monorepo from Make to Bazel where the repo layout is:
However with the default behaviour of this extension, PEX files generated end up packed with a path that requires us to change all import statements in our code which is not ideal, since it's a lot of changes and we want to allow smooth migration to Bazel with as few impact on the existent code as possible, hence this changeset which allows having a slight control on the path inside the PEX files, it's meant to be used as follow:
Example
Please advise if there is a better way of doing this.