I recently experienced some CI problems for several of my projects because of an (IMO unnecessarily tight) version specification of XML-XPATH’s dependency on XMLParser: `github://pharo-contributions/XML-XMLParser:v3.6.x/src`. At minimum, it seems the dependency should be on the major version (e.g. v3), not the minor (unless I’m missing a needed feature that was added in 3.6).
More importantly though, it raises questions about what the best practice is here. In summary, it seems to me that in general it’s better to only pin versions in reaction to CI failures because pinning has significant downsides and uncertainties. Defensive pinning smells like premature optimization IMO.
In more detail, partially pinning (to major or minor version) by default doesn't seem right to me for several reasons unless we're talking about a tagged release that should be reproducible because:
We have limited community resources to manage these pinnings, which create cascading conflict problems with all other dependent projects.
Pinnings seem somewhat arbitrary because often community maintainers are not familiar enough with both projects or have enough time to do a thorough investigation. How does the maintainer know whether a particular dependency version really “works” with the project? Unless they are intimately familiar with both projects (and even then I wouldn’t have confidence in a manual review), the best way I can think of is to rely on passing CI, and in that case…
It seems easier to just react to CI failures; defensive pinning smells like premature optimization IMO
Partial pinning (minor or major version) will not lead to reproducible builds because the patch is floating
What does reproducibility even mean from an untagged baseline?
Looking through a bunch of repos, it doesn’t seem that there is consensus either way. Some specify baselines and some specific versions.
p.s. for tracking the major, I like ba-st’s naming convention of v{integer} instead of adding “.x”’s of unclear value
Anyone? It would be good to have some consensus, at least for community-contribution projects…
Hi Sean,
This is my take on this issue, do not take it as ground truth is far from that :)
TL;DR; forcing everybody to use major versions is not a solution: why do we have minor versions if we can only use major versions?
El 4 ago 2024, a las 5:22 a. m., sean@clipperadams.com escribió:
I recently experienced some CI problems for several of my projects because of an (IMO unnecessarily tight) version specification of XML-XPATH’s dependency on XMLParser: github://pharo-contributions/XML-XMLParser:v3.6.x/src
. At minimum, it seems the dependency should be on the major version (e.g. v3), not the minor (unless I’m missing a needed feature that was added in 3.6).
I’ll start with a fact here. The developer chose a dependency 3.6.
Two options here: it could have been a good justified choice or not.
Let’s say we have projects A, B, C, and D forming a diamond.
.----depends on v1 of ---> B ----depends on v1.x of —.
/
A -> D
\ /
.----depends on v1 of ---> C ----depends on v1.1 of —'
From a semantic versioning point of view, 1.1 is compatible with 1.x
So a good version resolution should load D v1.1 when loading A, right?
However, if you do this right now with the current state of Metacello, dependency management is delegated to git or the user.
Git will know nothing about semantic versioning and tell that 1.1 and 1.x are different refs, and there is a conflict.
=> My conclusion here is that semantic versioning is a tool
An even worst example. What if B and C depend on incompatible versions of D? How would you resolve that issue?
.----depends on v1 of ---> B ----depends on v2.x of —.
/
A -> D
\ /
.----depends on v1 of ---> C ----depends on v1.1 of —‘
Here it’s easy to blame either the developer of C (for not upgrading) or A (for getting a wrong configuration, but maybe it was his only choice!).
Maybe why not B’s developer, because of “hasty upgrading”?
Thing is,
=> My conclusion here is that life is complicated :)
More importantly though, it raises questions about what the best practice is here. In summary, it seems to me that in general it’s better to only pin versions in reaction to CI failures because pinning has significant downsides and uncertainties. Defensive pinning smells like premature optimization IMO.
Why are you assuming somebody did a “defensive pinning”?
I mean, maybe it’s the case, but why are you assuming this was not justified?
In more detail, partially pinning (to major or minor version) by default doesn't seem right to me for several reasons unless we're talking about a tagged release that should be reproducible because:
In general, I agree with the rule :)
But what if this was a conscious decision and not just a default thing?
I understand this does not suit your current needs, but I see a lot of implicit assumptions in my interpretation of your view.
We have limited community resources to manage these pinnings, which create cascading conflict problems with all other dependent projects.
True
Pinnings seem somewhat arbitrary because often community maintainers are not familiar enough with both projects or have enough time to do a thorough investigation. How does the maintainer know whether a particular dependency version really “works” with the project?
You test?
Unless they are intimately familiar with both projects (and even then I wouldn’t have confidence in a manual review), the best way I can think of is to rely on passing CI, and in that case…
I don’t see what is the solution you propose here...
It seems easier to just react to CI failures; defensive pinning smells like premature optimization IMO
What do you mean to react to CI failures?
Partial pinning (minor or major version) will not lead to reproducible builds because the patch is floating
True
What does reproducibility even mean from an untagged baseline?
True
Looking through a bunch of repos, it doesn’t seem that there is consensus either way. Some specify baselines and some specific versions.
Yes. The thing is also that Metacello is both at the same time used to
p.s. for tracking the major, I like ba-st’s naming convention of v{integer} instead of adding “.x”’s of unclear value
This is aesthetics only, right? Anyways, I’d like to have a more “standardised” way :)
Thank you for the discussion. I am learning!
Replies inline…
Guillermo Polito wrote:
forcing everybody to use major versions is not a solution: why do we have minor versions if we can only use major versions?
Unless they are intimately familiar with both projects (and even then I wouldn’t have confidence in a manual review), the best way I can think of is to rely on passing CI, and in that case…
I don’t see what is the solution you propose here...
I agree! My gut is that the best default, especially given the limitations you describe in the tooling at this time, is to depend on baselines, not specific versions, unless one has an important reason not to (e.g. for a tagged release which ideally should be 100% reproducible), which in my experience is often not the case. I sense that we often reflexively specify versions because that is “the semantic versioning” way, even though our tooling does not really enable us to easily gain the benefits usually associated with semver. I only suggested that major versions would require a bit less cascading changes than minor and patch pinning, which there seems to be a lot of without an expressed justification.
Why are you assuming somebody did a “defensive pinning”?
I often see commits like “update to lates Xyz project version” and the commit changes v1.2.3 to v1.5.3. I find it difficult to believe, especially given the lack of commit message details to justify, that in all these cases the main project absolutely can’t work without the 1, 2 and 3 patches to 1.5. I feel it’s more likely a symptom of exactly what I’m pointing out and which you illustrated in your examples - unless different projects all point to the exact same version, Metacello will have problems, so just specify full versions everywhere.
It seems easier to just react to CI failures; defensive pinning smells like premature optimization IMO
What do you mean to react to CI failures?
I mean that CI failures might be a useful guide to when we really need to pin versions based on clear evidence.