Closed Bug 927219 Opened 11 years ago Closed 9 years ago

Close some older branches in mozilla-central

Categories

(Release Engineering :: Release Requests, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: clownshoes)

Attachments

(2 files)

+++ This bug was initially created as a clone of Bug #611030 +++

We have a number of open release branches in our HG repos that IMO should be closed. e.g.

GECKO100_2011122113_RELBRANCH 81527:2eddbaa0c3c8
GECKO100_2011122805_RELBRANCH 81540:0bda3d7b4c05
GECKO100_2012010318_RELBRANCH 81551:167c21367cbd
GECKO100_2012010410_RELBRANCH 81566:e7007f318c51
GECKO100_2012011108_RELBRANCH 81592:03fb07b26e3d
GECKO100_2012011807_RELBRANCH 81612:51c6a31cd3a3
GECKO100_2012012323_RELBRANCH 81625:f2a16e2a67ca
GECKO100_2012012901_RELBRANCH 110437:b8f758c47897
GECKO110_2012020115_RELBRANCH 85212:5fc28dd4fcfc
GECKO110_2012020801_RELBRANCH 85350:bf91d5e0841a
GECKO110_2012021420_RELBRANCH 85394:26ba6bd02ce1
GECKO110_2012021522_RELBRANCH 85402:04e408ddcdcf
GECKO110_2012022207_RELBRANCH 85448:d8e57887e2f2
GECKO110_2012022820_RELBRANCH 85505:33223be526ea
GECKO110_2012030517_RELBRANCH 85530:377c06cd1151
GECKO110_2012030815_RELBRANCH 85556:972f303ffaaa
GECKO110_2012030912_RELBRANCH 110494:f44a3982c8ad
GECKO110_2012031017_RELBRANCH 85570:c368cf027fa3
GECKO110_2012031218_RELBRANCH 110499:227aa795dec1
GECKO120_2012031407_RELBRANCH 88576:3eac56b8d713
GECKO120_2012031419_RELBRANCH 88597:9dfcd6ac9083
GECKO120_2012032021_RELBRANCH 88618:9825cece5359
GECKO120_2012032804_RELBRANCH 92226:65cec0350218
GECKO120_2012040320_RELBRANCH 88662:37bcffbad31d
GECKO120_2012041106_RELBRANCH 88691:3acc8aea7693
GECKO120_2012041716_RELBRANCH 88713:390af05c3068
GECKO120_2012042014_RELBRANCH 110522:49fe214331fb

I strongly believe that we should have a policy of closing hg branches for releases that are no longer supported. If the branch is dead from a code perspective, close it. Mercurial branches are forever encoded in Mercurial's immutable commit object, so it's not like we lose the history of a branch by closing it. But it sure makes the output of |hg branches| and |hg heads| saner!

I also believe there is room for us to consider not using branches for release management or to consider using fewer of them. We probably made the decision to use branches back before bookmarks were usable. Mercurial has lightweight branches called bookmarks. We should consider using them. But I don't want that discussion getting in the way of a much needed repository cleanup.
Bug 924024 tracks no longer using relbranches.

As noted in https://wiki.mozilla.org/ReleaseEngineering/VCSSync/HowTo#How_to_deal_with_GECKO90_2011121217_RELBRANCH , having commits in a relbranch in one gecko-dev repo that don't exist in another gecko-dev repo causes fast-forward issues.  In this case, if we follow the old branch-closing process, when these changes get to mozilla-beta, mozilla-release pushes will die due to non-fast-forward.

Another option is to use |hg debugsetparents| to "merge" the branch back into default without making any changes to default.  I'm not sure if this satisfies the original requirements.
Comment 1 makes me think that we should wontfix this bug. What do you think?
Flags: needinfo?(gps)
I think there are two separate issues:

1) Not using Mercurial branches for tracking releases (bug 924024)
2) Closing old Mercurial branches when they are dead (this bug)

The issue described in comment #1 about fast forward issues is due to a flawed choice for branch names. Currently, branch names across aurora, beta, release, etc can collide. If we are going to use names (branches, bookmarks, whatnot) for tracking releases, we should be using names that are globally unique. That way, if we pull all repositories into a unified one (such as what the Git mirror does), there will be no naming collisions and the tools won't complain.

I believe we should keep this bug open to close the old/dead release branches.
Flags: needinfo?(gps)
Thanks for the explanation!
These old and open branches are annoying me again today. Are there any plans to close the old branches?

Note that closing branches doesn't mean they disappear from history. But it does mean you can't write to them any more. It also impacts some Mercurial UI and performance.

Old release branches would also be good candidates for no-op merges so the number of topological heads is kept in check.
Gregory, do you mind to go grab this bug, go ahead and close the release branches on mozilla-central? I am pretty sure they are not used anywhere in release automation.
Flags: needinfo?(gps)
mozilla-central or mozilla-beta and mozilla-release, etc? I'm mostly concerned about the release repos at this point.
Flags: needinfo?(gps)
I'm happy to close branches on *any* repository provided release automation won't throw a fit.
I don't think the sheriffs have a horse in this race, but SGTM!
I don't think release automation relies on relbranches unless it's specified in the release build submission.

Release automation will create new branches though.

Ben, Nick, is there anything that I'm missing?
Flags: needinfo?(nthomas)
Flags: needinfo?(bhearsum)
There are two things that could be done:

1) Close old branches
2) Merge old heads (topological heads)

Closing branches hides them from the Mercurial interface. It's a good start.

But aspects of Mercurial's performance decrease as the number of heads in a repository approaches infinity. (Ditto for Git.) Keeping the number of DAG heads low prevents degradation over time.

Because of the performance implications, I'd like to merge topological heads. But this raises more questions: where should they get merged to and how many heads should be merged?

I can make the argument that we should just merge all heads. Even the old release heads could get merged. In a few years we'll have hundreds of those and I'm not sure if leaving them open offers much value. (The history is still in the repo if people want to get at it.) But, that position may be a bit radical to some. I'd be happy to merge repos like beta so we have 1 head per major version number. Not ideal, but better than the several hundred heads we have now.
This is what I'd like to do on mozilla-central:

https://pastebin.mozilla.org/8406873

Essentially:

for each {head} {branch}:
  hg debugsetparents tip {head}
  hg commit -m 'Merge old branch head {branch} into default'
glandium: please bless the procedure in comment #12 to remove all but 1 topological head from mozilla-central.
Flags: needinfo?(mh+mozilla)
wfm
Flags: needinfo?(mh+mozilla)
> I'd be happy to merge repos like beta so we have 1 head per major
> version number. Not ideal, but better than the several hundred
> heads we have now.

What good would that make?

Once merged, one can still hg up the_branch_name, and commit on top of that, which will reopen the branch and create a new head.
That said, come to think of it, there's one thing to be *very* careful about when doing those merges: tags. Please check that no tag is being removed.
Yeah, tags could be problematic.

I'll plan on closing old branches in the release repos soonish. Then I'll investigate tags behavior and push topological head merges iff there are no tags issues. This should be a net win for people pulling release repos - closed branch heads will get filtered from `hg heads` and make that command more usable.
For those -central branches, it might be fine, but for release and beta, every relbranch has their own tags.
For example:
$ git diff beta/GECKO360b1_2015011222_RELBRANCH:.hgtags beta/GECKO360b2_2015012018_RELBRANCH:.hgtags
diff --git a/beta/GECKO360b1_2015011222_RELBRANCH:.hgtags b/beta/GECKO360b2_2015012018_RELBRANCH:.hgtags
index f27807e..aeec8a4 100644
--- a/beta/GECKO360b1_2015011222_RELBRANCH:.hgtags
+++ b/beta/GECKO360b2_2015012018_RELBRANCH:.hgtags
@@ -138,13 +138,10 @@ ca89fe55717059e4e43040d16d260765ffa9dca7 FIREFOX_AURORA_36_BASE
 0000000000000000000000000000000000000000 FIREFOX_AURORA_36_BASE
 6047f510fb73c7dbe9866066fb01ddda3c170c9c FIREFOX_AURORA_37_BASE
 0000000000000000000000000000000000000000 FIREFOX_AURORA_37_BASE
 0000000000000000000000000000000000000000 FIREFOX_AURORA_36_BASE
 b297a6727acfd21e757ddd38cd61894812666265 FIREFOX_AURORA_36_BASE
 0cf828669d5a0911b6f2b83d501eeef5bdf9905e FIREFOX_AURORA_35_END
 75177371cb85baaa9d623f56d849a5c21d18040f FIREFOX_BETA_36_BASE
 137baee3dda45c6a3b38be74f5709c24f7c7701a FIREFOX_BETA_35_END
-1b26127c33231f17a2c4a9e960d31529ba158d55 FIREFOX_36_0b1_RELEASE
-1b26127c33231f17a2c4a9e960d31529ba158d55 FIREFOX_36_0b1_BUILD1
-1b26127c33231f17a2c4a9e960d31529ba158d55 FIREFOX_36_0b1_RELEASE
-521859f9eae2384c9720aee9b3191f0a3a59dd9a FIREFOX_36_0b1_RELEASE
-521859f9eae2384c9720aee9b3191f0a3a59dd9a FIREFOX_36_0b1_BUILD2
+ddf243cd8f14f99b4ff2af6ff50e299bbb4f6511 FIREFOX_36_0b2_RELEASE
+ddf243cd8f14f99b4ff2af6ff50e299bbb4f6511 FIREFOX_36_0b2_BUILD1

And as mentioned in bug 1120318, some branches have different values for the same tags.
(In reply to Mike Hommey [:glandium] from comment #18)
> For those -central branches, it might be fine

FWIW, none of the branches on central have added tags to what .hgtags was when the branch was forked from default or merged to default last.
I've closed branches on aurora and beta. I'm not going to attempt to merge heads due to tags issues. Not today at least.

I plan on closing branches on release and esr's today as well.
release and esr's are taken care of. I left the most recent release branches open. In my unified repo, `hg branches` now displays a very reasonable 11 entries. `hg heads` is also somewhat usable.

I may have a go at the b2g repos later.
(In reply to Gregory Szorc [:gps] from comment #21)
> I may have a go at the b2g repos later.

Only 30+ are supported at this point, FWIW.
I closed branches from all the b2g repos earlier tonight.

Also, we're receiving reports that vcs-sync isn't liking things. Smells like it isn't filtering out changes to non-"default" branches or something. The Git mirrors may not be happy for a little while. *sigh*
Keywords: clownshoes
Thanks to bug 1120318, you changed the meaning of some tags.
on release: FIREFOX_21_0_RELEASE
on beta: FENNEC_14_0b8_RELEASE, SEAMONKEY_2_24b1_RELEASE

Something fishy also happened on the esr branches.
mozilla-central was the only repository I merged heads on. I only created close branch commits everywhere else.

I suspect what's happening is Mercurial's tag resolution logic is temporarily in a wonky state. Mercurial reads the .hgtags content from all the DAG heads and then applies content to a dict. Most recent (in terms of revlog order) wins. Since we have old branch closures at tip (and no additional changes yet), .hgtags values from these old branch heads are "winning" the last-write "race."

One way to recover from all this is to merge all the heads then produce a .hgtags with sane values. I think.
(In reply to Gregory Szorc [:gps] from comment #25)
> mozilla-central was the only repository I merged heads on. I only created
> close branch commits everywhere else.
> 
> I suspect what's happening is Mercurial's tag resolution logic is
> temporarily in a wonky state. Mercurial reads the .hgtags content from all
> the DAG heads and then applies content to a dict. Most recent (in terms of
> revlog order) wins. Since we have old branch closures at tip (and no
> additional changes yet), .hgtags values from these old branch heads are
> "winning" the last-write "race."

Did you close the branches in revlog order? if so then that will fix itself with a push to the default branch. If you didn't, then yeah, merging in the right order will fix it. Another possibility is to add another changeset on top of each head so that they are order in the revlog. Or to reorder the revlog manually.
For the most part, I processed branch closings in revlog order. There are a few exceptions to that.

Hopefully other people now know why tags aren't the immutable concept people often mistake them for :)
Before I go to bed so I can do legwork tomorrow, I just wanted to throw in the idea of stripping the diverging branch closing commits. No systems should be indexing these commits, so losing them shouldn't be a big deal. Mercurial clients that have pulled the commits will get two copies of essentially a commit that does the same thing: close a branch. I've done this locally and Mercurial doesn't complain. You'll just end up with a lot of local DAG heads until you reclone the repo. We could probably even build something into firefoxtree or even the build system to detect when this is present and recommend you reclone or something.

Something to think about. It sounds dangerous, but I like the option over alternatives, which require introducing several hundred more commits in all the repos. (Fortunately these commits are nearly free since there is no commit data in them.)
(In reply to Mike Hommey [:glandium] from comment #24)
> Thanks to bug 1120318, you changed the meaning of some tags.
> on release: FIREFOX_21_0_RELEASE
> on beta: FENNEC_14_0b8_RELEASE, SEAMONKEY_2_24b1_RELEASE

You know how we retag _RELEASE for each _BUILDn? We also have separate heads for each BUILDn, so in fact, those tag changes above are _RELEASE tags that switched from the last _BUILDn to an earlier one.

FIREFOX_21_0_RELEASE is now FIREFOX_21_0_BUILD3 instead of FIREFOX_21_0_BUILD4
The other two seem, in fact, to have been fixed when they were previously pointing to the wrong thing, whatever that was.
(In reply to Mike Hommey [:glandium] from comment #29)
> (In reply to Mike Hommey [:glandium] from comment #24)
> > Thanks to bug 1120318, you changed the meaning of some tags.
> > on release: FIREFOX_21_0_RELEASE
> > on beta: FENNEC_14_0b8_RELEASE, SEAMONKEY_2_24b1_RELEASE
> 
> You know how we retag _RELEASE for each _BUILDn? We also have separate heads
> for each BUILDn, so in fact, those tag changes above are _RELEASE tags that
> switched from the last _BUILDn to an earlier one.
> 
> FIREFOX_21_0_RELEASE is now FIREFOX_21_0_BUILD3 instead of
> FIREFOX_21_0_BUILD4

So it turns out mercurial is actually smarter than picking tags in order of increasing revlog (and consequently, that git-remote-hg is dumb, and sees _BUILD2 instead):
http://selenic.com/hg/file/8a544fb645bb/mercurial/tags.py#l163

*But*, in the case of FIREFOX_21_0_RELEASE, the head that has _BUILD4 has a lower revlog number *and* doesn't contain the values of FIREFOX_21_0_RELEASE for the other builds. From a cursory look at recent releases, it seems to be fine, now. I however don't know if it's fine now by chance, or if our processes changed in a way that ensures that mercurial's smarts work.

Since I'll have to fix git-remote-hg for this, I'll also check all the release tags at the same time.
Attaching email thread on this topic - looks like these changes may have caused fallout in modern vcs sync.
Attachment #8556403 - Flags: feedback?(gps)
Err, so in fact, this is the deal for FIREFOX_21_0_RELEASE:
2c41ea90753c04f0d6d53e96e772613154818c62 has a .hgtags with 30ec6828d10e86ac4d21ee20ded1b551794901ef and 916fdce8831c34612c69643ec1a454ed704e0590 for FIREFOX_21_0_RELEASE, in that order. 916fdce8831c34612c69643ec1a454ed704e0590 is BUILD4.
8c8150211c0d3f43536b94c795b6dc46af84e24c has a .hgtags with 0a66f8ca83a1723605e200df860957193d440c15, 30ec6828d10e86ac4d21ee20ded1b551794901ef, 916fdce8831c34612c69643ec1a454ed704e0590 and 30ec6828d10e86ac4d21ee20ded1b551794901ef for FIREFOX_21_0_RELEASE. 30ec6828d10e86ac4d21ee20ded1b551794901ef is BUILD3. This is a superset of the previous, so 30ec6828d10e86ac4d21ee20ded1b551794901ef wins.
773fa5504fbcf33c557414d5a3c7a0598b32fb27 has a .hgtags with 3402fc52312cb6c90b45a2724769421145d1f429 and 0a66f8ca83a1723605e200df860957193d440c15 for FIREFOX_21_0_RELEASE, which is a subset of what the previous had. so 30ec6828d10e86ac4d21ee20ded1b551794901ef still wins.

In fact, if I'm evaluating correctly, whatever order those come in, BUILD3 still wins. So in fact, FIREFOX_21_0_RELEASE didn't actually change value. Our tagging for that release is inconsistent and git-remote-hg is dumb and happily switches between tags because it doesn't work the same. In fact, now that I look at the history of that .hgtags, moving back to BUILD3 was done on purpose: http://hg.mozilla.org/releases/mozilla-release/rev/3b27f4f6e22658d6fc34179b8b840b64d2424691

So the tagging issues can be ignored altogether. We're left with the differing heads between esr, release, beta and aurora.
(In reply to Pete Moore [:pete][:pmoore] from comment #31)
> Created attachment 8556403 [details]
> More info significant issues with geckogit.pdf
> 
> Attaching email thread on this topic - looks like these changes may have
> caused fallout in modern vcs sync.

I can see these changes as being responsible for the non-fast-forward issues. Not for the <h<surkov thing, though.
This: http://www.oxymoronical.com/blog/2015/01/hgchanges-is-down-for-now made me look at the pointed changeset, which is the result of comment 12, and I don't know if mercurial actually likes it: the corresponding manifest is exactly that of one of the parent changesets... as opposed to a manifest with the same content as one of the parent changesetss, but with two parent manifest references.

That said, the problem in his case is that he's trying to download the diff for the merge, which is huge because of the age difference between the trees (it's probably close to doing a diff versus 000000000000000000000000000000000000000)
Looks like there are natural occurrences of this pattern in mozilla-central's history, so the only problem would be the diff being too big for hgweb to produce it quickly. That also gives an easy DoS vector.
When we uplift branches we use debugsetparents to merge heads. To keep tags accessible from default we use a hack:

http://hg.mozilla.org/build/mozharness/file/6506f3e066b8/scripts/merge_day/gecko_migration.py#l233

Maybe you can use the same approach.
(In reply to Rail Aliiev [:rail] from comment #10)
> I don't think release automation relies on relbranches unless it's specified
> in the release build submission.
> 
> Release automation will create new branches though.
> 
> Ben, Nick, is there anything that I'm missing?

The release automation doesn't do anything with relbranches unless it's specified, or if the revision specified is the relbranch. I think it's safe to close all of the relbranches on mozilla-beta, mozilla-release, and mozilla-esr31 at this point. We never use them on Beta (always ship the latest), and mozilla-release and esr31 don't have changes landing to them except explicit backports, so the default branch is pretty much always in a ready state. (There's rare cases where we've used them, but it's few and far between -- and we can re-open a branch in the case that we need it, I assume.)
Flags: needinfo?(bhearsum)
(In reply to Gregory Szorc [:gps] from comment #23)
> I closed branches from all the b2g repos earlier tonight.

Looks like you missed b2g37.
Pete: 	aki	pmoore|away: http://hg.mozilla.org/build/mozharness/file/89039ca68621/scripts/vcs-sync/initial_beagle.py#l526 through 549 deals with <h<surkov .

initial_beagle.py isn't run anymore; it's a one-time run.  But you might be able to use the logic to figure out where <h<surkov is coming from and strip that from the vcs-sync runs, although merging in these heads might make it problematic.
Depends on: 1127480
We actually shipped 21.0 build3, not build4 (this was when the AMD crash issues has hot and build4 was just-in-case build3 was crashy). And for 34.0b9 we shipped build1 over build2. Please see pages like https://wiki.mozilla.org/Releases/Firefox_34.0b9/BuildNotes if there is any ambiguity.
Flags: needinfo?(nthomas)
I'm ready to strip conflicting branch closure commits at my leisure. However, Firefox release people don't want to take the risk today with a Beta GTB happening (backlog in #releng). The work will have to be performed later tonight or tomorrow.

It's still possible for someone to fix vcs sync in the interim.
I've now been informed that if this work is to happen today, it must happen very late due to the Beta build. This likely won't happen until tomorrow. And even then there could always be another build of Beta to delay things further :/
(In reply to Gregory Szorc [:gps] from comment #42)
> I've now been informed that if this work is to happen today, it must happen
> very late due to the Beta build. This likely won't happen until tomorrow.
> And even then there could always be another build of Beta to delay things
> further :/

That would prevent touching beta, but how does that block touching the other repos? In fact, you could just declare that the branches on beta are the ones that are correct, and make the others match.
As part of cleaning up things, I executed a command as the wrong user and temporarily corrupted the Aurora repository due to a wonky permissions issue. No data was lost from the server. But it was quicker to have RyanVM and KWierso just push it from their machines, so they did. As a result, we lost some pushlog data from earlier today (certain push IDs are now empty and their changesets are now associated with RyanVM's and KWierso's pushes).

I suspect the issue was compounded by a replication hook firing at an inopportune time. I will have to experiment with a user repo before moving forward with additional cleanup, as I don't want to risk more accidents.
Depends on: 1127944
Git replication on mozilla-central is completely confused. The dummy merges I did (8991b10184de::bfa194d93aed) came from heads that hadn't been consistently seen by bother legacy and modern vcs sync machines. The computed p2 commit for the first merge commit (8991b10184de) is inconsistent between these systems. On modern vcs sync, the final merge commit (bfa194d93aed) has a p2 parent that contains 72923 commits, all the way down to a separate root commit (it should contain exactly 2 commits before the branch point). Had GitHub not rejected the push because of the faulty author line in 26a2aacf9e3a ("Alexander Surkov <h<surkov.alexander@gmail.com>"), this merge commit and its 72921 "duplicate" commits would have made its way into history.

At this point, legacy vcs sync is in good shape to replicate b2g* repos again. Hal is working on that now.

However, anything after my merge commits in central/inbound/fx-team/b2g-inbound is a real mess. We're considering skipping my merge commits in the Git repos. They add little value to history and just cause headaches and a prolonged outage. We would rebase subsequent commits onto the commit immediately prior to my merges and and update the SHA-1 map accordingly. This will make duplicating the SHA-1's in Git from an independent conversion almost impossible. But I'm pretty sure that ship has sailed a long time ago. If you have objections to this plan now, speak now or forever hold your peace. You probably have until Sunday PST time, since Hal and I both have earlier commitments for Saturday.
temporary changes per comment 45
This suggests the sha1 map is missing information for old changesets. Skipping even more changesets sounds to me like a recipe for future surprises. The problem with rushing things so that the outage lasts less longer means whatever the git repo will look like after that will have to be that way forever, except if we break the rule that the git sha1s must never change.
I tend to agree with Mike, I'm uncomfortable with the balance of manually altering things directly on the vcs sync hggit working directories in order to get us up running quickly again. Experience has told me that anything that can go wrong will go wrong, so I see putting us in a position where it would be extremely difficult to re-fudge a clean environment as not beneficial overall, and causing us pain later on, either with unexpected side effects or surprises, or in a disaster recovery situation. I think if there is any way to keep the process clean (such that the conversion could be created from scratch) we will be better off. I propose we meet on Monday to discuss options.
Depends on: 1128698
Attachment #8556403 - Flags: feedback?(gps)
gecko-projects Git replication has been restored.
Depends on: 1128754
Depends on: 1129675
I have stripped mozilla-beta and mozilla-aurora of old branch closing commits that didn't have the proper date. I believe these were the only repositories containing these bad branch closing commits.

In the case of beta, duplicate branch closing commits have been stripped. In the case of aurora, these were the only branch closing commits. So branches now appear as opened on fresh clones of aurora. We can re-close the branches after merge day.
We're good for now on vcs-sync impacts:

15:19 < rail> hwine: is this something that you should worry about with regards to vcs sync
              https://bugzilla.mozilla.org/show_bug.cgi?id=927219#c53 ?
15:20 < gps> rail: i looked at modern vcs sync after the stripping and it seemed happy
15:20 < rail> yay
...
15:21 < hwine> rail: yeah, what gps said for today
As was discovered during merge day activity today, someone re-pushed old and bad branch closing commits to mozilla-aurora last Friday. I re-stripped Aurora to unblock central->aurora.

I have since pushed proper branch closing commits to mozilla-aurora. If anyone attempts to push the old bad changesets again, the multiple heads per branch hook will reject them on the server.
I've also pushed some branch closing commits to mozilla-beta.

With this change, I'm willing to call this bug done. Good riddance.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Blocks: 1249802
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: