Open Bug 1596135 Opened 5 years ago Updated 4 years ago

disable `archive` webcommand on hgweb

Categories

(Developer Services :: Mercurial: hg.mozilla.org, task)

task
Not set
normal

Tracking

(Not tracked)

REOPENED

People

(Reporter: sheehan, Assigned: sheehan)

References

Details

Attachments

(1 file)

Since around November 7th we've been seeing load increase steadily on hgweb, eventually requiring human intervention in the form of apachectl graceful to bring load values back to normal. Looking at the Apache error logs, I can see the attached stack trace repeated continuously and by examining the logs for the days around November 7th I can see a huge increase in the occurrence of this stack trace around that date (26 occurrences on Nov 6th, 2000 by Nov 8th).

The stack trace is the result of a bot attempting to follow the archive urls on hgweb. These URLs are actually an hgweb webcommand to generate a tarball/gzip/bzip archive of a repo from a given revision. When this webcommand fails (which it appears to be doing, a lot) it does not properly clean up resources resulting in slowly increasing load.

To fix this we could ban IP addresses for the offending hosts. However the recommended way to get a source archive is to visit archive.mozilla.org, as described here. A more robust way to avoid the load increases is to disable the archive webcommand altogether.

Stack trace

[Wed Nov 13 15:42:14.834410 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660] Traceback (most recent call last):
[Wed Nov 13 15:42:14.834435 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/hgweb/hgwebdir_mod.py", line 358, in run_wsgi
[Wed Nov 13 15:42:14.834654 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     for r in self._runwsgi(req, res):
[Wed Nov 13 15:42:14.834669 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/hgweb/hgweb_mod.py", line 310, in run_wsgi
[Wed Nov 13 15:42:14.834807 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     for r in self._runwsgi(req, res, repo):
[Wed Nov 13 15:42:14.834818 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/version-control-tools/hgext/serverlog/__init__.py", line 432, in wrapped_runwsgi
[Wed Nov 13 15:42:14.834950 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     for what in orig(self, req, res, repo):
[Wed Nov 13 15:42:14.834960 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/hgweb/hgweb_mod.py", line 430, in _runwsgi
[Wed Nov 13 15:42:14.834977 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     return getattr(webcommands, cmd)(rctx)
[Wed Nov 13 15:42:14.834984 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/hgweb/webcommands.py", line 1220, in archive
[Wed Nov 13 15:42:14.835244 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     subrepos=web.configbool("web", "archivesubrepos"))
[Wed Nov 13 15:42:14.835260 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/archival.py", line 335, in archive
[Wed Nov 13 15:42:14.835353 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     write(f, 'x' in ff and 0o755 or 0o644, 'l' in ff, ctx[f].data)
[Wed Nov 13 15:42:14.835362 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/archival.py", line 308, in write
[Wed Nov 13 15:42:14.835378 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     archiver.addfile(prefix + name, mode, islink, data)
[Wed Nov 13 15:42:14.835384 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/archival.py", line 194, in addfile
[Wed Nov 13 15:42:14.835394 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     self.z.addfile(i, data)
[Wed Nov 13 15:42:14.835401 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/usr/lib64/python2.7/tarfile.py", line 2020, in addfile
[Wed Nov 13 15:42:14.835829 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     copyfileobj(fileobj, self.fileobj, tarinfo.size)
[Wed Nov 13 15:42:14.835841 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/usr/lib64/python2.7/tarfile.py", line 275, in copyfileobj
[Wed Nov 13 15:42:14.835859 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     dst.write(buf)
[Wed Nov 13 15:42:14.835865 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/usr/lib64/python2.7/tarfile.py", line 471, in write
[Wed Nov 13 15:42:14.835876 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     self.__write(s)
[Wed Nov 13 15:42:14.835882 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/usr/lib64/python2.7/tarfile.py", line 479, in __write
[Wed Nov 13 15:42:14.835891 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     self.fileobj.write(self.buf[:self.bufsize])
[Wed Nov 13 15:42:14.835897 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]   File "/var/hg/venv_hgweb/lib/python2.7/site-packages/mercurial/hgweb/request.py", line 353, in write
[Wed Nov 13 15:42:14.836029 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660]     res = self._write(s)
[Wed Nov 13 15:42:14.836051 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660] IOError: Apache/mod_wsgi failed to write response data: Broken pipe
[Wed Nov 13 15:42:14.843556 2019] [wsgi:error] [pid 15941] [remote 63.245.208.201:9660] Exception IOError: 'Apache/mod_wsgi client connection closed.' in <bound method _Stream.__del__ of <tarfile._Stream instance at 0x7fcdb7d01560>> ignored

As described in the bug, this command is causing load problems on
the public hgweb interface. Since we have xz archives available
for download elsewhere, and any archive can still be built from a
clone of the repo using hg archive, this commit disables the
webcommand to archive the repo server-side. We do this by setting
the web.allow-archive config to an empty value (ie no archives
allowed). This disables the webcommand on the server side and also
removes the various links on the web interfaces pointing at URLs
for the webcommand.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

This broke upstream web-platform-tests, which is using a url like https://hg.mozilla.org/mozilla-central/archive/tip.zip/testing/profiles/ to get an archive containing just the profile settings that need to be applied when running tests in Firefox. AIUI this use case isn't covered by any of the proposed alternatives which require downloading an entire source tree when we only want a few files. Is there a suitable replacement for this use case?

Flags: needinfo?(sheehan)

This is biting me too, though I may be able to work around it. There are sparse profiles in build/sparse-profiles/ that we use to clone a subset of files and directories from the tree in run-task.

At the moment we don't have a workaround. I've re-enabled zip archives for now so wpt is unbroken, and I'll continue to monitor the load.

If you can use a sparse checkout, that's preferred. However I'd imagine w-p-t runs on the test machines, which likely don't have a copy of the repo cached.

Status: RESOLVED → REOPENED
Flags: needinfo?(sheehan)
Resolution: FIXED → ---

Thanks! This is upstream wpt CI, not the gecko CI, so we don't have any kind of checkout and I'm not really keen to create one; I just experimented with narrow clones and either I couldn't get it working, or it still tries to start from a bundle of the whole repo (I suspect the former).

The most obvious option if the archive feature goes away is to download each file one-by-one, which also seems suboptimal for everyone.

Or actually I guess what we'll do is get the relevant files as a taskcluster artifact and download from there. So if it's necessary to reenable this let me know and I'll work out how to do that.

Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/68d46e490098
ansible/hg-web: re-enable zip archives on hgweb

Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
See Also: → 1597094

(In reply to Jordan Lund (:jlund) from comment #9)

This bit Snap too: https://firefoxci.taskcluster-artifacts.net/GEl8wjxETuGohi2F9qx-9A/2/public/logs/live_backing.log

Hi Connor, this is blocking Snap releases. Do you know the status of it or when it will be resolved?

Flags: needinfo?(sheehan)

(In reply to Jordan Lund (:jlund) from comment #10)

(In reply to Jordan Lund (:jlund) from comment #9)

This bit Snap too: https://firefoxci.taskcluster-artifacts.net/GEl8wjxETuGohi2F9qx-9A/2/public/logs/live_backing.log

Hi Connor, this is blocking Snap releases. Do you know the status of it or when it will be resolved?

If you can switch to zip archives as a workaround, that would be ideal. They're still enabled on the server and seem to be more stable than the tarball archive generation. I'm working on a more permanent fix in the meantime.

Flags: needinfo?(sheehan)
Depends on: 1597928

Thanks for the hint, Connor! It made me realize Snap should just stop fetching the code this way 😃 More details in bug 1597928.

See Also: → 1656475
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: