An absolutely filthy recipe for mirroring Debian repositories

May 12, 2020 by Lucian Mogosanu

I've been burning to get this off my head and onto this piece of virtual paper for more than a month now. Alas, I didn't have the time then, and I'm purposely putting other work on the side to get this done now and... what else can I say; iarna-i grea, covidu' mare, semne bune anul are. Anyway, let us proceed.

Debian is, as many of you may know, a Linux-based operating system. I've been using it for a while now, use born out of a mix between chance, the project's age, my own habits, setup time and, to a very small degree, the tools it provides. No, I don't trust all its utilities down to the binary level, hence the "absolutely filthy" in the title, but it's worked for me for a while now, and by "worked" I mean that it does what I want and doesn't what I don't want it to, and by "it" I mean... well.

The project has seen its share of rape throughout time, the raping coming from both within and without. Thus at some point I stopped installing Debian on new systems, then I played with it just recently for bootstrapping processes, then I got pissed at the whole "computing" thing and thought that hey, I might as well try to mirror this shit.

And by "this shit" I mean the last version of Debian I've used, 7.0 (Wheezy), though I expect the recipe here will work for any APT-based operating system, including Ubuntu et al. Speaking of which: APT stands for the very pompous "Advanced Package Tool", a wrapper over "dpkg" (Debian Package), which is a tool for managing such things as "software packages". APT is mostly useful because it can grab stuff from "repositories", regardless of whether said repositories are to be found on your USB stick or on a web server.

This lengthy introduction being laid out, my goal is to steal a Debian release in its entirety, so that I may later use it to set up robust (albeit not necessarily trustworthy) systems based on this rusty ol' piece of shit. I wasn't kidding when I said I'm maintaining Feedbot-Linux, and I don't particularly care about the variety of "truckload of unknown unknowns" I'm grabbing, so... here I am. Anyway, these are the necessary ingredients:

  • a GNU/Linux system, preferably of Debian variety1, used as a bootstrapper;
  • a tool for downloading e.g. pieces of archive.debian.org on one's machine; and
  • a minimal install Debian ISO image, e.g. the "netinst" image,

such that in the end both the "installer" and the resulting operating system will be able to grab packages off the "bootstrapper".

In order to make this possible, one should be able to mirror the Debian archive, which would be a curl away, except... I don't want to mirror 600GB for 150GB's worth; more precisely, it should be possible to mirror only the Wheezy repository, not those of other versions of Debian. To this end, I propose employing one abject Perl script going under the name of "apt-mirror", which I have mirrored here for my own pleasure2.

Before using this script, I have found the need to make some modifications to some paths that are apparently hardcoded there, or at the very least attempting to change them via the usual command-line magic didn't do anything for me. So I first changed the value of the base_path config variable (line 95) to something more fitting, e.g. some place in my home path. Then I changed the value of the config_file variable (line 120) to a custom mirror.lst, because I don't want to have to edit a global system config file to download some stuff off the Internet.

Regardless of your configuration options, make sure that the mirror.lst file is in its proper place and that it contains some minimal config info. For example, I set the following:

set defaultarch amd64
set nthreads=2
set base_path=/my/attempt/to/override/base_path

followed by a bunch of APT source.list entries, e.g.:

deb-amd64 http://archive.debian.org/debian/ wheezy contrib main non-free
deb-amd64 http://archive.debian.org/debian/ wheezy-backports contrib main non-free
deb-i386 http://archive.debian.org/debian/ wheezy contrib main non-free
deb-i386 http://archive.debian.org/debian/ wheezy-backports contrib main non-free
deb-src http://archive.debian.org/debian/ wheezy contrib main non-free
deb-src http://archive.debian.org/debian/ wheezy-backports contrib main non-free

So I just grabbed everything related to x86, plus sources, adding up to about 130GB's worth of Wheezy.

Assuming you've done all this, now you just have to fire up apt-mirror and sit and wait for it to download everything. It's supposedly also doing some setup afterwards using some post-mirror scripts, but I didn't find the need for any such thing: I found everything I needed in:

$base_path/mirror/$url

The final step in this recipe involves verifying the thing, not that this is actually possible. Still, you should be able to run some minimal tests, e.g. fire up a web server to serve the files you've just downloaded, and then you can fire up the installer on a machine in the local network, have it pointed to the newly-installed repository and then grab a coffee until the install is done3.

That's about it. You may have noticed the "deb-src" in my list; my guess is that these people must have (had) some process to bootstrap the whole system from sources... and while in that field the alternatives are clearly superior, I guess it doesn't hurt to look.


  1. I haven't tested it on other systems, nor do I have any plans to. 

  2. Just in case you're asking: no, I'm not going to sign, maintain etc. such nonsense. Read the script for your own, or you can let it burn down your house for all I care. 

  3. Or until it fails, in which case I may be able to help, depending on the specifics of your hardware setup. So far the process worked great for me on 2012-era Intels, for example. 

Filed under: computing.
RSS 2.0 feed. Comment. Send trackback.

8 Responses to “An absolutely filthy recipe for mirroring Debian repositories”

  1. Was there supposed to be a link to the resulting mirror here?

  2. #2:
    spyked says:

    Huh, I forgot to discuss this, sorry.

    Stan, I would really like to share this with my L1 somehow, but I don't want to make it available to the Internet at large. I can see how this sorta makes me a jerk, except once in a while I open my various logs and mailboxes and see how "the Internet at large" tries to break into my machines, fill them with spam and so on and so forth. This gives me no faith in that "old Internet vibe", so hell no, I won't help the various randos do whatever it is they're doing. Besides, it's not like this is any secret, they can mirror the official archives while those are up, I suppose I'm already doing the WWW a great service by posting the recipe on a site without ads, JS and other diseases.

    If you're interested, I guess I can GPG you a tar of the repository, or upload the archive to one of your machines, or some other alternative you think might work. Let me know if you are.

  3. Re #2:

    Thanks for the offer, but I have my own mirror of the Gentoo which I actually use. Presently I am not using Debian anywhere.

    For other folks (who actually use Debian..) I suppose you could put up a passworded thing.

    FWIW my Gentoo mirror thus far hasn't cost me appreciable BW. Several (unknown) folks grabbed the whole thing; a few added it to their portage config and occasionally fetch packages. But this hasn't added up to backbreaking load.

  4. #4:
    spyked says:

    Regarding how to provide this: I guess I could, though at the moment I don't know how Debian's APT can be configured to deal with password-protected mirrors. If someone really wants this, I'll look into it.

    Regarding bandwidth usage: it's good to know, perhaps it'd be no trouble to leave it public. Still, I'd like people to make friends before, or even by asking for this, if only because then I'd know that the trouble to set it up wouldn't have been for nothing. I know that in itself is no big trouble after all, and I'm not making one a favour either, but as I've said above, I have no love whatsoever for anonymouses.

    Besides, if no one cares enough to ask, then on one hand it would have made no sense to make it public after all, while on the other, I hope that it might encourage a few to try and replicate the experiment in this article.

  5. Re #4 -- I share in the disdain for leechers (none of the people who grabbed my mirror bothered to put up their own, thus far) but so far found that it isn't worth the effort to keep'em out.

  6. [...] http://thetarpit.org/2020/an-absolutely-filthy-recipe-for-mirroring-debian-repositories << The Tar Pit -- An absolutely filthy recipe for mirroring Debian repositories [...]

  7. [...] starting where we left off, we have a fresh, clean1 Debian system and we want to install TRB; how do we do [...]

  8. [...] to either maintain SBCL or... take another stab at rolling out my own? I guess I'm already doing one or the other, only this imposes a significant burden upon my brain; or as you can see, I'm 3369 [...]

Leave a Reply