Home Home > 2008 > 12 > 16 > Best Way to Download openSUSE
Sign up | Login

Best Way to Download openSUSE

December 16th, 2008 by

For most people, downloading traditionally looks like this:

  1. looking at a traditional, more or less static mirror list, and picking a mirror :-(
  2. trying the mirror and see that it is too slow, outdated, or not reachable :-(
  3. looking at the mirror list again, and picking another mirror :-(
  4. downloading with a web browser or FTP program
  5. restarting a failed download, after loosing network connection for some reason :-(
  6. ditching the download because it never finishes, starting from scratch from another mirror :-(
  7. finally having a completed download, but for some reason it doesn’t install… :-(
  8. finding the MD5 sum and manually verifing the download :-(
  9. finding it broken and don’t know whether to start from scratch, repair the download with rsync, … :-(
  10. scratching head… and be frustrated :-(

Manually proceeding like pictured above is no longer needed, nowadays. At least not with openSUSE.

All you need is a Metalink client. This is a wonderful technology that fixes all the above issues, and makes downloading “just work”. A Wikipedia article explains how that is achieved.

The openSUSE download server fully supports this technology, by using MirrorBrain. Mirrorbrain is a download redirector and metalink generator which is open source and supports all advanced Metalink features. Features as embedding of Torrent links, verification hashes, cryptographical signatures and transparent negotiation, so that no separate links are needed on our web sites. Most of these features were added during the course of 2008.

There is a number of Metalink client programs out there. There is a FireFox extension called DownThemAll which works in FireFox on all platforms. There is aria2, a commandline program which is the most powerful of all of them. Our wiki has a list with more clients. I tend to recommend aria2, because it is the most powerful one. It is very simple to use, nevertheless.

aria2 deserves special notice, because it has the full support for all goodies that one might think of. These include:

  • downloading from several mirrors at the same time (so it also makes you faster) :-)
  • automatically noticing mirror problems, and resuming from other mirrors :-)
  • simultaneously downloading via Peer-to-Peer (BitTorrent) :-)
  • error checking for transferred data is not only done in the end – but already during downloading. Each part of the file which has arrived is already checked, and if it’s found to be broken, it is scheduled to be refetched from another mirror. :-)
  • creating a local *.asc file which contains the cryptographical signature which can be used to verify the authenticity of the file :-)
  • automatically noticing if a server supports metalinks (if not, it will just act as “normal” download client)
  • being robust against all sorts of network failures :-)
  • avoiding head-scratching of its user :-)

Both aria2 and MirrorBrain are “location aware”, and work together to select mirrors that are as close to you as possible. In addition, mirrors known to be more powerful are assigned more users.

What else do you need to know? Not much. The command that you run to download an image is as simple as:


aria2c http://download.opensuse.org/distribution/11.1/iso/openSUSE-11.1-DVD-i586.iso

(For some other clients, you need to append “.metalink” to the URL.)

Note, aria2 tries to maximize utilization of your Internet connection for download bandwidth. This is wanted for most people, but it may be unwanted if you want to use the connection for other work, or if you are in a company with shared Internet access. In that case, use acia2′s -C command line option to limit the number of simultaneous servers being used.

Special note for Torrent users: you don’t need to bother downloading Torrent files. Aria2 does this automatically… since the Torrent link is embedded in the Metalink!

If you want to see what the magic behind all this is, look at http://download.opensuse.org/distribution/11.1/iso/openSUSE-11.1-DVD-i586.iso.metalink with an editor. You’ll see an XML file containing everything that the Metalink client needs. This file transfers the knowledge of the download server (and mirror database) to the client. With this knowledge, the client is enabled to work its way to a successful download even under adverse circumstances. In contrast, a traditional HTTP redirect to a mirror does convey only extremely minimal information – one link to one server, and there is no provision in the HTTP protocol to handle failures, or to add checksums that make problems detectable. An Internet Draft documents the Metalinks.

Many thanks to Tatsuhiro Tsujikawa for aria2!

This technology would be even easier to use, when web browsers would implement native support for it. Let’s hope that we will see that in the future. The technical challenges are solved and the way is paved. Ask your favourite browser vendor for it today…!

And since this is so powerful, we intend to employ it for other downloads as well — those done by the openSUSE package management tool, YaST respectively zypper. A prototype for this is available in openSUSE 11.1. Please test it – it is enabled by installing aria2 and setting ZYPP_ARIA2C=1 in the environment.

Both comments and pings are currently closed.

13 Responses to “Best Way to Download openSUSE”

  1. Awesome stuff Peter! Thanks a lot for sharing this one.
    Seems to me that aria2 is what I’d call a “eierlegende Wollmilchsau” ;)

  2. MirrorBrain + aria2 = killer download combo!

    • rjladyman

      “And since this is so powerful, we intend to employ it for other downloads as well — those done by the openSUSE package management tool, YaST respectively zypper.”

      All this is fine, as long as it doesn’t bypass all the lovely squid caching we do for our network users – get the iso / updates once and every other machine gets them from the cache. If it does prevent caching, it would be ideal if the ‘aria2′ method can be made optional (and / or a reversion to previous behaviour is available as an option). The ‘-C’ option will also need to be available as an external option for the above tools as well.

      • Peter Poeml

        Don’t worry, it doesn’t bypass proxy caches at all.

        When downloading package metadata or packages, it’s all still plain old HTTP, not torrent involved. Just with added metalink goodies.

        Torrents are only relevant for downloading iso images.

        And yes: allowing for tuning with -C and similar things, in the context of aria2c used from libzypp, is one of the requirements in my view as well.

  3. Nicolas

    Just want to correct a little factual inaccuracy :)

    “…no provision in the HTTP protocol [...] to add checksums that make problems detectable.”
    There are two ways to add checksums in HTTP.

    One is to use the Content-MD5 header, defined in RFC2616 (HTTP/1.1), section 14.15. It kinda sucks. Only supports MD5. It applies to the body of the HTTP response, which may or may not be the same as the full file (for example due to Range, or transformations like compression). The server may waste CPU calculating it even when the client won’t use it, because the server has no way to know if the client will use it.

    The other way is specified in RFC 3230. The client has to first request a checksum (which may be MD5 or SHA-1, and allows future extensions), and the server will reply with the checksum of the whole file, even if the returned data is just a part of the file (due to Range).

  4. TSU2

    Howdy Peter,
    Just spent a couple hours digging up all the interesting stuff I can about Metalinks and it sure does look like downloading SuSE files would be a unique experience primarily due to the implementation of MirrorBrain and if true the use of the Akamai network.

    But, I wonder about the potential for other scenarios where the Servers aren’t widely deployed geographically, mostly HTTP/FTP (non-segmented downloads and don’t upload) instead of Torrent and don’t benefit from Akamai technology.

    Metalink seems to only attempt to download from the most opportune source but the truth of the matter is that only yields a benefit in either distributing Server load or where there is ample excess Server capacity. In fact it looks particularly limited if the Servers don’t support “segmented downloads” where large files are broken down into individual small chunks which can be downloaded in any order and can be retrieved by anyone else purely because that little bit is available.

    I can’t see any kind of advantage over a robust and virulent Torrent swarm at all… Although initially it might look great that a Client has so many options to download but the fact of the matter is that if the client is not uploading (like Torrent clients) or too many people are connecting to HTTP/FTP downloads then the benefit could be marginal if at all.

    If you see a hole in my thinking, pls comment… :)

    • Nicolas

      Servers always support “segmented downloads”; you’d have to do quite a few configuration tweaks to make Apache *not* support them. I don’t understand why you mention HTTP as “non-segmented downloads”.

      Metalink doesn’t download from “the most opportune source”, it downloads from 5 most opportune sources at the same time.

      It’s true that it has no advantage over a torrent swarm, but it doesn’t claim that. It’s supposed to be better than giving a huge list of mirrors and a md5 hash to check manually. Torrents are a different beast.

    • Peter Poeml

      99% of servers support segmented downloads. In HTTP, it’s implemented as so-called byteranges, and even though it is an optional part of the HTTP/1.1 specification it is supported by every webserver. It is a valid configuration to disable it but less than 1% of server admins do so.

      Theoretically, a Torrent swarm could be a replacement, but only theoretically. In practice, for most downloads there is no Torrent swarm. For some larger, popular files it is feasible for them to exist but it is unrealistic for them to ever exist for a myriad of smaller files that we are also serving.

      Metalinks can play all their advantages with the existing infrastructure of 10-150 mirrors that serve files. In those cases where Torrents exists (iso images) it is a welcome addition that can be also employed by metalinks.

      Whether Akamai servers take part of not doesn’t make a functional difference; it is just one server (farm) more that contributes to handling the user requests.

  5. Thailandian

    What a breakthrough Peter!

    I’m currently downloading 11.1 at 50-70 kB/s in Thailand, even though the Thai mirrors haven’t got the new distribution yet.

    Just a note about aria2 – for those interested, there are three gui’s available, albeit in early stages of development. Here’s a link:

    http://aria2.sourceforge.net/#guifrontends

    As part response to TSU2, you may well be correct that a “robust and virulent Torrent swarm” is ideal, but that is only the case for a highly popular torrent at it’s peak. After I’ve got my shiny new openSuse 11.1 up and running, I’ll want to download a whole raft of niche packages that I seriously doubt would have such swarms.

    Moreover, since metalinks include torrents anyway (if they exist), using metalink clients should help swarms get nice and virulent that much more quickly. It certainly looks to me as if aria2 is uploading. Here’s a snippet of aria2′s output:

    [#3 SIZE:702.7MiB/4,442.5MiB(15%) CN:45 SPD:65.59KiB/s UP:32.45KiB/s(474.5MiB) ETA:16h13m09s]
    FILE: ./openSUSE-11.1-DVD-x86_64.iso

    One thing I’m curious about though, does the metalink update during the download process? For example, if the Thai mirrors come online before my download is complete, will aria2 “know” about that?

    • Chris

      I’M from germany, hi you!

      Ok, i believe aria2 downloads the metalink file at the beginning and does not update it at any point so it does not know because it downloades all the data needed at the beginning of the download and does not update it’s knownledge. I was using aria2 as well for opensuse11.1 and i needed 5hours,20minutes at around 200Kb/s average speed;-)

  6. sami

    Hi all,
    Well i’m new to linux world, and from a month i downloaded Opensuse 11.0, and it’s simply amazing and working perfect…
    And i need an advice, since i downloaded the new release 11.1 dvd version 2 time and burn it, and still i have the same problem, during installation from the first step i’m getting error that the checksum sha1 for one file if i remember is in the boot folder is wrong if you trust the source continue…….
    And when i choose yes i trust the source the installer have a complete different look, blue and black with different way of the older installation…. is there’s any hint? even i checked the checksum for the complete iso, MD5 and they are good.

    Regards,
    Sam

    • Otto

      Hi all!

      I am in the same troubles as Sam.

      Downloaded openSuse 11.1 DVD (via Torrent link on official site), burned the image and now I am getting the same SHA1 sum error as Sam.
      The checksum seem to be o.k., but the installer keeps complaining.

      Do I need to get the image otherwise (without Torrent = painfully slow) or is there an other way.

      Thank you,
      Otto

      • Peter Poeml

        Did you guys download with a metalink client? I suppose so, since you commented at this place? One of you talks about 11.0 installation, one about 11.1. Anyway, I strongly suggest that you create a bug report if you have trouble installing the downloaded image.