Home Home > 2008 > 08 > 29 > openSUSE’s MirrorBrain and a New Lizard in China
Sign up | Login

openSUSE’s MirrorBrain and a New Lizard in China

August 29th, 2008 by

Do you know openSUSE’s MirrorBrain?  I have been working on it for over a year now. It is a mirror framework which is open source and can be used by anyone.

The other day, I received the following sentiment:

I fully appreciate your work. In my view openSUSE has the best managed download mirroring in place! Only few come close!

It was the admin of one our our mirrors who wrote this. A large one, which does mirror more than 100 projects other than openSUSE.

It is nice to see (and important for us) if mirror admins are happy. Mirrors are crucial to get openSUSE out to you. Without mirrors, we are nothing. Our little download server could not serve you on its own. download.opensuse.org receives 15.000.000 to 40.000.000 requests on a normal day. But together with the friendly organizations that mirror us, we have been serving at least 25-30 gigabytes per second (!) to you at peak times.

A lot happens behind the scene to make sure that openSUSE is continuously and easily available. If you never actually notice anything about it, then it only means we are doing well!

For instance, I am always searching new mirrors. One of the biggest recent achievements was that Coly Li, our Chinese friend, installed the first “real openSUSE” mirror in China: http://www.lizardsource.cn/. In China mainland, there are already several sites that mirror opensuse, now lizardsource.cn is the first opensuse specific mirror and the largest openSUSE mirror so far.

When talking to Coly about the situation in China, he provided the following insight:
(explanatory comment: GFW refers to the censorship system, nicknamed Great Firewall of China)

Our motivation is:

1) international internet connection is slow from China mainland, no matter GFW exists
2) Most of universities and institutes use CERNET, they can not connect to international internet directly.
3) South China and North China use different public internet networks, inter-connections is very slow.

There are several opensuse mirrors in China already, but they are 1) limited to a small group of people, or 2) slow for non-charge users, or 3) out of maintenance.

lizardsource.cn can be accessed from both universities, institutes, south China, north China. The download speed within China mainland is much faster, people from universities observed 200KBytes persec. That’s the advantage and importance of lizardsource.cn.

Some other mirrors I could acquire last month (good ones) were in Nicaragua, South Africa, Indonesia, Poland, Latvia and other countries.

On the more invisible side, last week I have extended the mirror framework so that it can run in multiple instances on one machine; this may open up some interesting applications later, because we could run a separate redirector for separate file trees, with a different set of mirrors.

Gerard Fàrras, one of our GSoC students, is working on incorporation of a metalink client into YaST/zypper. Once that is implemented, it will make our package installer much much more robust against all sorts of network issues. A working prototype exists!

Currently, I am researching on a somewhat complicated idea to achieve a more fine-grained mirror selection scheme. More on that later maybe.

The outdated wiki pages that list mirrors need to be replaced by real-time lists generated from the mirror database. I don’t know when I/we get around to do this. If anyone would like to hack on a web frontend for the mirror database (I am picturing a TurboGears app that integrates with the existing Python mirror toolbox), contribution would be most welcome; let me know if you are interested!

See http://mirrorbrain.org for more info about the framework we use. Info for site operators interested in mirroring us is to be found here.

Both comments and pings are currently closed.

26 Responses to “openSUSE’s MirrorBrain and a New Lizard in China”

  1. http://www.lizardsource.cn/ is down :(

  2. Some other mirrors I could acquire last month (good ones) were in Nicaragua, South Africa, Indonesia, Poland, Latvia and other countries.

    Indonesian openSUSE Community providing mirror because we got same problem as well as in China. download.o.o occasionally could not be accessed from Indonesia, and it was a big problem because most of new openSUSE user used one-click-install feature. We solve this problem by editing the one-click-install file, replace download.o.o by nearest mirror and then publish the one-click-install file on website, forum and mailing list.

    I don’t know is this possible or not, but it would be very nice if the mirror also provided front end of software.opensuse.org/search and search any software within the mirror.

    • poeml

      Vavai, I see the problem. It can in fact hit anyone, even if it is much more prevalent in regions like yours.

      One of the two large user groups of download.opensuse.org is libzypp (through YaST/zypper). I have a concept how to deal with the issue you mention, but it hasn’t been implemented by anyone so far. It’s described here: http://en.opensuse.org/Libzypp/Failover
      (The other large user group being people wanting to download ISO images – they are in general not as much affected as libzypp, though.)
      A very important feature for libzypp, that I want since a long time, is the possibility to configure local mirrors for preferred or additional use.

      software.opensuse.org/search is a different story. I think downtimes are less critical compared to download.opensuse.org, because security updates are not affected, and it’s rather needed for installing additional software. Is reachability of software.opensuse.org bad all the time? Or does it vary much? Is it due to slow international link or where is the bottleneck? You could approach the buildservice people and discuss the possibility of setting up a second instance of software.opensuse.org/search somewhere in your region, for load balancing and available as fallback. opensuse-buildservice at opensuse org or opensuse-project at opensuse org would be good contact points for this I think.

      As a sidenote, *any* concept for use of fallback servers / mirrors needs to take into account that the security of obtaining security updates can suffer, so it might be a better tradeoff to wait for the security update instead of getting it from an untrusted source. (I’m not only talking about packages, which are signed anyway, but also about the check for updates itself, where a rogue upstream source could claim “there are no new updates” and thereby keep you from installing an important update. This is also the reason why download.opensuse.org delivers some critical files always directly.)

  3. neo

    Since Fedora already has a mirror manager that does what you want, why not use it instead of duplicating the work?

    https://fedorahosted.org/mirrormanager/

    It has a turbogears based administrative interface as well.

    • Beineri

      Does MirrorManager redirect downloads and generate metalinks?

    • poeml

      Neo, do you refer to Fedoras MirrorManager web frontend, or to the complete mirror system? I am assuming the latter and replying as such.

      Fedora is not the only one having a mirror manager, there are other systems like Sourceforge, Mozilla.com, OpenOffice, Akamai, which all share a common goal (which always implies certain duplication), but still have different needs, implementations. “Does what you want” is a very relative thing :-) No, Fedoras MirrorManager does not what we want. But our system does. Even though I envy Fedora for their complete administrative interface ;)

      (The administrative interface could probably be grafted onto our database. It’s just that nobody has looked into that. Feel free to work on it)

      One of the unique features of our MirrorBrain is that it is a truly generic solution. It can be used for other purposes just fine, and is in no ways tied to openSUSE content / DVD images / rpm packages, not even to mirrors. You could use it to host a network of shared video or image servers just fine. I’m actually confident that it will be adopted by other organizations sooner or later. This has been an important goal for me from the beginning. Of course, everybody already has “something” in place, and it is always work to switch…

      Fedoras MirrorManager wouldn’t work for us because larger parts of our file trees change too fast. In fact, the openSUSE build service strongly pushed the development, because of the virtually continuous stream of packages that it generates; but also the trees containing Factory and security updates have a high turnover rate. Higher than mirrors can sync the trees. With a little bit of insight into the problems that large content distributors are fighting (for instance, FreeBSD, OpenOffice, or also Fedora) I can assure you that we can be very happy with the approach we use.

      Having a preconception of a “not invented here” syndrome? Don’t worry ;-) The last time I talked to Fedoras MM author is maybe three days ago. It is very important to exchange ideas and views because, if you are building infrastructure like this, it is *very* difficult to find anyone to talk to at all.

  4. neo

    Yes, it does both.

    • poeml

      Neo: Yes and no. Redirection, yes; regarding updates, part of the mirror choice is actually done on the client side, though.
      Metalinks: Currently being worked on, so: not yet ;)

  5. manchette

    I oftentimes read people who kind of “compete” to find the best mirrors around,this on and on, then i read that download.opensuse.org had its own internal mechanism able to find according to the ip the best/fastest mirrors near to the user.
    Is this still true ?
    If so is it useful to use others mirrors than download.opensuse.org (e.g : skynet.be or gwdg.de … and so on )?
    Or using the orignal ones is enough as this magnificent mechanism does the work behind the scene for us ?

    • poeml

      Manchette, download.opensuse.org does indeed attempt to take away as much work away from you as possible. Criteria of mirror usefulness are not only “reliable” and “fast”, but also which files does it have, when was it last updated, and is it in a consistent state? With 1000000 files a Terabyte in size, no complete mirror of openSUSE exists in this world. download.o.o knows what’s on each mirror and picks the closest match for you.

      Having said that, even though that the server can make a good choice for you, it doesn’t know (and has no way of assessing) whether a chosen mirror is really usable for *you*, a given user, at a particular time. After all, the mirror could be just shutting down or crash in the very moment, or a network make it unreachable.

      Therefore, an good download client is the key. For downloading ISO images, Metalink clients are the way to go. For instaling packages and updates via YaST/zypper, the <a href=”http://en.opensuse.org/Libzypp/Failover”libzypp failover plan I mentioned is the way to go. There are enough mirrors to be used for fallback, aren’t there?? Intelligent clients will use them.

      If you think you need to “hunt mirrors”, then there is likely something wrong. If you get broken files from a mirror, then we need to know about so that the mirror can be fixed. (We can notify the mirror admin.) If you get assigned to a mirror that seems too far away it could just be to the fact that no other mirror has certain files (yet), and of course it could be a problem that you might want to report to us. You could report problems either via the ftpadmin at suse de mail address, or via Bugzilla (product openSUSE.org, component download infrastructure). We will investigate it then.

      • manchette

        Thanks for your answer, i do not have any problem. i just think the mechanism i described behind download.o.o is not know enough, so that people tend to advice each other “better” mirrors or tend to advice others mirrors to “ease the charge of the main one”.
        Is download.o.o capable to accept all these downloads ? i often read that choosing others mirrors will give me a quicker download, can i assume this is not true anymore ? Or never has been true maybe.
        It looks like you’d definitely advise download.o.o as the best choice (but in some special cases you described above ).

        • poeml

          Manchette:

          > Thanks for your answer, i do not have any problem. i just think the mechanism i described behind download.o.o is not know enough, so that people tend to advice each other “better” mirrors or tend to advice others mirrors to “ease the charge of the main one”.

          Yes, this is what’s still in the heads of many people.

          > Is download.o.o capable to accept all these downloads ?

          Totally, it is not even under load.

          > i often read that choosing others mirrors will give me a quicker download, can i assume this is not true anymore ? Or never has been true maybe.
          > It looks like you’d definitely advise download.o.o as the best choice (but in some special cases you described above ).

          Yes.

          In the distant past, there was one master download server, and everybody was
          advised to *not* use it. Everybody had to search for mirrors instead. Mirror
          lists were maintained by hand.

          Today, the mirrors are known by the master, and you simply let it assign you a
          mirror. Indeed you can expect that it makes a reasonable choice for you.

          However, one thing has not changed: a mirror that’s assigned could be broken at
          the very time, could suffer an attack or abuse, and it could become painfully
          slow or unavailable. This can happen at any time and it means that you need to
          try another mirror.
          But there usually is, when you download something from the Internet, no
          “feedback loop” which notifies the server whether your download was successful
          in the end. The server can’t help you with dealing with the situation, except
          that it can “try again” when *you* try again. Well, this is truly a general
          problem that affects all download clients, browsers, servers alike.

          If you wonder, if it wouldn’t be easy to make the client tell the server that a
          mirror didn’t work – no, it’s not; first, no standard client (the webbrowser
          you use) does do it, and even if they did, it’d be difficult to distinguish
          problems that affect only the user from problems that really are with the mirror.
          It’d be hard to get useful data from it.

          The solution to this exists today, and is available to you. It’s just not well
          known. I’ll describe it in the following.

          (Sidenote: there may be other potential solutions you can think of, but for our
          demands it is a must that a solution be highly scalable, stateless, play well
          with proxies.)

          The principal solution is twofold. The responsibility of the server
          (download.opensuse.org) is to provide “knowledge”. The responsibility of the
          client (your download program) is to *use* this knowledge.

          In detail: download.opensuse.org can provide information about all potential
          download sources right away, and send checksums along which allow for
          verification of the download. With this information, the client has all it
          needs to be sure to successfully complete the download without ever having to
          query the server again for additional information. It also means the client can
          work fully automated without prompting the user for anything – until the
          download is done. Note, you don’t even have to manually check MD5 sums, like in
          the past. It’s already done by the client.

          This technology is called “Metalink“.

          The download program (the Metalink client) automatically tries other mirrors, if
          one doesn’t work, or if is too slow. The client verifies the checksum of each
          downloaded part, and automatically re-fetches broken segments from another
          mirror. This will guarantee that you get the correct file even if several
          mirrors had an outdated or broken file by accident.

          Therefore, I heartly recommend to use a Metalink client when downloading openSUSE.
          It’ll give you a happy download. It’s worth it.
          Yes, it’s not an utopy – it’s there :)

          If you use Firefox, install this Firefox extension.
          If you are a commandline junkey, there is aria2c (packages)
          Check http://metalinker.org/ for more clients.
          Our wiki has instructions: http://en.opensuse.org/Metalink

          Ask your browser vendor for native metalink support.

          The same technology could be leveraged for YaST/zypper. Your life will be a lot
          easier once YaST/zypper can sensibely deal with network and mirror failures.
          This is still an utopy — although a working prototype exists, thanks to
          Google, who allowed a GSoC student to work on it. See
          http://en.opensuse.org/Libzypp/Failover for more information about this and the
          Status quo.

  6. phobe

    This is basically the only thing left preventing me from coming back to Suse. I like having my choice of mirrors embedded in software management.

    • poeml

      Phobe, do you mean in libzypp? -I agree we need support in libzypp (i.e. in YaST, zypper) to configure arbitrary mirrors for preferred or additional usage. This is a very, very, important feature which I hope will be implemented.

      If you want to see all the sources for downloading ISO images, you have the full choice – just browse to the download server (e.g. the 11.0 DVD dir) and click on a mirror link, like this one. Note that *real* mirrors will be listed, which have the file – not *potential* ones.

  7. You’re doing cool and important work! Thank you.

  8. log111

    Good Work!

  9. neo

    “One of the unique features of our MirrorBrain is that it is a truly generic solution. It can be used for other purposes just fine, and is in no ways tied to openSUSE content”

    Not so unique. I am using Fedora’s mirror manager to mirror non-Fedora content. So is Dell.

    “(The administrative interface could probably be grafted onto our database. It’s just that nobody has looked into that. Feel free to work on it)”

    You work on it if you need it. I don’t. I was merely pointing out that one already exists that you could use or else reinvent the wheel. Not so new for Novell anyway.

  10. http://suse.mirrors.tds.net/pub/opensuse

    All i use for OSS, non-OSS, and updates. Fast, reliable, down right killer. Used to use mirrors.kernel.org, but found the initial contact to be slow, and sometimes not up-to-date.

  11. thx for infos. Thx to manchette for questions ;) french community is here too :D

  12. LiuMan

    oh,it is a great news for us who living in China.
    So,we can download the update more faster then before.
    thanks,Lizard.cn

  13. Vincent Liu

    Good!
    But I found it not so fast at my location (South China; Xiamen, to be specific).
    The lupaworld is much faster than lizardsource.cn.