Netlib is a repository for mathematical software such as BLAS, LAPACK and, most importantly =), AMPL Solver Library (ASL). The software can be retrieved using a number of protocols and as I found out the download speed can vary greatly. For example, it took me 10 minutes to retrieve entire ASL by FTP using wget while other options can reduce this time to tens of seconds. This was the reason why I decided to do this small comparison of different methods to retrieve software from Netlib.
So the slowest way is to retrieve individual files by FTP:
$ time wget --recursive ftp://netlib.org/ampl
...
Total wall clock time: 10m 9s
Downloaded: 880 files, 45M in 2m 26s (317 KB/s)
real 10m8.855s
user 0m0.804s
sys 0m3.704s
It turned out that a much faster method is to use HTTP instead of FTP. This is a complete surprise to me, considering that FTP was designed for file transfer. Maybe its just an issue with wget? If you have any ideas please let me know in the comment section below.
$ time wget --recursive --include-directories=ampl \
http://www.netlib.org/ampl
...
Total wall clock time: 1m 59s
Downloaded: 714 files, 41M in 30s (1.37 MB/s)
real 1m58.986s
user 0m0.476s
sys 0m2.856s
The --include-directories=ampl
option ensures that only the
content of the ampl
directory is downloaded and files in other
locations referred from html files are ignored.
However there is even a faster method which relies on Netlib ability to provide whole directories as compressed files:
$ time wget ftp://netlib.org/ampl.tar.gz
...
real 0m49.365s
user 0m0.320s
sys 0m1.908s
Rsync is yet another method almost as fast as downloading compressed directories by FTP:
$ time rsync -avz netlib.org::netlib/ampl .
...
real 0m53.467s
user 0m0.860s
sys 0m2.512s
Putting it all together:
The conclusion is simple: using rsync or retrieving compressed directories by FTP are by far the fastest methods of downloading from Netlib. Recursive HTTP download is more than two times slower and recursive FTP is painfully slow.
Update: found possible explanation of poor performance of recursive FTP here:
Retrieving a single file from an FTP server involves an unbelievable number of back-and-forth handshaking steps.
Last modified on 2012-06-25