How to mirror CPAN
There are several ways to mirror CPAN depending upon what you want to achieve.
How do I create a private or offline mirror?
Requirements for a full / public mirror
- Good internet connectivity
- Around 1GB of storage space for just the current modules.
- Around 30GB of storage space for the full mirror.
It's highly recommended that you also subscribe to the announcements-only cpan-mirrors mailing list by emailing cpan-mirrors-subscribe at perl.org.
Only use FTP if these other methods are absolutely impossible. Never mirror with HTTP - you will end up with a million duplicate files in tens of gigabytes.
Which CPAN Mirror should I use?
You can also sync from
"tier 1 mirrors"), though you currently might get better
performance from a "local" mirror.
Please limit to once or twice a day. For more frequent updates please see Instant mirroring.
On Unix systems
/usr/bin/rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
Using 'crontab' you can make rsync run once a day, for example
40 4 * * * sleep $(expr $RANDOM \% 7200); /usr/bin/rsync -a --delete cpan-rsync.perl.org::CPAN /project/CPAN/
The "sleep $(...);" statement makes the command delay up to 2 hours before running rsync; the advantage of this is that you (and everybody else) won't access the mirror at the same time.
Unless you are mirroring to an SSD you might get timeouts using --delete-after when many symlinks are being purged. Using --delete will work properly.
If you have a problem with permissions (files are created with mode
-rw-------), set umask in your cronjob :
40 4 * * * umask 022 ; sleep ... ; /usr/bin/rsync ...
The umask 022 allows rsync to set proper permissions for files and directories.
On Windows systems
C:\Program Files\Rsync\rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
Using the 'AT' tool, you can schedule rsync to run daily, for example:
AT 20:00 /every:M,T,W,Th,F,S,Su "C:\Program Files\Rsync\rsync -a --delete cpan-rsync.perl.org::CPAN /project/CPAN/"
How do I create a public mirror?
- Consider Instant mirroring, required if you wish to be a tier 1 mirror, or..
- rsync once a day
- Provide (in order of preference) rsync, HTTP and/or FTP public access
- To be added to http://www.cpan.org/SITES.html and mirrors.json please complete the template confirming the public accessible URLs to your mirror: rsync, ftp, http and email it to firstname.lastname@example.org.
"Instant mirroring" keeps your CPAN mirror up-to-date by continuously tracking the CPAN master; picking up the changes from the master, a short time (minutes) after they occur.
Instant mirroring is used for all Tier 1 mirrors (so cpan-rsync.perl.org stays in sync across mirrors).
To use "instant mirroring", you need a special client: "rrr-client" or "iim".
"rrr-client" is part of the File::Rsync::Mirror::Recent
(also known as
rrr) package ; it is the official client, used
on the CPAN master to get updates from PAUSE : the true heart and soul of "all things
perl", see the setup
guide for more details.
"iim" is an alternative for "rrr-client" ; basically it does the same thing, but it is more efficient (on start-up) and has some features that may be helpful to CPAN mirror operators.