All patches and comments are welcome. Please squash your changes to logical
commits before using git-format-patch and git-send-email to
patches@git.madduck.net.
If you'd read over the Git project's submission guidelines and adhered to them,
I'd be especially grateful.
5 webcheckout - check out repositories referenced on a web page
9 B<webcheckout> [options] url [destdir]
13 B<webcheckout> downloads an url and parses it, looking for version control
14 repositories referenced by the page. It checks out each repository into
15 a subdirectory of the current directory, using whatever VCS program is
16 appropriate for that repository (git, svn, etc).
18 The information about the repositories is embedded in the web page using
19 the rel=vcs microformat, which is documented at
20 <http://kitenet.net/~joey/rfc/rel-vcs/>.
22 If the optional destdir parameter is specified, VCS programs will be asked
23 to check out repositories into that directory. If there are multiple
24 repositories to check out, each will be checked out into a separate
25 subdirectory of the destdir.
33 Prefer authenticated repositories. By default, webcheckout will use
34 anonymous repositories when possible. If you have an account that
35 allows you to use authenticated repositories, you might want to use this
40 Do not actually check anything out, just print out the commands that would
41 be run to check out the repositories.
45 Quiet mode. Do not print out the commands being run. (The VCS commands
46 may still be noisy however.)
52 Copyright 2009 Joey Hess <joey@kitenet.net>
54 Licensed under the GNU GPL version 2 or higher.
56 This program is included in mr <http://kitenet.net/~joey/code/mr/>
69 # Controls whether to print what is being done.
72 # Controls whether to actually check anything out.
75 # Controls whether to perfer repos that use authentication.
78 # Controls where to check out to. If not set, the vcs is allowed to
82 # how to perform checkouts
84 git => sub { doit("git", "clone", shift, $destdir) },
85 svn => sub { doit("svn", "checkout", shift, $destdir) },
86 bzr => sub { doit("bzr", "branch", shift, $destdir) },
89 # Regexps matching urls that are used for anonymous
90 # repository checkouts. The order is significant:
91 # urls matching earlier in the list are preferred over
92 # those matching later.
97 qr/^http:\/\//i, # generally the worst transport
101 Getopt::Long::Configure("bundling", "no_permute");
102 my $result=GetOptions(
103 "q|quiet" => \$quiet,
104 "n|noact" => \$noact,
105 "a|auth", => \$want_auth,
107 if (! $result || @ARGV < 1) {
108 die "usage: webcheckout [options] url [destdir]\n";
112 $destdir=shift @ARGV;
120 my @args=grep { defined } @_;
121 print join(" ", @args)."\n" unless $quiet;
123 return system(@args);
126 # Is repo a better than repo b?
131 foreach my $r (@anon_urls) {
132 if ($a->{href} =~ /$r/) {
136 elsif ($b->{href} =~ /$r/) {
143 return $firstanon != $a;
146 return $firstanon == $a;
150 # Eliminate duplicate repositories from list.
151 # Duplicate repositories have the same title, or the same href.
156 foreach my $repo (@_) {
157 if (exists $repo->{title} &&
158 length $repo->{title}) {
159 if (exists $bytitle{$repo->{title}}) {
160 my $other=$bytitle{$repo->{title}};
161 next unless better($repo, $other);
162 delete $bytitle{$other->{title}}
165 if (! $seenhref{$repo->{href}}++) {
166 $bytitle{$repo->{title}}=$repo;
174 return values %bytitle, @others;
181 my $parser=HTML::Parser->new(api_version => 3);
182 $parser->handler(start => sub {
185 return if lc $tagname ne 'link';
186 return if ! exists $attr->{rel} || lc $attr->{rel} ne 'vcs';
187 return if ! exists $attr->{href} || ! length $attr->{href};
188 return if ! exists $attr->{type} || ! length $attr->{type};
191 $parser->parse($page);
200 if (! defined $page) {
201 die "failed to download $url\n";
204 my @repos=dedup(parse($page));
206 die "no repositories found on $url\n";
209 if (defined $destdir && @repos > 1) {
210 # create subdirs of $destdir for the multiple repos
212 chdir($destdir) || die "failed to chdir to $destdir: $!";
217 foreach my $repo (@repos) {
218 my $handler=$handlers{$repo->{type}};
220 if ($handler->($repo->{href}) != 0) {
221 print STDERR "failed to checkout ".$repo->{href}."\n";
226 print STDERR "unknown repository type ".$repo->{type}.
227 " for ".$repo->{href}."\n";
234 #print Dumper(\@repos);