Everyone loves stats, ok well – at least I do. I was doing some research with regards to package maintenance within the Debian distribution and since the results might be interesting for someone else – there we are.
On 19th of August 2011 there have been:
- 16935 unique source packages in Debian/sid
- 9977 packages with Vcs-* field in Debian/sid
- 6957 packages without a Vcs-* field in Debian/sid
Therefore ~59% of all packages in Debian/sid are officially managed with a version control system (VCS). Now, which VCS do those packages use?
- Svn: 4939
- Git: 4377
- Darcs: 284
- Bzr: 247
- Hg: 61
- Cvs: 31
- Arch: 28
- Mtn: 10
I’ve retrieved the numbers from the Ultimate Debian Database (UDD). Sadly there’s a bug in UDD regarding the Vcs-Type information, see #637524. Therefore I’ve extracted a list of 80 packages where a Vcs-Browser header is available but the Vcs-Type entry is empty in UDD. 29 packages of them are managed inside CVS but don’t appear as such in UDD, so I manually corrected the number for CVS in the numbers above. The remaining 51 packages have a Vcs-Browser field set but lack the according Vcs-* entry, some of them pointing to upstream VCS instead of the according Debian package repository, some of them result in 404 errors, etc. As a result I’ve reported bugs where applicable (#638466, #638468, #638469, #638470, #638471, #638472, #638474, #638475, #638476, #638477, #638479, #638482, #638486, #638488, #638493, #638497, #638501, #638475, #638475, #638502, #638503, #638505, #638506, #638508, #638509, #638510, #638511, #638512, #638513, #638516, #638518, #638519, #638520, #638522, #638523, #638524, #638525, #638526, #638527, #638528, #638529, #638530, #638516, #638531).
Disclaimer: I found Debian’s Statistics wiki page and Zack’s VCS usage stats after starting to play with my own stats. AFAICT Zack’s slightly higher numbers are the result of looking at multiple versions for the same source packages, as you’ll see when comparing numbers from UDD’s sources_uniq view (which I used) with either 1) UDD’s sources table, 2) source table count from projectb or 3) Package count from http://$DEBIAN_MIRROR/debian/dists/unstable/{main,contrib,non-free}/source/Sources.bz2.
Conclusion: 9316 packages are officially managed with Subversion and Git as of today, representing ~94% of the VCS managed packages. This means ~55% of all the Debian (source) packages are available through either a Git or Subversion repository – and that’s actually the number I was originally interested in.
Thanks to Alexander Wirt, Christian Hofstaedter, Gerfried Fuchs, Jörg Jaspert and Michael Renner for hints in forming up the final stats results.