A three part think

Fri Oct 12 19:46:11 BST 2007

On Oct 12, 2007, at 12:58 PM, Simon Wistow wrote:

> Part the first: I know that it's impossible to accurately gauge the
> absolute number of downloads for a given CPAN module but we can  
> probably
> guess the relative popularity if we get the download figures from,  
> say,
> search.cpan.org and cpan.org.
>
> Accuracy would be improved if we could get the logs for several of the
> bigger mirrors.

This has been done for Phalanx, no? Andy Petdance should have a  
reasonable rough number and way to get this metric.

>
> Part the second: What percentage of modules on CPAN use XS?

On a local minicpan (current revs of everything)

1337 dists of 13829 appear to have 4052 files containing 1787122  
lines of XS ending in .xs by my count. Of course, 10% of that (177356  
lines) is P5NCI.xs. And another 171594 lines are in Graphics-VTK.

About 2000 of the xs files are 100 lines or shorter.

> Additionally - Is there any way to programmatically work out how
> complicated that XS is - maybe by getting total lines of code,  
> removing
> obvious boilerplate, sub declarations and argument marshalling and  
> then
> looking at the remainder? Would it be possible (or, more accurately,
> easy) to work out how much is calls out to external libraries rather
> than hairy XS code?
>

This is left as an exercise for the reader. Or at least someone with  
more time to kill than me.

> Would this even be useful?
>
>
> Part the third: Given a list of features in Perl - AUTOLOAD, Formats,
> EXPORTER, pseudohashes, globs, typeglobs, BEGIN blocks, string eval,
> etc - how easy would it be to work out what percentage of modules on
> CPAN use those features?

I'd recommend talking to Tatsuhiko about his cpan codesearch tool.

> Presumably something like PPI and MAD could help with the last two.
>