The (Unwritten) Rules of Perl 5 Minor Releases

When a new major version of Perl 5 comes out, history suggests a new minor release will follow.

Some of this reasoning is pragmatic. For all of the requests of the Perl 5 Porters for people to test development snapshots (5.11.0 through 5.11.5) and the inevitable release candidates (in this case, 5.12.0 RC1 through... hopefully not RC2 and RC3), nothing gets more testing or bug reports than new major releases. Bugs get reported. Changes get requested. Changes occur.

The traditional view of new major releases is that they're somewhat unstable. "Wait for the patch release," people say. This was true of 5.6.0. (The ancient Camel third edition describes an unreleased version of Perl 5 somewhere between 5.6.0 and 5.6.1). This was true of 5.8.0. This was even more true of 5.10.0, where the CPAN itself suggested that 5.10.0 was a "testing" release for bleeding edge users.

Given the size of the Perl 5 test suite and the daily and weekly test reports produced from the bleeding edge of Perl 5 itself as well as the monthly releases, most of the obvious bugs appear and get corrected quickly. Even so, bugs happen. It's software. Changes occur and people notice only in odd or complex situations. Sometimes a new compiler warning appears, or underlying libraries change. Sometimes a few updates help get Perl 5 building on a platform which itself has changed.

A patch release is inevitable.

However, the Perl 5 Porters make no promises about when a point release will occur. Nor do they promise how many point releases will occur in a family.

The Perl 5.8.0 family had nine point releases between July 2002 and December 2008. If Perl 5.10 hadn't taken five and a half years, Perl 5.8.0 might have had fewer point releases.

22 months passed between the release of Perl 5.10.0 and Perl 5.10.1. There may never be a 5.10.2.

A point release needs two things: a steady stream of bug fixed in the core without breaking source compatibility and someone to identify appropriate those patches and to make the release. In other words, one or more people need to be able to cherry-pick patches from the development track of the next release of Perl 5 to the branch which will become the new point release. The weight of history and expectations is sufficient to assume that p5p will be able to find or herd enough volunteer effort to make a Perl 5.12.1 and a Perl 5.14.1 and so on, but if monthly Perl 5 releases continue and produce a new major release every 12-18 months, the likelihood of a Perl 5.12.2 and a Perl 5.14.2 decreases.

The single unpredictable factor is the presence of a major bug discovered in that release family; a major security bug or a data loss bug is one possibility. In that case, a single-patch minor release is likely. Beyond that, minor releases have diminishing returns.

CPAN Testers 2.0: Interim milestone

This is just a quick note to say that I’ve successfully configured a test CT2.0 server hosted entirely on Amazon Web Services. I’ve also successfully sent test reports to it from my regular CPAN::Reporter client using Test::Reporter::Transport::Metabase. It’s the first end-to-end test of the target architecture for CT2.0.

There are some tweaks I need to make before I’m ready to open it up to beta testing, and all the updated module are still in git and not yet CPAN, but there is a light at the end of the tunnel and (so far) I don’t think it’s a train.

Perl 5, Version Numbers, and Binary Compatibility

As mentioned in What Perl 5's Version Numbers Mean, the written Perl 5 support policy must explain several guidelines and their implications.

If you've ever upgraded between major versions of Perl 5 on the same machine, you've likely noticed that you have to install new versions of modules. Various resources spread across the Internet suggest the use of CPAN autobundles, but even that's likely enough to make you curse a little bit as you babysit a CPAN shell for an hour or two to get back to where you started.

This is due to the binary compatibility guidelines to which the Perl 5 Porters adhere. While there's a strong desire to ensure that programs and modules written in the past several years will continue to work unmodified on new major versions of Perl 5, it's almost impossible to ensure that compiled XS code remains compatible across major versions of Perl 5.

Certainly the porters attempt to maintain source code compatibility of XS wherever possible, but ensuring that an XS module compiled in 2000 for Perl 5.6.0 will continue to work unchanged on Perl 5.12 in 2010 requires a great deal of foresight, plenty of tolerance for workarounds in the core, and no small amount of luck. There's a limit to what's practical to provide for how long, and the price of reinstalling extensions (and recompiling a few) is worthwhile.

While this choice may seem odd, considering the degree to which Perl 5 retains backwards compatibility, it reflects a philosophy from the earliest days of Unix: source compatibility is important, but everyone should have access to the source code to recompile as necessary.

Within a release family—Perl 5.8.0 through 5.8.9, for example—the porters attempt to maintain binary compatibility. If you installed DBI on a fresh Perl 5.8.0, it should continue to work even if you install 5.8.9 and remove 5.8.0. That's the intent.

Note that the Perl 5 configuration and installation process enforces this by default; minor releases within a major release family reuse the same module installation paths. Major releases have new installation paths. You can reconfigure Perl 5 to use paths of your choosing, but you do so at your own risk.

Links for 2010-02-19 [del.icio.us]

What Perl 5’s Version Numbers Mean

Perl 5.11.5 comes out tomorrow and Perl 5.12 should be out soon. (Much credit goes to people such as Jesse Vincent and David Golden, to name two, for getting Perl 5 on a regular release cycle.) I've long promised to write about the Perl 5 support and deprecation policy and how that affects users.

Perl 5.10.1 was, by definition, a minor revision. Perl 5.12 is a major revision. The nominal difference is which component of the version number increases. By intent, users of 5.10 (actually 5.10.0, but often abbreviated) should be able to upgrade that installation in place to any subsequent minor release in the 5.10 family. The upgrade isn't always completely transparent, but the intent is that, modulo bugfixes, it should be.

When 5.10.0 came out, work started on a new Perl 5 release family called 5.11 (that's not entirely true, but it's sufficiently true for this explanation). This is the unstable series intended for development and testing which will become 5.12 in the next couple of months. You are welcome to download, configure, build, test, and even install 5.11, but you should be comfortable without support from p5p for upgrades and changes.

The monthly releases in the 5.11 (and soon, the 5.13) series represent points of stability and review so that the Perl 5 developers can concentrate on the quality of what will become 5.12.0.

When 5.12.0 comes out, you will notices changes from 5.10.0 in terms of new features, removed features, and upgrades to the standard library. While most code should work unmodified with 5.12.0 as it did with 5.10.0, some modules will need updates. You likely also have to recompile any modules with XS components.

In subsequent entries, I'll write more about the implications of all of this, when you should upgrade, how deprecations and changes work, and the binary compatibility policies of Perl 5.

Why SDL Perl Matters

I read a book proposal years ago on the subject of teaching kids to program with C++. "After a week," it said, "children will know enough to create their own simple text games and animations."

I was perhaps six years old when I saw my first minicomputer. I flipped open the first page of the manual and typed in the lines verbatim—except I left off the line numbers, likely thinking that they were merely a convenience for readers. Perhaps I've had good taste from the beginning.

My typing skills were, as you might expect, abysmal. Even so, I had feedback from the computer within fifteen minutes or less. If I'd had to spend a week learning things to move characters around on the screen, I'd have given up.

I like games. I enjoy thinking about how they work. I like writing stories. I play games. The mechanics of rules and balance and design and enjoyment and player participation and perception are fascinating. Even more important is the idea that games can have a didactic purpose.

I spent a lot of time in my childhood years playing games but also breaking games. A bit of work with a hex editor could give my party more experience points so that two or three well-placed fireball spells would clean out the kobold lair. (Any role-playing system which starts magic users with four hit points won't have them surviving the tetanus shot before they get their passports.)

Because I could only get time on computers at school if they had an educational purpose, I taught myself how to write programs so I could write games. I don't suggest my experience is representative of all children, but it's not so far different from that of many of my friends.

A few years ago, I tried to help revive SDL Perl when the maintainer retired. The experience was difficult; it's a big wad of XS code that needs plenty of probing and configuration for a handful of somewhat-optional libraries. I don't even want to think about everything required to detect which version of OpenGL you have installed and available in a cross-platform fashion.

Fortunately, Kartik Thakore is everyone's hero (and plenty of other people are helping too).

I've heard the arguments that "Kids these days are too busy texting each other!" or "It's okay that kids make YouTube mashups of pop songs and clips of their favorite anime characters, that's creativity!" and "You can teach a kid PHP and HTML and call him a programmer, and that's super fun!" I don't believe any of them.

I think instead that you can plop your smart seven year old in front of a real computer with a real keyboard and show her that typing something makes a picture appear and typing something else makes it move and give her a few other commands and boom she'll play with that for a while. Not everyone's suited to the deep, dark logic of understanding the bindings from a high level language to a shared library and memory management techniques thereof, but what a privilege to teach a younger generation that a computer isn't merely an appliance to read Wikipedia and text their friends, but a general purpose device they can control.

Show a few of them how to make pretty graphics move around on screen per their command—per textual instructions they have to reason about and maintain themselves—and you just might have something. Sure, Pygame and Pyglet are great. I've used them productively. Even so, more options for free software and free environments can only help.

CPAN Testers 2.0 mid-February update

February is a short month and the last couple weeks have flown by. Since my last update, there have been a couple of significant milestones that bring us closer to CT2.0:

  • I successfully created a test Metabase backed by Amazon S3+SimpleDB and have created, retrieved and deleted some user profile facts. This built on some of the earlier work by Leon, but reflected the work I’ve been doing to revise the guts of the Metabase libraries.
  • I wrote a program that converts a given NNTP report into a CPAN::Testers::Report object (which is itself a Metabase Fact subclass.) This wound up being trickier than I expected due to potential for ambiguous mapping of distribution names to actual distribution files. (Thank you to Tokuhirom, Andreas, Takesako and Offer Kaye for reviewing and fixing my exception list.)
  • I generated 900+ Metabase user profiles for all the known CPAN Testers, which will be used to linked with their reports during the conversion process. We still need to figure out a way to distribute these to testers so that new records will link up and other logistics, but this was a necessary step to prepare for conversion.
  • I’ve started configuring Amazon EC2 instances — one for the parallel conversion of reports and one for the CT2.0 server itself. I’m still coming up the learning curve on EC2, but see no obstacles (except time)

My immediate goals are (1) to get a test CT2.0 server up and receiving reports so we can start testing it and (2) prep and run the parallel conversion process. I’ve already gotten some help on the #catalyst channel for #1 (thank you, hobbs) and #2 is now mostly a matter of programming it up now that all the foundational prep work is done.

Hitting the March 1 deadline is going to be tight. We’re still holding at about 2 weeks behind the original estimate (where we’ve been since mid-January). But I think progress is coming fast and furious and I hope to get a solid beta launch by March 1 and negotiate with the Perl NOC for an orderly transition as we ramp up the new service.

A Decade of Lexical Filehandles

Perl 5.6.0 is almost a decade old; perldoc perlhist gives a release date of 22 March 2000.

My favorite feature of Perl 5.6.0 is lexical filehandles. Instead of having to access the IO slot of package global typeglobs, I could use lexical variables to contain filehandles -- without having to muck about localizing symbol tables or worrying about action at a distance or lifetimes of global symbols.

Yet to this day, almost a decade later, I still see the old way with all of its disadvantages (Tell the truth; do you understand every word of "the IO slot of package global typeglobs"? Do you want to explain that to novice programmers?) in new code.

Perl 5.6.2 is long dead. Perl 5.8.9 is the last of its release series too. The argument for running new code on old installations of Perl 5 is awfully thin, in that light.

Likewise I can't make a simplicity argument for the old approach. Making old-style filehandles work like people might expect is anything but simple. Throw in a local here or there and the typeglob sigil and maybe a gensym() call for good measure. Fun!

Reasonable people differ on style and technique, but I wonder what makes a feature such as pseudohashes or 5.005-style threads so hated that it eventually gets deleted, while difficult-to-use-correctly features superseded by better replacements stick around far longer than necessary. My guess is that the Perl 5 world suffers here, as usual, from a questionable abundance of old code, old tutorials, old books, and copy and paste coding from ancient sources of dubious wisdom. (This probably means I should submit patches to perldoc perluniintro and other offenders in the core documentation.)

Perhaps it's time to consider a gradual, intentional, well-tested and well-reviewed campaign to update tutorials and example code with somewhat more modern examples of maintainable Perl.

(For fun, imagine a world where the canonical printed Perl 5 reference covered a version of Perl 5 released this millennium. Then again, Perl.com thinks that 5.6.2 is "the previous version of Perl" 5.)

Chunking, Subtlety, and Whitespace

I delayed writing about references in Perl 5 in the Modern Perl book for a long time. References in Perl 5 are useful. They have their warts. They're not as difficult as most people believe, however. Novices have trouble learning how to use references effectively because most tutorials and introductions explain them poorly.

I had to think about explanations for a long time before I found a way to explain them well.

Of course, the syntax for dereferencing gets complex very quickly—but it's also an effective example of what I've been discussing this week. Perl has a handful of subtle design consistencies that, if you understand them, help you read and skim code very effectively. If you don't learn them, you'll get lost in a sea of punctuation soup.

Consider an array reference $monkeys_ref. You can get the number of monkeys by evaluating that reference as an array in scalar context in one of two ways:

# the short way
my $count = @$monkeys_ref;

# the disambiguatey way
my $count = @{ $monkeys_ref };

The former way is shorter and more idiomatic. Anyone familiar with Perl 5 references should understand what the additional sigil means ("I want a list from the following reference"). The latter syntax has the same effect, but it means instead "I want a list coerced from the expression evaluated within this block." The difference is subtle and you don't have to understand the subtleties for this example.

Trouble arrives when you deal with nested data structures or more complex expressions, such as slices:

# the short way
my $monkeys = join ',' @$monkeys_ref[@indices];

# the clearer way
my $monkeys = join ',', @{ $monkeys_ref }[@indices];

The first expression is somewhat more difficult to parse; which takes precedence, the indexing operation represented by the square brackets or the dereferencing operation indicated by the leading sigil? The second expression works because the intended order of operation is clear, at least to anyone who understands how curly-brace grouping works with complex references.

The whitespace is unnecessary, of course, but I find that it adds clarity.

A little bit of disambiguation isn't necessary to help the Perl 5 parser in this case, but it does helps the reader. Students of compiler design might argue that nested expressions this complex belong on separate lines. I can imagine how this would read in a pseudo assembly language (I work on Parrot, after all). There's definitely a balance between the complexity of nested expressions and dereferencing... but this is a place where I consider the idiomatic use of Perl 5 sufficiently expressive that spreading the list slice out over multiple lines would obfuscate the intent of the code.

Certainly it's possible to perform even more complex dereferences of data structures, but when it's difficult to identify individual chunks of the desired behavior, it's time to simplify the code or the expression or the design. Even still, readability of this code does should not depend on the desire to avoid teaching novices about references.

Chunks and Syntax Highlighting

If I'm right—if reading source code requires identifying parts of speech—then familiarity with syntax and grammar is important to programming as an adept.

Consider Damian Conway's SelfGOL. As an experienced Perl programmer, I can pick out various pieces of the code at a glance. There's an assignment. There's quoting. That's a variable. That's a list slice.

If you've never encountered Perl before (or programming in general), you might recognize some English words, such as print and die, and that's all.

One of Perl's design ideas borrowed from linguistics is that "different things should look different". To novices, everything looks different. $name isn't obviously a single chunk. It's an English identifier and one of several punctuation symbols apparently sprinkled at random throughout the program.

Good use of whitespace helps. So does the good use of parentheses as grouping constructs (though as in prose, they often get overused by novices).

One of the most subtle mechanisms to identify individual chunks floating in a sea of code is with syntax highlighting. I can't prove this. I haven't studied it in repeatable situations. Even so, I hypothesize that (modulo color choice concerns) merely highlighting different types of terms in the grammar in different ways will help novices understand how to pick out individual chunks in code.

This requires training. This demands practice. Unless you spend time reading code, you won't understand how expressions fit together, and you have little hope of understanding code. I believe it's impossible to skip this step, and thus I don't care if someone who's used C or ML has trouble reading Perl 5 code. Of course people have trouble reading when they don't know the grammar.

(Don't worry, Lisp fans. Homoiconicity—apart from additional complexity of quoting forms and reader macros—means that novices have to spend their time learning to recognize idioms and abstractions at a level higher than tokens and chunks without the benefit of patterns of chunk types as mnemonics to idioms. Then again, I think in patterns, rarely words.)