<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Perlblogs &#187; modernperl</title>
	<atom:link href="http://perlblogs.com/category/modernperl/feed/" rel="self" type="application/rss+xml" />
	<link>http://perlblogs.com</link>
	<description>Posts from selected Perl bloggers</description>
	<lastBuildDate>Fri, 18 May 2012 19:03:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/><atom:link rel="hub" href="http://superfeedr.com/hubbub"/>		<item>
		<title>Separating Presentation from Content in Templates</title>
		<link>http://www.modernperlbooks.com/mt/2012/05/separating-presentation-from-content-in-templates.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/05/separating-presentation-from-content-in-templates.html#comments</comments>
		<pubDate>Mon, 14 May 2012 18:47:11 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[templating]]></category>
		<category><![CDATA[webprogramming]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=4f5d73f0d2b0265d5fdbbecd96a82b97</guid>
		<description><![CDATA[A couple of comments on Simple Attribute-Based Template Exporting have asked for an example. I'll show off more of this code in my YAPC::NA 2012 and Open Source Bridge 2012 talk about how to write the wrong code (along with...]]></description>
			<content:encoded><![CDATA[
        <p>A couple of comments on <a
href="http://www.modernperlbooks.com/mt/2012/05/simple-attribute-based-template-exporting.html">Simple
Attribute-Based Template Exporting</a> have asked for an example. I'll show off
more of this code in my <a href="http://act.yapcna.org/2012/talk/50">YAPC::NA
2012</a> and <a href="http://opensourcebridge.org/proposals/796">Open Source
Bridge 2012</a> talk about how to write the wrong code (along with a handful of
other techniques).</p>

<p>(I assume some knowledge of <a
href="http://search.cpan.org/perldoc?Template">Template Toolkit</a> (besides
far too many books about finance, accounting, and investing, the Template
Toolkit book is always within reach these days); I've set up a wrapper template
which provides the standard look and feel of my application and I
include/process other templates liberally. If you understand that much, you'll
be able to follow along.)</p>

<p>One of the interesting templates in the system displays a list of chapters
of a book in progress. A cron job rebuilds a static page from this template
once a day. The template looks something much like:</p>

<pre><code>[% USE Bootstrap -%]
[%- canonical_url = 'http://sitename.example.com/book/' _ link -%]

[%- add_og_properties({
    'fb:admins'      =&gt; '436500086365356',
    'og:title'       =&gt; title _ ' | sitename.example.com',
    'og:type'        =&gt; 'article',
    'og:image'       =&gt; 'http://static.sitename.example.com/images/logo.png',
    'og:url'         =&gt; canonical_url,
    'og:description' =&gt; text.chunk(300).0,
    'og:site_name'   =&gt; 'Sitename: site tag line',
   })
-%]
[%- add_meta(
    'pagetitle'     =&gt; title _ ' | sitename.example.com',
    'feed_url'      =&gt; 'http://static.sitename.example.com/book/atom.xml'
    'canonical_url' =&gt; canonical_url
) -%]

[% article_text = BLOCK -%]
&lt;article&gt;
&lt;h2&gt;[% title | html %]&lt;/h2&gt;
&lt;p&gt;Published: &lt;time datetime="[% date %]"&gt;[% nice_date %]&lt;/time&gt;&lt;/p&gt;
[% text %]
&lt;/article&gt;

&lt;ul class="pager"&gt;
[%- IF prev -%]
    &lt;li&gt;&lt;a href="[% prev.link %].html"&gt;&larr; [% prev.title | html %]&lt;/a&gt;&lt;/li&gt;
[%- END -%]
    &lt;li&gt;&lt;a href="/onehourinvestor"&gt;index&lt;/a&gt;&lt;/li&gt;
[%- IF next -%]
    &lt;li&gt;&lt;a href="[% next.link %].html"&gt;[% next.title | html %] &rarr;&lt;/a&gt;&lt;/li&gt;
[%- END -%]
&lt;/ul&gt;

[% INCLUDE 'components/social_links.tt', title =&gt; title %]
[%- END -%]

<strong>[%- row(
    maincontent( article_text ),
    sidebar(
        sideblock( process( 'components/cached/book_latest_chapters.tt' ) ),
        sideblock( process( 'components/cached/book_drafts.tt'          ) )
    )
) -%]</strong></code></pre>

<p>The emboldened lines are most important; they put all of the
<em>content</em> produced or assembled by this template in the HTML structure
the site needs. That is to say, everything on the site needs to fit into
something I call a <code>row</code>. A <code>row</code> can contain multiple
elements, such as <code>maincontent</code> and a <code>sidebar</code>, or
<code>fullcontent</code> by itself with no <code>sidebar</code>. A
<code>sidebar</code> can contain multiple <code>sideblock</code>s.</p>

<p>(You can ignore the other functions; they put metadata in the right places
to pass to wrapper templates.)</p>

<p>Within my template plugin (called <code>Bootstrap</code>), each of these
elements is a simple Perl function which takes one or more arguments and
interpolates it into some HTML:</p>

<pre><code>sub row :Export
{
    return &lt;&lt;END_HTML;
&lt;div class="row"&gt;
    @_
&lt;/div&gt;
END_HTML
}

sub sidebar :Export
{
    return &lt;&lt;END_HTML;
&lt;div class="span4"&gt;
    @_
&lt;/div&gt;
END_HTML
}</code></pre>

<p>(I initially tried to write these functions as templates within Template
Toolkit itself, but there comes a point at which you want a real language. That
point came very early for me.)</p>

<p>I lose no love over the <code>varname = BLOCK</code> pattern necessary to
populate variables to pass to these plugin functions, but it works for now. In
some of my templates&mdash;usually those with lots of text I might end up
changing later&mdash;I extract that text into a separate template under
<em>components/content/</em> to make it easy to edit. (This idea came up during
a client project where the client wanted to edit the legal clickthrough
arrangement after users create accounts. I didn't want lawyers or anyone to
have the ability to mess up the templating language, so I said "Edit this
single file as plain HTML and you'll be fine." It worked great.)</p>

<p>While my programmer brain says "This is ugly, and you're a horrible person
for committing this hack upon the world&mdash;you're calling Perl from your
template system to generate HTML you're stuffing into a template and that puts
your presentation elements in Perl code, you awful human being!", it keeps the
presentation code in a single place where I can update it infrequently (being
that I don't change the layout of the site dramatically) without having to
change the divs and classes of multiple templates.</p>

<p>I'm not arguing that this technique as expressed here is <em>right</em>.
It's probably not optimal; there may be easier approaches to achieve the same
effects.</p>

<p>I am saying that this currently works very well for me. I'm not typing the
same HTML over and over and over again, and I can tweak it much more easily
than I did before when I was refining the look and feel. In fact, I've even
<em>forgotten</em> the exact details of the layout, from the HTML/CSS point of
view, and now think only in terms of rows, maincontent, and sidebars.</p>

<p>Working abstractions are very nice.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/05/14/separating-presentation-from-content-in-templates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Simple Attribute-Based Template Exporting</title>
		<link>http://www.modernperlbooks.com/mt/2012/05/simple-attribute-based-template-exporting.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/05/simple-attribute-based-template-exporting.html#comments</comments>
		<pubDate>Fri, 11 May 2012 20:29:01 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[cpan]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[webprogramming]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=8072073cb45a2548172a881d305c3f74</guid>
		<description><![CDATA[If you're like me and your design skills are sufficient to modify something decent to look nice but insufficient to create something from first principles, you can do a lot worse than to play with Twitter Bootstrap for your next...]]></description>
			<content:encoded><![CDATA[
        <p>If you're like me and your design skills are sufficient to modify something
decent to look nice but insufficient to create something from first principles,
you can do a lot worse than to play with <a
href="http://twitter.github.com/bootstrap/">Twitter Bootstrap</a> for your next
web site.</p>

<p>I've used it successfully for a few projects and it's been great.</p>

<p>It's a lot better now that I've written my own silly little <a
href="http://template-toolkit.org/">Template Toolkit</a> plugin to reduce the
need for writing lots of repetitive HTML in my templates. (It's like <a
href="http://haml-lang.com/">Haml</a> but less ugly and more Perlish and easier
to extend.)</p>

<p>Writing a TT2 plugin is relatively easy. Of course I do it the wrong way;
when you initialize your plugin, you have the ability to manipulate TT2's
stash. This is the data structure representing the variables in scope in your
templates. Where a well-behaved template should use object methods to perform
its operations, my code stuffs function references in the stash. Here's the
relevant code:</p>

<pre><code>sub new
{
    my ($class, $context, @params) = @_;

    $class-&gt;add_functions( $context );

    return $class-&gt;SUPER::new( $context, @params );
}

sub add_functions
{
    my ($class, $context) = @_;
    my $stash             = $context-&gt;stash;

    while (my ($name, $ref) = each %exports)
    {
        $stash-&gt;set( $name, $ref );
    }

    $stash-&gt;set( process =&gt; sub { $context-&gt;process( @_ ) } );
}</code></pre>

<p>I'll fix this eventually, but the process of making this work was
interesting.</p>

<p>In my first attempt (see <a
href="http://www.modernperlbooks.com/mt/2012/05/write-the-wrong-code-first.html">Write
the Wrong Code First</a> for the justification), I'd write the function I
needed, like <code>row()</code>, which creates a new Bootstrap row or
<code>maincontent()</code> which creates the main content area of the page.
Then I'd add that function to the <code>%exports</code> hash and everything
would work.</p>

<p>After the sixth function, keeping that list up to date was tedious. Then I
kept forgetting it. After all, any time you have to update the same data in two
places, you're doing something wrong.</p>

<p>Now the code looks more like:</p>

<pre><code>sub row <strong>:Export</strong>
{
    return &lt;&lt;END_HTML;
&lt;div class="row"&gt;
    @_
&lt;/div&gt;
END_HTML
}</code></pre>

<p>... with a single code attribute marking those functions which I want to
stuff into the template stash. I've used <a
href="http://search.cpan.org/perldoc?Attribute::Handlers">Attribute::Handlers</a>
before, but I always end up reading the manual and playing with things to get
them to work correctly. (Something about the way you have to write another
package and inherit from it to get your attributes to work correctly always
confuses me.)</p>

<p>My second attempt lasted no longer than ten minutes. I switched to <a href="http://search.cpan.org/perldoc?Attribute::Lexical">Attribute::Lexical</a>. This is almost as trivial to use as to explain:</p>

<pre><code>use Attribute::Lexical 'CODE:Export' => \&amp;export_code;</code></pre>

<p>Whenever any function has the <code>:Export</code> attribute, Perl wil lcall
my <code>export_code()</code> function:</p>

<pre><code>my %exports;

sub export_code
{
    my $referent = shift;
    my $name     = Sub::Identify::sub_name( $referent );

    return unless $name;
    $exports{$name} = $referent;
}</code></pre>

<p>The first argument to this function is a reference to the exported function.
I use <a href="http://search.cpan.org/perldoc?Sub::Identify">Sub::Identify</a>
to get the name of the function reference. (That wouldn't work for anonymous
functions, but I can control that here.) Then I store the name of the function
and the function reference in a hash.</p>

<p>It took as long to write as it does to explain.</p>

<p>A lot of people dislike the use of attributes. Used poorly, they create
weird couplings and plenty of action at a distance.
<code>Attribute::Handlers</code> can be confusing.</p>

<p>I like to think that I'm using attributes well here (even if I'm abusing TT2
more than a little), and that they've simplified my code so that I can avoid
repeating myself and performing manual busywork that I'm likely to forget. Even
better, the code to use them isn't magical at all: it's all hidden behind the
pleasant interfaces of <code>Attribute::Lexical</code> and
<code>Sub::Identify</code>.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/05/11/simple-attribute-based-template-exporting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NYTProf, File IO, and an Optimization Gone Awry</title>
		<link>http://www.modernperlbooks.com/mt/2012/05/nytprof-file-io-and-an-optimization-gone-awry.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/05/nytprof-file-io-and-an-optimization-gone-awry.html#comments</comments>
		<pubDate>Mon, 07 May 2012 21:56:41 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[cpan]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[profiling]]></category>
		<category><![CDATA[softwaredevelopment]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=ce8793e9ef2fc7b17ac9464159a79cc7</guid>
		<description><![CDATA[One of my projects performs a lot of web scraping. Once every n units of time (where n can be days or weeks), a batch process fetches several web pages and extracts information from them. It's a problem solved very...]]></description>
			<content:encoded><![CDATA[
        <p>One of my projects performs a lot of web scraping. Once every <em>n</em>
units of time (where <em>n</em> can be days or weeks), a batch process fetches
several web pages and extracts information from them. It's a problem solved
very well.</p>

<p>I designed this system around the idea of a pipeline of related processes,
where each component is as independent and idempotent as possible. This has
positives and negatives; it's an abstraction like any other.</p>

<p>I initially wrote the "fetch remote web page" and "analyze data from that
page" as a single step, because I thought "analyze" was the main goal and
"fetch" was a dependent task. I separated them a couple of weeks ago to
simplify the system: analysis now expects data to be there, while fetching can
be parallel on a single or across multiple machines. (Testing the analysis step
is also much easier because feeding in dummy data is now trivial.)</p>

<p>I use the filesystem as a cache for these fetched files. That's easy to
manage. I modified the role I use to grab data for the analysis stage to look
in the cache first, then fall back to a network request. That was easy too. The
<code>get_formatted_data_for_analysis()</code> method looked something like:<p>

<pre><code>sub get_formatted_data_for_analysis
{
    my ($self, $type, $key) = @_;

    my $cached_path         = $self-&gt;get_cached_path( $type, $key );
    if (-e $cached_path)
    {
        my $text = read_file( $cached_path );
        return $self-&gt;formatter-&gt;format_string( $text ) if $text;
    }

    return $self-&gt;formatter-&gt;format_string( $self-&gt;fetch_by_url( $type, $key ) );
}</code></pre>

<p>I thought I was done. This trivial caching layer took five minutes to write and gave my project a lot of flexibility.</p>

<p>I thought this would speed up the processing stage, because I was able to
make the fetching stage embarrassingly parallel so that more than one fetch
could block on network IO simultaneously. My rough benchmark didn't show any
speed improvement, but it was fast enough, so I moved on.</p>

<p>On Friday I decided to profile the slowest stage of the application with <a
href="http://search.cpan.org/perldoc?Devel::NYTProf">Devel::NYTProf</a>. The
slowest stage was the processing stage. I isolated it so that it performed no
network fetching. It was still slow.</p>

<p>One of the formatter modules used to extract data from web pages is <a
href="http://search.cpan.org/perldoc?HTML::FormatText::Lynx">HTML::FormatText::Lynx</a>.
It allows me to run <code>lynx --dump</code> to strip out all of the HTML and
other formatting of a document. The formatter allows you to pass in the name of
a file or the contents of a file as a string.</p>

<p>For some reason, most of the time in the processing stage in the profile was
spent in file IO. That wasn't too surprising; these aren't all small files and
there may be thousands of them. I dug deeper.</p>

<p>Most of the time in the processing stage in the profile was spent in reading
the files in my method and reading files in the formatter&mdash;reading files,
even though I was passing the contents of those files to the formatter as
strings.</p>

<p>I poked around at a few other things, but came back to the source code of
the formatter. A comment in <a
href="http://search.cpan.org/perldoc?HTML::FormatExternal">HTML::FormatExternal</a>
says:

<blockquote><code>format_string()</code> takes the easy approach of putting the
string in a temp file and letting <code>format_file()</code> do the real work.
The formatter programs can generally read stdin and write stdout, so could do
that with <code>select()</code> to simultaneously write and read
back.</blockquote>

<p>In other words, all of the work I was doing to read in files was busy work,
duplicating what the formatter was about to do anyway. (Okay, I stared at the
code for a couple of minutes, thinking about various approaches of rewriting it
and submitting a patch or monkey patching it. Then I turned lazier and wiser.)
I rewrote my code:</p>

<pre><code>sub get_formatted_data_for_analysis
{
    my ($self, $type, $key) = @_;

    my $cached_path         = $self-&gt;get_cached_path( $type, $key );
    return $self-&gt;formatter-&gt;format_file( $cached_path ) if -e $cached_path;

    return $self-&gt;formatter-&gt;format_text( $self-&gt;fetch_by_url( $type, $key ) );
}</code></pre>

<p>The result was a 25% performance improvement.</p>

<p>Three things jumped out at me in this process. First, how nice is it to have
a working tool like NYTProf and a community that distributes source code, so
that I could examine the whole stack of my application to isolate performance
problems? Second, how interesting that an assumption and an admitted shortcut
in a dependency could have such an effect on my own code. Third, how much more
I like my new code with all of the file handling gone; pushing that
responsibility elsewhere is a nice simplification without the performance
improvement.</p>

<p>Perhaps the two tools I miss most from my C programming days are
Valgrind/Callgrind and KCachegrind, but NYTProf goes a long way toward filling
that gap. Besides, I'm at least 20 times more productive with a language like
Perl.</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/05/07/nytprof-file-io-and-an-optimization-gone-awry/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Smoothing the Condescending Onramp</title>
		<link>http://www.modernperlbooks.com/mt/2012/05/smoothing-the-condescending-onramp.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/05/smoothing-the-condescending-onramp.html#comments</comments>
		<pubDate>Wed, 02 May 2012 21:42:28 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[codingstandards]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[novices]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[training]]></category>
		<category><![CDATA[tutorials]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=d0bec1efeae94194d5b907c840f6b5c5</guid>
		<description><![CDATA[If you ever need a dose of humility, solve a non-trivial problem and then watch a Real Actual User try to figure out how to use it. In my second professional job, when I was a system administrator at HP,...]]></description>
			<content:encoded><![CDATA[
        <p>If you ever need a dose of humility, solve a non-trivial problem and then
watch a Real Actual User try to figure out how to use it.</p>

<p>In my second professional job, when I was a system administrator at HP, I
worked in the laser printer group. One afternoon, someone walked by my desk and
asked me to do a user interaction study. I followed her to a little lab area,
where she handed me a list of tasks, and asked me to complete them.</p>

<p>I did, except that I misread the icon on the copier and put in the source
pages upside down, and made ten warm and blank pieces of paper. As soon as that
happened, I <em>understood</em> the icon and why I'd misinterpreted it.</p>

<p>I never heard the results of the study, but I hope my stubborn confusion
ended up improving the product.</p>

<p>User experience (and <em>real</em> user experience, not the fake user
experience stuff that says users are clueless and incapable of all of the
complexity of navigating the cereal aisle of an American grocery store and thus
interfaces must degenerate to a single beveled button which says "DO IT", do
you like my black turtleneck?) is fascinating. What's clear to you, you who
understand the internal model of the software, is perfectly opaque to users. <a
href="http://www.modernperlbooks.com/mt/2011/11/promoting-perls-features-versus-benefits.html">Users
know the results they want</a>, but not necessarily how to achieve them.</p>

<p>Making things easy for novices&mdash;for people who don't have a correct
internal model of the software&mdash;can be compatible with making powerful
software. Consider the <a
href="http://www.perl.com/pub/2012/04/perlunicook-standard-preamble.html">Perl
5 standard Unicode preamble</a> necessary to convince Perl to use the defaults
you probably want to handle anything-but-Latin-1 correctly.</p>

<p>(When user complaints of "My code doesn't work!" get met on PerlMonks and
the Perl Beginners List and elsewhere with "What's the error message?", you
know the languages, libraries, tools, and ecosystem could do more to help
people debug their own code.)</p>

<p>You see the problem when books and other tutorial materials say "Error
checking is left as an exercise for the reader", as if the burden of writing
correct code or the increased page count is far more important than the desire
to help new programmers learn how to code well.</p>

<p>I'm not only talking about better defaults (like <code>strict</code> enabled with <code>use 5.014;</code>). I'm not only talking about writing and collecting <a href="http://perl-tutorial.org/">good Perl tutorials</a>. (Part of the reason <a href="http://modernperlbooks.com/books/modern_perl/">Modern Perl: the Book is available for free online</a> is to continue to cultivate the culture of making great tutorial material available to anyone and everyone.)</p>

<p>With that said, I do despise the attitude of "You have to be clueful enough
to use the proper incantation at the start of your programs before you'll get
help on PerlMonks". Sure, those of us who know Perl <em>now</em> had to learn
the hard way that symbolic references and global variables make our code harder
to manage, that a unified testing system can only improve the CPAN, and that
agreeing on an interoperable OO syntax (if not implementation) lets us
concentrate on solving problems, not rebuilding Greenspun frameworks, but
that's no reason to force the same learning curve on novices.</p>

<p>We'll never remove the essential complexity from programming (to do so would
require us to remove the essential complexity from the problems we're trying to
solve). We <em>can</em> smooth out the onramp for new programmers. That
requires us to think like new programmers and to understand what they're trying
to do and why.</p>

<p>Sometimes that recommends that those of us who see a question and think
"Wow, everyone knows how to use a hash! What's <em>wrong</em> with you for not
understanding this?" to shut up. (Sometimes the best person to help a new
programmer is someone who was recently new.) Often times that requires us to
listen and look for the deeper question.</p>

<p>That <em>probably</em> recommends us to be a little gentler on the audiences
we reach when we publish text and code. As Tom Dale wrote in <a
href="http://tomdale.net/2012/04/best-practices-exist-for-a-reason/">Best
Practices Exist for a Reason</a>:</p>

<blockquote>writing code before you have an expert-level understanding is okay.</blockquote>

<p>(The whole post and its comments are... enlightening.)</p>

<p>Ultimately I expect the real point is to know who you're writing for. If
you're only ever writing for your own amusement and you're willing to cut off
everyone who doesn't share your level of knowledge, that's one thing. If you're
writing to help other people&mdash;even if they have just started using Perl
today&mdash;perhaps there are ways you can smooth the onramp for them a little
bit more. After all, the things we think are easy now are because we understand
the intricacies of lexical binding and scope and default topicalization and
eager versus iteration file reading and so on.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/05/02/smoothing-the-condescending-onramp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Picking Functional Programming&#8217;s Pockets</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/picking-functional-programmings-pockets.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/picking-functional-programmings-pockets.html#comments</comments>
		<pubDate>Mon, 30 Apr 2012 19:49:43 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[functionalprogramming]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[laziness]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[purity]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=4665bcfaeb4de72be73c62ebd1976eb2</guid>
		<description><![CDATA[In all of the debates over whether pair programming is exclusively 100% good or exclusively 100% evil or whether test-driven design is exclusively 100% beneficial or exclusively 100% silly, people sometimes miss the nuances of the polemic &#34;if it's hard...]]></description>
			<content:encoded><![CDATA[
        <p>In all of the debates over whether pair programming is exclusively 100% good
or exclusively 100% evil or whether test-driven design is exclusively 100%
beneficial or exclusively 100% silly, people sometimes miss the nuances of the
polemic "if it's hard to test, it's hard to use".</p>

<p>In practice, that means that good programmers with good taste built from
painful experiences have the ability to write better code if they exercise good
taste when building tests.</p>

<p>(I know, this is the Internet of 21st century culture; the law of the
excluded middle suggests that nuance, like irony, is deader than 19th century
utopian cults. Doesn't mean that 10,000 volts of CPR are <em>always</em>
wasted.)</p>

<p>That's what I had in mind when I wrote <a
href="http://www.modernperlbooks.com/mt/2012/04/mock-objects-despoil-your-tests.html">Mock
Objects Despoil Your Tests</a>. (See also Martin Fowler's <a
href="http://martinfowler.com/articles/mocksArentStubs.html">Mocks Aren't
Stubs</a>.)</p>

<p>The more gyrations your code has to undergo before you're confident that it
does what you intend it to do, no more and no less, the less confidence you
have overall. In highfalutin' architecture astronaut terms, the more tightly coupled your tests are to the internals of your code, the worse your tests are. They could be fragile. They could make too many assumptions. They could be exercising things that no real code would exercise. They could be hard to write and overspecific.</p>

<p>In short, the likelihood that you've built yourself a maintenance burden is
higher when you know far too much about the internals of a thing outside of
that thing, even if the thing on the outside is a test intended to give you
confidence.</p>

<p>(That's why I distrust putting code and tests in the same file, thank you
very much Java. It's too tempting to cheat when the clear lines of demarcation
aren't there.)</p>

<p>I only realized what I've been doing lately when I read Buddy Burden's <a
href="http://blogs.perl.org/users/buddy_burden/2012/04/lazy-cache.html">Lazy ==
Cache?</a>. He describes Moose lazy attributes the way I see them: as a promise
to provide information when you need it. That laziness is a hallmark of
Haskell. If you take laziness as far as Haskell does, you can build amazing
things where things just happen when you need them.</p>

<p>Haskell, of course, goes a long way to encourage you to write programs in a
pure style, where functions don't have side effects. Data comes into a function
and data goes out, and the state of the world stays unchanged. Sure, you can't
write any interesting program without at least performing IO, but Haskell
encourages you to embrace purity as much as possible such that you minimize the
places you update global state.</p>

<p>In my recent code, this has also just sort of happened, even in that code
which isn't Haskell.</p>

<p>Consider an application which tracks daily stock market information, such as
price and market capitalization. Each stock is a row in a table modeled by <a
href="http://search.cpan.org/perldoc?DBIx::Class">DBIx::Class</a>. Each stock
has an associated state, like "fetch daily price" or "write yearly free cash
graph" or "invalid name; review".</p>

<p>No one would fault you for updating the stock price, market cap, and state
on a successful fetch from the web service which provides this information.
That's exactly what I used to do.</p>

<p>Now I don't.</p>

<p>I've separated the fetching of data from the parsing of data from the
updating of data. Fetching and updating are solved problems; they happen at the
boundaries of my code and I can only control so much about them. Either the
database works or it doesn't. Either the remote web service is up or it isn't.
(I still test them, but I've isolated them as much as possible.)</p>

<p>The interesting thing is always in the parsing and analysis. This is where
all of the assumptions appear. (Is Berkshire Hathaway's A class
<code>BRK.a</code> or <code>BRK-A</code> or something else? Are abbreviations
acceptable in sector and industry classifications?) This is where I want to
focus my testing&mdash;even my ad hoc testing, when I've found an assumption
but need to research what's gone wrong and why before I can formalize my
solution in test cases and code.</p>

<p>This means, the daily analysis method looks something like:</p>

<pre><code>sub analyze_daily
{
    my ($self, $stock, $updates) = @_;
    my $stats                    = $self-&gt;get_daily_stats_for( $stock-&gt;symbol );

    return unless $stats-&gt;{current_price};
    $updates-&gt;{current_price} = $stats-&gt;{current_price};

    return unless $stats-&gt;{market_capitalization};
    $updates-&gt;{market_capitalization} = $stats-&gt;{market_capitalization};

    $updates-&gt;{PK} = $stock-&gt;symbol;
    return 1;
}</code></pre>

<p>Any code that wants to test this can pass in a hash reference for
<code>$updates</code> and a stock object (or equivalent) in <code>$stock</code>
and test that the results are sane by exploring the hash reference directly,
rather than poking around in <code>$stock</code>.</p>

<p>(The data fetcher itself uses dependency injection and fixture data so that
all expected values are known values and that network errors or transient
failures don't affect this test; obviously other tests must verify that the
remote API behaves as expected. While I could make <code>$stats</code> a
parameter here, I haven't had the need to go that far yet. There's a point
beyond removing dependencies from inside a discrete unit of code makes little
sense.)</p>

<p>This code is also much more reusable; it's trivial to create a <em>bin/</em>
or <em>script/</em> directory full of little utilities which use the same API
as the tests and help me debug or clean up or inspect all of this wonderful
data.</p>

<p>Better yet, I find myself needing fewer tests, because each unit under test
does less and has fewer loops and conditionals and edge cases. The problem
becomes "What's the right fixture data to exercise the interesting behavior of
this code?" My tests care less about managing the state of the objects and
entities under test than they do about the transformations of data.</p>

<p>Perhaps it's not so strange that that's exactly what my programs care
about too.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/30/picking-functional-programmings-pockets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Make a DBIC Schema from DDL</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/make-a-dbic-schema-from-ddl.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/make-a-dbic-schema-from-ddl.html#comments</comments>
		<pubDate>Fri, 27 Apr 2012 20:05:45 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[cpan]]></category>
		<category><![CDATA[dbixclass]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=89dba20bb9817607a15c78b55adb195f</guid>
		<description><![CDATA[For some reason, creating DBIx::Class schemas by hand has never made sense to me. I like to write my CREATE TABLE statements instead. DBIx::Class::Schema::Loader works really well for this. I keep this schema DDL in version control. I also keep...]]></description>
			<content:encoded><![CDATA[
        <p>For some reason, creating <a
href="http://search.cpan.org/perldoc?DBIx::Class">DBIx::Class</a> schemas by
hand has never made sense to me. I like to write my <code>CREATE TABLE</code>
statements instead. <a
href="http://search.cpan.org/perldoc?DBIx::Class::Schema::Loader">DBIx::Class::Schema::Loader</a>
works really well for this.</p>

<p>I keep this schema DDL in version control. I also keep a SQLite database
around with some test data (but the database isn't in version control).</p>

<p>I usually find myself writing a little shell script or other program to to
regenerate the DBIC schema from that test database. That usually requires me to
make manual changes to the test database representing the changes I've just
made to the DDL.</p>

<p>After doing this one too many times, I decided to combine <a
href="http://search.cpan.org/perldoc?DBIx::RunSQL">DBIx::RunSQL</a> with the
schema loader. By creating a SQLite database from my DDL <em>in memory</em>, I
can create a schema without me modifying any databases manually.</p>

<p>This was easier than I thought:</p>

<pre><code>#!/usr/bin/env perl

use Modern::Perl;

use DBIx::RunSQL;
use DBIx::Class::Schema::Loader 'make_schema_at';

my $test_dbh = DBIx::RunSQL-&gt;create(
    dsn     =&gt; 'dbi:SQLite:dbname=:memory:',
    sql     =&gt; 'db/schema.sql',
    force   =&gt; 1,
    verbose =&gt; 1,
);

make_schema_at( 'MyApp::Schema',
    {
        components =&gt; [ 'InflateColumn::DateTime', 'TimeStamp' ],
        debug =&gt; 1,
        dump_directory =&gt; './lib' ,
    },
    [ sub { $test_dbh }, {} ]
);</code></pre>

<p>The next step is to connect everything to <a href="http://search.cpan.org/perldoc?DBIx::Class::Migration">DBIx::Class::Migration</a>&mdash;but first things first.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/27/make-a-dbic-schema-from-ddl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embrace the Little Conveniences</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/embrace-the-little-conveniences.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/embrace-the-little-conveniences.html#comments</comments>
		<pubDate>Wed, 25 Apr 2012 18:55:09 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[codereuse]]></category>
		<category><![CDATA[cpan]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=ff38d1c7a63fad035dc6c3c1098e8a60</guid>
		<description><![CDATA[When Perl 6 introduced say (like print, but appends a newline) I had some skepticism. Yes, the Modern::Perl module was as much a polemic as it was a convenience. I know File::Slurp exists, but my fingers by now know how...]]></description>
			<content:encoded><![CDATA[
        <p>When Perl 6 introduced <code>say</code> (like <code>print</code>, but
appends a newline) I had some skepticism.</p>

<p>Yes, the <a href="http://search.cpan.org/perldoc?Modern::Perl">Modern::Perl</a> module was as much a polemic as it was a convenience.</p>

<p>I know <a href="http://search.cpan.org/perldoc?File::Slurp">File::Slurp</a>
exists, but my fingers by now <em>know</em> how to read from a file in a single
line of (impenetrable to the uninitiated) code:</p>

<pre><code>my $text = do { local (@ARGV, $/) = $file; <> };</code></pre>

<p>... and in each case, my initial feeling of "Why bother? What does that
offer? How silly!" were wrong. In every one of these cases, the ability to
write (and the requirement to <em>read</em>) less code has made my code
better.</p>

<p>With <code>say</code> I don't have to worry about single- versus
double-quotes, or even quoting at all sometimes. With <code>use
Modern::Perl;</code>, I don't have to worry about enabling various features and
pragmas. With <code>File::Slurp</code>, all I have to care about when reading
from a file is typing <code>read_file( $path )</code>.</p>

<p>None of these are big deals on their own, but they're little details I don't
have to worry about anymore. The same principle which says that <a
href="http://search.cpan.org/perldoc?Proc::Fork">Proc::Fork</a> is easier to
manage than writing your own forking code (I've written far too much of my own
forking code) applies.</p>

<p>Sometimes getting the little nuisances out of the way makes me more
productive and ready to tackle the big nuisances. Maybe saving my brainpower
for complicated problems (what's the standard deviation from a least square
fit?) is a better approach to typing my own <code>read_file()</code> function
on every project.</p>

<p>As silly as it once seemed to use a CPAN module for a one liner, I've
realized that <em>not</em> reusing good code is even sillier.</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/25/embrace-the-little-conveniences/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dependencies, Minimizers, and Regressing to JavaScript</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/dependencies-minimizers-and-regressing-to-javascript.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/dependencies-minimizers-and-regressing-to-javascript.html#comments</comments>
		<pubDate>Fri, 20 Apr 2012 18:29:54 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[cpan]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[modernperl]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=87d8f71e8be6ad2f75438fd9553c24bd</guid>
		<description><![CDATA[JavaScript is Perl 4 with first class functions, slightly better lexicals, better implementations, and more users. (If that hasn't offended you yet, note that that sentence doesn't include &#34;a better type system&#34;, on purpose.) While you can do some amazing...]]></description>
			<content:encoded><![CDATA[
        <p>JavaScript is Perl 4 with first class functions, slightly better lexicals,
better implementations, and more users.</p>

<p>(If that hasn't offended you yet, note that that sentence doesn't include "a
better type system", <em>on purpose</em>.)</p>

<p>While you can do some amazing things with modern JavaScript (see also <a
href="http://clubcompy.com/">ClubCompy</a>, a retro-style programming
environment designed for kids of all ages, for which we have a compiler and
interpreter written in JavaScript), its flaws of language and ecosystem are
obvious. The latter are obviously products of its environment.</p>

<p>Consider: you don't have anything like the CPAN for JavaScript in the
browser. (Yes, I'm aware of <a href="http://npmjs.org/">NPM</a>. No, it doesn't
count. The <em>point</em> of client-side JavaScript delivered from a web page
is that you don't have to have anything other than a web browser installed.)</p>

<p>Consider: this means you either make lots of requests for your dependent
libraries (jQuery, any plugins you use, the JavaScript you've written), which
is good in that if you use these libraries unmodified and load them from a
public CDN, there's a chance some cache in the middle will already have them
cached, but you still pay the network penalty for loading all of those
libraries <em>n</em> at a time or you glue them all together on the server side
somehow and send the client only one thing, except that it's only cached for
your site.</p>

<p>Also, if you find a bug in a dependency, you get to regenerate that big blob
of code. (If you <em>don't</em> find that bug, you get to live with it.)</p>

<p>I think about these things when I see <a
href="http://blogs.perl.org/users/ovid/2012/04/the-price-of-cleverness-yaml-is-not-safe.html">a
big lump of code stuffed into YAML.pm</a>. Because we've left 1994 behind in
the Perl world, we're able to take advantage of an amazing library distribution
and dependency management system in the CPAN, where installing dependencies
(and knowing they pass their tests) is so well understood that it's an
exceptional condition when it <em>doesn't</em> work. In the past couple of
years, installations have become so easy thanks to newer tools like perlbrew
and cpanm that (if you're in the know) it's easier to manage code this way than
to consider not.</p>

<p>... except for when you <a
href="https://github.com/schwern/test-more/blob/Test-Builder1.5/lib/TB2/Mouse.pm">stuff
generated code in your repository instead of as a dependency</a>.
(Test::Builder is a strange case. You want your underlying test library to be
as stupidly simple as possible and not to rely on anything else so it's as
unlikely to fail as possible and as impossible to interfere with what you're
testing as ever.)</p>

<p>Now when there's a bug in the dependency in the generated code, everything
which uses the dependency has to be updated too. Read carefully. You can't
merely update the dependency. You have to know everything on which it depends
and wait for the authors to get around to updating their generated code.)</p>

<p>I admit, I'll probably never understand the mindset which says "I'm
distributing software for end-users to install in the worst possible way so
that they won't have to install software." I understand the use of things like
<a href="http://search.cpan.org/perldoc?App::FatPacker">App::FatPacker</a> to
make one-file installations possible, but actively distributing generated code
in CPAN distributions? Where CPAN has a working dependency resolution model
already in place? Where your distribution is already an upstream dependency of
thousands of other distributions?</p>

<p>I just don't understand it. I understand that the business of shipping
software is the art of managing competing needs, but I can't see how optimizing
for fragility helps anyone.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/20/dependencies-minimizers-and-regressing-to-javascript/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Method-Function Equivalence Strikes Again!</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/method-function-equivalence-strikes-again.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/method-function-equivalence-strikes-again.html#comments</comments>
		<pubDate>Wed, 18 Apr 2012 16:51:05 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[debugging]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=06b7e7b93ed59b8e92c81f1ba367214d</guid>
		<description><![CDATA[One of the satisfying aspects of writing an opinionated book like Modern Perl is writing a section like Avoid Method-Function Equivalence. Explaining to a novice programmer a potential pitfall and how to avoid it always seems to me like reducing...]]></description>
			<content:encoded><![CDATA[
        <p>One of the satisfying aspects of writing an opinionated book like <a href="http://www.modernperlbooks.com/books/modern_perl">Modern Perl</a> is writing a section like <a href="http://modernperlbooks.com/books/modern_perl/chapter_11.html#TWV0aG9kLUZ1bmN0aW9uRXF1aXZhbGVuY2U">Avoid Method-Function Equivalence</a>. Explaining to a novice programmer a potential pitfall and how to avoid it always seems to me like reducing the amount of potential misery in the world.</p>

<p>That's satisfying.</p>

<p>I've been revising a proof of concept document categorization system into
shape for the past year, by adding tests and refactoring and cleaning things up
and even adding features. Every week it gets a little bit better, and it's
fascinating to discover the patterns of this style of programming. (It's
related to <a
href="http://www.modernperlbooks.com/mt/2012/04/debuggability-driven-design.html">debuggability-driven
design</a>.) I've enjoyed the experience of watching code get more general and
useful and powerful even as that's meant shuffling around code and concepts far
beyond the initial design. While there are still messes (what working code
doesn't have a mess somewhere?), the code has a goodness to it.</p>

<p>Just when you get a big head, the universe punishes you for your unwarranted
hubris. (Annie Dillard once wrote "I no longer believe in divine playfulness."
Sometimes "divine antiauthoritarianism" is more like it.)</p>

<p>Monday night, my business partner found a bug. We have a categorization
system and several topics into which these documents could find themselves. We
added several new categories last month, and I had to revise the sharing system
such that documents in one cluster of categories never appeared in other
clusters. (Think of it this way: you have a newspaper and want to group
articles about food, television, movies, and books in a Life and Culture
section and articles about basketball, lacrosse, and hockey into a Sports
section, but you never accidentally want an article about food to show up in
the Sports section or an article about the felonious tax evasion of Kenny Mauer
to show up in the Life and Culture section.)</p>

<p>One line of filtering that's easy to explain to users is keyword filtering.
Any article in this topic (food, television, books) must contain one of these
keywords: food, television, cuisine, literature, novel, bestseller, author. You
get the picture.</p>

<p>Monday's bug was that documents in a single cluster which obviously belonged
to a single topic ("Which Television Shows Won't Be Back Next Season", for a
fake example) within a cluster showed up as belonging to the cluster as a whole
("Life and Culture") and not the topic within the cluster ("Television").</p>

<p>Fortunately I had most of the necessary scaffolding to build in debugging support. I expected that the keyword filtering was to blame, whether missing the appropriate keywords or not applying them appropriately. (I wondered if the system used a case-sensitive regular expression match or didn't stem noun phrases for comparison appropriately.)</p>

<p>Turns out it was my silly mistake.</p>

<p>All of this filtering for validity and cross-topic intra-cluster association
is in the single module <code>MyApp::Filter</code>. This started life as a
couple of <em>functions</em> that didn't belong elsewhere. As I moved more and
more code around and defined the filtering behavior more concretely, it grew
until it made more sense to treat these functions as methods. It's not an
object yet. It may never become an object; it manages no state. Yet I changed
its invocation mechanism from:</p>

<pre><code>
=head2 make_bounded_regex

Turns a list of arguments into an optimized, case-insensitive regex which
matches any of them and requires boundaries at their ends.

=cut

sub make_bounded_regex
{
    return unless @_;

    my @keywords = map { s/\s/./; $_ } @_;
    my $ra       = Regexp::Assemble-&gt;new( flags =&gt; 'i' );
    my $re       = $ra-&gt;add( map { '\b' . $_ . '\b' } @keywords )-&gt;re;

    return qr/$re/;
}</code></pre>

<p>... to:</p>

<pre><code>sub make_bounded_regex
{
    <strong>my $class = shift;</strong>
    return unless @_;
    ...
}</code></pre>

<p>I made all of these functions into methods in one fell refactoring swoop.
(Why not? Be consistent! Do more than the bare minimum! Eat your vegetables!) I missed one place which called <code>make_bounded_regex()</code>:</p>

<pre><code>sub _build_keyword_filter
{
    my $self     = shift;
    my $kw       = $self-&gt;keywords;
    return unless @$keywords;
    return Feedie::Filter::make_bounded_regex( @$keywords );
}</code></pre>

<p>... such that the first keyword (and usually the most important, because
that's what users put in first) becomes the <code>$class</code> parameter to
the method. Because it's a class method, nothing ever uses <code>$class</code>,
so there's no error message about wrong package names.</p>

<p>The tests don't catch this either because of the distribution of test data.
(Obviously a mistake to rectify.)</p>

<p>Sure, a language with integrated refactoring support (you don't even need an
early binding language with a static type system to get this) could have shown
me the error right away. That's one thing I <em>do</em> like about Java. Sure,
you need that scaffolding to get anything done, but it does occasionally help
you not write bugs.)</p>

<p>What bothers me most of all is that Perl itself has no means by which it
could even give an <em>optional</em> warning when you treat a method as a
function or vice versa. You don't have even a runtime safety net here.</p>

<p>Warnings will never replace the need for programmer caution, but bugs
happen. Bugs always happen. I keep the error log as squeaky clean as possible,
and warnings have caught a lot of bugs and potential bugs even during testing,
sometimes in our deployed software.</p>

<p>In lieu of warnings though, the best I can do is document my mistakes and
explain why they make them in the hope that I won't make them again and you'll
be more cautious than I was. (At least this one was easy to fix.)</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/18/method-function-equivalence-strikes-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Method-Function Equivalence Strikes Again!</title>
		<link>http://www.modernperlbooks.com/mt/2012/04/method-function-equivalence-strikes-again.html</link>
		<comments>http://www.modernperlbooks.com/mt/2012/04/method-function-equivalence-strikes-again.html#comments</comments>
		<pubDate>Wed, 18 Apr 2012 16:51:05 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[debugging]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://perlblogs.com/?guid=06b7e7b93ed59b8e92c81f1ba367214d</guid>
		<description><![CDATA[One of the satisfying aspects of writing an opinionated book like Modern Perl is writing a section like Avoid Method-Function Equivalence. Explaining to a novice programmer a potential pitfall and how to avoid it always seems to me like reducing...]]></description>
			<content:encoded><![CDATA[
        <p>One of the satisfying aspects of writing an opinionated book like <a href="http://www.modernperlbooks.com/books/modern_perl">Modern Perl</a> is writing a section like <a href="http://modernperlbooks.com/books/modern_perl/chapter_11.html#TWV0aG9kLUZ1bmN0aW9uRXF1aXZhbGVuY2U">Avoid Method-Function Equivalence</a>. Explaining to a novice programmer a potential pitfall and how to avoid it always seems to me like reducing the amount of potential misery in the world.</p>

<p>That's satisfying.</p>

<p>I've been revising a proof of concept document categorization system into
shape for the past year, by adding tests and refactoring and cleaning things up
and even adding features. Every week it gets a little bit better, and it's
fascinating to discover the patterns of this style of programming. (It's
related to <a
href="http://www.modernperlbooks.com/mt/2012/04/debuggability-driven-design.html">debuggability-driven
design</a>.) I've enjoyed the experience of watching code get more general and
useful and powerful even as that's meant shuffling around code and concepts far
beyond the initial design. While there are still messes (what working code
doesn't have a mess somewhere?), the code has a goodness to it.</p>

<p>Just when you get a big head, the universe punishes you for your unwarranted
hubris. (Annie Dillard once wrote "I no longer believe in divine playfulness."
Sometimes "divine antiauthoritarianism" is more like it.)</p>

<p>Monday night, my business partner found a bug. We have a categorization
system and several topics into which these documents could find themselves. We
added several new categories last month, and I had to revise the sharing system
such that documents in one cluster of categories never appeared in other
clusters. (Think of it this way: you have a newspaper and want to group
articles about food, television, movies, and books in a Life and Culture
section and articles about basketball, lacrosse, and hockey into a Sports
section, but you never accidentally want an article about food to show up in
the Sports section or an article about the felonious tax evasion of Kenny Mauer
to show up in the Life and Culture section.)</p>

<p>One line of filtering that's easy to explain to users is keyword filtering.
Any article in this topic (food, television, books) must contain one of these
keywords: food, television, cuisine, literature, novel, bestseller, author. You
get the picture.</p>

<p>Monday's bug was that documents in a single cluster which obviously belonged
to a single topic ("Which Television Shows Won't Be Back Next Season", for a
fake example) within a cluster showed up as belonging to the cluster as a whole
("Life and Culture") and not the topic within the cluster ("Television").</p>

<p>Fortunately I had most of the necessary scaffolding to build in debugging support. I expected that the keyword filtering was to blame, whether missing the appropriate keywords or not applying them appropriately. (I wondered if the system used a case-sensitive regular expression match or didn't stem noun phrases for comparison appropriately.)</p>

<p>Turns out it was my silly mistake.</p>

<p>All of this filtering for validity and cross-topic intra-cluster association
is in the single module <code>MyApp::Filter</code>. This started life as a
couple of <em>functions</em> that didn't belong elsewhere. As I moved more and
more code around and defined the filtering behavior more concretely, it grew
until it made more sense to treat these functions as methods. It's not an
object yet. It may never become an object; it manages no state. Yet I changed
its invocation mechanism from:</p>

<pre><code>
=head2 make_bounded_regex

Turns a list of arguments into an optimized, case-insensitive regex which
matches any of them and requires boundaries at their ends.

=cut

sub make_bounded_regex
{
    return unless @_;

    my @keywords = map { s/\s/./; $_ } @_;
    my $ra       = Regexp::Assemble-&gt;new( flags =&gt; 'i' );
    my $re       = $ra-&gt;add( map { '\b' . $_ . '\b' } @keywords )-&gt;re;

    return qr/$re/;
}</code></pre>

<p>... to:</p>

<pre><code>sub make_bounded_regex
{
    <strong>my $class = shift;</strong>
    return unless @_;
    ...
}</code></pre>

<p>I made all of these functions into methods in one fell refactoring swoop.
(Why not? Be consistent! Do more than the bare minimum! Eat your vegetables!) I missed one place which called <code>make_bounded_regex()</code>:</p>

<pre><code>sub _build_keyword_filter
{
    my $self     = shift;
    my $kw       = $self-&gt;keywords;
    return unless @$keywords;
    return Feedie::Filter::make_bounded_regex( @$keywords );
}</code></pre>

<p>... such that the first keyword (and usually the most important, because
that's what users put in first) becomes the <code>$class</code> parameter to
the method. Because it's a class method, nothing ever uses <code>$class</code>,
so there's no error message about wrong package names.</p>

<p>The tests don't catch this either because of the distribution of test data.
(Obviously a mistake to rectify.)</p>

<p>Sure, a language with integrated refactoring support (you don't even need an
early binding language with a static type system to get this) could have shown
me the error right away. That's one thing I <em>do</em> like about Java. Sure,
you need that scaffolding to get anything done, but it does occasionally help
you not write bugs.)</p>

<p>What bothers me most of all is that Perl itself has no means by which it
could even give an <em>optional</em> warning when you treat a method as a
function or vice versa. You don't have even a runtime safety net here.</p>

<p>Warnings will never replace the need for programmer caution, but bugs
happen. Bugs always happen. I keep the error log as squeaky clean as possible,
and warnings have caught a lot of bugs and potential bugs even during testing,
sometimes in our deployed software.</p>

<p>In lieu of warnings though, the best I can do is document my mistakes and
explain why they make them in the hope that I won't make them again and you'll
be more cautious than I was. (At least this one was easy to fix.)</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2012/04/18/method-function-equivalence-strikes-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

