<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Perlblogs &#187; apis</title>
	<atom:link href="http://perlblogs.com/category/apis/feed/" rel="self" type="application/rss+xml" />
	<link>http://perlblogs.com</link>
	<description>Posts from selected Perl bloggers</description>
	<lastBuildDate>Mon, 06 Feb 2012 20:47:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/><atom:link rel="hub" href="http://superfeedr.com/hubbub"/>		<item>
		<title>Eliminating Errors with Little Languages</title>
		<link>http://www.modernperlbooks.com/mt/2010/07/eliminating-errors-with-little-languages.html</link>
		<comments>http://www.modernperlbooks.com/mt/2010/07/eliminating-errors-with-little-languages.html#comments</comments>
		<pubDate>Tue, 20 Jul 2010 17:51:01 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[apis]]></category>
		<category><![CDATA[languagedesign]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[perl6]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Jamie McCarthy made an interesting point about type safety in embedded SQL on String-Plus: SQL is a great example for this. Relational databases are more useful with strong typing, so EMPLOYEE_ID is incompatible with PRODUCT_ID even if they are both...]]></description>
			<content:encoded><![CDATA[
        <p><a
href="http://www.modernperlbooks.com/mt/2010/07/string-plus.html#comment-500">Jamie
McCarthy made an interesting point about type safety in embedded SQL</a> on <a
href="http://www.modernperlbooks.com/mt/2010/07/string-plus.html">String-Plus</a>:</p>

<blockquote>SQL is a great example for this. Relational databases are more
useful with strong typing, so EMPLOYEE_ID is incompatible with PRODUCT_ID even
if they are both implemented as INT. It'd be a great idea to see those
constraints implemented at the perl level, presumably by giving perl more
knowledge of the database schema than even the database engine
has.</blockquote>

<p>Imagine that you have, or can write, a little language parser for a SQL-like language.  My simple example was:</p>

<pre><code>SQL {{
    UPDATE users SET address = { Address $address } WHERE user = { User $user }
}}</code></pre>

<p>This can decompose into several operations:</p>

<ul>

<li>Get the value of the <code>$address</code> variable.</li>

<li>Get the primary key of the <code>$user</code> variable.</li>

<li>Prepare a database query with a rewritten query string which uses placeholders for the <code>$address</code> and <code>$user</code> variables to avoid SQL injection and other interpolation errors.</li>

<li>Execute the query.</li>

</ul>

<p>That's a nice interface, but you can do better.  As I suggested, you can add error checking if you know the structure of the database:</p>

<ul>

<li><em>Get the metadata which describes the <code>users</code> table.</em></li>

<li><em>Verify that the required fields (<code>address</code> and <code>user</code> exist).</em></li>

<li>Get the value of the <code>$address</code> variable.</li>

<li>Get the primary key of the <code>$user</code> variable.</li>

<li>Prepare a database query with a rewritten query string which uses placeholders for the <code>$address</code> and <code>$user</code> variables to avoid SQL injection and other interpolation errors.</li>

<li>Execute the query.</li>

</ul>

<p>You can take advantage of type checking too:</p>

<ul>

<li>Get the metadata which describes the <code>users</code> table.</li>

<li>Verify that the required fields (<code>address</code> and <code>user</code> exist).</li>

<li><em>Verify that the type of <code>$address</code> is compatible with the type of the <code>address</code> field.  Repeat for <code>$user</code> and <code>user</code>.</em></li>

<li>Get the value of the <code>$address</code> variable.</li>

<li>Get the primary key of the <code>$user</code> variable.</li>

<li>Prepare a database query with a rewritten query string which uses placeholders for the <code>$address</code> and <code>$user</code> variables to avoid SQL injection and other interpolation errors.</li>

<li>Execute the query.</li>

</ul>

<p>If you know the structure of the database when the program starts, you can start to push some of this type checking to the point of compilation.  (You may not be able to perform <em>all</em> of the type checking at compilation time, but you can do as much as possible as early as possible to prevent as many errors as possible.)</p>

<p>That's simple and easy.  Now imagine something more interesting:</p>

<pre><code>SQL {{
    SELECT name, address FROM users, addresses GIVEN { User $user }
}}</code></pre>

<p>It's obvious from the syntax of the query language that the database needs to perform a join operation, and it's obvious that the primary key of the <code>$user</code> object is the important key of the operation.  If the program knows the relationship of the <code>users</code> and <code>addresses</code> tables, it can join them effectively as well.</p>

<p>Don't get caught up in the syntax or the semantics of the remainder of examples here; they exist to demonstrate possibilities, not the final form of battle-tested code.  Even so, imagine a dynamic query:</p>

<pre><code>SQL {{
    SELECT @fields FROM { Table $table_one }, {Table $table_two } }
}}</code></pre>

<p>Again the structure and intent of the code is obvious.  The operations are now:</p>

<ul>

<li>Find the primary keys for <code>$table_one</code> and <code>$table_two</code>.</li>

<li>Verify that they're joinable.</li>

<li>Verify that all members of <code>@fields</code> are present in either <code>$table_one</code> or <code>$table_two</code>.</li>

<li>Construct the query.</li>

</ul>

<p>If I were to implement this, I'd make a <code>join_tables</code> multimethod.  It takes two arguments (generalizable to more, but follow along with two for now).  Imagine that it looks something like this:</p>

<pre><code>multi join_tables( Table $t1, Table $t2 ) { ... }

multi join_tables( Any, Any ) { fail() }</code></pre>

<p>Given two <code>Table</code> arguments, the first multi candidate matches and gets called.  Given any other combination of arguments, the second candidate matches and produces an error.</p>

<p>Knowing that you have two <code>Table</code> objects isn't enough, however.  The tables might have no relationship to each other.  Imagine if you somehow <em>could</em> verify that the tables have an appropriate relationship.  If I were to implement this, I might check that the keys of the tables matched types, perhaps with a syntax something like:</p>

<pre><code>multi join_tables ( Table $t1, Table $t2 where { $t1.primary_key eqv $t2.foreign_key( $t1 ) } ) { ... }</code></pre>

<p>That is, the keys must be of equivalent types.  If one key is a
<code>user_id</code> and the other is an <code>Integer</code>, the where clause
won't match for this candidate, so a different multi will get called.</p>

<p>Now imagine that for those embedded SQL minilanguage statements where table name is available at compilation time and sufficient type information exists to verify the statements themselves at compilation time:</p>

<pre><code>SQL {{
    SELECT name, address FROM { User users }, { Address addresses }
}}</code></pre>

<p>... then everyone who uses this minilanguage (and has set up the table information appropriately) gets safety and correctness by default.  Some of that can even occur <em>before the program runs</em>.  The rest of it can occur as the program runs.</p>

<p>(A really, really good type checker and optimization system could infer that some errors are impossible even if it can't prove the use of a single type in every case.)</p>

<p>Now imagine that you have a language which allows you to build minilanguages like this, to build APIs which specify correct operations and fall back to good error reporting on incorrect operations, and which do so without interfering with other code and other extensions.</p>

<p>Welcome to Perl 6.</p>
        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2010/07/20/eliminating-errors-with-little-languages/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>String-Plus</title>
		<link>http://www.modernperlbooks.com/mt/2010/07/string-plus.html</link>
		<comments>http://www.modernperlbooks.com/mt/2010/07/string-plus.html#comments</comments>
		<pubDate>Fri, 16 Jul 2010 17:59:43 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[apis]]></category>
		<category><![CDATA[languagedesign]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[perl5]]></category>
		<category><![CDATA[perl6]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[What does this variable represent? my $thingie = It's obviously an address, but what does Perl know about it? Perl knows it's a string. Perl knows it's some 60 characters long. Perl may even know that it's a valid string...]]></description>
			<content:encoded><![CDATA[
        <p>What does this variable represent?</p>

<pre><code>my $thingie =<<'END';
Thaddeus Droit
4616 NW Washington Place
Beaverton, OR 97006
END</code></pre>

<p>It's <em>obviously</em> an address, but what does Perl know about it?  Perl knows it's a string.  Perl knows it's some 60 characters long.  Perl may even know that it's a valid string of Latin-1 characters.</p>

<p>Perl doesn't know where the string came from, nor that it contains a street
address or a legal name nor a zip code (and not a zip + 4).  Any meaning to the
program beyond "It's a string of some 60 characters and is valid in the Latin-1
encoding" is far beyond what Perl knows about it.  That's why the name of the
variable is <code>$thingie</code>; even though Perl doesn't care about variable
names, calling it <code>$address</code> instead could have led you to believe
there's more structural meaning to this chunk of memory than actually
exists.</p>

<p>Names are important, at least to people maintaining source code.  This code is obviously wrong:</p>

<pre><code>$user-&gt;set_address( $birthday );</code></pre>

<p>... but to Perl it might as well be:</p>

<pre><code>$foo-&gt;bar( $baz );</code></pre>

<p>... for all of the semantic meaning it understands.  There's no obvious
intent.</p>

<p>I know you're smart and you're way ahead of me and you think "If I wanted a
good static type system, I know where to find Haskell or OCaml and I'd never
let code that bad get out of code review and why aren't you writing tests." but
that's not the point.  You can be super careful or <a
href="http://www.modernperlbooks.com/mt/2010/07/strings-and-security-and-designing-away-bugs.html">make
APIs which restrict the most natural way to write code in the host language in
favor of extra security</a>.  That may be the right approach.  (You have to be
careful, though: the ease of interpolating untrusted user input into a raw
string or the use of register globals in PHP seems analogous to the <a
href="http://definitions.uslegal.com/a/attractive-nuisance/">attractive
nuisance doctrine</a>, where people who don't know any better can't analyze the
risk appropriately.</p>

<p>There may be another way.</p>

<p>Suppose I annotated the address:</p>

<pre><code>my Address $thingie =<<'END';
Thaddeus Droit
4616 NW Washington Place
Beaverton, OR 97006
END</code></pre>

<p>It's still a chunk of memory with certain characteristics, but now it has an
extra piece of metadata related to the program itself (and not merely Perl
itself).  A clever compiler could detect certain places where the semantics of
an operation don't match:</p>

<pre><code>method set_address(Address $addy) { ... }</code></pre>

<p>... though you do have to be able to resolve this kind of dispatch at
compilation time to prove the type safety of the entire program at compilation
time.  (I've seen suggestions that even Smalltalk programs can resolve some
85-90% of dispatch targets in a static fashion.)</p>

<p>You don't have to go that far; runtime verification with a good test suite
is effectve, can be fairly cheap, and is available right now in Perl 5 with <a
href="http://moose.perl.org/">Moose</a>.</p>

<p>There's still another way.  Consider again the untrusted input example.  If
you enable tainting, you might read user input into an address:</p>

<pre><code>my Address $untrusted_addy = $req-&gt;get( 'address' );</code></pre>

<p>You don't see it in the declaration, but the "This is tainted!" metadata is present in <code>$untrusted_addy</code>.  How do you deal with that?</p>

<p>You could be picky about always untainting untrusted data, but can you do
that accurately and effectively?  Can you rely on everyone always getting it
right?</p>

<p>What if you could write:</p>

<pre><code>SQL {{
    UPDATE users SET address = { Address $address } WHERE user = { User $user }
}}</code></pre>

<p>... and Perl could verify that <code>$address</code> is an appropriate
Address (and <code>$user</code> is an appropriate User), could quote and escape and validate both of
them effectively, could extract the primary key from <code>$user</code>, and
could untaint any tainted <code>$address</code> or <code>$user</code>?</p>

<p>If your language supports multiple dispatch, lets you define your own types,
lets you override stringification, and can override interpolation for cases
like these, you can do such things.</p>

<p>In other words, you could turn what would otherwise be a raw string into an
embedded little language with its own syntax and semantics, interoperate with
native data structures in the host language, and provide composable
safety&mdash;and users don't have to know much of anything about how this
works, as it pretty much does what they expect.</p>

<p>I can imagine a language like that.</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2010/07/16/string-plus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Strings and Security and Designing Away Bugs</title>
		<link>http://www.modernperlbooks.com/mt/2010/07/strings-and-security-and-designing-away-bugs.html</link>
		<comments>http://www.modernperlbooks.com/mt/2010/07/strings-and-security-and-designing-away-bugs.html#comments</comments>
		<pubDate>Wed, 14 Jul 2010 16:23:45 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[apis]]></category>
		<category><![CDATA[languagedesign]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[perlprogramming]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Some people believe that security problems and other severe bugs are inevitable. Some of these people believe that conscientious design and clear thinking about how languages and APIs work is irrelevant; bad code is possible in every language. Bad code...]]></description>
			<content:encoded><![CDATA[
        <p>Some people believe that security problems and other severe bugs are inevitable.  Some of these people believe that conscientious design and clear thinking about how languages and APIs work is irrelevant; bad code is possible in every language.</p>

<p>Bad code <em>is</em> possible in any language and wrong code is possible with any API.  Even so, it's possible to create languages and APIs which make the right thing so much easier than the wrong thing that only the most incompetent (or dangerously malicious) write bad code.</p>

<p>Imagine, for example, a database access layer which forbids the use of raw strings to create SQL queries.  You might have to write:</p>

<pre><code>my $sth = $dbh->select( @tables )->join( %relations )->where( %conditions );</code></pre>

<p>That's not necessarily a <em>beautiful</em> interface dashed off after a moment of thinking, but it has an important security property: it avoids the interpolation of untrusted user input.  All data sent to the database may go through a quoting or untainting process without the user having to remember to do so.</p>

<p>A similar library could help avoid malicious user input from interfering
with the display or operation of a web site, for example.  These are both
specific cases of a general principle: <a
href="http://www.modernperlbooks.com/mt/2010/07/dont-parse-that-string.html">replace
unstructured string data with structured data</a>.  In both cases, the
structure of the data makes the intent of the data clear, which allows the
library to ensure as much safety as possible.</p>

<p>This principle has other implications as well; more on that next time.</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2010/07/14/strings-and-security-and-designing-away-bugs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Don&#8217;t Parse That String!</title>
		<link>http://www.modernperlbooks.com/mt/2010/07/dont-parse-that-string.html</link>
		<comments>http://www.modernperlbooks.com/mt/2010/07/dont-parse-that-string.html#comments</comments>
		<pubDate>Wed, 07 Jul 2010 19:50:39 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[apis]]></category>
		<category><![CDATA[languagedesign]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[perl5]]></category>
		<category><![CDATA[perl6]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Defensive programmers anticipate what might go wrong. Robust code handles the unexpected, partly by minimizing the surface area of potential problems. The fewer things that can go wrong, the fewer things that will go wrong. (Things will still go wrong,...]]></description>
			<content:encoded><![CDATA[
        <p>Defensive programmers anticipate what might go wrong.  Robust code handles the unexpected, partly by minimizing the surface area of potential problems.  The fewer things that can go wrong, the fewer things that will go wrong.  (Things will still go wrong, but you can write safer code if you're clever.)</p>

<p>Yuval Kogman asked <a href="http://blog.woobling.org/2010/07/are-we-ready-to-ditch-string-errors.html">Are we ready to ditch string errors?</a>  I am; there's a general principle of API design beyond his question.</p>

<p>One problem with <code>die "Some error!"</code> is how to identify what error that represents&mdash;not to a programmer or user, who ostensibly speaks enough English and problem domain jargon to have some idea of what the error means&mdash;but the rest of the program.  How does your code catch this error and distinguish it from some other type of error?  Can you determine which of the two you can handle and which you must delegate?</p>

<p>Break out <code>split</code> or the regular expression engine and prepare to write heuristics which guess, and woe to you if someone someday internationalizes your error messages or runs all of your exceptions through a logging mechanism which changes their formatting slightly or....</p>

<p>The problem is that you can't take advantage of the structure of the exception data because it's not present in the string.  The same goes for <a href="http://search.cpan.org/perldoc?DBI">DBI</a>'s connection strings:</p>

<pre><code>my $dbh = DBI-&gt;connect( 'dbi:DriverName:database=database_name;host=hostname;port=port' );</code></pre>

<p>As the documentation suggests in the very next sentence:</p>

<blockquote>There is <em>no standard</em> for the text following the driver name. Each driver is free to use whatever syntax it wants.</blockquote>

<p>Compare this to a keyword argument form:</p>

<pre><code>>my $dbh = DBI-&gt;connect(
    driver   =&gt; 'DriverName',
    database =&gt; 'database_name',
    host     =&gt; 'hostname',
    port     =&gt; 'port',
    extra    =&gt; 'arguments',
);</code></pre>

<p>This has several advantages.  The method doesn't have to guess (or
<em>parse</em>) the string.  The layout and vertical alignment makes the
keyword form easier to read and to modify.  DBDs can decorate and augment this
argument list without parsing and recreating a string.  Verification and
default arguments are much easier.</p>

<p>The same argument goes for using a module such as <a href="http://search.cpan.org/perldoc?File::stat">File::stat</a> instead of parsing the output of <code>`ls -l filename`</code>.</p>

<p>The same argument goes for... you get the point.  It's far too easy to
unfold the regex widget from the swiss-army chainsaw when a little bit of
caution decomposing data into structured data makes your programs safer, easier
to use, more flexible, and more robust.</p>

<p>(I consider sometimes how a language would look if it had only keyword arguments and how you could optimize them with immutable, internable strings and cached call sites and a zero-copy register allocation mechanism, but I made it as far as writing a self-hosting garbage collector before I had real work to do.)</p>



<p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2010/07/07/dont-parse-that-string/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Luck and the Class Struct API</title>
		<link>http://www.modernperlbooks.com/mt/2010/07/luck-and-the-class-struct-api.html</link>
		<comments>http://www.modernperlbooks.com/mt/2010/07/luck-and-the-class-struct-api.html#comments</comments>
		<pubDate>Fri, 02 Jul 2010 20:01:01 +0000</pubDate>
		<dc:creator>chromatic</dc:creator>
				<category><![CDATA[apis]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[modernperl]]></category>
		<category><![CDATA[oo]]></category>
		<category><![CDATA[oop]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[perl5]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Class::Struct has been a core module for ages. (Previously it was Class::Template, but a great renaming occurred 13 years ago.) If you've never seen it before, it might remind you a little bit of Moose: package Cat; use Class::Struct; struct(...]]></description>
			<content:encoded><![CDATA[
        <p><a href="http://search.cpan.org/perldoc?Class::Struct">Class::Struct</a> has been a core module for ages.  (Previously it was <a href="http://search.cpan.org/perldoc?Class::Template">Class::Template</a>, but a great renaming occurred 13 years ago.)  If you've never seen it before, it might remind you a little bit of <a href="http://moose.perl.org/">Moose</a>:</p>

<pre><code>package Cat;

use Class::Struct;

struct( name =&gt; '$', age =&gt; '$', diet =&gt; '$' );</code></pre>

<p>You don't get all of the benefits of Moose, but you do get attributes and accessors.  You also get a default constructor.</p>

<p>Of course, the default constructor reads something like:</p>

<pre><code>{
    package Cat;
    use Carp;

    sub new
    {
        my ($class, %init) = @_;
        <strong>$class = __PACKAGE__ unless @_;</strong>
        ...
    }

    ...
}</code></pre>

<p>If that emboldened line is curious to you, it's curious to me too.  I saw a note in one of the test files somewhere suggesting that the purpose of this was to allow you to write:</p>

<pre><code>package Cat;

my $cat = new();</code></pre>

<p>I don't know why you'd do that, however.  In what kind of object design does
it make sense to create objects of a class from within that class?  (That seems
like a violation of responsibilities to me.)  You can also write:</p>

<pre><code>package NotCat;

my $cat = Cat::new();</code></pre>

<p>... though that's exceedingly fragile.  For one thing, it implies that you could also write <code>RobotCat::new()</code>&mdash;assuming that <code>RobotCat</code> extends <code>Cat</code>, but avoiding method dispatch for calling a constructor means that <code>RobotCat</code> had better provide its own <code>new()</code> which behaves as a function as well as a method.  (Even if you somehow convinced the subclass to inherit the superclass's function through some sort of exporting scheme, the hardcoded <code>__PACKAGE__</code> would hurt.)</p>

<p>Hardcoding a method dispatch as a function dispatch means that the maintainers of <code>Cat</code> are not free to change <em>how</em> <code>Cat</code> provides its constructor, much for the same reason.</p>

<p>Woe unto you if there's an inherited <code>AUTOLOAD</code> somewhere.</p>

<p>I realize that in 1994 or 1995, people who wrote OO code in Perl 5 might
have had familiarity with OO in C++ (where this syntax makes a little more
sense) or, perhaps, Java where the indirect constructor call (the <code>my Cat
$cat = new Cat;</code> is prevalent), but the benefit of hindsight is that
experienced Perl 5 programmers can look back on this API a decade later and
cringe at its potential for misuse.</p>

<p>If you're lucky, everything will go right&mdash;but what kind of a defensive programmer relies on <em>luck</em> when designing an API?</p>

        
    ]]></content:encoded>
			<wfw:commentRss>http://perlblogs.com/2010/07/02/luck-and-the-class-struct-api/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

