AJS's Software Development Blog

Wednesday, September 12, 2018

Yadda: How should exceptions and abstract methods be combined?

Perl 6 is many things, but I have always felt that its most important role is in exposing other languages to modern concepts in programming language design, in a more real-world environment than many research languages that tend not to be shaped by actual production use, the likes of which Perl has seen for decades, now.

One of those areas where modern concepts have influenced Perl 6 is the use of the "yadda" operator. At first, this was only in the form of the ellipsis (three periods, if you are not using a Unicode-enhanced editor, or just prefer ASCII equivalents). This operator is used in situations where an unstated piece of code is suggested, but not implemented.

In the most common use-case, this is because the compiler is expected to intuit the appropriate code by assessing a list of values and providing a generator that will continue the sequence, for example:


1,2,3 ... 100

This sequence is the same as the explicit range (note the double-dot range operator):


1 .. 100

But the compiler can intuit several different kinds of successor function. This article is not about that usage, though; it is instead about the use of the yadda operator in abstract method definitions, as such:


class SomeThing {

    # Sub-class should define this:
    method do-stuff { ... }

}

In this case, we are defining an abstract base-class called "SomeThing" which contains a method called "do-stuff". But the do-stuff method is not defined. The yadda operator tells the compiler that, should a programmer attempt to call do-stuff directly, without going through a subclass that defines it properly, an exception should be packaged and returned (see below).

This is common in many OO languages, but Perl 6 takes this one step further, adding two additional operators which differ only in how exceptions are handled:


class SomeThing {
    method do-stuff { ... } # fail() is called when executed
    method do-other-stuff { ??? } # warning is generated
    method do-new-stuff { !!! } # die() is called when executed
}

The difference is in how exception-handling is managed. In the first case, a call to the abstract method returns a "failure" which is something like an unexploded bomb. It is intended to improve debugging and error-reporting by identifying both the offending method invocation and the use of its returned value. It does this by wrapping an exception object in a semi-safe failure object that is benign unless or until its value is examined, and then a new exception with both contexts is raised, and the compiler can present a detailed history of the offending code-paths to the user.

If you desire a more straightforward approach, and/or are writing an abstract method where any access would be problematic immediately, you can use the !!! form, which immediately raises an exception upon use.

On the other hand, if you are writing a method whose use might be acceptable (e.g. replacing an obsolete API where an old method becomes irrelevant) you can use the ??? form and a warning will be generated, which the caller can suppress if needed.

These lessons are important for other languages to consider, not just in a "that's not the way we do it," sense, but in a serious way. Language design is an ever-changing landscape, and any language that gets too entrenched in its original design is measuring its own mortality. One way to avoid such long-term failure is to evaluate new ideas, absorb what works and reject what does not.

Perl 6 offers many opportunities to perform such evaluations (as I've covered previously) and language designers and maintainers would be remiss in not availing themselves of the opportunity to learn from such ambition.

Thursday, September 15, 2016

The not so killer features of Perl 6

Perl 6 doesn't really lack for killer features. If you ask the typical user what they love about it, they'll cite grammars, metaoperators, slangs and many other game-changing features. But we rarely have to write code that just happens to take advantage of all of the best features of a language. So what is it that makes Perl 6 a pleasure to just... use?

Here are a few of those features that I find myself using casually.

Trivial class construction

Here's a class:

class Server {
    has $.port;
    has @.active;
    has $!socket;

    method listen { ($!socket = IO::SomeService.new(self.port)).listen }
    method connect {
        my $connection = $!socket.connect;
        self.active.push: $connection => start { self.service: $connection };
    }
}

Now, you might look at this and think, "that's all well and good, but it's an example." This is true. There's no code for servicing the asynchronous Promise initiated by the "start" command, nor is there a definition of what this "SomeService" is, which presumably is doing the interesting work.

But here's what you can write immediately, without any other infrastructure:

    my $server = Server.new(:port(2000));

Where's the constructor that knows how to take a "port" parameter? Built for you. Where are the accessor functions, used in the class, "port" and "active"? Built for you.

All you have to do in writing a Perl 6 class is write the code that does what you want. If you have some special needs in a constructor, sure you can get under the hood (boy can you get under the hood!) but you don't have to.

As a small side-feature, notice that I don't declare parameter lists for my methods, above. The invocant is always implied but unnamed, accessible through the meta-variable "self". To be explicit, I probably should define an empty parameter list (which actually has different semantics when someone tries to pass a parameter). But either way, I never have to waste my time writing out "self" on every method definition.

List semi-flattening

Something that I find myself doing all the time in other languages that feels wrong is constructing a list out of single items in order to be able to do list concatenation with them. Here's an example from Python:

    list2 = ['apple'] + list1

In Perl6, we can flatten just the first layer of an array into a list expression:

    @list2 = 'apple', |@list1;

This also happens to work well with any sort of datastructure that can be flattened to a list, so you don't have to worry about the type of a thing, just what it can do.

Context markers

Code gets much easier to read in Perl6 because you can mark an expression as having a specific type context. Here's an example:

    sub binary_digit_of($n, $m){ ~$n.base(2).substr($m,1) }

Now, you don't need That "~" there. It imposes a string context, and substr only returns strings. So why? Because putting it there clearly tells anyone looking at this statement that it's operating in a string context, even if it's calling some obscure method that might return an integer on Tuesdays, this expression is going to give a string.

Sure, you could call .Str on the result and get the same effect, but it's not right up there in front, telling you the context at first glance, it's just another method call in the chain. This is optional, of course, but it's nice to have that tool handy when there might be ambiguity or you want to enforce your assumptions.

Note: in this example, we could have just declared a return type for the function. The context imposed by "~" or "+" or "?" works on any expression, but there are definitely alternatives in specific situations.

Named variable passing

One of the features that I find myself using all the time and absolutely loving is the ability to pass a variable to a function as a named parameter, using the variable name as the parameter name. Here's an example:

    sub firstline($file, $enc) { open($file, :r, :$enc).get }

Normally, you would pass a named parameter with the ":param(value)" syntax, but when the "param" name is the same as the name of the variable, you can abbreviate to this shorter ":$variable" syntax. This makes quite a lot of named parameter passing much simpler. Where, in other languages, you often find yourself writing "name=name, size=size, duration=duration..." Perl 6 lets you avoid all of the redundancy.

Tuesday, August 30, 2016

Six reasons to try out Perl 6

I hate list articles, but sometimes they're a necessary evil. We humans are more inclined to enumerated lists of information rather than open-ended discussion, I suspect, because we feel like there is less investment. When we see "five reasons to jog," we assume that we'll get a reason right off the bat that we can evaluate and just stop reading if we don't agree.

But with an essay or other generic prose format, it can be difficult to tell just when the thesis will be delivered, and therefore there is more cognitive load in investigating the content.

To that end, I thought I would provide a simple, easily read (and probably easily dismissed) list of the reasons that Perl 6 is a programming language that you should be exploring.

Before we start, though, let's determine what constitutes a valid reason to learn a new programming language. Certainly necessity is quite common. I learned BASIC in the 1980s because it was the only language I had available to me on my Tandy Color Computer (dubbed "CoCo" by its users). I didn't chose BASIC, it chose me. Today, this is commonly the case when people take a job or attend a course which uses a particular language. Perl 6 is obviously too new a language to impose itself on people in this way, so that's certainly not on the list.

If you are not forced to use a particular language, then it comes down to utility, but utility functions are complicated beasts, especially when talking about ways of expressing ideas. As an excellent example, let's look at Perl's (and many of its predecessor languages') use of a prefix (or suffix) character to indicate a data type. In Perl this has traditionally been a dollar sign for scalar (singular) values, at-sign for arrays and percent for hashes (sometimes called dictionaries or more technically unordered associative lists in other languages). This affords Perl a great deal of flexibility. Variables can be called out in strings (e.g. "I am $name") and there is no such thing as a reserved word for variables (e.g. $while is perfectly valid). It also gives someone reading code the opportunity to quickly scan for variable usage and visually pattern-match the "shape" of the code.

But to non-Perl programmers, this feature often induces frustration. It's harder to read code initially when you're not used to this convention. So, does this make Perl a good or bad language to learn?

For the sake of this article, I'm going to say that it has no impact. I'm only interested in language features which will either improve the quality of code or the capabilities of the programmer. For example, learning any Algol-derived language from C to Perl to Java, etc. will teach a programmer to understand code in any language that mixes procedural function calling with mathematically-inspired infix operators. This is an important skill in the world of modern programming languages, and it will improve the ability of the programmer to learn nearly any modern language. Getting used to variables with prefixes won't really change the way you code or enable you to learn other languages, so it wouldn't be nearly as useful as this.

Okay, so on to the list...

Number 6: Grammars

While many people tout Perl 6 grammars as the primary innovation of the language, I'm going to place it at number six, here, because it's not really the most fundamental shift you will experience in the language. Still, it needs to be on this list. Just about every language has access to a parser generator, but Perl 6 places the parser directly into the language as a first-class feature. In fact, the language uses its own grammar feature to parse itself!

Essentially this is the ability to define a parser much the way you would define a class. Grammars can derive from each other like classes, and they can even have attributes much as a class would (this is because, under the hood, they use the same infrastructure in the language as a class).

One thing that I think this will improve is the reliability of programs that need to access user-provided data. All too often, user data is handled with poorly crafted regular expressions that either fail to express the possible complexity of user input or don't sufficiently constrain it due to the limitations of regexes.

Number 5: Modern OO

In most modern OO languages, features such as a metaobject protocol, automatic initializers and accessors and the like are commonplace. This is certainly the case with Perl 6, but the language goes far beyond this. Trait-like roles that can be generics, monkey-patching objects, indirect method invocation, redispatch, and dozens of other advanced features are combined in Perl 6's object system making it a joy to use in a diverse array of use-cases.

Number 4: Laziness

If you are used to languages like CommonLisp or Haskell, you will know the power of lazy evaluation being the default. Unlike languages that have token access to generators and iterators, Perl 6 and the other languages I mentioned, provide laziness as a primary mode of accessing sequential data. It's almost always the case that laziness is involved in any such transaction and often in a way that is transparent to the user. For example, constructing a range from a variable start point to a variable end point is fairly common in most languages, but you usually have to use an alternate mechanism is either of those endpoints are going to be infinite. In Perl 6, there is no such distinction. Ranges behave the same no matter what their endpoints might be.

Number 3: Deep Gradual Typing

It's possible to construct a function in Perl 6 which takes a scalar variable and does something with it, regardless of its type. In fact, that's the default. However, if you want to specify a type for a parameter, you can. The type of an input parameter is checked and if it matches, you're good to go.

But that's only the start. The "where" operator allows the programmer to further constrain a parameter in a dynamic way (e.g. requiring that a numeric parameter falls within a specific range) and the "subset" operator allows the declaration of a type-like name that includes such parameterization. This allows a mix of static and duck typing in a convenient syntax.

This typing goes further, allowing constraints on the contents of sequential and associative data types, class interface assertions (through "roles") and many other advanced type management features.

Number 2: Slangs

Slangs are variations on Perl 6's parser. They can be defined at runtime in Perl 6 and can be used lexically like a variable definition. A slang allows anything from a trivial feature change (perhaps you want a conditional block that used the keyword "maybe") to more substantial additions (say, adding a quoting operator for JSON data that validates the data at compile-time) to fundamental replacements of the parser (e.g. to lexically scope a block of Perl 5 code).

Slangs offer the ultimate in language flexibility, and because the underlying runtime is so flexible and full-featured, nearly anything you might want to do is possible. Want to define a Python object with Python's method call semantics? It's just a matter of getting in there and writing the code!

Number 1: Access to Perl 5 and C

Through the "Inline::" modules, Perl 6 has access to Perl 5 and C code. This immediately gives the language access to two of the largest suites of libraries in the world. Essentially, from day 1, Perl 6 has more library support than any language I have ever seen as this stage of development.

While new features are all well and good, one of the crippling problems with many new languages is that they have very few features available to them through library code. This is why the JVM is such a popular platform for new languages, but constraining implementation to one VM is hardly the right solution to such problems.

Honorable mentions

There are too many great features in Perl 6 to mention in one document, but a few that didn't quite make the list are:

Automatic use of rationals for integer division
The most advanced Unicode handling I have ever seen
True concurrency without a GIL

Tuesday, July 5, 2016

Perl 6: What's that you've got there? On stringification.

In Perl 5 and Python, the two languages I'm most familiar with outside of Perl 6, at this point, strings are pretty simple things. When you have an object and you want to print it, for example, here's the Perl 5:


    say $thing;

And here's the Python:


    print thing

Notice that there's no hinting being given to the compiler as to what this "thing" is or how to render it to the output. In Perl 5, this is an implicit feature of all scalar values. They have a string representation which can always be fetched. It might not be useful (and in many cases isn't) but it'll be there.

In Python, print automatically converts its parameters to strings as if you had written:


    print str(thing)

though it can be more complex than that, due to encoding issues. The default stringification for objects is the __repr__ method on the object's class, which is (as in Perl 5) not terribly useful, but good object authors know to override it and/or the stringification method (__str__) with something more useful.

In Perl 6, the picture is both more complicated and more functional out of the box.

There are three kinds of stringification:

Coercing to str (usually via the ~ unary or binary operator)
The .gist method
The .perl method

The first is a straightforward "what are you as a string"? That has no implication of preserving all aspects of an object. So, for example, when matching a regular expression and getting back a match object, you might want to print the whole match. This is done by coercing the match to a string:


    given "fool" {
        say ~ m/fo+/
    }

Will print "foo". Notice that there's quite a lot of information thrown away by this operation. There's the position in the base string, any sub-matches, etc. all available in the match object, but when stringified, that's all thrown out the window, and the "most salient" string elements are printed.

The second form of stringification is called a "gist" and it's accessed by using the .gist method. Here's an example:


    given "fool" {
        say m/f(o+)/.gist
    }

prints:


｢foo｣
 0 => ｢oo｣

In a gist, the stringified version of an object tries to strike a balance between preserving internal information necessary for debugging or other analysis and readable text. The general rule appears to be:

Throw away internal state that's not relevant to the textual understanding of the object (e.g. the "from" and "to" attributes of a regular expression match
Use the corner-brackets (「」) from Asian scripts to enclose internal string value(s)
Recursively descend data structures, calling each of their gist methods

Finally there's the third option, which is exactly like Perl 5's Data::Dumper and a bit like Python's repr except that it aggressively attempts to determine how to represent code as executable Perl. Unless there's ephemeral state involved, evaluating the output of the .perl method on an object should yield a copy of the object.

Languages, in general are moving towards this sort of "structured intuition" about the representation of objects, rather than some more on-demand way of representing data. It allows authors to carefully control how their objects are represented, but also gives simple objects all of the tools that they need out of the box.

Thursday, August 27, 2015

Using my stardock image on Amazon AWS free tier

Amazon's AWS service is great. I love the fact that it provides a free server to develop on and if I ever need more, I can pay for it.

Just one catch: many of the things I enjoy working with are small enough to fit in RAM to run, but not to build. Perl 6 is a great example. See my previous posting on building a "stardock" container for Perl 6. Anyway, that won't build on Amazon's AWS free tier, so what do I do? Well, I can spin up a local VM under a Windows box to do the build, so here's what I do:

docker build --tag=stardock .
docker save -o stardock.img stardock
scp stardock.img <awshost>:.

And then on the target system:

docker load -i stardock.img
docker run -it stardock perl6 -v

Now you have a build of Perl 6 Rakudo Star from the most recent HEAD, but without having to have enough RAM in your free dev environment on Amazon!

Thursday, July 9, 2015

Building bleeding edge Perl 6 as a Docker container

I got tired of re-building Perl 6 on every system I touch, but the public Rakudo Star build of Perl 6 that's available as a Docker image is pretty badly out of date. If you want to get and build the latest and greatest, here's how to do it in a container.

First, create your "Dockerfile" with this as its body:

FROM ubuntu:15.04

RUN apt-get update && apt-get -y upgrade && apt-get clean
RUN apt-get -y install perl && apt-get clean
RUN apt-get -y install git-core && apt-get clean
run apt-get -y install build-essential && apt-get clean

RUN git clone https://github.com/rakudo/star.git
RUN cd star && perl -i.bak -pE \
  's/git\@github.com\:/git\:\/\/github.com\//g' .gitmodules
RUN cd star && git submodule sync && git submodule init && git submodule update
RUN cd star && git clone https://github.com/rakudo/rakudo.git
RUN cd star && perl Configure.pl \
  --force --prefix=/usr --backends=moar --gen-moar --gen-nqp
RUN cd star && make && make install

CMD [ "shell" ]

Now build your image by running docker build:

$ docker build --tag=stardock .

I call my image stardock. You can call yours whatever you like. Note that the build is too large to fit in some minimal environments. It may fail, for example, on a free tier AWS EC2 instance.

Next, you want to test your build:

$ docker run -it stardock
root@5bf4853308cf:/# perl6 -v
This is perl6 version 2015.06-225-g3bdd0af built on MoarVM version 2015.06-88-g647df11
root@5bf4853308cf:/# perl6 -e 'say "Hello, world!"'
Hello, world!

And there you go, a working Perl 6 built off the latest rev of Rakudo build per the Rakudo-Star release packaging.

Because the point of this is to be up-to-date, you might want to add this line before the "RUN git clone" of Rakudo:

ADD update.txt update.txt

and just "touch update.txt" in your Dockerfile directory before you do the build to force it to pick up the latest Rakudo and rebuild from scratch.

Tuesday, January 21, 2014

Sand: Rules

Sand is a programming language that I introduced in a previous article. Sand rules are not completely fleshed out, but here are the primary design goals:

To be as similar to perl 6 rules as possible
To reduce complexity where it does not yield substantial benefit and where reduction in complexity does not come at tremendous cost to compatibility
Speed of parsing and execution

Specifically, these are the top-level elements:

A regex is a sequence of atomic assertions about patterns very similar to Perl 5 regular expressions.
An assertion is embedded code within a regex which returns true or false and matches on true.
A subexpression is a reference from one named regex to another.
A token is a regex which matches in a single pass (no backtracking) or fails.
A rule is a regex which defaults to considering whitespace significant.
Tokens, rules and bare regexes are all optionally named, allowing subexpressions to use those names for reference.
All of the above may be referred to as elements of a grammar and are technically methods. They must either be declared as part of a class that implements (does) the role, "grammar" or must be implicitly associated with the global "_regex" grammar (e.g. this code defines a method on _regex implicitly and invokes it: if "foo" ~~ regex{f} {...})

here are some examples:

class Sand :does(grammar) {
method same_category($letter, $category by reference) {
my $letter_cat = $letter.unicode_block_category();
if $category {
assert($letter_cat == $category);
} else {
$category = $letter_cat;
}
}
method is_token_unicode() -> (!bool) {
my $number = undef;
my $other = undef;
for self.match(0).str().list() -> ($letter) {
if $letter.isdigit() {
self.same_category($letter, $number);
} else {
self.same_category($letter, $other);
}
}
return True;
}
token identifier {
[ <alpha> | '_' ] <alphanum>* { .is_token_unicode() }
}
rule scalar { '$'<identifier> }
}

This code defines the "identifier" and "scalar" regexes that are part of Sand's grammar. Notice that a grammar class can define methods like any other class and those methods can be invoked from within the body of a regex.