Alan Knight's blog: 2011

Wednesday, December 7, 2011

Bugs are like prime numbers...

"Bugs are like prime numbers. You can never really find the last one. But after the first billion or so, they start to thin out a little bit..." - Brian Foote

Tuesday, December 6, 2011

STIC submission deadlines coming soon

The Smalltalk Industry Conference 2012 is coming up March 19-21, and submission deadlines are coming up sooner.

This year there are two parts to the conference, so there could be some confusion.

- Smalltalk Industry Conference: This is the traditional Smalltalk Solutions program. To submit, you just need summary information for the talk. The presentations are, at least sometimes, recorded, but there aren't published papers. The deadline for those submissions is VERY SOON - December 15th. Call for participation is here.

- Smalltalk Directions: This is the academic part of the conference, new this year. It accepts academic papers, which will be refereed and a selection of which will be submitted to the Journal of Object Technology. The deadline for those submissions is January 6th, 2012. The call for participation is here.

Monday, December 5, 2011

Some nice words for Smalltalk

As part of a comment on an earlier post, Bob Calco writes some nice things about Smalltalk...

Now I'm getting into Smalltalk 'for real' and finding that the OO-ness of it is not even the main thing I find compelling: it's the live-ness of it. It's just easier to think about the abstractions 'in the present' as it were.

I think Smalltalk takes the idea of live objects to such a level of sophistication that most people can't quite grok the Platonic Forms of domain modelling that swirl around the mind of an accomplished Smalltalk developer.

and also

But folks who have come to think OO is more buzzword than reality could not possibly have tried Smalltalk, not 'for real,' let alone tried to get good at it. It's not just a language or even a platform but a way of thinking about reduction of a problem to its essence, as this article makes clear.

I did leave out the bit in the middle where he has some thoughts for improvements like pattern matching, but they're at the bottom of the article. For myself I've never quite seen pattern matching as an especially valuable feature. To my mind the biggest gain is that it's a terse way of extracting out elements of a list -either a variable size list of arguments, or if you're in a language where linked lists are the primary data structure, being passed a list and automatically having it expressed as two variables represent first and rest. Other than that it just seems like syntactic sugar for a case statement. But maybe I'm missing something.

Friday, December 2, 2011

Dave Ungar on Massive Parallelism

Another one of the Microsoft Channel 9 videos from the Splash conference. This features Dave Ungar talking about Self and his current work with massive parallelism, using Smalltalk and C++, and how we can get our answers much much faster if we're not quite so hung up on them having to be exactly right...

Thursday, December 1, 2011

Inheritance hierarchies

In an IRC discussion the comment came up that "deep hierarchies mean you're doing good OO.. um... right?"

That put me in mind of one of my favourite comments on the subject, from Richard Gabriel's 1996 book "Patterns of Software" (in PDF). In the first chapter (Reuse versus Compression) he talks about inheritance not as code re-use, but as code compression. You can express a lot very succinctly by sharing code that way, but you're tightly coupling those things together. Whether that's worthwhile or not depends on the circumstances.

Wednesday, November 30, 2011

Lego Nativity Scene

An annual tradition at our house

A nice quote

A nice quote from Malte Ubl on Google+

Optional typing might be the worst or best idea ever. I'm more & more leaning toward the best idea ever direction.Also +Gilad Bracha is promising Smalltalk-style edit-debug-edit-debug-cycles (without restarting the program). If you've ever worked with VisualWorks or another Smalltalk environment that supports this, you will agree that every other programing environment, including every single one in mainstream use today, feels like the stone age.

Nice both to see that Dart is promised to support that style of development, and to get some praise for VisualWorks and Smalltalk environments in general.

Thursday, November 17, 2011

Dave Thomas Splash Video

Microsoft's Channel 9 did interviews of a number of people at Splash. Here's the one with Dave Thomas of OTI/VisualAge/Eclipse/Bedarra fame. Dave is always worth listening to. Teaser quote

"Q: ...what's the state of object-oriented programming today in your mind?

A: Well, I think that the state is that it's commercially immensely successful, but practically I think it's a disaster. ... I don't think we really understood how difficult it is for people to do abstraction"

Tuesday, November 15, 2011

Algorithm Animation via Hungarian Dance

This is awesome! An entire series of sort algorithms animated via different dances.

Sunday, October 30, 2011

Optional typing

Back home from Splash and recovering. I'll post some of the things I found interesting as I find time over the next few days.

Dart was definitely a big deal at the conference. One smaller thing that was interesting for me was agreeing to participate in a usability test for a Dart tool. I discovered how stupid those kind of tests can make you feel. I did talk to someone who does such usability tests for a different product and they said that a lot more of your brainpower is used up trying to vocalize your thought processes than you realize. So that's my story, and I'm sticking to it.

But the most surprising thing I found was that I was finding the optional types useful. My normal style would be not to put them in at all, but in a couple of that places I had them (because I'd copied the code from somewhere else) they did give me very quick feedback on the errors I was making - much faster and clearer than if I left them out.

Partly that's because I wasn't in my normal toolset, and what I had wasn't as good on runtime error reporting. What I'm used to is a Smalltalk debugger immediately popping up telling you that the method doesn't exist and letting you fix it right there. It's a bit different having your code compiled into Javascript and run in a browser. When there's an error it just quietly does nothing, but if you go into the developer tools in the browser and scroll down far enough in the generated code you can see an error indicating that the method doesn't exist (by which it may just mean that you forgot one of the parameters)

I don't think this is going to turn me into a static typing advocate, but it was an interesting experience.

Gilad Bracha talked about the type system, and one of the good lines from that talk was

Didn't you do all this 18 years ago? Yes, we did this in Strongtalk in 1993 and nobody paid attention. They will pay attention now"

Wednesday, October 26, 2011

Refactoring now possible for dynamic languages

There's an interesting post here from Bob Nystrom about getting used to the optional typing in Dart. But it did have one bit in particular that irked me.

"If it knows the type, then thanks to the previous point, it knows what you can do with it. Ta-da: auto-complete and refactoring are now possible for a dynamic language."

I don't want to pick on Bob, this seems to be one example of the widespread belief that you can't do refactoring in a dynamically typed language, despite the fact that much of the early work on it was done in Smalltalk. The term was actually coined with respect to Forth, as Brian Foote points out here, but it was popularized by the work of Bill Opdyke, John Brant and Don Roberts. The Brant and Roberts Refactoring Browser is currently the standard browser in VisualWorks and ObjectStudio and is the first example I know of automated refactoring support. Thanks to Don Roberts and Brian Foote who happen to be here at the Splash conference and so available to provide the historical information.

It's true that in a dynamic language you have a bit less information to use during refactoring. If we have a polymorphic message and we want to refactor it to, say, rename the method, but only some occurrences, then in a dynamic language we don't have a reliable way to know which senders refer to the ones we want to rename and which refer to the other implementations. So, if we wanted to rename MyClass>>printString we don't have a way to know reliably which senders of printString mean MyClass.

The problem is that we don't have a way to know that reliably in a statically typed language either. We will have more information that might be helpful in some circumstances. But suppose that we use a generic collection List. If we send any messages to the objects in that collection we don't know who the receiver is. So if we want to refactor, it's hard to make any assumptions about who that might be sent to.

Even with inheritance this sort of situation can arise. Suppose I want to rename the method printString in a subclass B whose superclass A also defines that method. If I find that message send to something whose static type is B, I can change the sender. But if if that message is sent to something whose static type is A, what do I do? The problem is that in the presence of multiple polymorphic implementations of the same method, renaming may not be a behaviour-preserving transformation.

I suspect that people using refactoring tools in statically typed languages don't notice these issues because in practice refactoring works fine for them in most normal circumstances. But the same thing is true for people using dynamically typed languages.

And in closing I'll add one comment from Don Roberts, that when he and John Brant looked at refactoring in Java they found that although the static types did give you some more information, the difficulty of satisfying the bookkeeping of the static type system ended up making it more difficult.

Monday, October 24, 2011

Nice line from Splash/OOPSLA

I'm at the Splash conference in Portland. In this morning's Lars Bak talk about the Dart VM, there was a question/answer sequence that went roughly like

Q: What about tail call optimization?
A: Not too likely. I don't know why you even want that. The biggest problem for me with it is that if you end up in a debugger you can't see where you came from.
Q: Well, you shouldn't be debugging in the first place

That seems to nice sum up some of the differences in approach of different programming communities.

Saturday, October 22, 2011

Number sequence questions on tests

IQ or aptitude tests, the sort of things you end up writing when you're in school, often feature number sequence questions where you have to fill in the next number in sequence. There's a simple trick that makes an enormous number of them much easier, that I learned when I was quite young and always make a point of teaching to children that I know. It's just applying successive differences. So, for example,

1  4  9  16 25 ...
  3  5  7  9

16  22  34  58 106 ...
  6  12   24  48

Mathematically, for any polynomial taking the differences reduces the degree of the polynomial, which for most of the polynomials you're likely to encounter in test contexts will reduce to a constant in a couple within two or three iterations.

So polynomials can be done mechanically, but the technique works better than that. A fibonacci sequence reduces to itself

1  1  2  3  5  8 ...
 0  1  1  2  3

and so do exponentials, with any constant or polynomial terms reducing as they would have on their own

1   2   4   8   16   32 ...
 1    2   4   8    16

and in general, sequences that aren't very tricky rapidly become obvious. I think I remember learning about this trick by reading a Mathematical Games column that talked about a computer program designed to take IQ tests that only knew how to do number sequences and shape similarities.

To make sure I haven't been deluding myself and needlessly bothering young relatives I found a set of questions at fibonicci.com and ran through them all with that method. These were the ones ranked as medium difficulty. There were 13 questions, and there were 3 of them that this technique didn't get.

The trickiest one that it did get was

7  21  14  42  28 ...
 14  -7  28  -14

so two alternating sequences, each doubling. Or at least that's the interpretation I took, and it agreed with the test writers. There are an infinite number of patterns that fit any particular sequence, so agreeing with the test writer as to the simplest one is the important criteria. I'm sure there's a point to be made about such tests there.

The sequences it didn't get were

75  15  25   5  15...
 -60  10  -20  10

which I wasn't sure of the proper answer to, but the internet suggests an alternating series of divide by 5 and add 10.

1   2   6  24  120...
  1   4  18  96
    3   14  78

where each term is the previous term multiplied by n. And

183  305  527  749  961...
  222  222  222  212

and where the trick is not consider the numbers as integers but rather to consider each digit separately, so the next term is 183 again as all the digits wrap back around to their original values.

Obviously there are a lot of more complex sequences that this technique won't help with, but for test questions it's extraordinarily useful.

Sunday, October 16, 2011

Dart: Block return revisited

One of the things that was in my initial list of things I missed in Dart was the ability to return from a closure and return to the original outer scope, as Smalltalk blocks do. Thats's important if you're defining all your control structures in terms of closures. e.g.

   input isEmpty ifTrue: [^self]

That doesn't do you a lot of good as a guard clause if the return only returns from inside the block and just continues execution on the next line. Another place non-local return is useful is to short-circuit collection iteration. So to find the first element in a collection that satisfies some condition we can write it as

   detect: aBlock
      aCollection do: [:each | (aBlock value: each) ifTrue: [^each]].
      ^'not found'

The if statement motivation isn't so important in Dart because they have "if" as a syntactic construct and if you return from within the action clause it's part of the same method, not in a closure.
For iteration, Dart has three mechanisms, two of them syntactic, and one that's just a method. The first is the old-style "for" loop

   for (i=0;i<=something;i++) { print(i); }

The second is also a "for" loop, but with special syntax that iterates over each element of a collection

   for (var each in aCollection) { print(each)}

Finally, there's a forEach method listed in the Collection interface that takes a closure and applies it for each element

    aCollection.forEach( (each) => print(each);

With the first two forms they're part of the syntax, so a return statement will break out of the loop. But if you're invoking the forEach method then there isn't a way to break out of it.

The interesting bit I discovered today is that if you write your own collection classes, you can still use any of the forms. The implementation of the second form is that it sends the iterator() message to the collection and then uses that iterator to loop. So I could write a trivial binary tree class, define an Iterator (two methods: hasNext() and next()) for it and write

   for (var x in tree) { print(x); }

and it works fine.

I have to say both that I'm impressed that that works and that this takes a bit of the edge off my wanting non-local return. There are other uses, but I do have to admit that it's a complicated feature with some difficult edge cases, and being able to use it this way does take the air out of the most obvious motivating use case. Wanting to have an at:ifAbsent: where in the ifAbsent case I return from the original outer scope might be useful, but it's not nearly as good an example.

Dart keyword arguments

Coming from a Smalltalk background, one of the things I like in it is the message format, with keywords with colons separating the arguments rather than mathematical function syntax. So, e.g.

    aDictionary at: 12 ifAbsent: ['default'].

Dart has operators you can define as messsages, so their default syntax would be

    aMap[12]='something';

But that's not the same operation, as it doesn't have the ability to say what to do if the key is missing. It does define

    aMap.putIfAbsent(12,'default');

but that's not the same as at:ifAbsent:, both because it always adds something to the map/dictionary, and because the thing you're adding is a value, not the result of evaluating a block.
However, Dart also has named keyword arguments to methods. They're not shown, as far as I saw, in the introductory materials on the site, and I didn't notice them in the sample code that I looked at, but they're in the spec. The form is

   at(var key, [ifAbsent]) {
     if (containsKey(key)) return this[key];
     if (ifAbsent is Function) 
        return ifAbsent(); 
     else 
        return ifAbsent;
    }

and to invoke it we'd do something like

  Map aMap = { "one": 1, "two":2};
  aMap.at("one",ifAbsent: "default");

or, if we felt like formatting it slightly differently, it seems that there can be whitespace between the dot and the method, so

  aMap.
     at("one",
     ifAbsent: "default"

  aMap.
     at: 'one',
     ifAbsent: ()=> throw SomeException;

Both of those latter forms, while they have some additional bits of punctuation, look interestingly like syntax I'm more used to. Named parameters can also have default values, which could also be useful, though I tried

  someFunction(var a, [b= (var x)=>x * 2])

and while it didn't complain of a syntax error, variable b was null if not specified. That might just be a compiler bug, or it may be that a closure literal isn't allowed there, I'm not sure.
I haven't tried using these on any kind of scale to see how usable they would be in practice, and they obviously aren't the preferred style in the examples, but I found it interesting. And as a minor point, it's odd that for normal parameters I have to specify either a type or "var" but that doesn't seem to be necessary for the named parameters.

Friday, October 14, 2011

STIC 2012 Call For Participation

The Smalltalk Industry Conference (previously known as Smalltalk Solutions) 2012 Call For Participation is now out. The web site appears to be in a poor state right now, so pasting the whole thing here.

Smalltalk Solutions is now called STIC – Smalltalk Industry Conference.

STIC is a forum where Smalltalk professionals, researchers, and enthusiasts can meet and share ideas and experiences. We are currently accepting proposals for talks involving Smalltalk technology and other areas of innovation in the software industry. We're looking forward to an excellent conference, and need your participation to maintain the high technical level of the conference!

The conference will take place in Biloxi, Mississippi, March 19 – 21, 2012.

Presentations will have 45 minutes time slots including discussion. They may be in the form of

Technical Presentations
Experience Reports
Technology Demonstrations
Panel Discussions
Workshops
...

Proposals should be submitted by email to STS_Speakers@stic.st and should include the following information:

Name
Contact Information
Type of Presentation
Title of Presentation
Brief Abstract
Short Biography of the Presenter(s)
Any constraints on date/time
Any other information of importance in evaluating the proposal

If you cannot discuss the internal application you are working on because of corporate restrictions, perhaps you can discuss the application's component usage or development process. We will also be reserving time for short presentations, of the form of Lightning Talks and (very) short technology demonstrations, but these will be available for sign-up at the conference rather than as advance proposals.

Submissions should be received by December 15, 2011. Note that submissions with incomplete information may be rejected - particularly if bio or abstract information is missing.

Presenters will qualify for a significantly discounted registration. This year that will be $200 US, or $160 US for STIC members.

For announcements of the conference see www.stic.st and http://www.smalltalksolutions.com/.

Georg Heeg

STIC - Smalltalk Industry Council

Executive Director

Phone +49-3496-214328, Fax +49-3496-214712

Monday, October 10, 2011

Google+ Useful

I got an invite to Google+ fairly early on, but up until now I haven't found it all that useful. Right now my ratio of technical to non-technical people in circles stands at 42:1. So it acts a lot like yet another technical news service. But the other day I was looking at some of the posts people made and there were several about talks at the upcoming Splash conference, which combines a number of other conferences, including OOPSLA. I hadn't been to OOPSLA in a few years, but there were some very interesting talks, including Daniel Weinreb on language extensibility in Object-Relational mapping (I don't care if you don't find O/R mapping interesting - I do :-) interesting massively multi-processor research in Smalltalk with Dave Ungar, Ivan Sutherland on getting away from the "prison" sequential computing and a talk from the always-interesting Dave Thomas (OTI / Bedarra) on why modern application development sucks. Lars Bak of newfound Dart fame will also be in attendance.

So now I'm going to go, and it's more or less because of Google+.

Dart

Today is Thanksgiving in Canada, but the relatives were all over yesterday, so that gave me a lot of time to spend looking at Google's new "Dart" language, introduced this morning at the GOTO conference in Aarhus.

So here are my initial thoughts and questions, given that I've really only just read stuff and written a few code snippets. I'll try to keep it a little beyond just complaining that it isn't my favourite language :-) Overall, it's clearly very early days for it, and there are many things yet to be done, but it looks like it does a lot of things right and has some serious potential.

Open Questions

What sort of tool support is there? I've seen that talked about, but not being at the keynote I haven't seen it demoed yet.
How exactly does the exception model work? One of the things I find most useful in Smalltalk is that exception handling is done in two phases. First, we find the handler and run it, and only after that do we unwind the stack. That means that when developing, an unhandled exception can put us right into the debugger at the point of the exception and with the ability to see the code and modify it. There are hints in the spec that suggest Dart is able to do this. The exception handler definitely gets a strong representation of the stack which "becomes undefined" after it's finished. I'd like to know more. Exceptions seem to be able to be any object, though many of the exceptions are classes. That seems like it might run into problems when you start wanting to be able to consistently ask certain things of an exception, but maybe the things you want to ask aren't really the exception object but the other things that come along into the handler.

Can I do some equivalent of block return? So, for example, the Dart collection library includes some basic iteration methods. There's a forEach, and there's a filter (think #select:) but there isn't a detect:. How would I write an equivalent of detect: in terms of a provided forEach method, since the return from a function just returns from the inner block. Maybe I could do it with an exception, but that seems awfully ugly. Or I could save the value and let it just keep going through the rest of the collection, but I don't want to do that. And that leads into...

Can I add my own control structures? And if so, how pleasant will they be to use? This has a few pieces. Can I extend existing classes? Javascript lets me do this easily, as does Smalltalk. Newspeak doesn't, on philosophical grounds. But to me this seems awfully useful for being able to define my own control structures and other language elements, and that's an important piece of being able to extend the language into its own DSL.

In general, I find myself thinking about typical Smalltalk tools and wondering how doable it would be to write them in Dart. Maybe I'll give it a try and find out.

Things I Like

Everything's an object. No primitive types and no mismatched operations because of it.
Most things are message sends. There's still too much syntax for my taste, but most of the important stuff goes through message sends.
Unlimited size integers

There's a doesNotUnderstand: equivalent (noSuchMessage)

There's a fairly significant type system. You can completely ignore it.
The setters and properties are nice. You can reference things as properties, and they go through the get/set methods if you've written them.
Also, there's operator overloading, but in the sense that operators are really just methods, and it lets you write some nice things like

things.stuff[1]="Foo"

Types like "int" are actually interfaces. Though some people were arguing that it's confusing to have them as lowercase.
A late addition - no separate compile step.

Things I'm Not Quite Sure About

The concurrency model with single-threading, but the ability to spawn actor-style isolates sounds quite interesting. That'll be interesting to see.
The default return value is null, not self. This is philosophically because if you don't explicitly return something and you try to make use of the return value it's probably a mistake and you want it to fail fast. But I'm not sure this wouldn't lead to a lot of extra statements in methods if you do tend to make use of the return values.
There's an interpolated string form that they seem to favour, where you can embed ${expression} into a string. But one comment that I read argued that this wasn't as useful as being able to put in things more like positional arguments and then bind that to a set of values, and I think they have a point.
Reflection is still a to-do list item.

The syntax for functions is pretty lightweight. In the simplest case, something like (excuse my formatting)

     () => 3;

     (aNumber) => aNumber + 1;

and the more complex cases where it's not a single expression in the body are

     (aNumber) { print(aNumber); return aNumber + 1;};

That's nicely short. I find the parentheses to be a bit of syntactic noise that bothers me more than the very simple [3], but I might be able to live with it.

Things I Could Wish For

There's no become:, but then I wouldn't really have expected to see one. And I'd have rather had keyword argument syntax, but I didn't really expect that either.

Classes aren't objects. There are "static" methods but they aren't really class methods, and you can't have an expression that evaluates to a class. They argue that the class/instance method distinction has been shown to be confusing for users. There is a fairly sophisticated mechanism for constructors which might be enough for the most common uses of class methods, it's not clear.

The reserved words seem a bit more intrusive than they need to be. Or maybe it's just that I got bitten trying to write a method named "do" :-)

It seems like only certain things are Hashable and thus eligible as Map keys. That seems restrictive. But I might just be confused about that one.

There's a lot of spec space and presumably a lot of mental cycles spent on the static typing system. I'd just as soon that energy went on more worthwhile things :-) Mostly you can ignore it, but I did run across one interesting case. There's a for loop construct that will let you loop over a collection. It requires you to put in a type, but ignores it. So I can't leave out the "int" in

main() {
  var a = [1,2,3];
  for(int x in a) print(x);
}

main() {
  var a = [1,2,3];
  for(Exception x in a) print(x);
}

Thursday, October 6, 2011

Google sues itself, more or less

This is an interesting twist, a Google-backed "non-practicing entity" patent holder suing Motorola, which Google is currently trying to acquire. In the current environment, where big companies build up big portfolios of patents to deter infringement, it seems to me that it's a real advantage to be a non-practicing entity so that you're immune to that defence. You can't be countersued for the patents your own products infringe, because you don't have any products. So ultimately its a win for pure parasitism over anyone actually trying to produce something.

The last few years I've been involved with a few different patent lawsuits where things that were done years ago in TOPLink or in Smalltalk might serve as prior art. When you talk to the lawyers they're generally quite open about the brokenness in the current patent system, but until something changes what can you do but play along and charge lots of money for doing so.

Monday, October 3, 2011

Principles of OO Design, or Everything I Know About Programming, I Learned from Dilbert

Back in the last century, I wrote a column for the magazine "The Smalltalk Report". One of the articles I wrote has been enduringly popular and I still get the occasional person asking about it. So here it is yet again.

Principles of OO Design

or
Everything I know about programming, I learned from Dilbert

Everyone knows that objects and object-oriented design are the hottest things since sliced bread (and of course, slices of bread are objects). The problem is that it’s hard to agree on what exactly they are. There have been many attempts to define the principles of OO design and coding, with varying degrees of success. In my opinion, most of them suffer from two flaws. The first is that they don’t tell me enough about how to code. Reading a definition of polymorphism doesn’t tell me how to exploit it in my programs. The second, and more important problem, is that they’re dull. Even if the definition of polymorphism did tell me how to code, it’s hard to stay awake long enough to finish reading it.

Therefore, I modestly present some of my own principles of OO-ness, which I hope address both of these flaws. Furthermore, I believe that these principles relate well to the corporate environments that are currently adopting OO principles.

1) Never do any work that you can get someone else to do for you

This is always good advice, but it’s particularly applicable in OO. In fact, I consider it the fundamental principle of OO. As an object, my responsibilities are very clearly defined, and so are those of my co-workers. If something is (or ought to be) one of their responsibilities, then I shouldn’t try to do that work myself.

Let’s look at a concrete example

   total := 0
   aPlant billings do: [:each |
      (each status == #paid and: [each date > startDate])
         ifTrue: [total := total + each amount]].

versus

   total := aPlant totalBillingsPaidSince: startDate.

In the first case we’re asking the plant for all of its billings, figuring out for ourselves which ones qualify, and computing the total. That’s a lot of work, and almost none of it is our job. Far better to use the second option, where we simply ask for something to be done and get a result back. In real-world terms, the conversation might look like

“Excuse me Smithers. I need to know the total bills that have been paid so far this quarter. No, don’t trouble yourself. If you’ll just lend me the key to your filing cabinet I’ll go through the records myself. I’m not that familiar with your filing system, but how complicated can it be? I’ll try not to make too much of a mess.”

Smithers actually understands his filing system, so he can probably do the work faster than we can, and he’s much less likely to mess everything up. In seeking to do his job for him, we’re just making things worse. They’ll get a lot worse when he switches over to that new filing system next week. We’d be far better off with the stereotypical tyrant boss.

“SMITHERS! I need the total bills that have been paid since the beginning of the quarter. No, I’m not interested in the petty details of your filing system. I want that total, and I’ll expect it on my desk within the next half millisecond.”

Let’s look at a simpler example, which is all too common.

   somebody clients add: Client new.

versus

   somebody addClient: Client new.

There’s always a temptation to choose the first, since it saves writing a couple of methods that do nothing but adds and deletes on the other class. But deep down you know it’s wrong. You’re trying to do somebody’s work for them, and in the end it’s only going to cause problems. Writing those extra methods puts the responsibility where it belongs, and will make the code cleaner in the long run.

This principle is close to the more conventional idea of encapsulation, but I like to think it makes the idea a bit clearer. I often see people who are happily manipulating the internal state of another object, but think it’s OK because they’re doing it all through messages. Encapsulation is not just about accessing state, it’s about responsibilities. Responsibility is about who gets stuck doing the real work.

2) Avoid responsibility

If responsibilities are about getting stuck with work, it’s important to avoid them. This has some important corollaries

If you must accept a responsibility, keep it as vague as possible.
For any responsibility you accept, try to pass the real work off to somebody else.

Our first principle tells us to take advantage of other objects when writing code. We also have to avoid being taken advantage of. Any time I (as an object) an tempted to accept a responsibility, I should ask myself “Is this really my job?” and “Can’t I get someone else to do this?”

If I do accept a responsibility, it’s important to keep it as vague as possible. If I'm lucky, this vagueness will help me get out of really doing the work later. Even if I do have to do the work, it may let me take some shortcuts without anybody else noticing.

For example, I’ve seen objects with responsibilities described as

Maintain a collection of the whosits to be framified

This is much too specific. My job isn’t to maintain a collection, it’s to be able to report, when necessary, which whosits need framification. That may be implemented by maintaining a collection, it may be implemented by asking one or more other objects for their collection(s), it may be hard-coded, or it may be computed dynamically as

   Whosit allInstances select:

No matter which of these options I choose, there shouldn’t be any impact on my responsibilities.

My preference for phrasing a responsibility of this kind is

“Know which ...”

but I’m flexible as long as the phrasing is suitably vague. I’d probably be even happier with

“Be able to report which ...”

Now, this is all very well, but carried to the extreme, it seems that this could lead to the situation where everyone passes information around and nothing ever gets done. Exactly. Object bureaucracy at it’s finest.

Seriously, a good OO system can actually approach this state. Each object will do a seemingly insignificant amount of work, but somehow they add up to something much larger. You can end up tracing through the system, looking for the place where a certain calculation happens, only to realize that the calculation has been done and you just didn’t notice it happening.

3) Postpone decisions

The great virtue of software is flexibility. One of the ways we achieve flexibility is through late binding. We most often talk about late binding between a method name and the method it invokes, but it’s also important in other contexts. When faced with a decision, we can gain flexibility by postponing it. The remaining code just needs to be made flexible enough to deal with any of the possible outcomes.

The ideal is when we can avoid making the decision at all, leaving it up to someone else (the end-user, other objects). For example, consider the question of how to implement dictionaries. The standard thing to do is use a hash table. That works well for medium-sized collections, but it’s a waste of space and effort for very small collections. For very large collections, it may also be wasteful, particularly if the number of elements exceeds the resolution of our hash function. We have to make a decision here, so we’d like to postpone it or pass it off to someone else.

Some implementations of the collection classes do precisely this. The collections pass off much of their behavior to an implementation collection which actually does the work. Depending on the size, the nature of that collection can change. In VisualAge 2.0, small dictionaries would be stored as arrays since the overhead of hashing was more than the cost of a linear search. Larger dictionaries could be represented as either normal or bucketed hash tables. Unfortunately, postponing this particular decision ended up leading to worse performance on the average cases, and the scheme was abandoned in VisualAge 3.0. This makes it not only a good example of the principle, but an illustration of when you can take it too far. Postponing decisions can have a performance cost.

There are other possible costs to this kind of principle. Decisions aren’t just sources of problems, they give us the power to solve our own problems. Since we can’t solve all the problems of the world at once, we make the decision to limit ourselves, and we make assumptions about the problems we’ll be given. This makes our code simpler, easier to write, and faster. The problem arises when it turns out our decisions were bad, or our assumptions don’t hold any more. The trick is to make enough decisions to be able to work, but few enough that our code doesn’t become brittle. That’s one of the things that makes software hard, and makes the ability to change software important. That’s one of the reasons I like Smalltalk, because it makes changing software much easier than any other environment I’ve seen.

4) Managers don’t do any real work

The subject of “manager” or “control” objects can provoke a lot of debate in OO circles, much as the subject of “managers” does in other work environments. Some argue that they are inherently un-productive and should be eliminated. Others argue that, although they may represent a throwback to outdated ways of thinking, they can be very useful under the right circumstances.

I definitely believe that managers can be useful, but it’s important to distinguish between good ones and bad ones. For example, consider a program in which most of my classes are “record objects” (objects whose only behaviours are get and set methods). The real work is done by a control class which manipulates these objects, with full access to all their data. At this point I have a procedural program dressed up in an OO disguise. The control object is in the most complete possible violation of the fundamental principle, since it’s trying to do all the work itself.

On the other hand, consider a window class like the VisualWorks ApplicationModel or the WindowBuilder WbApplication class. These are manager objects that coordinate the interactions between user interface widgets and the domain model. They server as a vital “glue” layer (although I prefer to think of it as duct tape) and it would be much harder to get a clean design without them. This pattern is less clear in VisualAge, since the glue lies in a number of different “Part” classes, but it’s there.

People who are vehemently opposed to any kind of manager object are often stuck in the trap of trying to precisely model the world, taking the OO paradigm much too literally. One of my favourite quotes on this subject (from several years back) is from Jeff Alger, who wrote:

"The real world is the problem; why would you want to just simulate it?"

How can we tell a good manager object from a bad one? We apply the principle that managers don’t do real work. A manager object should manage interactions between other objects. It should not be trying to do work itself, unless it’s legitimate management work.

An example of legitimate management work is an ApplicationModel figuring out which menu items need to be disabled. An example of non-legitimate work would be doing (non-trivial) calculations of values to be displayed in its fields. Those values should be calculated by the domain objects.

This rule can be tricky to apply in practice. It can be difficult to decide if something is legitimate management work or not. Always remember that this is just a specific application of the fundamental principle. If the manager can plausibly get someone else to do the work, it should do so.

Another difficulty is that the word “Manager” is sometimes tacked on to the end of a class name even though what it describes is not a manager at all. In one comp.object discussion Robert Cowham (cowhamr@logica.com) described a DiscountPolicyManager object, and worried about the desirability of introducing a manager object, even though it seemed to make the design cleaner. The description was as follows:

A Discount Policy Manager is going to be passed, say, an Invoice object, and will calculate the appropriate discount to be applied to that Invoice (using methods on the Invoice to find out about it) and then use a method on Invoice to add the discount to it.

Reading this description, it’s clear that the DiscountPolicyManager is really just a policy object as described in the previous section. It isn’t a manager at all, and should be called DiscountPolicy instead.

5) Premature optimization leaves everyone unsatisfied

The most fun you can have as a programmer is optimizing code. There’s nothing quite so satisfying as taking some little piece of functionality and making it run 50 times faster than it used to. When you’re deep in the middle of meaningless chores like commenting, testing, and documenting the temptation to let go and optimize is almost irresistible. You know it’s got to be done sometime, and you feel like you just can’t put it off any longer. Sometimes you’re right and the time has come to make this piece of code really scream. More often than not, you’ll be happier in the long run if you can just hold off a little longer.

There are several reasons. First of all, time spent on optimization isn’t being spent on those “meaningless” chores, which are often more important to the success of the project. If testing and documentation are inadequate, most people won’t notice or care how fast a particular list box updates. They’ll have given up on the program before they ever got to that window.

That’s not the worst of it. Premature optimization is usually in direct violation of the principle of postponing decisions. Optimization often involves thoughts like “if we restrict those to be integers in the range from 3 to 87, then we can make this a ByteArray and replace these dictionary lookups with array accesses”. The problem is that we’ve made our code less clear and we’ve greatly reduced its flexibility. It may have felt really good at the time, but the other people involved in the project may not be entirely satisfied.

Of course this rule doesn’t apply to all optimizations. Most programs will need some optimization sometime, and this is particularly true in Smalltalk. As a very high-level language, Smalltalk makes it very easy to write very inefficient programs very quickly. A little bit of well-placed optimization can make the code enormously faster without harming the program.

There’s also a large class of optimizations that I call “stupidity removal” which can be profitably done at just about any time. These include things like using the right kind of collection for the job and avoiding duplicated work. Their most important characteristic is that they should also result in improvements to the clarity and elegance of the code. Using better algorithms (as long as their details don’t show through the layers of abstraction) can also fall into this category.

Other Rules To Live By

There are a lot of other rules of life that can be extended to the OO design and programming domains. Here are a few more examples. Feel free to make up more and send them to me. Make posters out of them and put them up on your office wall. It’ll make a nice counterpoint to those insipid posters about “Teamwork” and “Quality” that seem to be everywhere these days.

· Try not to care - Beginning Smalltalk programmers often have trouble because they think they need to understand all the details of how a thing works before they can use it. This means it takes quite a while before they can master Transcript show: ‘Hello World’. One of the great leaps in OO is to be able to answer the question “How does this work?” with “I don’t care”.

· Just do it! - An excellent slogan for projects that are suffering from analysis paralysis, the inability to do anything but generate reports and diagrams for what they’re eventually going to do.

· Avoid commitment - This is another way of expressing the principle of postponing decisions, but one which might strike a chord with younger or unmarried programmers.

· It’s not a good example if it doesn’t work - This one comes from Simberon's David Buck (david@simberon.com), who’s fed up with looking at example and test methods that haven’t been properly maintained as the code evolved. I can’t think of a way to apply this on to life, but it’s good advice anyway.

· Steal everything you can from your parents - A principle for those trying to make effective use of inheritance or moving into their first apartment.

· Cover your ass - Like in a bureaucracy, the most important thing is to make sure that it isn’t your fault. Make sure your code won’t have a problem even if things are going badly wrong elsewhere.

My original byline for this stated that "Alan Knight avoids responsibility with The Object People", but nowadays I'm, well, um, a manager at Cincom Systems. I'm trying not to think about the OO design implications of that.

Friday, September 30, 2011

"Go" Smalltalk

Here's a random thought that I haven't done a lot of homework on but thought I'd throw out there. People are often looking for places that can host web sites written in Smalltalk. Lots of places will give you full access to a machine, but the places that provide a lot of infrastructure tend to restrict things to a couple of platforms.

Google's App Engine is one such platform. It provides a lot of services and scalability, and is free for up to 5 million or so page views per month, which is probably enough for many purposes. But the applications have to be written in Java, Python, or Google's own "Go" language. Go is intended as a system programming language, so presumably has good performance on low-level operations, but offers garbage collection and run-time reflection.

So it ought to be possible to write a Smalltalk VM in Go that wouldn't have the same sort of performance difficulties that you'd get trying to write one on top of, say, Java. And if the garbage collection is at all reasonable it might be possible to just delegate the GC to it instead of writing it as part of the VM. I would think it ought to be not that difficult to, say, adapt Squeak's Slang to emit Go code instead of C, or even to write a simple VM directly in Go.

There are a few difficulties. If you actually want to run on App Engine, it's a restricted environment. You can't write files, you have to use their datastore APIs for anything that persists. You're only allowed to run code in response to HTTP requests. So it's not really what Smalltalk expects, and it certainly wouldn't run the normal development environment easily. The only running code in response to HTTP requests might be an issue, though I'm not sure exactly what it means. Since this is supposed to scale automatically using Google's infrastructure, it might mean that any two requests might go to different running instances and you might get shut down once the request is done. So if you want to save any information between requests it would have to be put into some sort of data store. That's probably not so good for Seaside continuations, although it's possible to serialize processes using VisualWorks BOSS or the new Squeak/Pharo Fuel serializer. Aside: I know I saw something talking about using Fuel for this but can't find the link; this is something that BOSS has supported for many years.

Nevertheless, a Smalltalk VM in Go running on App Engine might make quite an interesting way of deploying at least some types of Smalltalk web applications with excellent scalability and free up to quite a large usage. I wonder if anyone else has thought about this or looked into how difficult it would be or how severe the limitations are in practice.

Monday, September 12, 2011

MacBook Air

The recent trip to Germany with a lot of city-hopping started making me very weight-conscious, so when we decided to replace the 2006 MacBook Pro we went with the new MacBook Air that had just come out. And I have to say it's remarkably nice. The difference in weight really makes a difference, not just when you're carrying it around in a bag, but even in normal use, and it's a very nice and responsive machine. I did run into one issue I hadn't thought about when one of the hotels on a recent vacation only had wired internet in the room (the Air has no Ethernet jack - it's too thin to have room for one) but on the whole I've been extremely happy with it so far.

Monday, August 29, 2011

Confusing hotel icon

Was taking pictures off my phone and came across this one, which I took in one of our hotels in Germany. It seems to be one of those language-independent icons intended to tell you, well, something, but I still have no idea what it means, and even a web search doesn't seem to help.

Any ideas?

Tuesday, August 23, 2011

Smalltalk Industry Conference (aka Smalltalk Solutions) 2012

Smalltalk Solutions has been renamed as the Smalltalk Industry Conference, and today we've announced that the 2012 conference will be held at the Beau Rivage resort in Mississipi. It's supposed to be a really nice facility (certainly the people who run it seem to have generally nice facilities), and it's an area I've never been to, so I'm looking forward to it.

Tuesday, August 9, 2011

Air Canada??? Really?

Yesterday I got an email from Air Canada which included in its headlines "Skytrax Award: Best International Airline in North America" and if you follow the link, it further explains

Just a few of the reasons we were ranked “Best International Airline in North America” in a worldwide survey of 18.8 million global air travelers**, took home five top honours in Business Traveler’s “Best in Business Travel” award program*** and was named Global Traveler Magazine’s Best International Airline in North America****

with the footnote

**The survey was conducted by independent research firm Skytrax between July 2010 and May 2011 using over 38 different aspects of passenger satisfaction to rank airlines’ product and service standards. This annual survey is regarded in the air transportation industry as a primary benchmarking tool for passenger satisfaction levels of airlines throughout the world.

This made me think. Has someone perfected the art of rigging a survey that much? Or are we led to the unbelievable conclusion that other North American airlines are actually *WORSE* than Air Canada?

But maybe I'm just bitter after years of having essentially no choice, and just having to put up with things like having the first leg of your flight silently dropped, leaving you stranded in Italy without a flight, with airlines apparently completely unable to communicate with each other so that you're left standing down the corridor at a pay phone on hold for eons at international long distance rates to try to get it fixed, and then Air Canada's idea of recompense being a modest discount on your next flight with them that if you took a suitably expensive flight would just about cover the additional expenses you incurred just to get home, and that's not counting the long distance in that. @#$@#%@.

Friday, July 29, 2011

Fun with Mail (part 2)

Some more details on the mail filtering I'd talked about here. To recap, I'd set up a Linux server in the corner of the laundry room, and was using it for IMAP, with Thunderbird on a Mac as the client. But I was finding Thunderbird's filtering very unreliable. I guessed that might be because it was IMAP rather than local folders. So I decided to do some filtering in Smalltalk. I set up a cron job as

export VISUALWORKS=/home/aknight/vw7.7.1nc
/home/aknight/vw7.7.1nc/bin/linux86/visual /home/aknight/bin/imap.im -nogui -evaluate "10 seconds wait. Net.Filter new run

To do the filtering, I wrote a simple class called Filter. I put it in the Net namespace because that way it would see all the Net classes I wanted to use, and because I was too lazy to make my own namespace for just one class.

When one of these filter objects is created we also set up an IMAP client, as

"Security.X509.X509Registry default 
   addTrusted: Security.X509.AlansGlobal."
"client useSecureConnection."
client := IMAPClient host: '192.168.1.5'.
[client connect] 
   on: Security.SSLWarning 
   do: [:ex | ex proceed].
client user: (Net.Settings defaultIdentity).
client login.client select: 'Inbox'.

You'll notice the first two lines are commented out, because after I'd set up the security I decided I didn't really need it for a process running on the same machine, within my home network. But I left the code there because it might be important in other circumstances. The remaining lines create an IMAPClient, tell it which host to use, tell it to use the identity that I'd entered in the settings, have it log in to the server, and issue the select: command to look at the Inbox.

One thing that's important is that when we're done, we should be careful to close the connection, doing

client close.
client disconnect.

or else the server gets too many connections after a while and complains. Not everything has a nice garbage collector to clean up for us.

Once we're connected, we need to get the messages.

messages
  | unseen tempMessages result notDeleted |

  unseen := client searchMessages: 'UNSEEN'.
  notDeleted := client searchMessages: 'NOT DELETED'.
  notDeleted isEmpty ifTrue: [^Dictionary new].
  tempMessages := client fetchMessages: notDeleted.
  client markAsUnSeen: unseen.
  result := Dictionary new.
  tempMessages do: [:each |
    result at: 
      (Integer readFrom: each key readStream) 
      put: each value].
  ^result.

The searchMessages: API will let you search on the server for a particular criteria. The criteria are pretty self-evident. One thing that I'm working around here is that using these API's marks the messages as read. So what I'm doing is finding all the unread messages and keeping a list of their ids, then fetching all of the messages, and then marking the ones that were previously unread as unread again. Not very elegant, but it worked ok. There's probably a race condition there if new messages arrive in between the steps, but the worst thing that happens is the messages show up as having been read when it's not true.

Once we've got all the messages, we loop over them and run filters. Much of the code for that is actually error handling. Martin Kobetic, who wrote a lot of our Net code, says that spam is a wonderful source of edge cases for the various protocols and formats. The main part of the loop looks like

messages keysAndValuesDo: [:key :eachMessage |
   message := [[MailMessage readFrom: eachMessage first readStream]
      on: KeyNotFoundError
      do: [:ex |ex receiver = StreamEncoder encoderDirectory
         ifTrue: [#undecodeablejunk]
         ifFalse: [ex pass]]]
      on: ParsingSimpleBodyError
      do: [:ex | #undecodeablejunk].
   message = #undecodeablejunk ifTrue: [
      Transcript cr; show: index printString, ' is undecodeable'.
      WindowingSystem isHeadless ifFalse: [eachMessage inspect]].

Messages are keyed by integers (message number in the particular mailbox) on the server. So we loop over the key (message number) and the message itself. Well, the message is actually an array with one element with the message body. We need to read that and extract the various header fields. But the message might be in an encoding we don't have. That comes up as a KeyNotFoundError, meaning we didn't find the encoding name, say, Big10. I chose to interpret that as meaning the message wasn't important, so I just return the special symbol #undecodeablejunk and log it. If I'm running interactively, I inspect the message, so I can validate that. I did have some valid messages get flagged as junk that way, but not a lot.

Even if we've got the encoding, the message may be malformed in interesting ways, and we may get a ParsingSimpleBodyError, so I catch that and also mark things non-decodeable.

Then we want to actually run the filters. I defined a pragma for filters, so what I have is a bunch of methods that look like

filtervwnc
   "self new run"
   
   ^self matchRecipient: 'vwnc@cs.uiuc.edu' andMoveTo: 'INFO.vwnc'

Where matchRecipient:andMoveTo: looks like

matchRecipient: recipient andMoveTo: mailbox
   message to, message cc do: [:eachRecipient |
      ('*', recipient, '*' match: eachRecipient) ifTrue: [
         ^self moveTo: mailbox]]

The pragmas are run by iterating over the collection we get from

filters
   ^(Pragma allNamed: #filter: from: self class to: self class)
      sorted: [:a :b | (a argumentAt: 1) <= (b argumentAt: 1)].

If any of the filters return value is the symbol #stop then we don't run any other filters, otherwise we keep going until the end. So, for example, I put in a filter that if an email was directly addressed to one of my email addresses, don't run any of the other filters, leave it in the Inbox. And once any filter has tagged a particular message, we move the message to the appropriate place and then stop.

Finally, there's moving the messages. The actual move is just a copy and delete in terms of
IMAP operations.

move: messageIdentifier to: mailbox

 | result1 result2 |
 result1 := client copy: messageIdentifier to: mailbox.
 result2 := client markForDelete: messageIdentifier.

but just to make doubly sure I'm not running into trouble I put in some checking ahead of that.

moveTo: mailbox

 | checkMessage |
 "First, check that the message is what we thought it was."
 checkMessage := (client fetchMessages: (Array with: index)) first value first.
 client markAsUnSeen: (Array with: index).
 (messages at: index) first = checkMessage ifFalse: [^#stop].
 self move: (Array with: index) to: mailbox.
 Transcript cr; show: 'Moving ', index printString, ' to ', mailbox.
 message isSymbol 
     ifTrue: [Transcript cr; show: message] 
     ifFalse: [ 
  Transcript cr; show: (message from first, '   ', message subject)].
 ^#stop.

In the end, with a bit of fighting with things that are hard to debug when the right thing just doesn't happen, I got this pretty much working. It had a few issues. One is that even though I was carefully running the filters a few seconds after each fetch, there was often some delay in the filters running, so I'd have things that should get redirected to mailing lists left in the Inbox for a couple of minutes. Another was that every once in a while it'd get stuck on a message that was malformed in a new and interesting way, and I had to go look at the error processing again. Some messages did get falsely caught - there are people sending legitimate emails who used some very peculiar encodings or header formats.

The biggest issue, though, is that in the end this proved mostly unnecessarily because I switched to an email client where the filters work on the Mac (Postbox) and that has a number of other advantages over Thunderbird as well. I'm still using the Smalltalk filtering. It has the advantage that it doesn't require the mail client on my main computer to be running in order for filtering to happen. But I've switched some of the most common filters to just use Postbox's filtering, mostly the ones that are for mailing lists that generate a lot of traffic. But I've still got most of my filters in Smalltalk, and nowadays the need for me to check them is pretty rare. And it was definitely an interesting experience writing it.

Thursday, July 28, 2011

OS X Lion

One thing I'm finding annoys me with Lion is the rmemoval of a feature that it turns out I used all the time -what I'll call mini-exposé. In Snow Leopard, if you held down the mouse button on a dock icon, it would show you all the windows associated with that application. Now if you do that it gives you a text list of them, and if you want to see the windows it's a separate menu item. Especially for something like a web browser with tabs, the textual representation isn't nearly as useful. Sigh.

On the plus side, following Travis' instructions it looks like I have syntax highlighting for Smalltalk code working here.

Wednesday, June 22, 2011

Weltmeisterschaft

I'm off for the next while to see the women's football world cup in Germany, so I'm unlikely to be posting much, and if I do, it probably won't be technical. See you all after the final.

Interesting Bugs

There are lots of interesting ways for software to go wrong. Here's one recent one that we uncovered at Cincom. And I should say that by "we" I mostly mean Tom Robinson, from the Store group.

I first noticed this bug in the middle of a demo in Frankfurt. I was showing some interesting Glorp and StoreGlorp capabilities. I'd run an interesting expression in a Store workbook, inspect the result, then run another expression and get an error. Force a reconnect, and it was fine again, but it occured several times.

Travis Griggs noticed an even odder manifestion of it in trying out some Store expressions. He wrote an expression to find the head of the 7.9 trunk for a particular package or bundle, and it ran fine, returning the expected result. Then he ran it to find the head of the 7.8 trunk, and it returned the exact same package as before, from the 7.9 trunk. Running it again produced the right answer.

Investigation revealed nothing obvious going on at the Glorp caching level. The database appeared to be genuinely returning results that were clearly incorrect for the query that was issued. Sometimes they were obviously for a different query. You might ask for a StorePackage and get back a single column that was obviously for the tw_databaseidentifier. This only happened when using Postgresql. And it apparently only happened when using the Store Workbook. Normal Store operations never showed this.

After much examination, Tom tracked down the cause, an interaction of several causes. First, when you open an inspector, there may be code specific to the type of object being inspected. In particular, the inspector shows the icon for things. The icon for packages shows differently depending if it's the version that's resident in the image or not. Asking that, especially on a completely new connection, could end up doing a database query. That's not really great, but should still work. But...

The inspectors try to be robust. If you write a printString that goes into infinite recursion, raises an exception, or otherwise doesn't return, it tries to stop it and just print an indication that it didn't work. So one of the mechanisms there is a timeout. If it takes too long, it just terminates the process. And in addition...

The Postgresql driver is written purely in Smalltalk. So there's Smalltalk code that manages a socket and the communication on it. This is in contrast to most other database drivers that call out to a C library, which may be just communicating on a socket underneath, but the protocol and details are hidden.

So, what happens is that we issue a query, and open an inspector on the resulting object(s). The inspector is the reason that this didn't affect normal Store operations. The inspector wants to know if this is the current version in the image, so it issues a query to the database. That query will run in a separate Glorp session, but because we're trying to be careful of resources, it'll re-use the underlying database connection. And there's more setup code than is really necessary that runs for each new session (or did, up until more recent builds). If the database is Postgresql, and isn't very close by on a fast machine, the inspector will time out before it gets the answer, so it will terminate the process. The database driver doesn't clean up properly when the process is terminated, so the previous results are left in a buffer. The next query that's issued on that connection will start looking for results, and get the results of the previous query out of the buffer. If the shape of those results matches what we're expecting, we'll just get a wrong answer. If the shape doesn't match, we'll get a confusing error. For example, if it's running the initial setup query to get the tw_databaseidentifier, we might see the result of that.

Fortunately, this only affects operations that use inspectors, so it has no effect on normal Store operations. And databases other than Postgresql keep the individual query results separate by themselves, so that won't happen. But it's a nice example of some very interesting interactions that normally don't show up on their own.

Friday, May 20, 2011

Checking overrides when upgrading

Here's a question that came up in the vwnc mailing list. Suppose that you have an application which has overrides of a number of system methods. I'm also going to start referring to overrides as redefines here, because I never liked the ambiguity between an override in a subclass and the replacement of a definition that VisualWorks uses the term for. So I'll try calling them redefines here, even though it makes for a lot of backspacing.

Anyway, let's say that this was built in VisualWorks 7.6, but now you're upgrading to 7.8, and you want to check which of these methods no longer apply. That can be complicated in general, but a basic first check is if the method we were redefining changed between VisualWorks 7.6 and 7.8. That's a bit tedious to check manually, but we can script it.

Of course, this depends on how we've organized things. The original question came up in the context of having grouped redefinitions by the package that contained the thing being redefined. So, if we redefined a method in "Assets", we'd create a package "Assets patches" and put the redefinition there.

So here's an example of a workspace script for finding these. It assumes that we've published the old version of the base into our local database, and that we've loaded our code into a new image.

mySession := StoreLoginFactory currentStoreSession.
versionToUpgradeFrom := '7.7 '.
patchPackages := Store.Registry allPackages select: [:each |
each name like: '% patches'].

So, first we need to get a StoreGlorp session which we'll use to do our queries, and we define a variable for how we'll find the old versions. We'll also need the list of patch packages that we have in the image. I used the #like: method to do the matching, which does it in SQL style, rather than the more traditional matches:, or even regular expressions, but they'd all work.

patchPackages do: [:eachPatchPackage |
basePackageName := eachPatchPackage name readStream upTo: Character space.
currentOverriddenPackage := Store.Registry packageNamed: basePackageName.

Now we start a loop over each of our patch packages. The first thing we need to know is what the basic package is, which we do based on the simple naming convention described above.

baseQuery := Query readOneOf: StorePackage where: [:each |
(each name = basePackageName &
(each version like: '%', versionToUpgradeFrom, '%')].
baseQuery orderBy: [:each | each timestamp descending].
baseVersion := mySession execute: baseQuery.

Now we do a query to find the appropriate old version. So this involves a Glorp query to find packages by that name, whose version string matches the variable we set at the beginning. If there are multiples, we take only the latest, by sorting them in descending order and just taking the first one. I'm dealing with the Cincom internal repository, so there will be lots and lots of versions of base code. In a project repository there are probably only a few, making it fairly simple to find this.

Once we've done that, we'll loop over the methods in the patch package, and get the three different versions of the method: ours, the new base image version, and the old base image version. These will be three different kinds of objects, and the APIs for manipulating them and getting them are, well, let's just say not as polymorphic as they might be. Our method will be a CompiledMethod. The new version in the image will be an OverriddenMethod. And the one we read from the database will be a StoreMethodInPackage, mapped to the database. Getting that one is a bit fussy, in that we need to make sure we're asking for the class by name, and the name should be exactly the way it will be in the database.

eachPatchPackage methods do: [:eachMethodDescription |
imageMethod := eachMethodDescription method.
oldOverridden := baseVersion
method: imageMethod selector
forClassNamed: imageMethod mclass instanceBehavior
absoluteName meta: imageMethod mclass isMeta.
newOverridden := Override
selector: imageMethod selector
class: imageMethod mclass
in: currentOverriddenPackage.

Finally, we get the source code for the old and the new versions and check them. If the new base version doesn't exist, then that method was either deleted or moved to another package, and we definitely need to think about our redefinition. And if the source code is different, we also want to think about it. Then there's the question of what to do if there's something to think about. We could easily write that to the file or to the Transcript, but in this case what I've done is open a simple text comparison window on the two. That could get ugly if there are a large number, but is nice to work with for just a few.

(newOverridden isNil
or: [oldOverridden sourceCode ~= newOverridden sourceCode])
ifTrue: [
| view |
view := SideBySideTextComparisonView new
leftText: oldOverridden sourceCode
rightText: newOverridden sourceCode.
ScheduledWindow new component: view; openWithExtent: 800@600]]].

This would need to be tweaked for particular environments, but seems like it might be a good start towards making that sort of migration a bit easier. And if I can figure out how to get the syntax highlighting on this blog working, it might get prettier to read here.

Wednesday, May 18, 2011

Looking at the public repository easily

A quick note. Today someone was wishing there was a way to see what was in the public repository easily without having to fire up Smalltalk. And there is. There's an index of it that's google searchable. It's fairly rough, and it omits things that it thinks aren't interesting, including packages without comments. But it's still quite useful.

Monday, May 16, 2011

Fun with Mail (part 1)

A description of switching around with email clients, and writing some IMAP code in Smalltalk.

I've been a Eudora user for many years. I started using it back when I was a student, and mostly stuck with it, with a bit of time off using VM in Emacs as my primary client. I never liked Outlook, but was able to so I just kept using Eudora in preference to that for corporate email. One of the very nice things in the later versions of Eudora was the search system. The UI was slightly awkward, but it was fast and gave good results. And I keep a lot of email, so that's important to me. I'm at about 5GB right now, with some of it going as far back as 1990.

Unfortunately, Eudora has been abandonware for quite a while now. I resisted switching for a long time, but there were starting to be enough problems that it had to happen.

I wasn't sure what client I'd end up with. The various webmail solutions seemed to be out because none of them seemed to have ways for me to upload huge archives of old mail. So I figured I'd end up with an actual mail client on my machine. In order to be able to try out different ones, and also because I thought it'd be fun, I set up a small Linux server with IMAP (using dovecot) and had it fetch the messages from my various accounts to one location.

Getting the email over was a bit of an adventure. It was possible, in Eudora, to set up an account on the IMAP server, and then to drag folders over onto it, copying their contents. But that was very slow, and tended to crash. Eventually I found a Mac called Eudora Mailbox Cleaner that can also convert the format, and I was able to use that to import both my old Eudora mail, and some other stuff that wasn't in Eudora, but was saved in more or less Unix mailbox format. It crashed a couple of times part-way through, but with a bit of babysitting I got it all converted into local folders in Thunderbird. Thunderbird would let you move stuff from local folders into IMAP as well, and it wasn't quite as slow and didn't crash quite as often.

So ultimately I had my mail converted, and was able to try out some clients. Apple's mail was the main other one I tried, but it missing features I wanted, so I ended up mostly using Thunderbird, particularly with the QuickFolders and Archive This extensions, which were helpful for quick filing and finding folders. But Thunderbird had some problems. The search was not particularly fast, at least not on the kind of volume of mail that I had. Worse, the filters didn't seem to work reliably. I don't know if that was just an issue with Thunderbird and IMAP, but the spam filtering wasn't moving things out of the Inbox reliably, and mailing list messages weren't reliably going into the right folders.

I decided to take this as an opportunity to write some Smalltalk code, and made myself a little cron job that would run a Smalltalk program to filter the messages. Along the way, I learned a bit about IMAP and the VisualWorks libraries for using them. So in part 2, I'll talk about some of the code that I ended up with.

Wednesday, May 4, 2011

New blog address

Hello World,
I'm Alan Knight, the engineering manager for Cincom Smalltalk and principal developer of the GLORP object-relational mapping framework for Smalltalk. I did have a blog on the Cincom Smalltalk site for a while, which I posted to, um, occasionally, but without Jim Robertson around to maintain that infrastructure I've decided to use a more conventional service. And since Blogger seems to have at least the potential for automatic Smalltalk syntax highlighting that seemed like as good a reason as any to choose it.

Once, many years ago, when I was a columnist for The Smalltalk Report someone asked me at a conference if I was "the" Alan Knight. And I really wasn't sure if I was or not. But now I know - I'm not "the" Alan Knight, but rather this is - LEGEND! I, on the other hand, am just a computer guy, live in Ottawa, play and referee soccer, and am married to Kirsten Carlson, a flute player and teacher.

Before I worked at Cincom I was with The Object People, a training and consulting company in Ottawa, Canada. They ended up best known for developing TOPLink first in Smalltalk and then in Java, and for a while I was the chief architect. TOPLink eventually ended up owned by Oracle, but TOPLink/Smalltalk is still interesting as an early example of O/R mapping software that is clearly prior art to any number of patents. For that matter, so is the ObjectLens software (which patent lawyers seem to mostly know by its very early and short-lived name of Infobase) that my employer Cincom inherited from ObjectShare/ParcPlace-Digitalk/Parcplace and still sell today.

In this space I hope to blog about Smalltalk and programming in general, Cincom Smalltalk in particular, Glorp and other O/R mapping issues, and anything else that seems interesting enough to post. And hopefully post things with more actual content and not quite so many links. Wish me luck.