Selective cluelessness

In his OOPSLA keynote, Martin Rinard, associate professor at MIT, talked about how we should make systems more resistant to errors, rather than putting a lot of effort into trying to create error-free systems.

Professor Rinard went trough the great schools of thought: rationalism, empiricism, and cluelessness (the latter maybe not as renowned as the others, however known to be practiced by golden retrievers and various blonde people in the entertainment industry). Finding that none of the three by themselves were sufficient when working with computer systems, he concluded that an approach using selective cluelessness would be the way to go.

The point of the selective cluelessness was that our cognitive ability is a limited resource – our brains can only understand so much, so we have to be selective in what we focus on. Hence, we have to choose some things to have no clue about. In this respect, programming languages should focus on reduce our needs to know everything what’s going on, so that we can stay focused on the problem at hand.

One of the inherent problems we face today when programming computers could be formulated like this:

  • Programs are unforgiving and must therefore be correct.
  • To make a program correct, we must completely understand the problem.
  • Programming is difficult, therefore simplicity and elegance are keys to success.
  • Unfortunately, simplicity and elegance are hard to come by.
  • To be simple and elegant, you need to know what’s going on.

As systems get larger, you have to focus on subsystems, and then you lose ability to know what’s going on in other parts of the system.

Hence, you will find it harder to find simplicity and elegance, and the systems will get even more complicated.

Another point that Professor Rinard was making, was that brute force is often a better approach. If you use brute force, there are good changes things will work. If you try to get smart, it will all come down. In practice, simplicity is a nonstarter, and elegance is largely irrelevant in practice. Applications such as Windows, Linux, and Microsoft Office are hardly simple and elegant, however very successful.

In order to make better system, Professor Rinard argued that software should be made to be acceptable, not necessarily correct. Cost and difficulty of developing software is roughly proportional to the amount of correctness. Hence, systems should be made so that errors are more acceptable, rather than trying to make them error-free. The term that he used for this, was failure-oblivious computing.

The people at MIT have made a study where they tried to apply failure-obviousness to existing programs to test the result. They used applications such as Pine, Apache, and Sendmail. What they did, was that they focused on some well-known problem areas in applications:

  • Reading/writing outside of arrays (a typical C/C++ problem)
  • Memory does not get deallocated, resulting in memory leaks (also a typical C/C++ problem)
  • Infinite loops

The researchers identified the problem areas in the applications, and changed the them so that they would just ignore instances where it wrote/read outside an array (write operations were discarded, reading returned a random value), and changed the program to overwrite old memory slots instead of allocating new ones. They allocated k chunks of memory for the program, and when the program needed a memory space for the k+1 object, the wrote it in the place 1 instead, overwriting whatever was there from before. Furthermore, systems where not allowed to run infinite loops, the systems were only allowed a finite number of loops.

The results of the study was that the failure-oblivious systems in fact were more stable than the other versions, and operated as expectedly anyways. The lessons to be learned was that the software could never crash, they would always continue and produce something, and that something was often good enough. In some cases, he argued, programs could just swallow exceptions and continue rather than halt. Thus, correctness was traded off for stability.

Agile panel debate at OOPSLA

I attended a panel debate at OOPSLA on agile methodologies. A lot of insightful information was given by the panelists (including Alistair Cockburn, Dave Thomas, Jutta Eckstein, Randy Miller, and Mark Striebeck) derived from their experiences in the field.
One question to the panel was on how to introduce agile in a large corporation which relies heavily on the waterfall approach and that was unwilling to change? One argument (from Dave Thomas) was that the development team could start focusing on continuous integration, testing, and perhaps test-driven development. As the code handed over to the QA department would have a higher quality, the role of the QA department decrease, and a more agile model would occur. Jutta Eckstein pointed out that as long as you have no by-in from the management, you will not get the value proposition from agile, but you could still use some of the agile tools.
Another important topic that came up how one could be agile when faced with usability issues (and I think, user interface design). It was pointed out that experience was that user interface development had to be started before the (implementation) iterations. There was some controversy in the panel on this issue, but one point that was made that the iterations should not focus on delivery to the customer, but should focus on end-usage of the system, in which user interface design and usability are important components.
What is the greatest advance in agile practices in the last 5 years? What new things should we teach our students? To this question, Dave Thomas replied that the most important thing is testing; unit testing and acceptancy testing. I very much agree to this, and I feel that we still have a long way to go, especially on acceptancy testing.
All in all, the panel discussion touched on many very interesting topics:

  • Everything should go in the backlog (Dave Thomas). Projects tend to keep things out of the backlog, making them lose overview and large variations in velocity. Things like vacations and defect stories should go in the backlog. Management should have a backlog, too, consisting of things like risks.
  • Focus more on deliverables (Alistair Cockburn). Focus on deliverables are key to agile processes, not iteration planning, iterations, and velocity. If you have the three latter without the focus on the former, you are not going agile.
  • There is no ideal iteration length. Iteration length varies from project to project. There have been teams (advanced!) that focus on continuous delivery with no iterations. The most important is to have short feedback loops.
  • Pair programming (Dave Thomas). Pair programming is four eyes looking at the code at some time, it does not mean two hands holding the mouse

Aspects versus modularity

Today at OOPSLA, I sat in on a panel discussion on aspect-oriented programming. It was about whether aspects are a good thing, when to use them, or whether they can help achieving modularity in our applications. One comment, which is more on aspect tools rather than the principle of aspects, was that aspects are a fix for something that should be included in a language or platform in the first place. One discussion was about exactly what aspects should be used for. The people opposed to aspects challenged the advocates to name killer apps of aspects. The best killer app they came up with was Spring where aspects are a central component to provide an alternative enterprise container for Java.
A lot of discussion was about whether that aspects would make the application code unreadable because functionality is implemented in aspects and thereby “hidden” when looking at the code. As a conclusion, the panelists argued that aspects should never change the core application functionality. Whether this is the case when it comes to Spring, I have some doubt.
My conclusion from the panel discussion is that it is not certainly clear what aspects should be used for, except maybe for logging purposes. Furthermore, I am not content with the argument from the advocates of aspects that aspects bring to the table that cannot be implemented in programming languages.

How to design a good API and why it matters

The most useful talk I have been to thus far at OOPSLA, was Joshua Bloch’s talk about designing APIs. I believe that many of the examples of do’s and don’ts that he went through is a gold mine for most programmers, whether or not they will ever create a public API. APIs are basically interfaces, and good interfaces are key to having a good modularity in the code.
The points he made include:

  • When in doubt, leave it out. If you are in doubt whether a feature should be included in the API, leave it out. It’s better to add it later if you need it, rather than having functionality that you would like to remove, but cannot because others have started using it.
  • Boilerplate code. A clean API will not force you to create a lot of boilerplate code. My interpretation of this, is that you should avoid inappropriate intimacy between your code and the client code.
  • Don’t violate the principle of least astonishment. Your functionality, behaviour and conventions should always be the ones that astonishes your users least.
  • Fail fast. Report errors to the programmer as soon as possible
  • Avoid fragile base class problem. If you allow users to extend your classes, document well how your methods are related. If extending the classes in the API is not something that you don’t want the users to do, make the classes final.
  • Overload with care
  • Do not use floating point to represent monetary values. The thing here, is that you will often not get the values and behaviour you expect.
  • Never use strings if a better option exists.
  • Use builder pattern to reduce number of method arguments. A method should have three or less arguments. A higher number makes you forget what they are. Create a builder to gather the arguments into a holder class and pass that to the method.
  • Return empty objects instead of null. For instance, take a method that returns a set of objects. If a method is unable to do this, return an empty set object instead of a null pointer. This way, you avoid having your caller always have to check for null before using the return value.
  • Don’t use exceptions for flow control. This is a personal “favorite” of mine. I have seen this on several occasions, and it upsets me every time.
  • Only use checked exceptions if there are any way to recover. If the system cannot recover from the error, there is no point spending time to try to catch the exception in your code. It just clutters your code.

Growing a language

My third day at OOPSLA started off with Dr. Guy Steele‘s (Sun Fellow) keynote entitled “A Growable Language”. It was a very interesting talk on how Sun Research is building a new language for scientific computation that is suppose to replace FORTRAN by raising development productivity.
Although the Fortress language is not anything I expect to touch in my professional work, there were many interesting aspects in the talk, including:

  • An important design principle was to create a languages that would grow over time. To accomplish this, the focus is to keep the language and the corresponding compiler lean, and relying on libraries to extend the language.
  • Support for unit testing, make (build tool), and Subversion is built into the language (or libraries, I don’t remember which).
  • It uses notation from math as much as possible, since this is what mathematicians know. In order to find notational constructs, they studies what mathematicians wrote on whiteboards, and try to incorporate this into the syntax. Furthermore, they based the syntax on Unicode to incorporate the special characters that mathematicians do. In order to make the code more readable they run it through Latex to get a full mathematical notation. In this respect, Fortress seems to be a kind of a Domain-specific language, or DSL.

Databases and objects

Today at OOPSLA, I attended a panel discussion on databases and objects, which basically were about object databases vs. relational databases + object-relational mapping. This was a very interesting panel discussion that focused on where we are today and were the technologies are evolving. Panelists included Erik Meijer from Microsoft, Patrick Linskey from BEA, Bob Walker from object database vendor GemStone Systems, Craig Russell from Sun Microsystems, and Christof Wittig from DB4Objects.

Why do we need two languages?

One questions that was raised by the evangelists of object databases, is why do we need two languages – one language for applications programming, and one for data manipulation. Of course, their answer to this question is that we don’t. This is also a trend that we see in object relation mapping approaches – trying to make application developers not having to relate to database languages, first and foremost SQL. Another example is Microsoft’s Linq technologies which tries to bring data manipulation features to their programming languages, C# and VB.

Why can’t database be as agile?

A question from the audience was why databases cannot be agile and support refactoring the way programming languages do. Supporters of object databases thought, big surprise, that object databases was a good enabler for being agile. Furthermore, one of the panelists also pointed out that dynamic languages are the best solution for being able to handle change. Smalltalk was given an example that enabled refactoring, together with object databases.

Where are we in 10 years?

The panelists were also challenged to share their thoughts on where databases will be in 10 years. Here are some of the answers:

  • Objects will be available for the application in memory at all times. (Bob Walker)
  • Data will be decentralized across the network (Linskey)
  • One central question is who owns the data (Russell)</li>

Object databases seems like a very interesting concept, and it has been around for 20 years. So, why are they not more widely used? According to one of the panelists, object databases scale very well because they enable dividing and distributing data across different servers. This made me very eager to try out object databases, to get a hands-on feeling about how they work. Furthermore, there is an open source object database out there, db4objects, that seems very interesting.

OOPSLA Workshop

I attended the 4th International Workshop on SOA & Web Services Best Practices workshop at OOPSLA today. The workshop brings together people with theoretical and practical background with the respect to SOA and Web Services.

Services versioning

One of the things that were discussed was the versioning of (web) services, i.e. how to handle web service upgrades and discontinuance. What we found, with good help from Nico Josuttis who attended the workshop, can be summarized into these three points:

  • Each modified service is a new service
  • Each modified data type is a new data type
  • Don’t use service types in your application

Basically, this gives how you should handle creating new versions of a service while providing older versions of the service to “old” client. The thing is that when you create a new version of the service, that’s a new service that gets deployed aside the old service that clients that are aware of the new version can connect to. Old clients can still, at least for a certain period, connect to the old version. The same goes for data types, typically defined using XML Schema Definitions.
I think that versioning is a very important architectural decision that you need to make when you create services in a large organization, or between organizations, where services and clients have different owners, funding, and life-cycles.

Automated robustness testing

One of the interesting papers submitted to the workshop presented an approach for generated robustness tests for web services from the WSDL of the web service. Basically, the approach was that the WSDL was used to generate tests that could be run using a unit testin tool such as JUnit, and using a tool like JCrasher to generate different kinds of inputs to the web service in order to test that the service is able to handle these. I think this is a good idea that has a lot of potential in automatically assessing the robustness of the service.

Selenium with support for cookie-management

I blogged earlier about using Selenium for security testing. One of the shortcomings that I pointed out, was that session handling (i.e. cookie handling) was needed. I went ahead and created extensions for this. As of version 0.8.0, Selenium now supports this out of the box. This means that Selenium now is able to test scenarios such as logging in and logging out from web applications, in additional to other cookie-based functionality. Great!