Software and I: February 2012

Sunday, February 26, 2012

Proving Pair Programming: How and Why it Works

In the first two posts in the Demystifying Agile, I wrote about how to plan release with Scrum and how to use agile estimation techniques in a Scrum project. Delving deeper into the agile software development process, there is one practice that is by far the least popular with traditional managers, and even with many developers. It is the practice of pairing on development tasks. In this post I will attempt to clear the air a bit, explaining why it works and how to make it work for you.

First, a story…

Roughly five years ago I had to rewrite a project from scratch. The details that led to that point are of lesser import, but what is important is that it was a codebase that took three developers eight months to develop and it had four months worth of ugly patches that sorely needed to be refactored. And a few breaking changes in the requirements. And I had 30 days. And two developers (myself included).

Rather than splitting the work, we decided to pair. We paired for 6-7 hours out of eight working hours every day.

The result was more robust, extensible and better performing than before.

And we did it in 20 days.

Why did it work?

From a strictly mathematical standpoint, it shouldn’t have been done. I’ve rewritten projects before, and while it often doesn’t take as much time as it originally did, it doesn’t take a lot less time. We took the work of 24 man-months, and completed it in 40 man-days.

Here are what I believe to be the reasons we achieved the results:

With two developers, equally participating on the project, we had immediate code-review done as we develop. It is a (should be) well known fact that the shorter the feedback cycle, the easier it is to act on the feedback. With near-instantaneous feedback, the improved quality is simply automatically baked in.
Two developers have two different sets of experiences, bringing in a different point of view. Any iffy idea is immediately examined. You don’t get to to sip your own cool-aid when you pair.
With another developer near you all day (or most of the day), you are more focused on your work. You don’t go off on a tangent, reading up on a new technique or gizmo, and you don’t get lost in the quagmire of email: You explicitly decide which distractions to acknowledge and which to postpone. And if you don’t, your pair will help you.
It’s much more difficult to cut corners with someone else (your pair) looking over your shoulder; your guilt will stop you, or his will. Any corners cut (or conversely, taking too long to complete a task) will only be the result of an explicit decision by both of you. Explicitness tends to cut down on occurrences.
Two-legged-distractions (a.k.a corporate-drive-by incidents, a.k.a. managers and coworkers) tend to bother you less when you’re sitting with someone else. They often go look for someone easier to single out if possible. There’s strength in numbers. If that fails, the fact that you’re already sitting with someone will give you a good valid excuse to ignore the distraction.

As you can see, these observations, my experiences, are not easily measurable. They are, however, no less real.

How to make it work for you

First, forgive me for the following copout: YMMV (Your Mileage May Vary); nothing I can write is guaranteed to work for everybody. But I do expect the following ideas to work for most developers who are willing to give it a shot:

Give it a shot. Seriously, it won’t bite. And you can always quit (though winners don’t quit, and quitters don’t win).
Don’t do it the whole day, and don’t work a long day. Pairing is very draining because it demands a focus far greater than you are likely to achieve on your own. Work for no more than 8-9 hours a day. Pair for no more than 6-7 of them. Leave an hour or so for email, meetings, easy no-brainer tasks, and other nuisances.
Pair with someone that could review your work. Offer to pair with that developer on his task after that.
Don’t hog the keyboard. Switch every now and then. Otherwise you have one developer and a bored observer who’s wasting his or her time.
Don’t declare it – just do it. Pairing is a hard sell. Results aren’t. Besides, it is usually easier to get forgiveness than it is to get permission.

I hope this post helps you add a trick to your agile bag. It worked wonders for me.

Assaf.

Technorati Tags: Agile,Pair Programming,Scrum

Sunday, February 19, 2012

Do Agile Estimation Techniques Really Account for Scrum Projects’ Successes?

After several years of experience working with, training and coaching teams on Scrum, I have no doubt that agile (empirical) estimation techniques are better than classic, predictive estimation techniques. One question that kept nagging me over time is whether or not using story points, T-shirt sizes, or ideal hours truly explains the increased success of Scrum based techniques?

It took me a while, but I finally figured it out. And the answer is a definite ‘NO’!

All Estimation Techniques are Merely Guesses

At the end of the day (or at least, at the end of the planning session), any estimation is just a guess: sometimes its more, and sometimes its less. This means that the distribution of the actual time versus the estimation should be pretty much normal: In large numbers, about half of the actual results will be less than the estimation, and half will be more. On average, according to the Law of Large Numbers, it evens up.

The better the estimation technique is, the steeper the curve will be, i.e. the closer actual results will be to the estimation.

My experience is that agile techniques are in fact better at predicting the actual results, but the fact remains that if you make enough of these predictions, then regardless of the technique, the variance should cancel itself out and the sum should settle on the average!

This means that the quality of the technique used to estimate each task has little or no impact of the success of estimating the time to complete the entire project!

Then Why Do Estimations Fail?

If estimations behave “normally” whole project estimations should not fail. I don’t buy into all that “programmers are notoriously optimistic” sentiment; I’m a developer myself and there is not a single soul alive that will call me optimistic, and yet, non-agile projects I worked on overran their deadlines, like everybody else’s. Something must be out there, that either changes the normal distribution, or invalidates the law of large numbers or both. And it has to be something that is present in classic waterfall-ish projects but not in agile ones.

It Is All About the Independence!

The main difference, in my opinion between agile and non-agile projects is how they are planned and constructed. An agile project plan is basically a stack of user stories, with no* dependencies between them, whereas a waterfall project plan is a Gantt chart describing tasks and their interdependencies.

Why Is This Important?

Let’s assume a task that has two dependencies. Each were estimated, and each have a normal distribution of the actual amount of time to complete vs. the estimated time. The following figure describes such a “plan”:

Each task has a 50% chance to be completed before time and 50% to be completed after (There is of course no statistical chance of being completed “on time”). Now, let’s say we’re interested in calculating the probability of Task C to be completed before time, or not to finish late. For the sake of ease, we’ll assume all three tasks to have the same estimated duration. For Task C to begin early, it needs both of its dependencies to complete early. Since the chance of each to complete early is 50%, the chance for both is (0.5 * 0.5) = 0.25, or 25%. The chance for this not to happen, i.e. for at least one of them to be completed late is 1-0.25 or 75%. this means that there’s a 75% for Task C to complete late, and of course it has its own probability of being late, which means that the chance of Task C completing late (and making the whole project run late) is more than 75%!

It is easy to see that the more complex a project is, the more dependencies each task has the greater the chance of being late is; The distribution of the actual duration of tasks with dependencies is a negatively skewed distribution.

Scrum User Stories Have No Dependencies

With this in mind, Scrum user stories have no* interdependencies. If two stories do have a dependency, they are should be joined to be treated as one (and if they are too big, they can and should be split again, though in a way that doesn’t create dependencies). With no dependency (or, if you wish, a single dependency that is the completion of the prior story), we get a plan that looks like this:

The chance of B starting early, which is the same chance of A ending early, is 50%. The chance of B completing early is still 50%! Even better the variance is reduced, which means that there is an increased probability of the total project duration being close to the the estimate (i.e. a steep probability distribution).

What Does This Mean?

This means, that while a project with multi-dependent tasks has an increased skew towards lateness with each added task, a project with no dependency (beyond sequence) has an increased tendency towards the mean, or the estimate.

This means that by the very nature of what we are estimating, agile projects tend to succeed (i.e. be on budget and on time), than non-agile.

Moreover, the actual estimation technique doesn’t matter – it is the estimation of independent units of work that makes the difference.

Therefore, my conclusion, and advice is that the very first change a team needs to make is to start working on independent features – call them user stories, features, or MMFs – it doesn’t matter; just keep them independent of each other!

Hope this helps,

Assaf.

Technorati Tags: Agile,Estimation,Scrum,Sprint Planning,User Story

Monday, February 6, 2012

How to Write Automated Tests for Console Applications

I don’t know about the rest of you, but I still find myself writing a lot of console applications. Most of the time these are utilities or services or some kind of customization or process automation scripts. Until recently, if I wanted to write automated tests for these, I would have to separate the business logic from the program itself and test the functions. While this is obviously a good practice for any moderately complex project, it sometimes might be overly complicated for what is needed. Moreover, this doesn’t work for black-box user acceptance tests.

The System.Console Class

In the .NET Framework console apps revolve around, well, the Console class. You get user input with Console.Read() (and ReadLine(), and ReadKey(), and so on). Conversely, you send output to the user with Console.Write() and Console.WriteLine().

So let’s take a look at the code inside the .NET Framework. By the way, I used Telerik’s JustDecompile to browse the code. If you don’t already have a decompiler of choice, I suggest you take a look at it. It’s simple, elegant and free (and no, I don’t own stock).

So, here's the code behind Console.ReadLine():

   1: public static string ReadLine()

   2: {

   3:     return Console.In.ReadLine();

   4: }

And here’s Console.WriteLine():

   1: public static void WriteLine()

   2: {

   3:     Console.Out.WriteLine();

   4: }

As it turns out, Console’s read and write methods are merely pass-through methods that call the same methods on the In and Out TextReader and TextWriter properties (again - respectively).

The text reader and writer classes, as so eloquently put in the class descriptions on MSDN, “represent a reader [and writer] that can read [and write] a sequential series of characters”. Well, that still is more helpful than some descriptions I’ve seen... Anyway, they are the abstract parents of the string and stream reader (and writer) classes. Why is this important? Well, the In and Out properties are by default set to the keyboard input and screen output streams.

Now, if only there was a way to redirect those streams…

Enter Console.SetIn() and Console.SetOut() respectively. Hmm, that’s a lot more respect than I usually show in a blog post…

With SetIn, I can redirect the In text-reader to a StringReader, and with SetOut, I redirect the output to a StringWriter.

I can now, quite easily initialize the string reader to a value that contains whichever input, or set of inputs I want to use, and the console application will read characters or lines as needed. I can then read the output of the string writer, and compare it with expected results.

And the most beautiful part of it is that there is (almost) no need to modify the console application in any way for this to work.

Example

The following example was written in C#, using MS-Test (forgive me) to test the code. The application is a simple one that plays the game 7-Boom (a local variation of the game BizzBuzz). As you will see, except for making the Program class and Main() method public in lines 1 and 3, so that I can reach the code from another (test) assembly, I did nothing I wouldn’t do in any other console application:

   1: public class Program

   2: {

   3:     public static void Main()

   4:     {

   5:         Console.Write("Enter upper limit: ");

   6:         var range = int.Parse(Console.ReadLine());

7:

   8:         for (var i = 0; i < range; i++)

   9:         {

  10:             var num = ((i % 7 == 0) || i.ToString().Contains('7')) ? "BOOM" : i.ToString();

  11:             Console.Write("{0}, ", num);

  12:         }

  13:     }

  14: }

And here is a test:

   1: // Arrange

   2: using (var sw = new StringWriter())

   3: {

   4:     using (var sr = new StringReader("100"))

   5:     {

   6:         Console.SetOut(sw);

   7:         Console.SetIn(sr);

8:

   9:         // Act

  10:         SevenBoom.Program.Main();

11:

  12:         // Assert

  13:         var result = sw.ToString();

  14:         Assert.IsFalse(result.Contains('7'));

  15:     }

  16: }

That’s it! I set the input that the user would have entered in line #7, and redirect the I/O in lines 6-7. All that remains is to assert that the output in line #14 meets the expectations.

By the way, this works exactly the same in Java as it does in C#. The difference is that in Java you will use System.setIn() and System.setOut() to set the PrintStreams (the In and Out properties used in Console.Read & Console.Write).

So now you can write test-driven (and behavior driven) console applications with greater ease.

Hope this helps,

Assaf.

Technorati Tags: Agile,BDD,C#,Java,TDD,Test,Unit Tests