Episode 1 of Deep Dive Revisited — Setting the stage with customer tests

So let’s begin our reimplementation of the sample payroll system introduced previously. Briefly, the problem stated that I had a batch payroll system that I needed written to pay the few employees that I had right now for my small company. They are all salaried as of now, but I can imagine hiring some hourly employees later.

Story number one said that, “As the owner, I want to be able to run payroll so that I can pay my employees”. Further clarification said that I have an input file that consists of single-line commands:


When reading this input file, the system should create a single output file containing all paychecks to be written to my employees for each date in the input file that is the first of the month. Having this batch payroll input file allows me, the customer, to run and rerun payroll to my heart’s content.

I also have an output file that I pass off to some other company who creates checks for me. The format of the output file looks like this:

Check|100|Frank Furter|$1000|05/01/2007
Check|101|Howard Hog|$2000|05/01/2007
Check|102|Frank Furter|$1000|06/01/2007
Check|103|Howard Hog|$2000|06/01/2007

Check is a keyword specifying that this is a check payment line, followed by a check number, the name of the person to whom the check should be written, the amount, and the paydate.

So that’s what I need. (Incidentally, as I went back and reviewed the first installment of this from 2004, I found a bug in the specifications. The check number (100-103 above) is supposed to be strictly increasing. I had it restarting with each pay period in the original write up.) I’d like to have a command-line program that I can write that will read the given input file and create the given output file.

We begin by specifying Customer Tests

One change in my development practices since 2004 is that I’ve become committed to having customer tests to define all work that I do. Since I’ve specified some behavior for my system, I’ll need to define a few customer tests to verify that the system is doing the right thing. Fortunately for us, this system is completely command-line driven, so it will be easy to write these customer tests using NUnit. Here are those tests:

public class ATFixture
    [Ignore("AT not implemented yet")]
    public void All_employees_are_paid_on_first_of_month()
        StringReader inputs = new StringReader("Payday|01/01/2007" + System.Environment.NewLine);
        StringWriter outputs = new StringWriter();
        PayrollSystem payrollSystem = new PayrollSystem(inputs, outputs);


        Assert.AreEqual("Check|100|Billy Bob|$10000|01/01/2007" + System.Environment.NewLine +
                        "Check|101|Sally Jo|$20000|01/01/2007" + System.Environment.NewLine, outputs.ToString());

    [Ignore("AT not implemented yet")]
    public void No_employees_are_paid_if_provided_date_is_not_the_first_of_the_month()
        StringReader inputs = new StringReader("Payday|01/02/2007" + System.Environment.NewLine);
        StringWriter outputs = new StringWriter();
        PayrollSystem payrollSystem = new PayrollSystem(inputs, outputs);


        Assert.AreEqual("", outputs.ToString());

Now, I’m cheating ever so slightly here, as I’m not reading and writing files, but using StringReaders and StringWriters. I’m just assuming that I can plug in the files at a later date, and things will just work. Email me if you find this cheat offensive, and I’ll consider adding other tests that really do touch the file system. I’m reluctant to do that, because touching the file system brings its own set of worries with it:

  • Where are the files actually located at runtime?
  • Do I have read and write permission to the directory where they’re located?
  • Who is going to clean up the output file after the test is finished?
  • Touching the file system is much slower than doing things in memory

and so on.

These customer tests are pure and simple State Based Tests, because it seems to me that this is the right technology to use at this level. They are state-based because the state of the system is what the user is concerned with — if I pass this into the system, this is what comes out. Makes sense to me. So I have two tests for right now, one specifying what happens on the first of the month, and another specifying what happens on any other day of the month. Between those two tests, this should characterize what this story does (absent error checking, of course, which is left as an exercise for the reader).

One final note — these tests are written assuming a very simple API for accessing our program. It is entirely possible that we’ll learn more about how we use our system as write our programmer tests, which will necessitate us changing our customer tests as we go. This will become less and less frequent the further we go into the project, but we’re at the very start now, and our architecture is very likely to change.

Implementing our first programmer tests

And now we’re off. This is going to mark another change to my development style as practiced 3 years ago. Back then, when I started this problem, I recognized the fact that there was an input section, a processing section, and an output section. I assumed that I could get the input and output sections working in some way when I needed to, so I focused on solving the critical business problem in the middle. My thinking was that by doing this I would be fleshing out the requirements of how the inputs and outputs communicated through the middle. I could then write main after I finished the middle, and then write the input or output code and be finished.

Now, I want to write a single test that goes through the whole system as a whole first, and use that to tease out the important abstractions earlier than I did before. So I am going to start with an interaction based test that flows directly from the customer tests. It will take the single PayrollSystem.Run() method and begin to flesh out how it works. This clearly becomes an act of design. My first test:

public class PayrollSystemFixture
    public void CreateControlFlowThroughSystem()
        MockRepository mocks = new MockRepository();

        IInputReader reader = mocks.CreateMock<IInputReader>();
        IOutputWriter writer = mocks.CreateMock<IOutputWriter>();
        IProcessor processor = mocks.CreateMock<IProcessor>();

        PayrollSystem payrollSystem = new PayrollSystem(reader, processor, writer);

        List<BatchInput> inputs = new List<BatchInput>();
        List<ProcessOutput> outputs = new List<ProcessOutput>();



Just writing this test has already taught me something new about my PayrollSystem class. In my customer tests, I had just the TextReader and TextWriter passed into the PayrollSystem. Now, in thinking more about the problem from a design point of view, I’m finding that there are other pieces that should be passed in. This doesn’t mean that the constructor defined in the customer test is wrong, just that I’ve discovered another one possibly.

Let’s look at this test in more detail. This is an interaction based test, as I described a couple of blog postings ago. Its purpose in life is to design the interactions of this class with the classes that are its immediate neighbors. This means that we’re going to have to do some design here, and make some guesses about other interfaces and methods on them. Writing IBTs is truly an act of design 🙂 (I have a few words to say about this process and YAGNI in a moment)

I find that these kinds of tests are much easier to understand if I read them backwards. In this case, the code in the playback section of the test is the code that is being run by the test. In this case, we’re trying to design the interactions of the PayrollSystem.Run() method.

In the record section, we see what we expect this method to do. In our case, it is going to call reader.ReadAllInputs(), which is going to return inputs, which is List. These inputs will be passed along to our processor, which has a Process method that returns outputs, as a List.

This is finally passed to the writer.WriteAllOutputs() method.

This is my first guess at a high-level design for this system, but I’m certainly open and eager to change this if needed. One thing I’m suspicious about is the way that the problem is broken down into three physically separate processing steps. I like this, because it really does keep the inter-step coupling down as much as possible, but it doesn’t quite feel right to me. When I’ve solved this problem in the past, I’ve had a main method, like Run(), that had a loop that called the input side, got a single record, processed it, and the act of processing caused the writing to happen. I guess this is the difference between a streaming design and a procedural, step-by-step design. I’m going to keep an eye out for anything that will guide me towards either of these choices as I go. For right now, I’m going with what I have written.

Above the recording section is where you can define objects that will be used throughout the test. In our case, we’re defining our PayrollSystem and the two lists that we’ve mentioned previously.

Finally, or firstly I you look at it right-side up, we define the mock objects that we’re going to need in our system.

One thing that I’m finding that I like about writing tests this way is that I get to defer implementation decisions about things until the Last Responsible Moment. For example, to get this test to compile, I had to create three interfaces, add a single method to each of them, define empty BatchInput and ProcessOutput classes, and my PayrollSystem class. I made as few decisions about anything as possible while writing this test, and I like that. I have a feeling that I would have had to at least define something about my BatchInput and ProcessOutput classes if I had been doing a state-based test, as well as defined hand-coded stubs for my interfaces.

Let’s look at the code that implements this test.

First attempt at implementation

One complaint about interaction-based testing is that you create the same code twice, once in the expectations of the test, and again in your source code. There is definitely some truth to this. If you look at the code in the Run() method, it is exactly the same as what was in the record section of the test. I don’t think that this is a problem, though. IBTs force you to think at a really high level, forcing out high level abstractions every early. These high-level abstractions are unlikely to change much once you get into your system a bit, so the duplication doesn’t really hurt you. I think it also helps a reader understand how data and control flows through your system, just through reading the tests.

So here is the code I wrote to make this test pass:

public class PayrollSystem
    private readonly IInputReader reader;
    private readonly IProcessor processor;
    private readonly IOutputWriter writer;

    public PayrollSystem(TextReader inputs, TextWriter outputs)


    public PayrollSystem(IInputReader reader, IProcessor processor, IOutputWriter writer)
        this.reader = reader;
        this.processor = processor;
        this.writer = writer;

    public void Run()
        List<BatchInput> inputs = reader.ReadAllInputs();
        List<ProcessOutput> outputs = processor.Process(inputs);

What does this test NOT address?

Note that I haven’t done anything at all about date-specific processing. I don’t even have a place to put it yet. This is another one of those things I like about IBT. It seems to me that writing tests in this style is driving me towards putting responsibilities for lower level details down at a lower level. It is doing this by making it difficult for me to write tests for implementation details like date processing. I have other thoughts about how these tests are making me follow the Law of Demeter much more closely and reducing the amount of refactoring that I’m having to do as I write this code. In fact, I’m finding that I’m refactoring my code less now, typically only when I’m changing my mind about how things are structured. The IBTs create systems that are already OO in nature very early, which means that I’m not feeling the need for the post-test-working refactoring step anywhere nearly as much. We’ll see if it continues to work this way as we get further into this.

What about YAGNI?

One concern I have at fleshing out these abstractions so early in my project is that I’m not sure I need them yet. Maybe I could have driven out different, less abstract abstractions (?) by focusing on just payroll behavior, and if so, that would be my fault. But this did feel like the shape of this problem to me, so I went with it. I’m going to watch for premature generalizations throughout this exercise, in an effort to comply with YAGNI.

Next step

OK, so we have the basic high-level design of our system in place. Given our design, I can now start to pick out any one of the three major legs of the design to start working on first, since they’re really totally independent. I’m going to do the input side first, just because its easy, and knock that out in the next installment in the next few days.

Please ask questions

Readers, I beg you.. I’m learning this stuff, too, and I learn best when people challenge me on what I’ve done. Ask me questions if something seems strange or silly. There is every chance that I’ve made some choice that could be better, and we’ll all learn from it as we go.

Thanks for sticking with this post for so long, and I hope it was useful.

— bab

4 thoughts to “Episode 1 of Deep Dive Revisited — Setting the stage with customer tests”

  1. Hi Brian,

    Great post and looking forward to seeing how IBT work out for you. Conincidentally I’m having another go at getting my head around IBT.

    I say another, because the last time I tried a small MVP spike I became quickly disullusioned by the problem you’ve already touched on – duplicating the knowledge of interaction in both the tests and the class under test. Now I wouldn’t have had a problem with this duplication if the duplication was contained within a *single* test method. The BIG problem I started observing was too many tests having to set up expectations and/or canned values (using SetupResult.For) to satisfy interactions that preceed the interaction that the current test wanted to verify. This made refactoring and adding new features that changed (added, modified, or removed) the existing interactions rather unpleasant. I tried to factor out this duplication by moving the setup of expectations into a helper expectation class. So for example I would write:



    Each of these helper methods would contain one or more expectations that satisfy what the method name specified. I then reused these factored out expectation methods as "setup" in other tests.

    At the time I gave up mid way through the spike thinking that these IBT’s are *much* more complex than the production code they are testing!!

    Over the next couple of weeks I’m going to revist my spike and try and identify ways of avoiding this expectation duplication. For a start I’m going to see whether using a stub (_mocks.Stub<T>) will help out. Also I’m going to look to see whether the code in the Presenters are causing a problem.

    As a side issue have you thought about using Behave# (http://www.codeplex.com/BehaveSharp) to write your customer facing story tests?


  2. Good stuff.

    Regarding the test and Christian’s note about using a "stub".

    I’m a big proponent of IBT. That said, one very important distinction that did not become obvious for quite some time is how to use "expectations" and "stubs" appropriately to keep even IBTs honest on specifying "behavior" vs. "implementation".

    What is the "behavior" of your PayrollSystem vs. its "implementation"? Tough question, particularly when we talking in the context of an IBT! Let’s look at it from the perspective of the 3 collaborations:

    1. Use the Reader – I consider this part of the ‘implementation’. It’s used as a query object, only to help me get some data that’ll use for later steps. It’s not contributing fundamentally to the desired "end state" of the business problem. It just happens to be what we have chosen to get the data we need to process.

    2. Use the Processor – I consider this part of the "behavior". It’s used as a command object, and actually contributes to the desired "end state" of business problem.

    3. Use the Writer – I also see this as part of the "behavior", as it’s also used command object in this scenario.

    In summary, I see the behavior from an IBT POV as: "get the input data, tell the Processor to process it, then tell the Writer to print the output." I see "use the Reader" only as an implementation byproduct.

    One rule of thumb I use that you may have noticed above is the "query" vs." "command" test – for the most part, I consider use of a "query object/method" as an implementation detail, and use of a "command object/method" as part of the behavior. This litmus test (no pun intended) seems to work pretty well for me.

    Back to "Expectations" vs. "Stubs" – I use "expectations" when specifying a "behavior-driven" interactions, and "stubs" when specifying "implementation-driven" interactions. Do I want my tests to break if I change the implementation in a way that does not affect the behavior? I say no, thus I try as best I can to not set "expectations" for anything that is not truly part of the "behavior" of the object under test.

    Undoubtedly not a "perfect science" or a trivial thing to get your head around, but I have found paying attention to this has not only made my IBT’s more expressive ("specifying behavior" instead of "verifying implementation"), but has made them a great deal more resilient and useful in avoiding "broken windows".


  3. I think I’ll follow along — in Java, with EasyMock.

    And it certainly bothers me that my expectations…




    seem to almost be a statement of the exact method code:

    final List<BatchInput> inputs = reader.readAllInputs();

    final List<ProcessOutput> outputs = processor.process(inputs);


    If I’m writing method code in my expectations, then what’s to make me think it’s correct? Where’s the test that forces me to write my expectation code?!?

    On the other hand, maybe the problem is that we’re only dealing with one example of calling the run() method. Maybe with multiple tests, each being an example of a set of expected calls, the implementation will end up being an algorithm, while each of the tests is just an example execution path.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.