TDD Dive 4 — Refactoring away External Loops

[Note: I’m using Copy as HTML to transfer code from VS.Net to BlogJet. This tool embeds style sheets into the posted blog entry to provide the nice formatting for the source code. Unfortunately, the version of .Text used by Dot Net Junkies won’t let me post the embedded style sheets. This means that viewing this article in an aggregator will look different, and worse, than viewing it on the DotNetJunkies site. I recommend viewing it there, as the code will be more easily readable than in an aggregator — bab]

This is the 4th part of an evolving series showing a TDD-built Payroll system in all its glory. Here are the previous parts:

Now, on to our show…

Jay Bazuzi had a very interesting post on the Refactoring list (membership required to read posts). Anyhow, Jay made three really good points:

Methods should be 1 line long
You should share loop code by using internal iterators and delegates
You should have no private methods

I’d like to throw my 2 cents in on these ideas:

1 line methods

I’d sure like to get to the point where I could write one line methods. I need to work harder to get to this point, but I’d like to try. Jay’s point was that the interesting thing about objects was not what each one knew how to do by itself — the interesting thing is how they collaborate with their neighbors to get something done. I’ve heard this referred to in the past as RavioliCode, not to be confused with SpaghettiCode.

A lot of the ability to do this gets down to your choice of language, as some languages (C-based) are more verbose than others (Smalltalk).

Prefer Internal Iterators and Delegates to External Iteration

I have to agree with this point. I used to do this all the time in C++, but I’ve gotten really far away from this habit in Java and C#. There are really two reasons to do this. The first is to avoid exposing your internal collection to outsiders. For example, in our Payroll example, we had the class EmployeeList, which was basically a strongly typed Employee collection.

     1: public class EmployeeList

     2: {

     3:     private ArrayList employees = new ArrayList();

4:

     5:     public void Add(string employeeName, int yearlySalary)

     6:     {

     7:         employees.Add(new Employee(employeeName, yearlySalary));

     8:     }

9:

    10:     public int Count

    11:     {

    12:         get { return employees.Count; }

    13:     }

14:

    15:     public IList Employees

    16:     {

    17:         get { return employees; }

    18:     }

    19: }

And in our Payroll class, we had our Pay method that calculated the pay for all employees.

     1: public class Payroll

     2: {

     3:     private EmployeeList employees;

4:

     5:     public Payroll(EmployeeList employees)

     6:     {

     7:         this.employees = employees;

     8:     }

9:

    10:     public IList Pay(string payDate)

    11:     {

    12:         ArrayList payRecords = new ArrayList();

    13:         foreach(Employee employee in employees.Employees)

    14:         {

    15:             employee.Pay(payDate, payRecords);

    16:         }

17:

    18:         return payRecords;

    19:     }

    20: }

The problem is that now our collection is vulnerable to changing by someone outside its encapsulation boundary, its class, EmployeeList. Exposing our naked collection like that makes it possible for someone to do bad things to it and to us.

Another issue with this is that we’re going to have this external iteration loop (lines 13-16) repeated all over our codebase, as we do different things to the collection of Employees. We can refactor this, however, to reuse the existing loop plumbing and allow our clients to specify the body of that loop.

What we can do is to create a method in EmployeeList called Apply() that takes a special delegate the we’ll write. This delegate will accept an Employee as its parameter, and the Apply() method will have the one and only one looping construct over an EmployeeList in our system. And the body of that loop will just call the delegate, passing in each Employee object in turn. Here is what it looks like in code:

     1: public class EmployeeList

     2: {

     3:     private ArrayList employees = new ArrayList();

4:

     5:     public delegate void ApplyAction(Employee employee);

6:

     7:     public void Apply(ApplyAction action)

     8:     {

     9:         foreach(Employee employee in employees)

    10:         {

    11:             action(employee);

    12:         }

    13:     }

14:

    15:     public void Add(string employeeName, int yearlySalary)

    16:     {

    17:         employees.Add(new Employee(employeeName, yearlySalary));

    18:     }

19:

    20:     public int Count

    21:     {

    22:         get { return employees.Count; }

    23:     }

    24: }

Note the new Apply() method in lines 7-13. This method does as described above — it takes a delegate and applies it to each Employee object in turn. It is the responsibility of the function to which the call is delegated to maintain what ever kind of state or context that is applicable per call. We were also able to get rid of our Employees property that exposed our collection to the world.

Here is the Payroll class after refactoring it to use this new Apply() method.

     1: public class Payroll

     2: {

     3:     private EmployeeList employees;

     4:     private ArrayList payRecords;

     5:     private string payDate;

6:

     7:     public Payroll(EmployeeList employees)

     8:     {

     9:         this.employees = employees;

    10:     }

11:

    12:     public IList Pay(string payDate)

    13:     {

    14:         this.payDate = payDate;

    15:         payRecords = new ArrayList();

16:

    17:         employees.Apply(new EmployeeList.ApplyAction(PayEmployee));

18:

    19:         return payRecords;

    20:     }

21:

    22:     private void PayEmployee(Employee employee)

    23:     {

    24:         employee.Pay(payDate, payRecords);

    25:     }

    26: }

Look at the Pay method in lines 12-20. Now, instead of having the explicit looping logic here, I have a call to the EmployeeList.Apply() method, passing a delegate to my new PayEmployee method. The end result of this is that I get rid of the looping logic at the price of another private method, PayEmployee. Even this cost is gone, however, in Whidbey, where you can have anonymous delegates. Finally, line 17 to me does not communicate its intention very well, so I did an ExtractMethod on it and created a PayAllEmployees private method. This is left as an exercise for the reader 🙂

I’m still not entirely happy with the Pay method, however, as its first two lines seem a little strange. Since I have to maintain state across calls to the PayEmployee method, I have to promote our argument, payDate, and our Collecting Parameter payRecords into member variables. This allows them to carry the state through the multiple calls to PayEmployee. I’m going to just acknowledge that I don’t like it right now, but I figure I’ll be back to clear that up later. I’m hoping future changes will inform me of what that stuff needs to turn into.

No Private Methods

Another point Jay makes is that private methods are really public methods of another, new class trying to find their way out. I actually teach this same concept, but I don’t start invoking it until a private methods gets rich enough for someone to want to test them. At that point, I advocate pulling it out into another class.

But perhaps Jay has an interesting point, and I’d like to try it. But I can’t see making all private methods into methods on other classes. I admit that I haven’t tried it as aggressively as Jay describes, so my opinion shouldn’t carry as much weight as his here, but it seems that there are some methods that are introduced just to help explain other code. I mean those one-liners that make code more easily readable. For example, the PayAllEmployees() method from above is introduced just to help make our code more readable. And the private PayEmployee() method is just an artifact of the language.

On the other hand, maybe there is something interesting going on here. If I moved the PayEmployees() method to another class, called PayrollAction or something like that, then the PayEmployee() method could come along, too. It would still remain private there, however, since it is an artifact of the language, and not strictly something we would want to write. I’m not sure that this would make things communicate better at this point, and it would give me more classes and methods. That moves me further away from The Simplest Thing That Could Possibly Work, which is always my underlying goal. I think I’ll keep this one under my hat for now, but I’m ready to spring to it should the need arise.

But what this piece of advice does to for me is to make me even more vigilant against creeping private methods. I think I’m going to be more proactive about moving them out into their own classes earlier than I have before.

Conclusion

I think I’m happier with the code now that I’ve gotten rid of the external iteration. I’ve encapsulated my collection, yet still maintained the generality I had before by introducing the Apply() method on EmployeeList. And I’ve made my Payroll class simpler, since it doesn’t have to understand as much about the internals of the EmployeeList class, which is always a good thing.

In the next episode, I promise we’ll finish this first story. Thanks for your patience, but I just keep discovering new things about this code I’ve never noticed before. By the way, this is one of the most powerful arguments for pair programming — the act of explaining your design to someone as it is taking shape leads to a better design. Just working in the same room together isn’t the same thing — you have to be intimately connected, as in Pair Programming.

— bab

Brian's Agile Musings

agile | software | cloud

TDD Dive 4 — Refactoring away External Loops