TDD Dive 4 — Refactoring away External Loops

[Note: I’m using Copy as HTML to transfer code from VS.Net to BlogJet. This tool embeds style sheets into the posted blog entry to provide the nice formatting for the source code. Unfortunately, the version of .Text used by Dot Net Junkies won’t let me post the embedded style sheets. This means that viewing this article in an aggregator will look different, and worse, than viewing it on the DotNetJunkies site. I recommend viewing it there, as the code will be more easily readable than in an aggregator — bab]

This is the 4th part of an evolving series showing a TDD-built Payroll system in all its glory. Here are the previous parts:

  1. Deep Dive into Test Driven Development
  2. TDD Dive — Part Deux
  3. Diving into TDD — Take 3

Now, on to our show…


Jay Bazuzi had a very interesting post on the Refactoring list (membership required to read posts). Anyhow, Jay made three really good points:

  • Methods should be 1 line long
  • You should share loop code by using internal iterators and delegates
  • You should have no private methods

I’d like to throw my 2 cents in on these ideas:

1 line methods

I’d sure like to get to the point where I could write one line methods. I need to work harder to get to this point, but I’d like to try. Jay’s point was that the interesting thing about objects was not what each one knew how to do by itself — the interesting thing is how they collaborate with their neighbors to get something done. I’ve heard this referred to in the past as RavioliCode, not to be confused with SpaghettiCode.

A lot of the ability to do this gets down to your choice of language, as some languages (C-based) are more verbose than others (Smalltalk).

Prefer Internal Iterators and Delegates to External Iteration

I have to agree with this point. I used to do this all the time in C++, but I’ve gotten really far away from this habit in Java and C#. There are really two reasons to do this. The first is to avoid exposing your internal collection to outsiders. For example, in our Payroll example, we had the class EmployeeList, which was basically a strongly typed Employee collection.

     1: public class EmployeeList
     2: {
     3:     private ArrayList employees = new ArrayList();
     4:  
     5:     public void Add(string employeeName, int yearlySalary)
     6:     {
     7:         employees.Add(new Employee(employeeName, yearlySalary));
     8:     }
     9:  
    10:     public int Count
    11:     {
    12:         get { return employees.Count; }
    13:     }
    14:  
    15:     public IList Employees
    16:     {
    17:         get { return employees; }
    18:     }
    19: }

And in our Payroll class, we had our Pay method that calculated the pay for all employees.

     1: public class Payroll
     2: {
     3:     private EmployeeList employees;
     4:  
     5:     public Payroll(EmployeeList employees)
     6:     {
     7:         this.employees = employees;
     8:     }
     9:  
    10:     public IList Pay(string payDate)
    11:     {
    12:         ArrayList payRecords = new ArrayList();
    13:         foreach(Employee employee in employees.Employees)
    14:         {
    15:             employee.Pay(payDate, payRecords);
    16:         }
    17:  
    18:         return payRecords;
    19:     }
    20: }

The problem is that now our collection is vulnerable to changing by someone outside its encapsulation boundary, its class, EmployeeList. Exposing our naked collection like that makes it possible for someone to do bad things to it and to us.

Another issue with this is that we’re going to have this external iteration loop (lines 13-16) repeated all over our codebase, as we do different things to the collection of Employees. We can refactor this, however, to reuse the existing loop plumbing and allow our clients to specify the body of that loop.

What we can do is to create a method in EmployeeList called Apply() that takes a special delegate the we’ll write. This delegate will accept an Employee as its parameter, and the Apply() method will have the one and only one looping construct over an EmployeeList in our system. And the body of that loop will just call the delegate, passing in each Employee object in turn. Here is what it looks like in code:

     1: public class EmployeeList
     2: {
     3:     private ArrayList employees = new ArrayList();
     4:  
     5:     public delegate void ApplyAction(Employee employee);
     6:  
     7:     public void Apply(ApplyAction action)
     8:     {
     9:         foreach(Employee employee in employees)
    10:         {
    11:             action(employee);
    12:         }
    13:     }
    14:  
    15:     public void Add(string employeeName, int yearlySalary)
    16:     {
    17:         employees.Add(new Employee(employeeName, yearlySalary));
    18:     }
    19:  
    20:     public int Count
    21:     {
    22:         get { return employees.Count; }
    23:     }
    24: }

Note the new Apply() method in lines 7-13. This method does as described above — it takes a delegate and applies it to each Employee object in turn. It is the responsibility of the function to which the call is delegated to maintain what ever kind of state or context that is applicable per call. We were also able to get rid of our Employees property that exposed our collection to the world.

Here is the Payroll class after refactoring it to use this new Apply() method.

     1: public class Payroll
     2: {
     3:     private EmployeeList employees;
     4:     private ArrayList payRecords;
     5:     private string payDate;
     6:  
     7:     public Payroll(EmployeeList employees)
     8:     {
     9:         this.employees = employees;
    10:     }
    11:  
    12:     public IList Pay(string payDate)
    13:     {
    14:         this.payDate = payDate;
    15:         payRecords = new ArrayList();
    16:  
    17:         employees.Apply(new EmployeeList.ApplyAction(PayEmployee));
    18:  
    19:         return payRecords;
    20:     }
    21:  
    22:     private void PayEmployee(Employee employee)
    23:     {
    24:         employee.Pay(payDate, payRecords);
    25:     }
    26: }

Look at the Pay method in lines 12-20. Now, instead of having the explicit looping logic here, I have a call to the EmployeeList.Apply() method, passing a delegate to my new PayEmployee method. The end result of this is that I get rid of the looping logic at the price of another private method, PayEmployee. Even this cost is gone, however, in Whidbey, where you can have anonymous delegates. Finally, line 17 to me does not communicate its intention very well, so I did an ExtractMethod on it and created a PayAllEmployees private method. This is left as an exercise for the reader 🙂

I’m still not entirely happy with the Pay method, however, as its first two lines seem a little strange. Since I have to maintain state across calls to the PayEmployee method, I have to promote our argument, payDate, and our Collecting Parameter payRecords into member variables. This allows them to carry the state through the multiple calls to PayEmployee. I’m going to just acknowledge that I don’t like it right now, but I figure I’ll be back to clear that up later. I’m hoping future changes will inform me of what that stuff needs to turn into.

No Private Methods

Another point Jay makes is that private methods are really public methods of another, new class trying to find their way out. I actually teach this same concept, but I don’t start invoking it until a private methods gets rich enough for someone to want to test them. At that point, I advocate pulling it out into another class.

But perhaps Jay has an interesting point, and I’d like to try it. But I can’t see making all private methods into methods on other classes. I admit that I haven’t tried it as aggressively as Jay describes, so my opinion shouldn’t carry as much weight as his here, but it seems that there are some methods that are introduced just to help explain other code. I mean those one-liners that make code more easily readable. For example, the PayAllEmployees() method from above is introduced just to help make our code more readable. And the private PayEmployee() method is just an artifact of the language.

On the other hand, maybe there is something interesting going on here. If I moved the PayEmployees() method to another class, called PayrollAction or something like that, then the PayEmployee() method could come along, too. It would still remain private there, however, since it is an artifact of the language, and not strictly something we would want to write. I’m not sure that this would make things communicate better at this point, and it would give me more classes and methods. That moves me further away from The Simplest Thing That Could Possibly Work, which is always my underlying goal. I think I’ll keep this one under my hat for now, but I’m ready to spring to it should the need arise.

But what this piece of advice does to for me is to make me even more vigilant against creeping private methods. I think I’m going to be more proactive about moving them out into their own classes earlier than I have before.

Conclusion

I think I’m happier with the code now that I’ve gotten rid of the external iteration. I’ve encapsulated my collection, yet still maintained the generality I had before by introducing the Apply() method on EmployeeList. And I’ve made my Payroll class simpler, since it doesn’t have to understand as much about the internals of the EmployeeList class, which is always a good thing.

In the next episode, I promise we’ll finish this first story. Thanks for your patience, but I just keep discovering new things about this code I’ve never noticed before. By the way, this is one of the most powerful arguments for pair programming — the act of explaining your design to someone as it is taking shape leads to a better design. Just working in the same room together isn’t the same thing — you have to be intimately connected, as in Pair Programming.

— bab

Different keyboards promote different priorities

Scott Densmore and Peter Provost are both keyboard junkies. They live their lives finding new keyboard shortcuts to do stuff. If it doesn’t have a shortcut for it, they ain’t gonna use it.

On the other hand, I don’t have an obsession with keyboard shortcuts. I have no problem using a mouse, right clicks, etc, to get the job done. And it is not because I am in love with the WIMP style of interfaces. Heck, I’m an old-time Emacs user, and can contort my fingers into any position necessary to hit Ctrl-x4b<ret> to make something happen.

I think its because of our keyboards. Scott and Peter both use desktop machines and standard keyboards. I use a laptop exclusively. My thinkpad has that little pointy stick in the middle and has mouse buttons I can reach with my thumbs. So, to use the mouse, I never have to move my fingers off the home row. They have to take their hand off their keyboard and put it on the mouse to use it.

Just a (semi) interesting observation that I made.

— bab

 

Speaking at .Net Users Group in St. Louis, 11/29

For those of you interested, and in St. Louis, I’ll be speaking at the .Net Users Group meeting this month. My topic will be “Refactoring: Adding Design to your Code After It Is Written”. My underlying intent in this talk is to convince people that refactoring is a worthwhile activity, as it adds value to your code.

I intend to walk through some simple refactorings to improve a piece of procedural code and begin to flesh out its dumb data objects into living, breathing OO classes.

Outline:

  1. Introduction of me (boring!)
  2. Imagine yourself walking into work and being asked to make a change to some code you’ve never seen before…
  3. Maybe if we improve it a bit, we can read it better…
  4. Good, now I can read it, maybe I can start thinking about changing it
  5. This is refactoring!
  6. New buzzword, not reworking. Can stop in middle
  7. Let’s improve the code a bit more and see where that goes…
  8. Why can I do this? (tests, better code, etc)
  9. These are my tools (brain, patterns, refactoring book)
  10. Here are the reasons why I can’t do this…
  11. What else can we improve in the code
  12. Other resources
  13. GIVEAWAY!!!

We have 3 signed copies of Jim Newkirk’s Test Driven Development in Microsoft .Net and may have a copy or two of the Resharper refactoring tool to give away.

Hope to see lots of you there.

— bab

 

The Psychology of the Extreme Programmer

There has been an interesting thread going on in the Extreme Programming Yahoo group. I haven’t read the whole thing, but I’ve cherry picked out some good posts from the list.

Steve Gordon posted one that I find particularly interesting. He posits:

I am finding that the most productive pairings are when the knowledge of the more junior developer is not a proper subset of the knowledge of the more senior developer, but complements it in some way. For example, when the junior developer is more familiar with a new domain or a new technology, the senior developer may still dominate the design, but the junior developer can make a contribution, which then creates a true partnership that allows each of them to learn from the other. The junior then picks up the more general development and design skills.

Therefore, when starting a new project, I would recommend selecting senior developers based on general skills rather than knowledge of the specific domain or technologies and selecting junior developers based on knowledge the specific domain or technology (as well as growth potential, of course). This recommendation is the exact opposite of what most people do.

I find this advice very interesting, and as he says, completely contradictory to what most projects do.

While I’ve always believed that knowledge of a specific technology is important, knowledge of how to build software is even more important. And this knowledge is much harder to find. But its the engine that makes projects go.

I’m not running down technology knowledge at all, as I know that’s critical for a project’s success — I just believe it is a lot easier to find someone well versed in J2EE or ASP.Net, for example, than it is to get someone who understands the tradeoffs that have to be made when designing and implementing a system. And someone who can gently mentor someone into being a more senior developer.

This fits in well with the profile of an expert agile developer as being someone who is really, really good at a lot of tasks, but perhaps not expert at any of them. What they have developed is the ability to pick up new technical topics as needed, very quickly. But they can always apply their software knowledge to any task, integrating the pieces into a coherent whole.

Companies would do a lot better to hire a bunch of these expert generalists and subsidize them with a sprinkling of young bucks, to bring the new tech expertise and to be groomed as the next generation of experts.

— bab

UPDATED — to fix formatting under IE. It read just fine in BlogJet before posting, and it reads just fine in Firefox, but IE runs the quoted stuff all into one line. Readjusted the BlogJet-produced HTML to make it flow properly. Thanks for the comment, Mark

Diving into TDD — Take 3

[Note: I’m using Copy as HTML to transfer code from VS.Net to BlogJet. This tool embeds style sheets into the posted blog entry to provide the nice formatting for the source code. Unfortunately, the version of .Text used by Dot Net Junkies won’t let me post the embedded style sheets. This means that viewing this article in an aggregator will look different, and worse, than viewing it on the DotNetJunkies site. I recommend viewing it there, as the code will be more easily readable than in an aggregator — bab]

Well, we’ve written two tests so far, and we’ve refactored this thing to within an inch of its life. We have a bit more to go before we get the Payroll portion of this first story finished up, after which we have to deal with reading the input batch file and writing the output check list, and also write a main. I’m much less concerned about those functions, because I know I can handle them. They’re part of the plumbing that makes the program work, rather than the meat of the program, so I generally leave those kinds of things for last.

As of now, we can pay one person, as long as it is always the first of the month. We should probably expand on both of those things, so that we can pay people or not, based on the day of the month, and we should also try to pay more than one person. Going back to our test list, those are the two remaining tests for this class.

Paying multiple employees

I’m going to choose to implement paying multiple people next. As a simplifying assumption, I’m always going to pass in a valid pay date when calling Payroll.Pay, since I haven’t implemented the logic to make paydays conditional on date yet. This is a pretty standard trick when implementing a system using TDD — You know you have some complex conditional behavior to be implemented soon, but you haven’t implemented it yet. What you can do is to write your tests now to reflect that condition always being true. This lets you make progress now, and also provides a safety net to prove that your code still works for the true case later, when you do implement the conditional.

So, as our first step, we need to write a test that tries to pay multiple people. That test might look like this:

     1: [Test]
     2: public void PayAllEmployeesOnFirstOfMonth()
     3: {
     4:     EmployeeList employees = new EmployeeList();
     5:     employees.Add("Bill", 1200);
     6:     employees.Add("Tom", 2400);
     7:  
     8:     Payroll payroll = new Payroll(employees);
     9:  
    10:     IList payRecords = payroll.PayAll("05/01/2004");
    11:  
    12:     Assert.AreEqual(2, payRecords.Count);
    13:  
    14:     PayRecord billsPay = payRecords[0] as PayRecord;
    15:     Assert.AreEqual("Bill", billsPay.Name);
    16:     Assert.AreEqual(100, billsPay.Pay);
    17:  
    18:     PayRecord tomsPay = payRecords[1] as PayRecord;
    19:     Assert.AreEqual("Tom", tomsPay.Name);
    20:     Assert.AreEqual(200, tomsPay.Pay);
    21: }

It is very similar to the previous tests. The only real differences are that we add two employees to the EmployeeList, and we have a new method, seen on Line 10, called Payroll.PayAll. This is a little tool-centric and language-imposed trick I’m using here. The existing Payroll.Pay method takes a single date and returns a single PayRecord. In an ideal world, I’d really like to just change that method to return an IList containing PayRecords, but language rules don’t let us override a method purely on return type. So, to have another Pay-like method that returns an IList of PayRecords, I have to have a differently named method.

I do have other options, rather than creating this newly named method.

  1. I could write the test to use the IList-returning Payroll.Pay method. This is where I eventually intend to be, after all. My objection to this is that it would require me to change multiple tests all at the same time to get this to compile and work. That’s nowhere near the smallest step I could make, and I’d rather find another path.
  2. I could comment out this newest test and refactor the previous tests to use the new version of the Payroll.Pay method. To do this the right way, I’d use a ChangeMethodSignature refactoring, which would involve these steps:
    1. Create new method with signature I want.
    2. Make the old method call the new method
    3. Retarget all calls to the old method to call the new one

    The problem with this is that to do step 1, we’d have to override the existing method with a new method differing only in return type, getting us back to right where we are now with the PayAll method, at least temporarily.

  3. We could proceed as we are now, make PayAll work, then convert all existing tests to call PayAll, one at a time. Once that is finished, no one should be using the existing Pay method, so we can get rid of it, and rename the PayAll method back to Pay, as we wanted. The reason I’ll choose this path is that my toolset (ReSharper) is really good at things like RenameMethod, so I can do that easily.

So at this point, I’ll go ahead and make the above test work as written. The first thing I need to do is to expand the EmployeeList class to actually have a list of employees, instead of just one. This is a simple refactoring, involving replacing the scalar Employee object with an ArrayList of employees, and adapting the existing methods to use the first item in the ArrayList as the single employee previously defined in the class. After doing this, I reran all my tests to confirm that all tests other than the one I’m implementing now still work, and they did. Here is the new code, along with a new method, Employees, that returns the entire list of employees:

     1: public class EmployeeList
     2: {
     3:     private ArrayList employees = new ArrayList();
     4:  
     5:     public void Add(string employeeName, int yearlySalary)
     6:     {
     7:         employees.Add(new Employee(employeeName, yearlySalary));
     8:     }
     9:  
    10:     public int Count
    11:     {
    12:         get { return employees.Count; }
    13:     }
    14:  
    15:     public Employee Employee
    16:     {
    17:         get { return employees[0] as Employee; }
    18:     }
    19:  
    20:     public IList Employees
    21:     {
    22:         get { return employees; }
    23:     }
    24: 

Now the Payroll.PayAll method:

     1: public IList PayAll(string payDate)
     2: {
     3:     ArrayList payRecords = new ArrayList();
     4:     foreach(Employee employee in employees.Employees)
     5:     {
     6:         payRecords.Add(employee.Pay(payDate));
     7:     }
     8:  
     9:     return payRecords;
    10: }

All tests work at this point, so its time for some refactoring. Let’s start by changing the first test to use the new PayAll method:

     1: [Test]
     2: public void NoOnePaidIfNoEmployees()
     3: {
     4:     EmployeeList employees = new EmployeeList();
     5:     Payroll payroll = new Payroll(employees);
     6:  
     7:     IList payrollOutput = payroll.PayAll("05/01/2004");
     8:  
     9:     Assert.AreEqual(0, payrollOutput.Count);
    10: }

And now, the last test to change:

     1: [Test]
     2: public void PayOneEmployeeOnFirstOfMonth()
     3: {
     4:     EmployeeList employees = new EmployeeList();
     5:     employees.Add("Bill", 1200);
     6:  
     7:     Payroll payroll = new Payroll(employees);
     8:  
     9:     IList payrollOutput = payroll.PayAll("05/01/2004");
    10:  
    11:     Assert.AreEqual(1, payrollOutput.Count);
    12:             
    13:     PayRecord billsPay = payrollOutput[0] as PayRecord;
    14:     Assert.AreEqual("Bill", billsPay.Name);
    15:     Assert.AreEqual(100, billsPay.Pay);
    16: }

And now, the old Pay method is no longer used, so I can go ahead and get rid of it, followed by a RenameMethod on PayAll to Pay, and we’re right where we wanted to be.

What if it’s not the first of the month?

Our original spec said that we shouldn’t pay an employee if payroll gets run on any day but the first of the month. We’ve avoided coding that so far, but we kind of need that to complete the Payroll class. So, let’s write a test. First question is, now that we’re thinking about day of the month, do we need to write any tests to confirm that we can pay on the first? Well, all the tests that we’ve written previously have assumed that the date was the first of the month, so each of those test cases should still function. That’s our proof that we can pay if we need to. Now we need to not pay if we don’t need to. Here is that test:

     1: [Test]
     2: public void NoEmployeePaidIfNotFirstOfMonth()
     3: {
     4:     EmployeeList employees = new EmployeeList();
     5:     employees.Add("Johnny", 3600);
     6:  
     7:     Payroll payroll = new Payroll(employees);
     8:     IList emptyList = payroll.Pay("05/02/2004");
     9:  
    10:     Assert.AreEqual(0, emptyList.Count);
    11: }

Run it, watch it fail, and now implement this functionality. Now the question is where should it go? Here is one implementation of it:

     1: public IList Pay(string payDate)
     2: {
     3:     ArrayList payRecords = new ArrayList();
     4:     foreach(Employee employee in employees.Employees)
     5:     {
     6:         if(employee.IsPayday(payDate))
     7:         {
     8:             payRecords.Add(employee.Pay());                  
     9:         }
    10:     }
    11:  
    12:     return payRecords;
    13: }

This would certainly work, and it is a solution I see a lot, but there is a philosophical issue with it. There is this OO design principle called Tell, Don’t Ask. What it says is that you shouldn’t ask an object a question about its state and then do something based on the response. If you think about it, that’s really exposing something about the internals of the object you’re talking to. The whole point of objects is that they are supposed to encapsulate data and behavior. If we let them do this, we end up with an interface that is made up of interesting methods, each of which has some meaning and context. If we don’t follow this rule, we end up with a lot of dumb data classes and a few god classes in our system that hold all the interesting behavior. I say, move the behavior and data to the same place, and get rid of simple accessor/setter methods!

To implement this change following Tell, Don’t Ask required a few more changes than the original change. First of all, the Employee.Pay method is changed:

     1: public PayRecord Pay(string payDate)
     2: {
     3:     if(IsPayday(payDate))
     4:     {
     5:         PayRecord payRecord = new PayRecord(Name, CalculatePay());
     6:         return payRecord;
     7:     }
     8:     else return null;
     9: }

And the Payroll.Pay method has to be changed as well. We need to look out for the null PayRecord coming back from the Employee.Pay method, and only add the returned PayRecord to our collection of PayRecords if it is not null. I’m not so sure this code looks any better than the first attempt we made by exposing the Employee.IsPayday method as public… 

The real solution to this dilemma is to use another pattern, called a CollectingParameter, to get rid of the null checking. Basically, if you hold your head just right, calling a method on another object and getting back a value, and doing something different based on whether that value is null or not, is really kind of a Tell, Don’t Ask violation. The same logic that is being done in the calling method through a check for null could realy be moved into the called method, provided the called method has someplace to put the result. That someplace is the CollectingParameter. This is how the refactoring turns out…

Inside Payroll.Pay, we get back to a very simple interface, with no extra state-based code:

     1: public IList Pay(string payDate)
     2: {
     3:     ArrayList payRecords = new ArrayList();
     4:     foreach(Employee employee in employees.Employees)
     5:     {
     6:         employee.Pay(payDate, payRecords);
     7:     }
     8:  
     9:     return payRecords;
    10: }

Note how we create our CollectingParameter payRecords at the top of the Pay method and pass it into the Employee.Pay method each time it is called. This allows the logic about whether or not to add a PayRecord into the Employee class, which is where the behavior that controls Pay logic lives. This is a better distribution of responsibilities. Here is the Employee.Pay code:

     1: public void Pay(string payDate, IList payRecords)
     2: {
     3:     if(IsPayday(payDate))
     4:     {
     5:         PayRecord payRecord = new PayRecord(Name, CalculatePay());
     6:         payRecords.Add(payRecord);
     7:     }
     8: }

And finally, here is all the code, as far as we’ve implemented so far. We’re still left with reading and writing to and from our input and output files, but we’ll cover that next time. There are some interesting design decisions in there that could spark some good conversation (if anyone is actually reading this!)

     1: using System.Collections;
     2: using NUnit.Framework;
     3:  
     4: namespace PayrollExercise
     5: {
     6:     [TestFixture]
     7:     public class PayrollFixture
     8:     {
     9:         [Test]
    10:         public void NoOnePaidIfNoEmployees()
    11:         {
    12:             EmployeeList employees = new EmployeeList();
    13:             Payroll payroll = new Payroll(employees);
    14:  
    15:             IList payrollOutput = payroll.Pay("05/01/2004");
    16:  
    17:             Assert.AreEqual(0, payrollOutput.Count);
    18:         }
    19:  
    20:         [Test]
    21:         public void PayOneEmployeeOnFirstOfMonth()
    22:         {
    23:             EmployeeList employees = new EmployeeList();
    24:             employees.Add("Bill", 1200);
    25:  
    26:             Payroll payroll = new Payroll(employees);
    27:  
    28:             IList payrollOutput = payroll.Pay("05/01/2004");
    29:  
    30:             Assert.AreEqual(1, payrollOutput.Count);
    31:             
    32:             PayRecord billsPay = payrollOutput[0] as PayRecord;
    33:             Assert.AreEqual("Bill", billsPay.Name);
    34:             Assert.AreEqual(100, billsPay.Pay);
    35:         }
    36:  
    37:         [Test]
    38:         public void PayAllEmployeesOnFirstOfMonth()
    39:         {
    40:             EmployeeList employees = new EmployeeList();
    41:             employees.Add("Bill", 1200);
    42:             employees.Add("Tom", 2400);
    43:  
    44:             Payroll payroll = new Payroll(employees);
    45:  
    46:             IList payRecords = payroll.Pay("05/01/2004");
    47:  
    48:             Assert.AreEqual(2, payRecords.Count);
    49:  
    50:             PayRecord billsPay = payRecords[0] as PayRecord;
    51:             Assert.AreEqual("Bill", billsPay.Name);
    52:             Assert.AreEqual(100, billsPay.Pay);
    53:  
    54:             PayRecord tomsPay = payRecords[1] as PayRecord;
    55:             Assert.AreEqual("Tom", tomsPay.Name);
    56:             Assert.AreEqual(200, tomsPay.Pay);
    57:         }
    58:  
    59:         [Test]
    60:         public void NoEmployeePaidIfNotFirstOfMonth()
    61:         {
    62:             EmployeeList employees = new EmployeeList();
    63:             employees.Add("Johnny", 3600);
    64:  
    65:             Payroll payroll = new Payroll(employees);
    66:             IList emptyList = payroll.Pay("05/02/2004");
    67:  
    68:             Assert.AreEqual(0, emptyList.Count);
    69:         }
    70:     }
    71:  
    72:     public class Payroll
    73:     {
    74:         private EmployeeList employees;
    75:  
    76:         public Payroll(EmployeeList employees)
    77:         {
    78:             this.employees = employees;
    79:         }
    80:  
    81:         public IList Pay(string payDate)
    82:         {
    83:             ArrayList payRecords = new ArrayList();
    84:             foreach(Employee employee in employees.Employees)
    85:             {
    86:                 employee.Pay(payDate, payRecords);
    87:             }
    88:  
    89:             return payRecords;
    90:         }
    91:     }
    92:  
    93:     public class Employee
    94:     {
    95:         private string name;
    96:         private int salary;
    97:  
    98:         public Employee(string name, int yearlySalary)
    99:         {
   100:             this.name = name;
   101:             this.salary = yearlySalary;
   102:         }
   103:  
   104:         private string Name
   105:         {
   106:             get { return name; }
   107:         }
   108:  
   109:         private int CalculatePay()
   110:         {
   111:             return salary/12;
   112:         }
   113:  
   114:         public void Pay(string payDate, IList payRecords)
   115:         {
   116:             if(IsPayday(payDate))
   117:             {
   118:                 PayRecord payRecord = new PayRecord(Name, CalculatePay());
   119:                 payRecords.Add(payRecord);
   120:             }
   121:         }
   122:  
   123:         private bool IsPayday(string payDate)
   124:         {
   125:             string [] dateParts = payDate.Split(new char[] {'/'});
   126:             return dateParts[1] == "01";
   127:         }
   128:     }
   129:  
   130:     public class PayRecord
   131:     {
   132:         private string name;
   133:         private int pay;
   134:  
   135:         public PayRecord(string name, int pay)
   136:         {
   137:             this.name = name;
   138:             this.pay = pay;
   139:         }
   140:  
   141:         public string Name
   142:         {
   143:             get { return name; }
   144:         }
   145:  
   146:         public int Pay
   147:         {
   148:             get { return pay; }
   149:         }
   150:     }
   151:  
   152:     public class EmployeeList
   153:     {
   154:         private ArrayList employees = new ArrayList();
   155:  
   156:         public void Add(string employeeName, int yearlySalary)
   157:         {
   158:             employees.Add(new Employee(employeeName, yearlySalary));
   159:         }
   160:  
   161:         public int Count
   162:         {
   163:             get { return employees.Count; }
   164:         }
   165:  
   166:         public Employee Employee
   167:         {
   168:             get { return employees[0] as Employee; }
   169:         }
   170:  
   171:         public IList Employees
   172:         {
   173:             get { return employees; }
   174:         }
   175:     }
   176: }

Happiness is…

Spambayes.

A long time ago, in a blog far, far away, I pined for something to free me from this prison of spam. My good friend, Peter Provost, turned me onto Spambayes.

This single tool has given me back my inbox. I’ve been targeted these past few days by a particularly nasty spam that has been sent to me over 300 times in the last 24 hours. Spambayes caught each and every one of them, and I didn’t even know they were there until I went back to look at the spam that had been filtered for me.

Ahhh, heaven 🙂 I have my inbox back again!

Here are the stats over the past week or so:

SpamBayes has processed 7238 messages –
1705 (24%) good, 4568 (63%) spam and 422 (5%) unsure.
1619 messages were manually classified as good (0 were false positives).
5249 messages were manually classified as spam (99 were false negatives).
2 unsure messages were manually identified as good, and 415 as spam.

— bab

Now playing: Dream Theater – Six Degrees Of Inner Turbulence – The Glass Prison

TDD Dive — Part Deux

When last I left you, we had just about started implementing a simple payroll system using TDD. We had written two tests and done some refactoring, but there were still some code smells that I didn’t care for that we were going to look at before proceeding.

This is the current Payroll.Pay method:

     1: public string Pay(string payDate)
     2: {
     3:     if(employees.Count == 0) return null;
     4:     Employee theEmployee = employees.Employee;
     5:     return "Check,100," + theEmployee.Name + ",$" + theEmployee.CalculatePay() + "," +payDate;
     6: }

My biggest problems here are that this method, inside the Payroll class, knows how to dip inside the Employee class to get names and pay amounts, and it know how to format the output according to what our check writing company needs. Let’s try to fix both of these, and see where this takes us.

My first instinct is that we could move this behavior inside Employee. This would remove the knowledge of the primitive accessors from the Payroll class, and would move all logic about creating pay information inside the Employee. This leaves us with:

     1: public string Pay(string payDate)
     2: {
     3:     if(employees.Count == 0) return null;
     4:     Employee theEmployee = employees.Employee;
     5:     return Pay(payDate, theEmployee);
     6: }
     7:   
     8: private string Pay(string payDate, Employee theEmployee)
     9: {
    10:     return "Check,100," + theEmployee.Name + ",$" + theEmployee.CalculatePay() + "," +payDate;
    11: }

Step one was to do an ExtractMethod refactoring to move the creation of the string into its own method. Now that its here, we can look at it more closely, and see if we can’t figoure out where the code really wants to be. It sure looks like all the data that this method is using comes from the Employee class. This is a code smell called FeatureEnvy, where code that exists in one place is accessing private parts of another class, and the code really wants to be in that other class. Let’s do a MoveMethod and move it there…

     1: private class Payroll
     2: {
     3:     private EmployeeList employees;
     4:
     5:     public Payroll(EmployeeList employees)
     6:     {
     7:         this.employees = employees;
     8:     }
     9:  
    10:     public string Pay(string payDate)
    11:     {
    12:         if (employees.Count == 0) return null;
    13:         Employee theEmployee = employees.Employee;
    14:         return theEmployee.Pay(payDate);
    15:     }
    16: }
    17:  
    18: private class Employee
    19: {
    20:     private string name;
    21:     private int salary;
    22:  
    23:     public Employee(string name, int yearlySalary)
    24:     {
    25:         this.name = name;
    26:         this.salary = yearlySalary;
    27:     }
    28:  
    29:     private string Name
    30:     {
    31:         get { return name; }
    32:     }
    33:  
    34:     private int CalculatePay()
    35:     {
    36:         return salary/12;
    37:     }
    38:  
    39:     public string Pay(string payDate)
    40:     {
    41:         return "Check,100," + Name + ",$" + CalculatePay() + "," + payDate;
    42:     }
    43: }

After this change, line 14 has been changed to call the Pay method of Employee, and Payroll is ignorant of any other behavior or implementation detail of Employee. This is a Good Thing.

The remaining smell in our code at this point is that the Pay method now knows about check formatting, and it probably shouldn’t. What I’d really like to have happen is that this method would return some dumb data object that another class could use to format the output at some later point. That way I’ve separated production of the payroll data from the formatting and reporting of that data. Each of these concepts could vary independently, so they really don’t belong in the same place. This leads me to create a PayRecord class, whose responsibility it is to transport pay data between the Employee class and the still to-be-designed check formatting class.

     1: private class Employee
     2: {
     3:     public string Pay(string payDate)
     4:     {
     5:         PayRecord payRecord = new PayRecord(Name, CalculatePay());
     6:         return "Check,100," + payRecord.Name + ",$" + payRecord.Pay + "," + payDate;
     7:     }
     8: }
     9:  
    10: private class PayRecord
    11: {
    12:     private string name;
    13:     private int pay;
    14:  
    15:     public PayRecord(string name, int pay)
    16:     {
    17:         this.name = name;
    18:         this.pay = pay;
    19:     }
    20:  
    21:     public string Name
    22:     {
    23:         get { return name; }
    24:     }
    25:  
    26:     public int Pay
    27:     {
    28:         get { return pay; }
    29:     }
    30: }

Sometimes it gets challenging to imagine what the smallest step along the way could be. In this case, it would have been pretty easy to create a PayRecord class, change the signature of the Pay method to return one of these buggers, and then change my tests to match. But I think I can do it is a smaller step. I can do as I show above — I can create the PayRecord class inside the Pay method and still return the string. I can now test this,and I’m certain that my code all still works. The next refactoring is the ChangeMethodSignature, where I’ll change this method to return a PayRecord and change the tests to match. Remember, refactoring is about making small, measured changes to your source code, and verifying their correctness before moving on. Taking big steps increases the risk that something can go wrong, and you’ll have to take extra time to figure out what that was and fix it.

Now here is the final version of the code as far as we are now. This last step was to change the method signature of the Pay method in Employee and Payroll to both return a PayRecord, and I updated the tests to look inside the PayRecord to confirm that the right data was produced.

     1: using NUnit.Framework;
     2:  
     3: namespace PayrollExercise
     4: {
     5:     [TestFixture]
     6:     public class PayrollFixture
     7:     {
     8:         [Test]
     9:         public void NoOnePaidIfNoEmployees()
    10:         {
    11:             EmployeeList employees = new EmployeeList();
    12:             Payroll payroll = new Payroll(employees);
    13:  
    14:             PayRecord payrollOutput = payroll.Pay("05/01/2004");
    15:  
    16:             Assert.IsNull(payrollOutput);
    17:         }
    18:  
    19:         [Test]
    20:         public void PayOneEmployeeOnFirstOfMonth()
    21:         {
    22:             EmployeeList employees = new EmployeeList();
    23:             employees.Add("Bill", 1200);
    24:  
    25:             Payroll payroll = new Payroll(employees);
    26:  
    27:             PayRecord payrollOutput = payroll.Pay("05/01/2004");
    28:  
    29:             Assert.AreEqual("Bill", payrollOutput.Name);
    30:             Assert.AreEqual(100, payrollOutput.Pay);
    31:         }
    32:     }
    33:  
    34:     public class Payroll
    35:     {
    36:         private EmployeeList employees;
    37:  
    38:         public Payroll(EmployeeList employees)
    39:         {
    40:             this.employees = employees;
    41:         }
    42:  
    43:         public PayRecord Pay(string payDate)
    44:         {
    45:             if (employees.Count == 0) return null;
    46:             Employee theEmployee = employees.Employee;
    47:             return theEmployee.Pay(payDate);
    48:         }
    49:     }
    50:  
    51:     public class Employee
    52:     {
    53:         private string name;
    54:         private int salary;
    55:  
    56:         public Employee(string name, int yearlySalary)
    57:         {
    58:             this.name = name;
    59:             this.salary = yearlySalary;
    60:         }
    61:  
    62:         private string Name
    63:         {
    64:             get { return name; }
    65:         }
    66:  
    67:         private int CalculatePay()
    68:         {
    69:             return salary/12;
    70:         }
    71:  
    72:         public PayRecord Pay(string payDate)
    73:         {
    74:             PayRecord payRecord = new PayRecord(Name, CalculatePay());
    75:             return payRecord;
    76:         }
    77:     }
    78:  
    79:     public class PayRecord
    80:     {
    81:         private string name;
    82:         private int pay;
    83:  
    84:         public PayRecord(string name, int pay)
    85:         {
    86:             this.name = name;
    87:             this.pay = pay;
    88:         }
    89:  
    90:         public string Name
    91:         {
    92:             get { return name; }
    93:         }
    94:  
    95:         public int Pay
    96:         {
    97:             get { return pay; }
    98:         }
    99:     }
   100:  
   101:     public class EmployeeList
   102:     {
   103:         private int employeeCount = 0;
   104:         private Employee employee;
   105:  
   106:         public void Add(string employeeName, int yearlySalary)
   107:         {
   108:             employee = new Employee(employeeName, yearlySalary);
   109:             this.employeeCount++;
   110:         }
   111:  
   112:         public int Count
   113:         {
   114:             get { return employeeCount; }
   115:         }
   116:  
   117:         public Employee Employee
   118:         {
   119:             get { return employee; }
   120:         }
   121:     }
   122: }

The point that I really want to make is that refactoring is really important. Test Driven Development is the process of adding behavior incrementally, driven through mini-use cases of the software you’re building. Each test forces some new behavior to be added to the system, but it doesn’t say anything at all about the design of the system. The tests help you get the interface right between the objects you’re talking to, but that’s at far as they go. Refactoring is the design engine in the TDD process. It is the process through which other classes are born, code is made simple and readable, and systems evolve and grow.

If you are doing TDD and you are not spending a significant amount of your time refactoring, you are doing yourself and your team a disservice.

I’ll leave you on that cheery note 🙂 More of this example to come shortly.

— bab

Now playing: Dio – The Last In Line – One Night In The City