Sometimes you accomplish nothing but you learn everything. That’s what happened to me yesterday in the Triangle Lounge. Quick background — I’m working on the Enterprise Library project in the Microsoft patterns & practices group. We’ve been working on consolidating many of the current MS Application Blocks into a single, coherent offering. We all sit in one room, called the Triangle Lounge, including the 3 remaining developers and the 3 testers.

This post centers around our relationship with the testers. They sit in the same room as us, laugh with us, talk with us, but I don’t feel like they’re really part of the team. They’re the testers. While we’re busy developing code and writing tests, they are still manually inspecting and testing everything that we do, clearly apart from our culture. This bothers me at several levels.

First of all, there is this natural tension in our relationship, since they are busy attacking everything that we do. That’s their job, and they are the best that I’ve ever met at doing it. But I feel like I’m on the defensive when I’m talking to them, justifying my design and code. The words that I most hate hearing are, “Brian, can I ask you a question?”, because this always leads to an issue they found in the code, and leads to me feeling defensive. I don’t like that, and I’m not sure what to do about it.

Second, I don’t like that they manually test everything. We spend all this time writing automated tests for our software, and they write small test programs that they drive manually. All of their knowledge about what they’ve tested and how they’ve found bugs is wrapped up either in their heads or bug reports, and we can’t easily get at it or use it. This is how they were trained, and I’m not seeing any inclination to change. The problem with it is that it makes any changes to the software harder to justify, since there has to be a separate, manual QA pass through the code once we’re finished. If their tests were automated… (Side note — would I feel less defensive about my code if I were presented with an automated test that I could run and inspect, rather than just a bug report? Interesting question)

Third, they write too many bugs. I mean this seriously. I feel that the testing team is an island, cut off from input from our customer, the Product Manager. They write bugs on any issue they see, whether it is major or exceedingly minor, and onus is on someone else to figure out if the bug is worth fixing. I guess this is probably the role of backend QA on a project, and I need to look at how we can improve our process to prevent us from spending time fixing things that don’t need to be fixed. But that additional step feels aggravating to me. And that’s where this story begins (finally!)

This post is really about what happens on a team when there are two cultures interacting. I’m trying really hard to get the developers to walk into our customer’s office whenever we have a question about how something should work, and get the answer straight from the horse’s mouth (Sorry, Tom, I’m not calling you a horse!). After all, he is the guy who understands the context where our product will live, and he is ultimately responsible for its content. And we, as a development team, are doing a really good job of that. To be honest, the job has become easier over time, as the development team makeup has changed, and we’ve gotten smaller and filled with people who are more devoutly agile. Our team is now three developers, me, Scott Densmore, and Peter Provost. We just lost TSHAK, who was a bit younger, but easily test-infected and really good. With Scott and Peter left on the team, we’re definitely feeling pretty agile about ourselves, so we’re doing a lot better at involving Tom. But the testing team just seems different. They are an island. There have been questions raised before where interaction with our customer would have sped things up tremendously, but I’ve never seen it happen. The testers log issues in our bug tracking tool, and wait until someone notices them before anyone starts talking. And this is what got me yesterday.

There was a bug logged that in certain, exceptional circumstances would cause an exception to be logged to the event log twice. It would happen exceedingly rarely, there was a pretty easy workaround for it, it was minimally harmful, and it was only found through manually injecting code to cause exceptions into the application code. This was a great catch in finding it, but I’m not sure if it was handled properly. I hope it doesn’t seem that I’m dumping on the testers, because I don’t want to. The same tester who wrote this bug also helped me tremendously in getting the Caching Application Block tested to the level of quality where it is today — I respect the work that our testers do tremendously (Prashant, Rohit, and Mani), I just really want to change how they work.

Anyhow, this bug was written, and it was given a severity of 2, which is pretty high. In fact, we’re not supposed to ship with any Sev 2’s, so I spent all day yesterday trying to write a test to reproduce it, so I could fix it (Remember, TDD? Write a failing test, then make it pass. Words to live by). The best way that I could figure out to write a test to expose this bug involved creating some CAS-modifying code, so that when the application code tried to access the EventLog, it would fail, and send me down the appropriate error chain to cause the bug. I wrote this test, I was all proud of it, cuz I was being really clever, and then I tried to run it.

Security Exception at line XXX. Additional information: Request failed.

Well, that was helpful. I spent the next 5 hours trying to figure out why this was happening, asked questions of everyone I could find, had Peter call in favors from guys he knew across the country, and involved the Program Manager who is in charge of CAS for the entire .Net framework and his test team. Finally, after hours of struggling, it came out that I was seeing an interaction between strongly named assemblies, explicit CAS Deny requests, and reflection. It turns out that strongly named assemblies have an implicit LinkDemand for FullTrust in front of all their public interfaces. When trying to instantiate a type through reflection from an assembly such as this, the LinkDemand is converted to a regular Demand for FullTrust, which was running into my Deny in my test, which caused the Demand stack walk to fail, and give me that lovely error message. But it took most of the combined resources of Microsoft to figure this out. And the way to solve it was to add [Assembly:AllowPartiallyTrustedCode] to the AssemblyInfo file of any assembly where a type was being created through reflection. Without this attribute, the creation process would always fail unless the calling code had FullTrust.

So, the serendipity in all of this (remember the name of this lengthy article?) was that it raised the issue of how our project would interact with partially trusted code. No one had considered this before, but my day spent doing this raised a number of issues and problems, which I talked with Tom and the test team about, and now we are considering what to do about partially trusted code.

The other part of this, and this is the part that really bothers me, and ties together this whole post, is that I also explained the bug I spent an entire day fixing to Tom, and he was amazed that I had put so much effort into something that was clearly not worth it. He said to resolve it as “Won’t fix”. He had better things for me to do with an entire day’s worth of work. So I reverted all of my code, closed the bug and moved on.

And why did this bother me? Because, due to a lack of communication on all parts, I wasted a day’s work. Our bugs go through a triage process, and Tom needs to be involved with that, to make sure that bug prioritization matches his business prioritization.  The test team needs a better definition of what they should and should not prioritize as important. And I need to be more cognizant of getting input whenever I have a question about a bug. I can try to affect the first two, but I’m definitely going to change the last point through my own actions.

I wrote this post because I care about agile development processes, and we clearly were not being agile here. We followed our process to the letter, but it ended up wasting time because we didn’t interact as people. That, to me, was the most important bug of the day.

Sorry for the rambling.

— bab


8 thoughts to “Serendipity”

  1. Am I understanding this right:

    They write code to find bugs. But when they find a bug, they don’t show you the code.

    (If they showed you the code (and maybe gave you a copy), you could turn it into a test, rather than starting over from scratch.)

    When working with third party vendors, I find that the best way to report bugs is to include a simple sample program, where possible, that reproduces the problem. I’ve logged dozens of bugs, quite smoothly and successfully, with certain major relational database vendors. ;->

    An example bug report:


  2. > Second, I don’t like that they manually

    > test everything. We spend all this time

    > writing automated tests for our software,

    > and they write small test programs that

    > they drive manually. All of their knowledge

    > about what they’ve tested and how they’ve

    > found bugs is wrapped up either in their

    > heads or bug reports, and we can’t easily

    > get at it or use it. This is how they

    > were trained, and I’m not seeing any

    > inclination to change. The problem with

    > it is that it makes any changes to the

    > software harder to justify, since there

    > has to be a separate, manual QA pass

    > through the code once we’re finished.

    > If their tests were automated…

    Interesting. They have to write code to do testing. You’re writing subroutine libraries. The testers must write "application" code, in order to test your library. But they don’t seem to like the automation idea? Or the tool???

    "Are the tests you’re doing useful enough to run again someday, or should they all be thrown away?"

    A good rule of thumb in non-XP test automation is that if you intend to run a test three or more times, then it’s worthwhile to automate it.

  3. Although the work was not (yet) put in to the product, I don’t think your day was wasted. Rather, you (and thus your team) gained in understanding of how you stuff interacts with the security infrastructure, and how that infrastructure works. It gave you a slightly increased depth of knowledge of how .NET works. Given that you are working on a "patterns & practices" project, it is a very good thing for your to have such deep knowledge, it could pay off if any other of ways.

  4. Sounds like a communication problem; perhaps the testers are not giving you the kind of feedback you need. People trained in the "Quality Assurance" school of thought are usually taught to log everything they see, and let the developers figure out went wrong. Developers on Agile teams prefer collaboration and richer communication in my experience. I prefer to not throw bug reports "over the wall" until I have spent as much time investigating as I can.

    When I find a bug like the one you describe, I will first work on it to make sure I can reproduce it on command, and then pair with a developer. Especially if it feels like a high severity, I will at least let them know what I’ve seen so they know what’s coming. If we pair and you say to me that you aren’t sure what is going on with that, but someone else on the team is more familiar, I’ll bring that person over so we can do some work together and get to the bottom of it quickly.

    If the developers say: "that’s a corner case, it’s not worth fixing" or something along those lines, I will spend the time investigating the bug. Bugs usually cluster, so I will spend the time to be sure that this is really a corner case that isn’t worth fixing. Often, this bug is merely a symptom of a much larger problem which may take me a couple of days to get to the bottom of. At that point, after I’ve exhausted my own investigation and asked for help along the way to get information the developers need, it’s time to pass it off to the developers to fix. I’ve had man "won’t fix" bugs turn into showstoppers because the original bug was a symptom of something lurking beneath that unit tests and customer tests didn’t reveal.

    If I realize that it is a corner case and isn’t worth fixing, I can give the developers more confidence in their assessment. If the customer finds something like this, I can also give them more confidence in the developer’s assessment.

    My job as a tester is to be sure I’m giving developers the feedback they need and find useful, and to explore areas of the software that seem to be suspicious so that the developers can program, and I can give them and the customer more confidence in the software being developed.

  5. I like that your testers are writing application code to test your framework, because that’s how it will be used by your customers.

    I’d want them to automate their applications, but I wouldn’t want them to directly automate your library otherwise their tests would be no more real-world than your unit tests. Apps are inherently more varied than the uniformity of environment provided by a test framework.

    Sounds like your triage process is what’s broken here. Triage is as intensely human an activity as I’ve come across in Microsoft and needs the right level of commitment and participation by everyone to work well.

    If Tom thought the bug wasn’t worth fixing then he should have vetoed it in triage. You guys shouldn’t have been working on a bug until it had been through triage.

  6. ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ??? ???? ???? ???? ???? ???? ???? ??? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ????

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.