Showing posts with label dev practices. Show all posts
Showing posts with label dev practices. Show all posts

Saturday, 21 June 2008

Yet another way of raising events from mocks

Update 2008-06-26: This is now in the Rhino Mocks trunk, so it should be available as part of the 3.5 release.

I've been playing around with raising events from mocks for the last couple of nights, and think I've finally come up with an approach that works for me. Finding a nice way of raising these events is particularly tricky for mock object frameworks, as the C# compiler is really picky about how you can use event references. For example, let's look at a very useful interface:

public interface IDoSomething {    
    event EventHandler SomethingDone;
}

Outside of a class that implements IDoSomething, the only time we can reference SomethingDone is when we are adding or removing listeners (x.SomethingDone += someEventHandler; or x.SomethingDone -= someEventHandler;). (C#'s lack of real support for System.Void is partly to blame here, as both these operations are void.)

To raise an event on a mock object, it would be lovely to be able to code something like this:

var mock = mocks.DynamicMock();
mock.Raise(mock.SomethingDone, mock, EventArgs.Empty);

Unfortunately due to the aforementioned constraint, the emphasised part of the code will give a compiler error stating "The event 'IDoSomething.SomethingDone' can only appear on the left hand side of += or -=".

There are a few workarounds for this. Let's start with the standard Rhino Mocks approach. (I'm using Rhino Mocks 3.5 beta and xUnit.net here -- feel free to translate from [Fact] to [Test] if you use NUnit, MBUnit et al.)

[Fact]
public void Raise_event_old_style() {
    var mock = mocks.DynamicMock<IDoSomething>();
    mock.SomethingDone += null;
    IEventRaiser eventRaiser = LastCall.IgnoreArguments().GetEventRaiser();
    mocks.ReplayAll();

    var wasCalled = false;
    mock.SomethingDone += (sender, e) => wasCalled = true;
    eventRaiser.Raise(mock, EventArgs.Empty);

    mocks.VerifyAll();  
    Assert.True(wasCalled);
}

Rhino Mocks records the expectation that an event handler is added, then uses LastCall to ignore the argument and gets an IEventRaiser for the last event referenced. That IEventRaiser can be used later on to raise our event. Phil Haack has a helpful post which explains a bit more about this approach.

When I first saw this I must admit it seemed like a lot of noise that obscured what I was really trying to do. This got worse when I started playing around with the new Arrange - Act - Assert syntax and I didn't want to go through the whole replay / verify cycle. So I started looking at the Rhino Mocks implementation of IEventRaiser, the EventRaiser class. This class lets us do this:

[Fact]
public void Raise_event_using_string_for_event_name() {
    var mock = mocks.DynamicMock<IDoSomething>();
    var wasCalled = false;
    mock.SomethingDone += (sender, e) => wasCalled = true;

    var eventRaiser = EventRaiser.Create(mock, "SomethingDone");
    eventRaiser.Raise(mock, EventArgs.Empty);
    
    Assert.True(wasCalled);
}

Here we can specify the relevant event using a string. This works nicely and is easy to read, but causes problems when refactoring and means we don't get intellisense or compiler assistance. Ayende has written about this approach, comparing it with the LastCall.GetEventRaiser() approach we used last time.

I wasn't overjoyed about either of these, and while searching around for other options I found another of Ayende's posts (I think his blog is about 30% of the web... great stuff :)), asking for feedback on a more natural syntax for raising events from mocks. This looked a bit like this:

mock.MyEvent += EventRaiser.Raise(this, EventArgs.Empty);

I quite liked this, but there were a few complaints in the comments about subscribing to and raising the event at the same time. The post was from about 12 months prior to me writing this and, as I'm using a recent Rhino Mocks build and couldn't find it, it looks like nothing came of this. Let's look for a compromise that also fits in nicely with our Arrange - Act - Assert approach. First we'll see what we can get working based on the first, LastCall.GetEventRaiser() approach used:

[Fact]
public void Raise_event_with_new_arrange_act_assert_syntax() {
    //Arrange
    var mock = MockRepository.GenerateMock<IDoSomething>();
    var wasCalled = false;
    mock.SomethingDone += (sender, e) => wasCalled = true;
    
    var eventRaiser = 
        mock
        .Stub(x => x.SomethingDone += null)
        .IgnoreArguments()
        .GetEventRaiser();
    
    //Act
    eventRaiser.Raise(mock, EventArgs.Empty);

    //Assert
    Assert.True(wasCalled);
}

Here we are specifying a fairly useless stub so we can get an IEventRaiser. We are still using ye olde x.SomethingDone += null trick (albeit with a lambda to neaten it up), but we are pretty much stuck with that if we want strong typing on this as discussed at the beginning of this post.

I think this looks a bit more cohesive now we are using the lambda. We have one statement that is fairly obviously getting an IEventRaiser, rather than a null event handler floating around on its own confusing poor people like me :). Beyond aesthetics, this cohesion can let us pull out this functionality and start getting closer to a neater syntax. For now we'll just whack this in a .NET 3.5 extension method, but we could probably find a better home for it (it can go in a standalone class but the final syntax doesn't read quite as well to me).

public static class EventRaiserExtensions {
    private static IEventRaiser GetEventRaiserFromSubscription<TEventSource>(
        this TEventSource mock, Action<TEventSource> eventSubscription) {
        return mock
            .Stub(eventSubscription)
            .IgnoreArguments()
            .GetEventRaiser();
    }
    
    public static void Raise<TEventSource>(this TEventSource mock, Action<TEventSource> eventSubscription, object sender, EventArgs args) {
        var eventRaiser = GetEventRaiserFromSubscription(mock, eventSubscription);
        eventRaiser.Raise(sender, args);
    }

    public static void Raise<TEventSource>(this TEventSource mock, Action<TEventSource> eventSubscription, params object[] args) {
        var eventRaiser = GetEventRaiserFromSubscription(mock, eventSubscription);
        eventRaiser.Raise(args);
    }        

    public static void Raise<TEventSource>(this TEventSource mock, Action<TEventSource> eventSubscription) {
        var eventRaiser = GetEventRaiserFromSubscription(mock, eventSubscription);
        eventRaiser.Raise(mock, EventArgs.Empty);
    }    
}

The emphasised bit of code is the stub call we did last time, but this time pulled out into one method. The main bits are the Raise<TEventSource> extension methods, which combine all the steps and give us an easy syntax for calling an event on a mock based on an event subscription delegate. So our example now looks like this:

[Fact]
public void Suggestion_for_raising_events() {
    var mock = MockRepository.GenerateMock<IDoSomething>();
    var wasCalled = false;
    mock.SomethingDone += (sender, e) => wasCalled = true;

    mock.Raise(x => x.SomethingDone += null, mock, EventArgs.Empty);

    Assert.True(wasCalled);
}

The implementation itself might need work, but I reckon that syntax is pretty neat considering the limitations of C#. Of course, you're welcome to think otherwise, so please leave a comment expressing your outrage and/or contempt :).

Disclaimer: I am fairly new to Rhino Mocks (have tended to stick to manual test doubles) and especially to Arrange - Act - Assert (it's only in beta at present), so this might fail pretty hard in other circumstances. Still, I thought I'd post the syntax in case it gave more knowledgable people some good ideas :)

Thursday, 19 June 2008

DI and cross-cutting concerns

Just came across some great guidance from Casey Charlton on the ALT.NET mailing list. My last post on dependency injection mentioned the benefit of documenting class dependencies via the constructor used for DI. But what about cross cutting concerns like logging? From Casey's mailing list post:

"Constructor properties shouldnt be used for cross cutting concerns like logging ... Put ILogger as a public property on the class, use the nulllogger as the default value, register the logging facility in Windsor, and all will magically happen"

I like this -- unique dependencies in your constructor, others set via property-injection. I'm also pretty keen on the idea of AOP for doing a lot of basic logging, but I haven't had a good look at this in .NET yet. The ALT.NET thread has a bit more of a discussion on logging.

Thursday, 12 June 2008

Attempting to understand Dependency Injection

Watching Rob Conery and Jeremy Miller work through Dependency Injection for the MVC Storefront project really helped me to piece together a few ideas that had been rattling around in my generally deserted cranium. Dependency injection (DI) is sometimes dismissed as "just" a tool for unit testing and mocking dependencies. But I think DI, and DI containers, offer a bit more to software design than simply providing a way to use mocks for interaction testing. So as usual I'm polluting the blogosphere with my ramblings in an effort to solidify my understanding. Let's work through an example and see what we can find.

Here is how we could write a class to send invoice reminder emails without any thought given to design or testability:

public class InvoiceReminderService {
  public void SendReminders() {
    var emailService = new EmailService();
    var invoiceRepository = new InvoiceRepository();  
    var invoicesDueForReminderEmail = invoiceRepository.GetInvoicesDueForReminders();
    foreach (var invoice in invoicesDueForReminderEmail) {
      var reminderMessage = createReminder(invoice);
      emailService.Send(reminderMessage);
    }
  }
  //...
}

Here we have dependencies on InvoiceRepository, for getting invoices due for reminder emails, and EmailService, for sending out the emails over SMTP. These dependencies could be instantiated at many different places all over our code base. Without worrying about design principles or testability, let's simply pull out these dependencies and rely on constructor injection to instantiate them.

public class InvoiceReminderService {
  private EmailService emailService;
  private InvoiceRepository invoiceRepository;

  public InvoiceReminderService(InvoiceRepository invoiceRepository, EmailService emailService) {
    this.invoiceRepository = invoiceRepository;
    this.emailService = emailService;
  }

  public void SendReminders() {
    var invoicesDueForReminderEmail = invoiceRepository.GetInvoicesDueForReminders();
    foreach (var invoice in invoicesDueForReminderEmail) {
      var reminderMessage = createReminder(invoice);
      emailService.Send(reminderMessage);
    }
  }
  //...
}

Here we've just created a constructor that takes the required dependencies and adds them to private fields. SendReminders() has been updated so that it is no longer responsible for instantiating the dependencies. I can then instantiate an InvoiceReminderService by creating the dependencies and passing them to the constructor, or I can use a DI container (I'm using a pre-release build of StructureMap 2.5 for this post) to automatically create my dependencies for me:

var invoiceReminderService = ObjectFactory.GetInstance<InvoiceReminderService>();

By default our DI container looks at the greediest constructor (the one with the most parameters) to see what dependencies it needs to create. Because InvoiceRepository and EmailService are both concrete classes (i.e. not interfaces or abstract), we don't need to give StructureMap any additional information, it can simply instantiate the required objects and pass them through to the InvoiceReminderService constructor.

So what has this accomplished? Well, SendReminders() is now a lot clearer without the dependencies in the way. And we also have an easy way to tell exactly what the dependencies of the class are (the constructor). We are also skirting dangerously close to having a testable class that could easily be modified to follow the Dependency Inversion Principle (DIP) (by extracting interfaces from our dependencies and relying on those instead of on concrete classes) and the Open/Closed Principle (OCP) (by allowing us to change the behaviour of our class by providing different implementations for the dependencies).

But the main, obvious thing this has accomplished is that our class is no longer responsible for instantiating its dependencies. I guess you could almost link this back to the Single Responsibility Principle (SRP) -- we are limiting this class' responsibilities and given it one less reason to change. Why is this good? Well, let's imagine our EmailService class is used by a number of classes throughout our application. We want to change our EmailService so that it no longer sends emails directly via SMTP. Now it simply queues up the email in a database, and another process is responsible for firing off emails from the queue. To create an EmailService without a DI container I now need to do this to every class that builds an EmailService:

emailService = new EmailService(new EmailRepository());

Or we can just let my DI container handle it for me, and keep our original ObjectFactory call unchanged. This reduces the friction experienced whenever we need to factor out a new class to provide new functionality or to avoid an SRP violation.

//No change need here, even though the EmailService constructor has changed...
var invoiceReminderService = ObjectFactory.GetInstance<InvoiceReminderService>();

(Yes, this is slightly circular logic as I am assuming we using DI for the new EmailRepository class too, but the logic is equally valid if we want to change the constructor to include a connection string or some other parameter. The point is that we can vary the instantiation without affect dependent classes.)

Now I'm not suggesting for a moment that you want to globally replace new with a DI container, but for me the realisation that DI containers were all about abstracting the responsibility of dependency creation, rather than simply providing a testing seam, was a bit of an A-HA! moment. Of course, I am a bit slower than your average developer, so sometimes the things like this take me a while :-)

From abstracting dependency creation to SOLID designs

Now there are still a number of SOLID design principles the code above is violating. This doesn't mean we need to instantly refactor to be compliant (you're never going to be, they are general guidelines that can be somewhat contradictory when taken to the n'th degree), but it is something we can look at.

First up is the DIP. We are depending on concrete types rather than abstractions. This also limits our options of altering the behaviour of the class via its dependencies to subclassing those dependencies, which is a minor strike against the OCP. This is easily fixed by extracting interfaces from the dependencies:

public class InvoiceReminderService {
  private IEmailService emailService;
  private IInvoiceRepository invoiceRepository;

  public InvoiceReminderService(IInvoiceRepository invoiceRepository, IEmailService emailService) {
    this.invoiceRepository = invoiceRepository;
    this.emailService = emailService;
  }
  //...
}

StructureMap will auto-wireup these dependencies, but if we are using our EmailService implementation that depends on an extracted IEmailRepository interface, then we'll need to tell StructureMap's configuration about it, either at compile time in code or using a configuration file:

public static class Bootstrapper {
  public static void ConfigureIoC() {            
    StructureMapConfiguration.AddRegistry(new DiSampleRegistry());
  }  
}
public class DiSampleRegistry : Registry {
  protected override void configure() {
    ForRequestedType<IEmailRepository>()
      .TheDefaultIsConcreteType<EmailRepository>();                
  }
}

We are now looking at a more traditional DI design. Our InvoiceReminderService is now completely independent of the specific implementations of its dependencies. All the code in the class is cohesive -- working at one responsibility with minimal background noise. This project doesn't even need a reference to the DLL containing the dependency implementations. This means we are fine changing implementations (although not interface contracts!) and we know our InvoiceReminderService is safe. If you were given this class to reuse, or simply to maintain, you wouldn't have to go pulling in a whole bunch of concrete dependencies. As a result and as a nice bonus, we have some good seams for running automated tests. We can easily stub or mock these dependencies and isolate the behaviour under test.

Aside: DI containers are an obvious way to get plugin and provider-style implementations, by loading specific implementations of an interface at runtime. I'm focusing on more general design effects here -- if you're looking for a plugin/component/provider model you'll obviously be designing more explicitly for loose coupling and will choose your approach based on that.

Not all beer and Skittles?

That's the DI theory, as far as I can tell. But as always there is no silver bullet. It is now tougher for us to navigate from the InvoiceReminderService to the current implementations of its dependencies (as we've abstracted them away). We are also looking at a potential interface explosion. Why extract an interface for virtually every class you write? That's two units of code for every responsibility. And we now have a DI container to learn and troubleshoot when debugging. If we're happy with tight coupling between this class and its dependencies, then why go to these lengths? If you want to run unit tests why not just use TypeMock to break into tightly coupled dependencies? Or at the very least provide default implementations in the class' default constructor (a.k.a. the poor man's dependency injection, see Nikola's post for an interesting variation on this approach):

public class InvoiceReminderService {
  private IEmailService emailService;
  private IInvoiceRepository invoiceRepository;

  public InvoiceReminderService() {
    this.invoiceRepository = new InvoiceRepository();
    this.emailService = new EmailService();
  }

  public InvoiceReminderService(IInvoiceRepository invoiceRepository, IEmailService emailService) {
    this.invoiceRepository = invoiceRepository;
    this.emailService = emailService;
  }  
  //...
}

And how much of all this is just working around limitations of statically typed languages? To me these are all quite reasonable questions (some of these raised by Kevin Berridge in some nice posts here and here, as well as a couple of posts from Jacob Proffitt starting here).

As always the key seems to lie in achieving a balance. The benefits of a loosely coupled design always need to be traded off with any additional complexity of that design. After watching the MVC Storefont DI screencast and having a play around with StructureMap I can see a lot of very obvious benefits to DI containers, but at the same time it's not something to dive into without having an vague idea of the theory and problems it aims to solve (which is why I've spent so much time reading about DI containers, but generally sticking to poor man's DI until I had a pressing reason to move to a container). After all, we don't go around blindly applying the GOF patterns to every situation under the sun... right? :-)

Some interesting DI reads

As opposed to this post, here are a few interesting takes on DI.

Tuesday, 27 May 2008

Top-down vs. bottom-up design

I've been having a think about top-down (a.k.a. outside-in) design during my recent iterative development exercise. In the series I've been putting off testing from the client layer down, primarily because GUIs have a reputation for being hard to test and harder to test-drive, and I wanted to make some early, easy progress on the core logic of the game.

I started to think that this approach might be a mistake. I'm dealing with the model that I think we'll need, not one demanded from the primary client of the model -- the GUI. Chad and Ben mentioned in their recent screencast that bottom-up implementation tended to lead to mistaken assumptions about infrastructure required by the top layers. I saw a similar point made on the BDD mailing list by Pat Maddox. Pat wrote (emphasis mine):

"I find an outside-in style of development to be very helpful... It forces you to think of your objects at a high level, so your design is driven by real need, and then you apply your design skills as you go on. When I use a pure bottom-up style, I write more speculative code and go down the wrong path far more often than I'd like. That's not to say that it's a problem inherent with that style, but rather a problem that I've personally experienced, and have more or less solved by using an outside-in approach."

This is opinion is echoed in an unrelated post to the TDD mailing list by Olof Bjarnason:

"I've been using TDD [bottom-up] for 2 years now, and it's been mostly a _great_ experience. The thing that bothers me most with "classic TDD" is that sometimes I build too much functionality in my classes, which isn't used in the end application after all. Even whole objects are wasted in the worst case."

The view here is that bottom-up design can lead to speculation and waste. By having a design driven directly by the overall, required behaviour, you only implement (and test) things that directly serve that behaviour. This can help eliminate speculative implementations of lower-level behaviour based on what you think the overall required behaviour will be.

Not so fast...

Sounds great! So what about inside-out / bottom-up / middle-out design? Ron Jeffries recently stated on the TDD mailing list that he generally prefers to start with the model (unless the project is simply to build a viewer). Maybe there's a bit more to it?

Digging further into that thread on the TDD list, there are a number of great points of view on the topic. Some TDD-ists argue that bottom-up design lets you build in small, easy steps, and refactor your way to the required behaviour that you would otherwise start with in top-down design. Others state that this leads to waste -- writing code and tests that just get refactored away. Which resulted in a couple of great quotes on the difference between refactored code and waste:

"I suppose we could also call the scaffold we use when constructing a large building as waste, or the safety harnesses as waste." -- John Roth to TDD list
"The analogy with scaffolding for a house is an excellent one - there is a lot of "stuff" constructed when building a house, *just* to support the construction - it is then discarded." -- Casey Charlton to TDD list

Top-down design can also lead to a "mockist" approach to TDD, where you need to mock all the required dependencies to implement the high level behaviour. This isn't necessarily a bad thing, but over-reliance on mocking can result in fragile tests. Martin Fowler has a great article on the pros and cons of "classic" and "mockist" TDD.

Enough rambling already!

While planning for part 3 of my recent development exercise I was coming to the conclusion that top-down was the way to go. After looking into it some more I was reminded of a whole host of advantages of bottom-up design. Even more importantly it reminded me that there is no silver bullet, and there are times when either, or a mix of both, approaches are fine. All this started to sound familiar, so firing up Google I noticed that I had read something to this effect in Jeremy Miller's excellent (as usual) post on the topic (search for "Bottom Up versus Top Down", although the whole post is worth reading).

I think the most important conclusion I've reached during this ramble is that if you are working in iterations to deliver a complete slice of the application (top and bottom) then you're never going to go too far wrong. Any "waste" from a bottom-up approach will be minimal as you'll be working toward and implementing the top almost immediately. And you'll still end up with higher-level behaviour specified with unit tests. Likewise starting top-down you'll still get the advantages of designing in small steps, particularly as you drive down into the design.

Thursday, 15 May 2008

Garden Race Series: Basics of iterative development and TDD

This page is an index page for the Garden Race series. This intention behind this series of posts is to teach myself a bit about the basics TDD and iterative development by working through an example of building a Snakes and Ladders-style game. Hopefully this example strikes a balance between being simple enough to write up as blog posts and being involved enough to for me to learn some things about iterative development that can translate to real work.

Note: This is post has been back-dated to appear before the other posts in the series. It was written after Part 2.

Friday, 19 October 2007

Link: Refactoring walkthrough

Jeremy Miller recently posted a "Best of..." compendium that included the following post:

Composed Method Pattern

This post takes a simple code example and applies a series of refactoring steps to make it easier to read and maintain. The comments point out a bug in the initial example (transferred to the final snippet) that is made more obvious by the refactoring (I won't point it out in case you want to find it yourself :)).

Friday, 28 September 2007

Free articles every developer should read

There are lots of great articles available from Object Mentor. Here is a selection that I feel are essential for any developer. Caution: ahead there be PDF links!

First up, the "SOLID" design principles:

Then some general practices and patterns:

At the very least I would read SRP, OCP and Coffee Maker if you go anywhere remotely close to Object Oriented Design in your work. Closely followed by The Humble Dialog Box for information on the MVC design pattern, and The Bowling Game to get an idea of what TDD is all about.

Tuesday, 25 September 2007

Learning from project failures with Raganwald

Found this 2-and-a-bit year old article from Reg Braithwaite on the lessons he has learned about software projects from previous failures. Great read, covering many warning signs that can indicate your project is in trouble.

raganwald: What I've learned from failure

Wednesday, 12 September 2007

Misusing fluent interfaces

Fluent interfaces are a way of making your code more readable. The NUnit 2.4+ assertion syntax is a good example of this, as is this example from Jon Galloway. For a really comprehensive example see Anders' post*.

It is also very easy to get carried away with this (ok, maybe that's just me) and misuse the technique by reducing it to simple method chaining and applying it everywhere. Ayende has a good post on the difference. To borrow Ayende's example (I'll give it back), this:

string user = new StringBuilder()
	.Append("Name: ")
	.Append(user.Name)
	.AppendLine()
	.Append("Email: ")
	.Append(user.Email)
	.AppendLine()
	.ToString();
is not a fluent interface, and is does not offer much over the standard approach:
string user = new StringBuilder();
user.Append("Name: ");
user.Append(user.Name);
user.AppendLine();
user.Append("Email: ");
user.Append(user.Email);
user.AppendLine();
user.ToString();

The second example is hardly significantly more difficult to read than the first (a bit more noise), and has the added advantage of making debugging easier by giving you a specific line number if an exception is thrown. The first example is not a fluent interface, it is just method chaining.

The main feature of a fluent interface is not the use of method chaining (in fact this is completely optional), but mainly providing an easy to use, easy to understand interface to your class or library. A fluent interface over Ayende's StringBuilder example might look more like this:

String user = new UserDisplayer()
  .AddName(user.Name)
  .AddEmail(user.Email)
  .ToString();

As a more detailed example, the changes Jon Galloway suggested to the C# Image Enhancement Filters Library transformed this (these code snippets taken straight from his post):

Image myImg = Bitmap.FromFile("cat.jpg");
Image transformedImage;
ZRLabs.Yael.BasicFilters.TextWatermarkFilter watermark = new TextWatermarkFilter();
watermark.Caption = "Test";
watermark.AutomaticTextSize = true;
transformedImage = watermark.ExecuteFilter(myImg);
transformedImage.Save("cat_watermark.png", System.Drawing.Imaging.ImageFormat.Png);

...to this...

        ZRLabs.Yael.Pipeline pipeline = new ZRLabs.Yael.Pipeline("cat.jpg");
        pipeline.Rotate(90)
            .Watermark("Monkey")
            .RoundCorners(100, Color.Bisque)
            .Save("test.png");

He used method chaining here, but the code is still an improvement without it:

        ZRLabs.Yael.Pipeline pipeline = new ZRLabs.Yael.Pipeline("cat.jpg");
        pipeline.Rotate(90);
        pipeline.Watermark("Monkey")
        pipeline.RoundCorners(100, Color.Bisque)
        pipeline.Save("test.png");

I recently wrote a fluent interface around sending commands to a database via a custom database controller class. To make it easier I added a number of methods for adding parameters to the command (yes, there are libraries to do this already, but this was to interface with existing code). I admittedly got a bit carried away with this and chained all the calls to have commands created in pseudo-natural language, at the cost of making debugging harder if an exception was thrown. The interface was pretty neat, but the method chaining was definitely overkill.

So the morale of the story is that fluent interfaces can be great, but method chaining is probably best left alone unless it makes a drastic improvement to code readability.

Footnotes:
* I frequently link to my own posts that link to another post, rather than straight to the original source. The reason is Blogger gives me backlinks (which is Blogger's substitute for trackbacks) that help me to navigate through my related posts.

Wednesday, 11 July 2007

Anders Norås on DSLs with C#

Anders has a fantastic post about creating DSLs using C#. He covers the normal design rules you need to break when programming your DSL (like changing state in the get{} accessor), as well as other tips on building the grammar such as selectively overriding the implicit cast operator. A great read for anyone even remotely interested in fluent interfaces.

Monday, 18 June 2007

The Ivory Tower of effective development practices

Jeremy D. Miller has posted a train of thought that includes an interesting section on a prevalent attitude towards things like TDD, IoC, MVP and other practices and patterns that can be used to improve the quality of software (aka ALT.NET). I have quoted the relevant sections here, as it is an attitude I have faced on many occasions, and it's nice to share the pain. :-)

This idea that things like TDD/DDD/ORM/IoC/Continuous Integration/Test Automation or whatever else is just academic hot air or merely shiny toys for the alpha geeks really worries me. I frequently see somebody say something like "we're too busy solving business problems to spend time on this ivory tower [crud]." It's a self-defeating attitude. It's the I don't have time to sharpen the saw, I'm too busy chopping wood! syndrome.

Do you know what's a really big problem that impacts the business? Technical debt. Code and software ecosystems that are hard to maintain and change. If I can create an enterprise system that is easy to change, I've opened up more opportunities for the business to ask for more functionality. If I write code that's hard or risky to change, I've closed off opportunities to the business. All of the acronyms listed above are techniques that people are developing and using in an attempt to create code more efficiently with better qualities for maintainability. Maintainability is crucial for your ability to keep delivering value to the business over anything but very short timelines. Think on that Mr. "I'm too busy delivering business value."

Monday, 11 June 2007

Persistence ignorance and TDD with LINQ to SQL

Ian Cooper has written an article on LINQ to SQL covering persistence ignorance and exercising LINQ to SQL code using unit and integration tests.

Staccato Signals: Being Ignorant with LINQ to SQL

Found courtesy of Jeremy Miller.

Tuesday, 5 June 2007

Explaining good code to non-geeks

I commonly encounter a lot of blank looks when I try and explain what a programmer does. Most people know it involves making the computer do stuff, but it pretty much goes down hill from there unless you have a geeky audience. This can pose a problem when explaining concepts like good quality code to non-IT business representatives and managers. This post is an effort to provide an understandable example of good code for non-geeks.

First up some disclaimers. This is a contrived example for illustrative purposes only. It is not meant to be a shining example of fantastic design, but to show some of the considerations that go into producing good code. I also provide some approximate definitions (approxinitions?) of some concepts. These may not be very accurate, but are designed to give the reader an idea of a concept without getting drowned in the details. Lastly, try not to get bogged down in the details (How does that work? How does it connect to a database? What's a database? What language is that? Why am I reading this strange person's blog?). Instead, concentrate on the concepts being communicated. You don't need the details to get the gist. Right? Ok, let's go.

A problem and first pass at a solution
We are writing an application that needs to send emails to a group of people. To do this we will write a class. Think of a class as group of related behaviours (functions) and data (fields, properties). Let's see some code:

class EmailSender {
function SendEmails() {
listOfPeople = database.LoadPeople();
foreach person in listOfPeople {
emailService.SendEmail(welcomeEmail, person);
}
}
}
Even if this looks a bit daunting, you can probably still follow through what we are doing. Here we are creating a new class called EmailSender that contains a SendEmails() function. We can call a function through a class. This is usually expressed in a className.functionName() format, so in this case we could run our function by writing EmailSender.SendEmails().

Within our function we load a list of people from the database, and then for every person in that list we send them an email by calling the
emailService.SendEmail(helloEmail, person) function. The values in the parenthesis is information we are passing to the function. In this case we are telling the function which type of email to send (a helloEmail, whatever that is) to a particular person. In this example we are assuming that the emailService and database classes are already written somewhere for us to use.

Adding more features
We now have a function that will send welcome emails to all the people in some database. Hooray! Now we realise that our application does not just need to send a generic email to everyone, but we also need to send emails to everyone that is about to go on holidays. Let's add a new function to do this:
class EmailSender {
function SendEmails() {
listOfPeople = database.LoadPeople();
foreach person in listOfPeople {
emailService.SendEmail(welcomeEmail, person);
}
}
function SendEmailsToHolidayers() {
listOfPeople = database.LoadPeopleGoingOnHolidays();
foreach person in listOfPeople {
emailService.SendEmail(haveANiceTripEmail, person);
}
}
}
This new function is very similar to the first. It loads a different list of people (people gong on holidays), and then sends each person a haveANiceTripEmail instead of a welcome email. Finally, lets send some nasty emails to people who have not paid their bills to our company.
class EmailSender {
function SendEmails() {
listOfPeople = database.LoadPeople();
foreach person in listOfPeople {
emailService.SendEmail(welcomeEmail, person);
}
}
function SendEmailsToHolidayers() {
listOfPeople = database.LoadPeopleGoingOnHolidays();
foreach person in listOfPeople {
emailService.SendEmail(haveANiceTripEmail, person);
}
}
function SendEmailsToLatePayers() {
listOfPeople = database.LoadPeopleWithOutstandingDebts();
foreach person in listOfPeople {
emailService.SendEmail(weAreSendingSomeGoonsEmail, person);
}
}
}
And we are finished. Simple right?

Problems with our class
Problems? Our class is perfect! We just wrote it, it's beautiful! The last code sample may work, but it has a lot of duplication. All three functions load a list of people from the database, then send an email to each person. This duplication is a warning signal to the programmer that something is wrong with the code.

We wrote more functional code than was required, repeating the same logic in several places. Functional code is the lines that do stuff, rather than the semantics of defining classes and functions. The code is also hard to maintain. For example, say we want to record in the database that an email is being sent before we sent it. We then need to change the following lines in 3 places:
  foreach person in listOfPeople {
database.RecordEmailIsBeingSent(weAreSendingSomeGoonsEmail, person);
emailService.SendEmail(weAreSendingSomeGoonsEmail, person);
}
It is easy to miss one of these lines once our class grows to have more functions, which encourages bugs. It also means if there is one bug in a repeated part of the code it will be faithfully reproduced in the duplicated parts of the code. This kind of duplication can also make it difficult to understand the intent of the code, as the important logic is obscured by other bits of functionality.

A good indicator of good code is a lack of duplication. This makes for smaller amounts of functional code (which means less work as we are solving each problem only once, and less places for bugs to hide), and code that is easier to change and maintain (faster and cheaper to add features and support).


So let's take a giant leap and see if we can turn this code into good code. This process is known as refactoring (changing the design without modifying the behaviour). To non-technical people refactoring can seem a waste of time. Why spend time changing something behind-the-scenes when the overall output remains the same? If it ain't broke, why fix it? To technical people, refactoring is essential and will reap great rewards the second someone finds a bug or wants to change the application slightly. In reality the code is broken, unless part of the initial requirements was for our application to cost a huge amount to support and be nearly impossible to change.

Refactoring to good code
To eliminate this duplication in our functions, let's think about exactly what each function is trying to achieve.
  • Load a list of people from a database. The list will be different for different email types, but the end result will still be a list of people.
  • We need to send an email to every person in the list.
  • The email content will be different for each email type.
So if we can extract the functionality which gets the list of people, and the email content, we will be left with the main logic behind sending these emails.
class EmailSender {
function LoadRelevantPeople();
function EmailContent();
function SendEmails() {
listOfPeople =
LoadRelevantPeople();
foreach person in listOfPeople {
emailService.SendEmail(
EmailContent(), person);
}
}
}
How does that look? We don't have any duplication now, but we have lost our multiple email types. Notice how clearly the SendEmails() function now expresses our intention: we load a list of relevant people, then send an email to each one. We also have some blank functions, LoadRelevantPeople() and EmailContent(). These functions represent the parts of our original code that differed between each function. We will take advantage of some programming magic called inheritance to inject the relevant functionality.

Inheritance is a relation between a parent class and a child class. The child class automatically gets the functions and properties of the parent, but can also add new functions, and
override the behaviour of the parent's functions. We will use this techniques to implement our different types of emails.
class EmailSender {
function LoadRelevantPeople();
function EmailContent();
function SendEmails() {
listOfPeople =
LoadRelevantPeople();
foreach person in listOfPeople {
emailService.SendEmail(
EmailContent(), person);
}
}
}
class WelcomeEmailSender inheritsFrom EmailSender {
function LoadRelevantPeople() {
return database.LoadPeople();
}
function EmailContent() { return welcomeEmail; }
}
class HolidayEmailSender inheritsFrom EmailSender {
function LoadRelevantPeople() {
return database.LoadPeopleGoingOnHolidays();
}
function EmailContent() { return haveANiceTripEmail; }
}
class LatePayersEmailSender inheritsFrom EmailSender {
function LoadRelevantPeople() {
return database.LoadPeopleWithOutstandingDebts();
}
function EmailContent() { return weAreSendingSomeGoonsEmail; }
}
We can now send emails like this: HolidayEmailSender.SendEmails(). This will use the SendEmails() function from our parent EmailSender class, which will fill-in-the-blanks with HolidayEmailSender's implementation of LoadRelevantPeople() and EmailContent().

What do we gain by doing this?
  • We have separated the logic of sending an email to a group of people with the particulars of getting that list of people and email content. This makes it easier to understand each discrete block of logic.
  • We can now add new email types without modifying the original EmailSender class. This helps us to avoid introducing bugs into that class. It also means we do not need to understand all the details of the EmailSender class, so our application is easier to modify.
  • We can change the EmailSender class to record all the emails being sent without affecting the child classes. This example was used in the previous section and required 3 modifications. Now we only require one.
  • We can separately test each unit of logic. This approach is more amenable to automated unit testing techniques.
  • We have eliminated duplication.
What do we lose?
  • In this case, but not always, brevity. By removing duplication we have actually increased the number of lines we wrote from 20 to 28. We are using a new class for each function rather than the 3 line functions we were using initially. This is more to do with this example than the refactoring process in general.
  • Directness. Before we could look at a function and see every step. Now we have introduced a layer of indirection or abstraction to the process. It is now easier to understand each discrete block of logic, but takes more brain power to understand the entire behaviour.
Which is better? Well, ending up with a bit more code is an unfortunate side effect of using a small example for this post. You can probably see that as the size and number of duplicated areas increase, this new approach will actually require less lines than the original.

Even so, the extra lines are not really an issue in this case. The extra lines are not
functional code, they are simple class definitions and so forth, rather than code that actually does stuff. Normally these non-functional lines are written for programmer's by their code editing programs. They are also the kind of things that are automatically checked by the code compiler, so unlike functional lines of code they will generally not be harbouring bugs.

So the balance lies in the gains versus introducing indirection. Bear in mind that this is a contrived example, and that most software is significantly more complicated. As things get more complicated, abstraction and indirection become compulsory. The human brain can only handle so much complexity, so the natural response is to divide and conquer these problems by using abstractions. Programmer's are used to these abstractions as they work with them every day, which further reduces the problems caused by the indirection.

A key point is that changing software can be very costly. Anything we can do to mitigate that cost will pay huge dividends over the life of the software. Programmer's can do this by removing duplication and creating high quality code.

To me the gains seem to significantly outweigh the shortcomings of our changes. On balance, I would always choose our revised example.

Parting thoughts...
The structure and design of the code itself can have a big impact on the cost of supporting, maintaining and extending software. Refactoring is sometimes dismissed by non-geeks as an optional or low-value activity. In reality it is essential. Skipping it incurs a
technical debt, which you will need to pay for next time the application needs to be modified.

The technique we used is an example of the
Template Method design pattern. This technique relies on a main function that contains the logic of what we are trying to achieve (the template), and uses child classes to fill in the blanks on variable behaviour. The are many others ways of removing duplication from code, each with varying applicability to individual situations, and each offering specific strengths and weaknesses.

There is no way to mandate or standardise on a particular technique for all situations. It would be disastrous to declare that every time a programmer writes an email feature into software they must use a Template Method. Each case needs to be evaluated on its merits. Writing good code depends on the skills and experience of the programmer. Good code can not be rubber-stamped and produced en masse, not by standardisation nor by graphical tools, pre-built platforms, advanced IDEs, or any method other than applying good professional practices to every individual case.

Coding is sometimes considered a low-value part of software development (read: monkey work), but good programmers can contribute a huge amount to a project by making effective decisions at this level.

I hope this has helped to show some of the features of good code, why it is worth the effort, and also highlighted some of the issues programmers face when trying to write good code.

Tuesday, 8 May 2007

RAD or Refactor?

Found this old, satirical post by Jeffrey Palermo. It reflects one part of a train of thought I have been entertaining over the past couple of months regarding future directions of software development.

I have been trying to resolve two contrasting approaches to development: RAD tools vs. clean code. While there can be some overlap between the terms themselves (both approaches can use code generation for example, but differ in how they use it), I am concentrating on the fundamental difference of approaches.

According to my definitions, RAD tools are tools that focus on quickly, easily and cheaply getting an application together using largely visual tools which generate code or use existing platform code. The "rapid" part of RAD comes from the ease of use of the tool, and the ability to concentrate on high level concerns and abstractions without being bogged down in coding details. A worthy goal :-).

Clean code (a.k.a. "pragmatic programming", "maintainable code" or "sustainable software") focuses on creating simple, correct code that is easy to test and maintain. In contrast to RAD tools, the clean code approach focuses on eliminating duplication (not just in the code itself, see the DRY Principle) to achieve rapid progress. The impact of coding details that RAD tools avoid are minimised through reuse and automation (e.g. automating persistence using NHibernate, rather than repeating yourself by continually plumbing your objects into a relational database).

So is one approach generally better than the other? Very rarely is there a silver bullet that will solve all your problems. However, there are sometimes general directions that are positive without being a cure all, perhaps like the development of high level languages like Java and C# helping to prevent wannabe-C++ coders from hanging themselves. So is one of these approaches helpful in a majority of cases? How do you make the choice between the two options?

My main problem when considering this question is that I am biased. I love coding. I love producing a beautifully crafted piece of software that just works (well, I'm pretty sure I'd like it if it happened ;-) ). I can see how beneficial the clean code approach can be in the long term, but I can also see how valuable it is for non-IT colleagues to drag-and-drop a custom report or automate a basic business process as required. Products like SharePoint (a.k.a. MOSS 2007) now make it fairly easy to drag-and-drop an approval workflow for documents, intranet posts etc. So as these tools continue to improve will the benefits of the clean code approach diminish? Will "traditional" coding disappear in favour of RAD tools?

So far my pondering of this has led me to the following ideas, primarily in favour of clean code. First, we have seen this phenomenon before with products like Excel, Access and VB/VBA, which provide a very low technical barrier-to-entry to produce an application that can have enormous business value. This ease of producing applications can cause huge problems when it fails to scale, or a bug that wasn't hit when the app was used by 3 users causes 50 users to lose data later on. These applications can sometimes end up being reworked or modified by more experienced developers, where layers of hacks and duplication obfuscate the actual logic of the application that is so valuable. How about migrating data to a new back-end or incorporating it in a larger system?

I have a feeling that all RAD tools fundamentally suffer from the same drawbacks as these early RAD tools - tool generated code may work in an original context, but that context is fairly fixed. Try and change it and you end up paying a large amount of technical debt.

What about more visually-oriented designers, like the Workflow Designer in Visual Studio? I really like Jeremy Miller's take on this (as quoted by Ayende), "Drawing workflow's with pictures is coding -- with every bit of the risk and danger that coding brings". I have seen this first hand when using a workflow designer (mandated for a project), and then jumping through hoops to get it integrated with an ASP.NET application and an application specific database. For all the ease of dragging and dropping, the effort to turn that into a workable solution was much higher than a clean code approach.

I have also found designers slowly me down in some situations. When you need to have similar actions a designer (well, maybe not a *good* designer) may require you to copy and paste actions and then manually set differing properties. This is a nice way for error to creep in, but also violates the DRY principle that is so important to clean code. For a more obvious example, compare using an application with a mouse versus with a keyboard once you have memorised the shortcuts for your most commonly used features.

How about if we assume you have a "perfect" RAD tool? Fast, easy to use but infinitely flexible? As flexibility increases my unsubstantiated and groundless assumption is that the complexity increases until you have implemented a programming language in itself. Regardless of whether you are dealing with text of boxes and arrows, you are still coding. Use the best method of speeding the process of coding while reducing duplication and repeated effort.

Gee, the way I tell it I've almost convinced myself that my future of coding up classes is a given! But realistically there are advantages to a good RAD tool. Ease of use and comprehension is a great advantage to help develop quickly, but you still need a certain amount of knowledge to use it wisely. It really becomes a trade off between how fast you want to go for the basics and how important it is to maintain and extend that base.

I've probably rambled more than enough for one post, so I'll attempt a semi-logical wrap up. RAD tools can be great for a one-shot application, but all end up generating code artefacts of one type or another that are subject to the usual risks of code. You are going fast for a while, but incurring technical debt. The initial investment required to create debt-free, clean code
may slow things down initially, but theoretically should accelerate quickly as repetition is avoided and reuse encouraged. Now just don't get me started on software factories!

UPDATE 2007-06-04: Phil Haack has a post along the same lines, so apparently I'm in good company :-)
UPDATE 2007-06-07: Very good company :-)
UPDATE 2007-07-06: David Hayden makes three and a crowd :-)
 

Friday, 27 April 2007

The Way of Testivus

From Artima: The Way of Testivus - Unit Testing Wisdom From An Ancient Software Start-up.



"Some good advice on developer and unit testing, packaged as twelve fake, pretentious, and somewhat cryptic bits of ancient Eastern wisdom in the hope of getting your attention."



Reminiscent of The Tao of Programming (described on Wikipedia).

Wednesday, 10 January 2007

Jeremy Miller on Orthogonal Code

Another helpful post from Jeremy Miller, this time on Orthongonal Code. He gives a real example of unmaintainable code, and then illustrates the effects of applying the following principles to the code:

  • encapsulation


  • the Law of Demeter


  • Single Responsibility Principle


  • loose coupling


  • dependency inversion


  • open/closed principle


  • testability


Thursday, 4 January 2007

MartinFowler.com - articles and pattern catalogues

Martin Fowler, author of "Patterns of Enterprise Application Architecture" and others, maintains a web site of programming articles, references and patterns.

Refactoring Catalogue

Martin Fowler's catalogue of refactoring patterns is a helpful reference for various refactoring operations. The refactoring.com web site also has some links to other references and refactoring tools.