Choosing a professional training course has always seemed like a bit of a minefield to me. Most courses have hefty price tags and it's hard to judge beforehand whether they actually represent good value. Although I find that you can learn pretty much anything online now with a combination of videos, blog posts, ebooks and open source documentation – I really wanted an in-person learning experience, to be in the same room as a master and hear directly from them what makes great software.
Enter Sandi Metz, 20+ years of software development experience and author of the excellent Practical Object Oriented Design in Ruby. Lucky for me she had decided to bring her accompanying Practical Object Oriented Design course to London with assistance from the insightful Matt Wynne, one of the authors of Cucumber. For 20 of us it would be 3 full days of pair programming, code reviews and spirited group discussions.
I jumped at the chance to take part and after attending the course this past June, I wanted to share some of the core concepts with you. Hopefully, this post will give you a few new ideas to consider and try out the next time you are in front of your editor.
One of the tasks we were given during the course was to programmatically generate the lyrics to the song 99 Bottles of Beer. We were given a set of tests and only the first one was currently being executed, the rest were being skipped for the time being. We were then asked to make the first one pass before doing anything else. Once that test passed we could unskip the next test and try to make that one pass. We were to repeat this process until all tests passed.
Duplication Is Better Than The Wrong Abstraction
The next thing we were told to do goes against all intuition.
Write shameless code, full of duplication just to make the tests green.
Hang on, isn't duplication the first thing we learn not to do?
Well yes, it's true that ultimately you want DRY code but Sandi advised that you are setting yourself up for failure when you try to make your code DRY and full of abstractions before you really understand the problem you are solving.
So this is the first test:
Here we expect that two lines from the song (lines #13 & #14) should be returned when we create an instance of the Bottles class and call the verse method, passing in the number 99 (see line #16).
How would you normally approach this test? Would you get distracted at the prospect of having to generate the whole song and start thinking about writing some clever method to do that? I know I would have in the past. It's very easy to fall into that trap. I think it's because as problem solvers, we are always so eager to reach that moment where we 'get' the pattern, that we rush ahead and remove duplication too soon or skip it altogether.
Although writing the minimum code to pass the test is a well known techinque from TDD, it's very hard to write 'shameless' code, even when explicitly told to do so. Only a couple of pairs in our group managed to meet this goal.
To get the first test to pass, we can start with something simple like this:
We just defined the verse method in Bottles with a parameter of number and just returned the exact same string from the test. We didn't even use the number. Pretty shameless.
Then removing the skip from the next test case (see below), we have a similar scenario but this time the number passed in to the verse method is 89 (line #8):
So now we are forced to do something with the number but we can start the process of duplication by adding a case statement which just returns the full string based on the number passed in:
You are probably itching to clean that up already but we are not yet ready to start abstracting yet. Sandi advised that code with duplication will be easier to handle than the wrong abstraction, so we are better off gathering more information, adding it to the solution until at some point, an abstraction will naturally occur. The cost of waiting for more information is low.
If we do skip ahead to writing a super smart abstraction too soon, we drastically increase the risk of having to untangle a mess later on.
Why is it easier and cheaper to handle? Although duplication looks ugly, it has far less mental overhead because the input cases are right there in front of you and there is less logic to keep track of. Adding a new input to the solution becomes a matter of adding to the duplication and you will see shortly that we have a neat technique for eventually DRY-ing out this code.
So we can continue in this vain to get the next 3 tests to green, they also just pass in different numbers (2, 1 and 0) to the verse method and each return a different verse string. To make these tests pass we add them to our case statement and return the strings directly:
Yuck. But our tests are green and it means we can keep moving forward with the challenge. The next test requires us to implement a verses method. This takes two numbers which define the range of verses in the song to be generated (line #11):
In this case it's just 99 down to 98. We don't yet have a case to handle 98 bottles, so we can add that to our verse method the same as we did for 99. Then we can define a new verses method that takes an upper_bound and lower_bound to determine the verses that must be generated. Within the verses method we can call our existing verse method for 99 and 98:
The tests pass and we can move to the next one which requires us to return 3 verses:
So now we need to be a bit smarter about how we generate the verses. We can do this by iterating over the number range with ruby's .downto, then using the collect method to get each verse and finally join them all with new lines:
The final test requires us to implement a song method, that should return the full song from 99 down to 0.
This is actually fairly easy for us to pass, we can just call our ready made verses method, passing in 99 and 0 as the range.
Great, now all our tests are shamelessly passing! You can view the solution here. Although you may have noticed one snag, our song method doesn't actually generate the full song because our verse method only returns when the verse is 0, 1, 2, 89, 98 or 99. Don't worry, we'll soon put that right when we start refactoring.
I think some programmers may argue that this example is trivial enough that you could potentially start abstracting sooner, however, this problem was used to introduce the shameless technique and Sandi made it clear that this approach will serve you well even when faced with harder problems, where you have no idea what the end solution looks like.
To summarise the advice so far, resist the urge to leap ahead to an abstraction. Start breaking the problem down with a simple, shameless solution and don't be afraid of duplication when starting out.
Refactoring Is Not An Afterthought
One of the most interesting ideas I took away from the course is that refactoring is not really the icing on the cake, it is the process of making the cake.
Instead of spending a long time in the red while we write our complicated method, then eventually getting to green, then maybe if we have enough energy left doing a bit of refactoring - we quickly obtained green tests from our shameless solution and that provides us with a platform to immediately begin the process of refactoring.
How To Refactor
Refactoring is rearranging code without changing behaviour and the approach Sandi recommended was to make tiny, tiny changes in a technique she perfected with Katrina Owen. The technique is to always stay one CTRL-Z (or ⌘-Z) away from green tests using a 4 step process:
- Compile: Get the new code you want to implement to compile within the same file - it shouldn't be called yet, this is in order to catch syntax errors
- Execute: Run your new code but don't use the result
- Use: Replace the old code with your new implementation
- Clean: Clean up and remove any old code you have now replaced
After each step you should run your tests to make sure you are still green. I'd never seen an approach like this before but I did experience a certain sense of 'flow' when following it during the course and it really forces you to stay on the baby steps path.
It still feels a bit unnatural for me to work in increments this small and I often tend to combine some of them but I have been making an effort to try it out. The idea is that by doing less and being able to CTRL-Z when red, it's always cheap to go back to a safe place and it prevents you from spending long periods of time stuck with failing tests, hoping it will come right in the end.
What To Refactor
Now we know the process of refactoring (making frequent small changes without changing behaviour), the question remains what should we refactor? If we think we are now in a place where our code has enough duplication and we have enough information, then we can start abstracting.
The process for abstracting is to find the two lines of code that are most similar then to make them more alike.
The important thing to note is that we don’t want to take the things are in common and extract them - e.g. "bottles of beer on the wall" is duplicated throughout but it adds no value to extract that into a method call or variable. Instead we find the 2 lines of code with the smallest differences and make them more alike or the same. By doing this we gradually chip away at the duplication and will result in a number of small methods that can later be refactored into classes.
The best way I can explain this technique really is to demonstrate it. Watch the video below and I will take you through the process of refactoring the code we have written so far:
Hopefully that gave you a flavour of how easy it is to create abstractions once you have followed the path of duplication. The next stage in this code base is to start extracting some of these methods into a separate class but I will leave that until next time.
Happy hacking :-)