Scotland on Rails Presentation Online
May 26, 2009
I gave a talk at Scotland on Rails titled "Working Effectively with Legacy Rails." It's online now for your viewing pleasure. Please take a look and share your thoughts here.
I just smashed a bug
I gave a talk at Scotland on Rails titled "Working Effectively with Legacy Rails." It's online now for your viewing pleasure. Please take a look and share your thoughts here.
By now, most programmers are familiar with the law of demeter. I think there's still a great deal of misunderstanding of what it means and how to use it. It's often tersely explained as a call chain with two dots, but the "two dots" rule is merely a heuristic for identifying possible demeter violations. Seeing a call chain with two dots doesn't necessarily mean demeter is being violated. To understand why, you have to go beyond that simple explanation and think about what it means and why it matters.
In following the law of demeter, our goal is to write encapsulated code. In writing encapsulated code, our goal is to expose an object's behavior while hiding its internal structure. The key thing to understand is that if your object doesn't have behavior, and you're primarily interested in the structure, then the law of demeter doesn't apply. Let's look at an example.
Say you're writing an application that simulates an auto race. You've got a car domain object, and it has some components to it. A race track object tells the cars when to start their engines.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
class Car def initialize(engine) @engine = engine end def start_engine @engine.start end end class Race def initialize(*cars) @cars = cars end def green_light @cars.each {|c| c.start_engine } end end |
In this case, you absolutely benefit by applying the law of demeter. Let's say down the road there's more work that needs to be done when starting the engine, such as charging a turbofan (I have no clue if that makes sense. I don't know anything about car racing). If you violate demeter, which looks like
1 2 3 4 5 |
class Race def green_light @cars.each {|c| c.engine.start; c.turbofan.charge } end end |
you see that you have to change the Race class. First off, that's odd, as you would expect changes to a Car's behavior to be made in Car's definition, not Race. Secondly, you're guaranteeing bugs somewhere down the road, because when you have a different object start a car's engine, you have to maintain this in two spots. If you don't think you'll ever have another object that tells a car to start its engine, consider your tests and think again.
Following demeter is valuable when you have behavioral objects. In the previous example, a Race deals with a Car abstraction, but doesn't care anything about the Car's engine. An engine is an internal object that helps the Car do something interesting.
Let's look at another example, in this case it's going to be a Rails app that lets you build a car, selecting the options that you want. The confirmation page looks something like
<td>Engine type</td>
<td><%= @car.engine.name %></td>
<td>CD player</td>
<td><%= @car.stereo.has_cd_player? %></td>
<td>iPod hookup</td>
<td><%= @car.stereo.has_ipod_hookup? %></td>
There's no behavior, it's all structure. And in this case, structure is what we care about. Because of this, following demeter would only result in needless indirection and code without providing any benefits.
The law of demeter is a useful guideline, but as with any other programming guideline you need to understand its subtleties in order to use it effectively. Demeter should be used as an encapsulation technique, when you want a behavioral object that hides its structural details. When you're interested in an object's structure, you're better off leaving demeter in your toolbox.
Raganwald doesn't like Array#sum. He says that #sum should not be on Array because it's not semantically valid for all Arrays.
While #sum doesn't apply to all Arrays, it applies to a lot of them, and it's friggin useful. Does it hurt the arrays that it doesn't apply to? Of course not. Because you don't call #sum where it doesn't make sense.
1 2 |
refrigerator.wag_tail # boom [1,2,'three'].sum # boom! |
I don't see a big difference between those two lines. The first is obviously nonsensical because refrigerators don't have tails to wag. Ruby will still let you send a refrigerator that message though. It may handle it or it may not. That's the essence of dynamic typing. Same with the second line, only this time the object's ability to handle it is a combination of the method signature and internal state. That's nothing to be scared of, we see it all the time.
1 2 3 |
5 / 0 [1,2,3].freeze.clear Account.new(100).withdraw 200 |
An object may respond to a message, and you can call it, and it can still blow up. For a language that literally lets you rewrite it, sending messages that may not be semantically valid is hardly the weirdest thing you can do.
You wouldn't call refrigerator.wag_tail, so don't call sum on an Array that can't handle it.
In Ruby, interfaces are fluid. You can say that an object's interface is the collection of messages it responds to, but it's more correct to say that it's the intersection of messages it handles and messages you send it. For example, Array can behave many different ways:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
queue = Array.new queue << :foo << :bar queue.shift # => :foo stack = Array.new stack.push :foo stack.push :bar stack.pop # => :bar album_db = Array.new album_db << {:name => "Legend", :artist => "Bob Marley"} album_db << {:name => "Welcome to Jamrock", :artist => "Damian Marley"} album_db << {:name => "Abbey Road", :artist => "The Beatles"} album_db.select {|h| h[:artist].include?("Marley")}.map {|a| a[:name] } # => ["Legend", "Welcome to Jamrock"] |
Array is a chameleon class, there's no single concept of Array in Ruby. I think that's a good thing thing, because whenever you need to use a collection, the question is usually "do I use an Array or Hash?" The alternative looks like 
You really shouldn't be passing around simple collection classes often anyway. They're data structures that should be used to implement your behavioral objects. Quick example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
class Account def initialize(balance) @transactions = [balance] end def credit(amount) validate_amount amount @transactions << amount end def debit(amount) validate_amount amount @transactions << -amount end def validate_amount(amount) raise "Must be a number" if amount <= 0 end def balance @transactions.sum end end |
You know it's safe to call sum in this case because you've controlled access to its internal data.
Want to provide access to the transaction list? Write a little class that provides the functionality you need:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
class TransactionList def initalize @transactions = [] end def <<(t) @transactions << t end include Enumerable def each(&block) @transactions.each &block end end |
This basic wrapper expresses that a transaction list is additive-only. You don't let clients subvert the contract by giving them a reference to the internal array.
There are even simpler cases where you can/should do this. As an example, everyone has written an Address class that used a string for the zip code. But that leaves open the possibility of zip_code.upcase. What the hell does that mean? You can create value types very cheaply:
1 2 3 4 5 |
ZipCode = Struct.new :zip sf = ZipCode.new 94102 lb = ZipCode.new 92651 sf == lb # => false sf == ZipCode.new(94102) # => true |
It won't work if you introduce subclasses, but it's a great technique if you want to build meaningful domain types on the cheap. Most of the time we don't need it though - intention-revealing names suggest how to use an object, ignore the naming at your peril. But if you do want added safety, create a new encapsulated type.
Specs help you write robust code and give other developers clear examples of how to use it. Write specs.
Every once in a while my Dock will just disappear, and along with it goes command+tab. In order to get it back, open Activity Monitor (using quicksilver or spotlight) and look for the process named Dock. Click "Quit Process" in the upper left and choose "Force quit" to restart Dock and get your dock / command+tab back.
Posted here mainly so the next time I google "os x dock and command+tab disappear" I'll be able to find it :)
This year at RailsConf, my good friend/quasi-business partner BJ Clark and I will be giving a presentation titled "Working Effectively With Legacy Rails." The oldest Rails systems are almost 5 years old now, which is an eternity given the rapid pace of change in the Rails world. BJ and I talk about technical debt, getting your gnarly codebase under test, and making excellent use of Ruby and Rails to be insanely productive and have fun to boot. Here's the full abstract over at the RailsConf site.
I gave this talk solo at Scotland on Rails and got a lot of good feedback. RailsConf attendees will benefit from me having learned from my mistakes in that one. This is a kick ass talk, so please join BJ and I at RailsConf!
There's a pattern in Domain-Driven Design called the Aggregate, a cluster of objects with tightly-interrelated behavior. At the head of this cluster is the Aggregate Root which controls access to the objects within the Aggregate. Aggregates are useful for several different things such as designing transactional boundaries or data partitioning schemes, but most importantly they serve as consistency boundaries. One property of Aggregates is that they can hand out references to internal objects, but clients should not hold those references - they may use them only within the scope of a single operation.
You should be careful not to expose internal objects too frequently. Clients will make assumptions and that leads to restrictive coupling. Here's an example of someone who found that his tests exposed a design flaw in his code. As it turns out, there is a design flaw, but he came to the wrong conclusion as to what it is.
I've rewritten the example in Ruby:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
class Order attr_reader :items def initialize @items = [] end def add_item(item) @items << item end def remove_all_items @items = [] end end describe Order do it "should remove all items" do order = Order.new order.add_item :rock_band order.add_item :game_guitar order.should have(2).items order.remove_all_items order.should have(0).items end end |
So far so good.
He later runs into a problem where calling order.items doesn't return the same object once the order items have been removed. The problem shows up when we restructure the test:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
describe Order do it "should remove all items" do order = Order.new order_items = order.items order.add_item :rock_band order.add_item :game_guitar order_items.should have(2).items order.remove_all_items order_items.should have(0).items end end |
Chris concludes that he should call @items.clear in Order#remove_all_items as opposed to assigning a new Array object to @items. That way clients holding references to it won't run into nasty surprises like the above.
The problem here is that clients are making assumptions about how Order is implemented, and that creates a tight coupling. You could allow that and make it a part of Order's contract, but it places a higher burden on both Order and its clients. You'll also need to document it, ideally in the form of tests as done above. This is the short-term maintenance overhead of tight coupling. Not surprisingly, people tend to skip that work, leading to bigger problems down the road as you try to maneuver within a rigid design.
Basic encapsulation fixes this problem. Instead of handing out the order items array, you place a query on the order itself. The tests and code become:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
describe Order do it "should remove all items" do order = Order.new order.add_item :rock_band order.add_item :game_guitar order.num_items.should == 2 order.remove_all_items order.num_items.should == 0 end end class Order def initialize @items = [] end def num_items @items.size end def add_item(item) @items << item end def remove_all_items @items = [] end end |
As you can see, we're able to use the original implementation with no issues [1]. By thinking of the Aggregates in your system, you can design your objects to eliminate needless coupling and avoid bugs. That opens up a whole world of exciting possibilities, but you have to get to this basic point first.
I've only met a few Rails programmers that are familiar with Domain-Driven Design, and don't know of any that apply the Aggregate pattern in their projects. ActiveRecord just isn't designed for it. Let's take a look at a problem that results from not drawing Aggregate boundaries.
Here's an action which takes an item and adds it to the user's cart.
1 2 3 4 5 6 7 8 |
def create if @item = current_car.items.create(params[:item]) redirect_to @cart else @product = Product.find params[:item][:product_id] render :template => "products/show" end end |
A problem occurs when we want to render the contents of the cart in a widget on each page. If we execute this action and the save fails, current_cart.items still contains a reference to this invalid item, and will be rendered along with the other items. Simply put, the cart and its items are in an inconsistent state. The solution is to reload the items before rendering them:
1 2 3 4 5 6 7 8 9 |
def create if @item = current_car.items.create(params[:item]) redirect_to @cart else @product = Product.find params[:item][:product_id] current_cart.items true render :template => "products/show" end end |
An alternative is to use << instead of build, which will attempt to save the associated object and will raise an error if it fails. This solves the immediate problem, but (a) doesn't solve the larger one and (b) often doesn't work with libraries such as resource_controller which typically follow a build/save pattern that's usable with an ActiveRecord class or association proxy.
We can solve this problem by making the cart and its items an aggregate, with the cart as the root. Now we place operations on the cart and let it manage its internal objects:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
class Cart < ActiveRecord::Base has_many :items def add_item(options) item = items.build options if item.valid? item elsif items.delete item false end end end def create if @item = current_car.add_item(params[:item]) redirect_to @cart else @product = Product.find params[:item][:product_id] render :template => "products/show" end end |
I don't actually do this much of the time :) I've learned to just kind of suck it up and make effective use of ActiveRecord and the various Rails-related libraries, at least when dealing with relatively simple stuff. Once the constraints get complex and interrelated, it becomes awkward to manage through Rails associations alone, and that's when you can make your life easier by defining aggregates in your domain model.
[1] Note that there were even bigger potential problems by handing out direct references to the internal order list. A client could do @order.items.clear and bypass any business rules on removing items from an Order
I'm watching Francis Hwang's talk from RubyConf 2008 titled "Testing Heresies." He discusses his reasons for not using mock objects, and commits a common error when trashing mocks: he shows crappy examples which evidently serve as proof of why the technique sucks.
To start, let's look at the two main arguments people make for why mocks suck:
Here's a code example Francis gives that uses mocks:
1 2 3 4 5 6 7 8 9 10 11 12 |
describe ShowController, "list" do describe "when a TV show has no public videos" do it "should not show that TV show" do Show.should_receive(:find). with(:all, :select => "id, nam, public_videos_count, full_episodes_count", :conditions => "shows.public_videos_count>0"). and_return([]) get 'list' response.body.should_not match(/#{@show.name}/) end end end |
Before he gets into it, he says something along the lines of "Please let me know if I'm mischaracterizing the approach." Too bad I wasn't there. Yes, you COMPLETELY mischaracterized the approach.
If you're using mocks like this, you're doing it wrong. Very, very wrong. Here's how you would really use mocks:
1 2 3 4 5 6 7 8 9 |
describe ShowController, "list" do describe "when a TV show has no public videos" do it "should not show that TV show" do Show.should_receive(:with_public_videos).and_return([]) get 'list' response.body.should_not match(/#{@show.name}/) end end end |
Better?
Then you would have a model level spec:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
describe Show do describe "with_public_videos" do before(:each) do @with_vids = create_show @with_vids.videos << new_video(:public => true) @private_vids = create_show @private_vids.videos << new_video(:public => false) @no_vids = create_show end it "should find the show that has a public video" do Show.with_public_videos.should include(@with_vids) end it "should not find a show with only private videos" do Show.with_public_videos.should_not include(@private_vids) end it "should not find a show with no videos" do Show.with_public_videos.should_not include(@no_vids) end end end |
This is one example where I'll cave in to the whole "mocks duplicate the code under test" argument. If your mocking code sucks, and your mocking code duplicates your production code, then guess what? Your production code sucks!
The value that mocking provides in this case is writing the code you wish you had. Do you really wish you had a find(:select => ..., :conditions => ...) in your controller code? Of course not! (I hope not, anyway). You don't have to use mocks to get the APIs you want (refactoring is still an option ;) but it's very, very helpful.
Bottom line: mocks aren't inherently evil. I could take Francis's example as proof of why Ruby, Rails, and RSpec all suck, and it would be just as valid as concluding that mocks suck. The only things that sucked here are the example and the production code.
No it doesn't. Let me give you a very quick rundown of the code you would write to implement this feature.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Feature: List shows
In order to watch their favorite shows
Users will need to view shows with available videos
Scenario: Show has a public video
Given a show named "The Office"
And "The Office" has a public video named "Sensitivity Training"
When I view the list of videos
Then I should see "The Office"
Scenario: Show has a private video
Given a show named "The Office"
And "The Office" has a private video named "Sensitivity Training"
When I view the list of videos
Then I should not see "The Office"
Scenario: Show has no videos
Given a show named "The Office"
When I view the list of videos
Then I should not see "The Office"
|
Now when someone says "but if I rename the finder, the controller spec will continue to pass! Mocks are horrible!!" You can point to the acceptance test that fails. Holy quality software, batman!
I'm hosting a workshop on behavior-driven development with RSpec next week in San Francisco. It's a 1-day workshop being offered on 2 separate days, so hopefully one of them works out for you. Hope to see you there!
Sign up for February 28th
Sign up for March 1st
A big thank you to Mark Carranza for helping set this up.
Some details:
This session is a real-world introduction to behavior-driven development (BDD) with RSpec. Attendees will learn:
In this workshop we will first discuss each of these topics in detail. Then we'll pull them all together by building a feature the true BDD way–beginning with a conversation with the customer, writing acceptance test scenarios, and test-driving the implementation. The goal of this workshop is for you to come away with a strong enough grasp on RSpec and BDD to begin using it at work (or home!) later that day.
Apress has a daily deal going on where they sell a PDF version of one of their books for $10 over a 24-hour period. They have minimal DRM, just prompting you for your email address.
They don't have a feed for it, so I made one that updates daily: Apress Deal of the Day atom/rss feed
Take the following very simple controller that changes a user's address:
1 2 3 4 5 6 |
class UserController def change_address @user.city = @view.city_field @user.state = @view.state_field end end |
It gets the job done. Until you do a little validation:
1 2 3 4 5 6 7 8 9 10 11 12 |
class UserController def change_address city = @view.city_field state = @view.state_field if States[state].cities.include?(city) @user.city = city @user.state = state else @view.display_error("#{city} is not in #{state}") end end end |
While the system has behavior, the objects don't. A different piece of code could change the city and/or state but not make this check, leaving the user in an invalid state.
So instead of a user being an object with properties to set, it's an object with behavior, one of which is "change address."
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
class UserController def change_address begin @user.change_address @view.city_field, @view.state_field rescue RuleError => e @view.display_error e.message end end end class User def change_address(city, state) raise RuleError("#{city} is not in #{state}") unless States[state].cities.include?(city) @city = city @state = state end end |
There's another word you've probably heard for this: encapsulation. A common explanation of encapsulation includes computed vs stored values. Does the circle calculate its area on the fly or does it return an instance variable? Clients don't have to know!
That leads to a very incomplete understanding of encapsulation. Encapsulation really means to expose behaviors, not data slots.
It's obviously very important to test your code, and writing specs can help to provide some of that safety net. But you'll hear sayings such as "test-driven development isn't about testing, it's about design."
Unit testing (before or after) helps keep your design simple. If you can't instantiate and use an object in a barebones test harness, you're going to have bigger problems down the road.
Exposing behaviors instead of data is the keystone of good OO design. If you're not exposing behavior, nothing will help you, not even SOLID.
The enlightenment of TDD is that you invent a good design via objects that are easy to create and use. The enlightenment of BDD is that good design is made even better by focusing on behavior. You get all the benefits of TDD because you go through the same steps (in fact, BDD is TDD done right).
How do you focus on behavior? Well, that's a mental shift you must make when programming. You can use tools to keep you on track. Two that I can think of off the top of my head are descriptive strings and mock objects.
Example descriptions should point to the behavior:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
it "should change the user's address" do @view.city_field = 'Grand Junction' @view.state_field = 'CO' @controller.change_address @user.city.should == 'Grand Junction' @user.state.should == 'CO' end it "should show a message when changing the user's address fails" do @view.city_field = 'Grand Junction' @view.state_field = 'AK' @controller.change_address @view.error_message.should == "Grand Junction is not in AK" @user.city.should == @original_city @user.state.should == @original_state end |
Both examples mention "changing the user's address" suggesting that there's a "change address" behavior on the User object.
Of course you can do it with Test::Unit
1 2 3 |
def test_should_change_users_address ... end |
Mock objects also help you focus on the behavior:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
it "should change the user's address" do @view.city_field = 'Grand Junction' @view.state_field = 'CO' @user.should_receive(:change_address).with('Grand Junction', 'CO') @controller.change_address end it "should show a message when changing the user's address fails" do @view.city_field = 'Grand Junction' @view.state_field = 'AK' @user.should_receive(:change_address).with('Grand Junction', 'AK').and_raise_error('Grand Junction is not in AK') @controller.change_address @view.error_message.should == "Grand Junction is not in AK" end |
With example descriptions, you have to make the connection between between that text and the behavior it implies, including the correct placement for it. Mock objects are more explicit in pointing to where behavior should go.
Anyway, those are my basic thoughts on behavior, why BDD is valuable, and how tools like RSpec can help out.