Four common fixes when you’re upgrading to Ruby 1.9

With Ruby 2.0 right around the corner, it might seem somewhat…behind the times to do a post on upgrading your project to ruby 1.9. Nevertheless, it’s something I’ve been working on for a few days now for one of 12 Spokes’s clients, and I think there are a few common issues I can comment on to help others making the transition. Although not an exhaustive list, these are the 4 things I ran into most often when trying to make all the little red dots turn green in my 1.9 branch.

1) relative path require statements

Much ruby code prior to 1.9 has made use of the fact that your current directory (‘.’) has always been on the load path. The reason it’s been removed in 1.9 is that there are potential security and specificity concerns about having requirements relative to whatever directory something is currently being run from. This is not a new idea (http://www.faqs.org/faqs/unix-faq/faq/part2/section-13.html), and it’s not really the purpose of this article to argue for or against that change. The reality of 1.9 is that all your require statements that depend on relative pathing to the current directory will break. You could just add ‘.’ to the loadpath, but that certainly doesn’t do anything about the alleged security concerns (which is why it was removed). If you’re commiting all the way to 1.9.2 or higher, the easiest thing to do is to use the new require_relative method which allows you to give a relative path to the current file. If, however, you still need your code to be backwards compatible to 1.8 for a time (as I do), you may wish to instead opt for simply expanding to the absolute path in your require statements (it’s a bit verbose, but it makes everything work and it’s hard to mistake its intent).

Thus this statement:


require File.dirname(__FILE__) + '/some_helper'

Would become:


require File.expand_path(File.dirname(__FILE_) + '/some_helper')

And things start working again!

2) encoding issues for string literals

I ran into a number of problems with exotic string literals in the upgrade process because frankly 1.8 didn’t really care one way or another, what you did with your string was your business. Ruby 1.9 is a little more conservative and will demand to know what the encoding of a string is and will try interpret it fairly strictly , and although I agree that it’s for the better and will prevent a lot of weird data output, it will require some grunt work when you’re making the transition. By far the most common error I ran into was something like this:

invalid multibyte char (US-ASCII)

Although a full treatment of this issue is beyond the scope of what I want to provide here (dive deep if you wish), the basic problem here is that unless you tell ruby otherwise it will assume that your source files are written with US-ASCII encoding.

If you then have a string literal in your code that has a NON-ascii character:

language = "Español"

Then ruby won’t even finish parsing the file before it chokes and gives you the “Invalid Multibyte” error shown above. At it’s simplest, ASCII is a character set that was designed around english and at it’s maximum uses one byte for storage space, and with all the many characters in the world there just isn’t enough permutations to encode them all. If you want to use those characters, you need to use a character encoding that can handle bigger character data; most commonly this is going to be UTF-8, and if you don’t know why that is then I hereby refer you to the venerable Joel Spolsky on the topic.

In any case, if you tell ruby to use UTF-8 instead of ASCII for that file, then it won’t have any trouble processing a character outside the normal ASCII character set, and you can do this with a simple ‘magic comment’ at the top of the file:


# encoding: UTF-8
language = "Español"

No problem.

3) yielding splats

This was a sneaky one! Follow me here, as it’s a little indirect. We had a chunk of code that found a small set of values and yielded them to a block with the splat operator. For example:

def aggregator_method
values = calculate_array_of_values
yield *values
end

In an area of the codebase that utilized this method, we had something like the following:

def heavy_lifting
massive_setup
aggregator_method do |values|
much_processing(values.first, values.last)
end
end

If it’s not obvious what the issue is here, the splat operator (*) takes an array and splits it out to feed one element per parameter to the block/proc/method it’s being passed to. This really was an example of us just using the operator in the wrong place (since we were expecting to use an array, it would have been find to just pass the array through without using a splat operator), but it turns out that ruby 1.8 was more than happy to oblige us and if your block just takes one argument, it would take the splatted array and pass the whole thing as the first argument. 1.9, on the other hand, sticks to it’s guns, and if you yield a splatted array and your target block only takes one argument, it just passes the first element of the array (which in my opinion is the correct behavior, but which borked the above because the first element of the array was not itself an array and thus had no first or last methods). So, although this is an edge case, it can be useful to know this gotcha is out there.

4) String as an Enumerable

There were many places in our codebase where a string was being treated as something to be iterated over (.map, .each, etc). In ruby 1.8, ruby is cool with this, it thinks you’re referring to a division by lines; but really a string is a sequence of a lot of things: bytes, characters, words, lines. Ruby 1.9 removes the ambiguity and says that string itself is no longer enumerable, you need to specify what it is you want to iterate over. In the case of the codebase I was working on, this feature was mostly being used to accept a string (single line) or an ARRAY of strings as a parameter to a method, and treating them equally without doing any checking:


def transform(strings)
strings.map { |s| s.upcase }
end

in 1.8 the above works with a string or with an array of strings. The quick fix for this was to simply make any single strings passed in on their own into a single-element array (although in some places we did an explicit check of what the object responded to and provided an array wrapper around the object if necessary, depending on what made the API subjectively more pleasant to read.).

Just Do It

Honestly, although the task seemed large at the outset, so many of the problems were variations on the above themes that a few days of dedication ground through the vast majority of them, even though our client had a relatively large codebase compared to your average rails app. The performance gains alone are making this shift worth it for us, and with ruby 2.0 launching soon, I wouldn’t want to be too far behind when support for new versions of common libraries starts dropping entirely for ruby 1.8. Good luck, and good (bug) hunting!