LaunchSchool - An Online School for Developers /

Blog

Mutating and Non-Mutating Methods in Ruby

This is the second in a series of three articles that discuss how Ruby manipulates variables and objects, and, in particular, how objects are passed around in a Ruby program. In the Understand Variable References and Mutability article, we explored how Ruby uses variables — variables don’t actually contain values, but instead serve as references to objects. We also discussed the concepts of object mutability and immutability, and introduced the concepts of pass by value and pass by reference.

In this article, we discuss methods, and how they can be mutating or non-mutating with respect to certain arguments. We focus special attention on assignment and concatenation, two operations that cause a lot of confusion for new rubyists.

Mutating and Non-Mutating Methods

Methods can be either mutating or non-mutating. As you might expect, mutating methods change something; non-mutating methods do not. The object that may or may not be mutated is of concern when discussing whether a method is mutating or non-mutating. For example, the method String#sub! is mutating with respect to the String receiver object, but non-mutating with respect to its other arguments. (Note that the receiver is itself a method argument.)

Non-Mutating Methods

A method is said to be non-mutating with respect to an argument (including its receiver) if it does not modify that argument. Most methods you will encounter do not mutate their arguments or receiver. Some mutate their receiver, but few mutate any other arguments.

All methods are non-mutating with respect to immutable objects. A method simply can’t modify an immutable object. Thus, any method that operates on numbers and boolean values is guaranteed to be non-mutating with respect to that value.

Assignment is Non-Mutating

Of particular interest when discussing non-mutating methods is assignment with =. As we saw in Variable References and Mutability article, assignment merely tells Ruby to bind an object to a variable. This means that assignment does not change an object; it merely connects the variable to a new object. While = is not an actual method in Ruby, it acts like a non-mutating method, and should be treated as such.

Take a moment to study this code:

1
2
3
4
5
6
7
8
def fix(value)
  value.upcase!
  value.concat('!')
  value
end

s = 'hello'
t = fix(s)

When this code runs, what values do s and t have?

We start by passing s to fix; this binds the String represented by “hello” to value. In addition, s and value are now aliases for the String.

Next, we call #upcase! which converts the String to uppercase. A new String is not created; the String that is referenced by both s and value now contains the value "HELLO".

We then call #concat on value, which also modifies value instead of creating a new String; the String now has a value of "HELLO!", and both s and value reference that object.

Finally, we return a reference to the String and store it in t.

The only place we create a new String in this code is when we assign “hello” to s. The rest of the time, we work directly with that object, modifying it as needed. Thus, both s and t reference the same String, and that String has the value 'HELLO!'. You can verify this yourself by running this code in irb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
>> def fix(value)
--   value.upcase!
--   value.concat('!')
-- end
=> :fix

>> s = 'hello'
=> "hello"

>> s.object_id
=> 70363946430440

>> t = fix(s)
=> "HELLO!"

>> s
=> "HELLO!"

>> t
=> "HELLO!"

>> s.object_id
=> 70363946430440

>> t.object_id
=> 70363946430440

Let’s modify the original code slightly:

1
2
3
4
5
6
7
def fix(value)
  value = value.upcase
  value.concat('!')
end

s = 'hello'
t = fix(s)

Now what happens with s and t?

In this modified code, we assign the return value of value.upcase back to value. Unlike #upcase!, #upcase does not modify the String referenced by value; instead, it creates a new copy of the String referenced by value, modifies the copy, and then returns a reference to the copy. We then bind value to the returned reference.

The rest of the program runs as before, but if you look at the results in irb, you’ll see that things are quite different:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
>> def fix(value)
--   value = value.upcase
--   value.concat('!')
-- end
=> :fix

>> s = 'hello'
=> "hello"

>> s.object_id
=> 70349169469400

>> t = fix(s)
=> "HELLO!"

>> s
=> "hello"

>> t
=> "HELLO!"

>> s.object_id
=> 70349169469400

>> t.object_id
=> 70349169435840

s and t now reference different objects, and the String referenced by s has not been modified. What happened here?

Let’s modify our code again:

1
2
3
4
5
6
7
8
9
10
11
def fix(value)
  puts "initial object #{value.object_id}"
  value = value.upcase
  puts "upcased object #{value.object_id}"
  value.concat('!')
end

s = 'hello'
puts "original object #{s.object_id}"
t = fix(s)
puts "final object #{t.object_id}"

If you run this code, you will see something like this:

1
2
3
4
original object 70349169469400
initial object 70349169469400
upcased object 70349169435840
final object 70349169435840

This shows that value = value.upcase bound the return value of value.upcase to value; value now references a different object than it did before. Prior to the assignment, value referenced the same String as referenced by s, but after the assignment, value references a completely new String; the String referenced by #upcase’s return value.

Pictorially:

Assignment always binds the target variable on the left hand side of the = to the object referenced by the right hand side. The object originally referenced by the target variable is never modified.

However, be aware that any mutating operations prior to the assignment may still take place:

1
2
3
4
5
6
7
def fix(value)
  value << 'xyz'
  value = value.upcase
  value.concat('!')
end
s = 'hello'
t = fix(s)

This program modifies the original string so its value is helloxyz. However, thanks to the assignment, it is not changed to HELLOXYZ or HELLOXYZ!; those changes occur in a different object that gets returned by the method.

These types of issues arise not only with assignment, but also with assignment operators like *=, +=, and %=. These are all implemented in terms of assignment, and that assignment always causes the target to reference a possibly different object. None of these operations mutate their operands.

This can be confusing at times. For instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>> s = 'Hello'
=> "Hello"

>> s.object_id
=> 70101471465440

>> s += ' World'
=> "Hello World"

>> s
=> "Hello World"

>> s.object_id
=> 70101474966820

Though it looks as if we are modifying s when we write s += ' World', we are actually creating a brand-new String with a new object id, and then binding s to that new object. We can see by looking at the object ids that a new object is created.

If you are new to Ruby, this will trip you up. It’s guaranteed. It’s probably already happened; it’s likely why you are reading this article.

Note the word “possibly” in “causes the target to reference a possibly different object”. The reason for this can be seen by running yet another variation on our #fix method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
>> def fix(value)
--   value = value.upcase!
--   value.concat('!')
-- end
=> :fix

>> s = 'hello'
=> "hello"

>> s.object_id
=> 70363946430440

>> t = fix(s)
=> "HELLO!"

>> s
=> "HELLO!"

>> t
=> "HELLO!"

>> s.object_id
=> 70363946430440

>> t.object_id
=> 70363946430440

This time, though we assigned a reference to value, we end up with both s and t referring to the same object. The reason for this is that String#upcase! returns a reference to the original receiver, value. Since the reference returned by value.upcase! is the same (albeit modified) String we started with, the assignment effectively rebinds value back to the object it was previously bound to; nothing is changed by the assignment.

Mutating Methods

A method is said to be mutating with respect to an argument or receiver if it modifies the argument or receiver.

Consider the String#strip! method that removes leading and trailing whitespace from a String object:

1
2
3
4
5
6
7
8
9
10
11
>> s = '   hey   '
=> "   hey   "

>> s.object_id
=> 70101479494960

>> s.strip!
=> "hey"

>> s.object_id
=> 70101479494960

Here, we mutate the original String object; s references the same object both before and after #strip is called. Only the state of the object has been changed.

Many, but not all, methods that mutate their receiver use ! as the last character of their name. However, this is not guaranteed to be the case. For instance, String#concat is a mutating method, but it does not include a !.

There are several common methods that sometimes cause confusion, #[]=, #<<, and setter methods.

Indexed Assignment is Mutating

Indexed assignment, such as that used by String, Hash, and Array objects can be confusing:

1
2
3
str[3] = 'x'
array[5] = Person.new
hash[:age] = 25

This looks exactly like assignment, which is non-mutating, but is, in fact, mutating. #[] modifies the original object (the String, Array, or Hash). It doesn’t change the binding of each variable.

Consider this code:

1
2
3
4
5
6
7
8
9
10
11
def fix(value)
  value[1] = 'x'
  value
end

s = 'abc'
t = fix(s)
p s            # "axc"
p t            # "axc"
p s.object_id  # 70349153406320
p t.object_id  # 70349153406320

Earlier, we saw similar code that merely assigned to value, and we saw that performing assignment bound value to a completely new String. Thus, s and t referenced different objects.

Here, though, we are using indexed assignment instead, and, perhaps surprisingly, the binding does not change. Even after the assignment to value[1], value still references the same (albeit mutated) String object.

The reason for this is that indexed assignment is a method that a class must supply if it needs indexed assignment. This method is named #[]=, and #[]= is expected to mutate the object to which it applies. It does not create a new object.

Let’s examine this with an Array:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>> a = [3, 5, 8]
=> [3, 5, 8]

>> a.object_id
=> 70240541515340

>> a[1].object_id
=> 11

>> a[1] = 9
=> 9

>> a[1].object_id
=> 19

>> a
=> [3, 9, 8]

>> a.object_id
=> 70240541515340

Here, we can see that we have mutated the Array a by assigning a new value to a[1], but have not created a new Array. a[1] = 9 isn’t assigning anything to a; it is assigning 9 to a[1]; that is, this assignment changes a[1] so that it references the new object 9. You can see this by looking at a[1].object_id both before and after the assignment. Despite this change, though, a itself still points to the same (now mutated) Array we started with.

This is normal behavior when working with objects that support indexed assignment: the assignment does cause a new reference to be made, but it is the collection element e.g., (a[1]) that is bound to the new object, not the collection (enclosing object) itself.

Concatenation is Mutating

The #<< method used by collections like Arrays and Hashes, as well as the String class, implements concatenation; this is very similar to the += operator. However, there is a major difference; += is non-mutating, but #<< is mutating. Lets look at an example that uses String#<<:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>> s = 'Hello'
=> "Hello"

>> s.object_id
=> 70101471465440

>> s << ' World'
=> "Hello World"

>> s
=> "Hello World"

>> s.object_id
=> 70101471465440

This example is nearly identical to our earlier example using +=, but with one major difference: we use #<< instead of +=. The #<< method is mutating with respect to its receiver (s here), so the object referenced by s is modified; no new objects are created, so s still references the same object it did prior to the #<< call.

The << operator is actually a method that is defined for some classes. It is usually used as a shorthand for appending new values to a collection or String. Such classes define << to mutate their left-hand operand (the receiver object).

Setters are Mutating

Setters are very similar to indexed assignment; they are methods that are defined to modify the state of an object. Both employ the something = value syntax, so they superficially look like assignments. With indexed assignment, the elements of a collection (or the characters of a String) are replaced; with setters, the state of the object is modified, usually by modifying an instance variable.

Setter invocation looks like this:

1
2
person.name = 'Bill'
person.age = 23

This looks exactly like assignment, which is non-mutating, but, since these are setter calls, they actually mutate the object bound to person.

We won’t go into a lot of detail to illustrate this; suffice to say that a detailed discussion would be nearly identical to the discussion of indexed assignment.

It’s possible to define setter methods that don’t mutate the original object. Such setters should still be treated as mutating since they don’t create new copies of the original object.

Refining the Mental Model

What does this have to do with whether Ruby is pass by value or pass by reference? The mere fact that Ruby can have methods that mutate its arguments would seem to say that Ruby must use pass by reference in some circumstances. Arguments that are passed by copy cannot be mutated, so Ruby must use pass by reference when a method can mutate its arguments.

More importantly, the question of whether Ruby is pass by value or pass by reference usually concerns whether a method will mutate its arguments or receiver. With this discussion, we’re better equipped to determine whether a method will to mutate its arguments or receiver.

The presence of a ! at the end of a method name is a pretty good indicator that a method mutates its receiver. However, not all mutating methods use the ! convention. In such cases, you need to look at the source code of the method to see what operations are performed. Certain operations, like setters and indexed assignments should always be treated as mutating methods; others, like assignment and the assignment operators (+=, *=, etc) are always non-mutating.

While none of this modifies our mental model for object passing, it is all consistent with that mental model. Immutable objects still seem to be passed by value, while mutable objects seemed to be passed by reference. What we have done, though, is show that assignment can break the binding between an argument name and the object it references. This is important to keep in mind when examining the relationships between variables and objects.

Conclusion

In this article, we’ve seen that methods in Ruby can be mutating or non-mutating with respect to individual arguments, to include the receiver object. A method that does not modify its arguments or receiver is non-mutating with respect to those objects; a method that does modify its arguments or receiver is mutating with respect to the modified objects.

We’ve also learned that assignment in Ruby acts like a non-mutating method — it doesn’t modify any objects, but does alter the binding for the target variable. However, the syntactically similar indexed assignment and object setter operations are mutating. We’ve also seen that the #<< operator — when used for concatenation operations — is mutating, while the very similar operation performed by += is non-mutating.

We’re now ready to dive more deeply into the topic of whether Ruby uses pass by value or pass by reference. Continue reading at Object Passing in Ruby – by Reference or by Value? article.