Classes, Hashes, Structs and OpenStructs in Ruby

Intro

If I am looking to maximize the performance of a Ruby app what should I be using? Hashes, Structs, OpenStruct, or Classes? New Ruby programmers love hashes. I love hashes. They are very flexible and tempting to add in everywhere. Even when refactoring a slow part of the system in Kafka microservices so that the whole feature can go fast, fast, fast! They sneak in everywhere and I am supposed to be coding for fun right? No? Ok. Back to making stuff fast.

Short Answer

If you are just looking for the speed leaderboard:

  1. Classes
  2. Structs
  3. Hashes
  4. OpenStructs.

OpenStructs are very much in the last place and that tortoise isn’t becoming a hare anytime soon. If you are concerned with speed you can go ahead and remove them from your codebase. But why? What is behind this and are there any trade-offs that compensate? And the same question for the rest of the rankings. Let’s dive in.

Hashes

Hashes are simple containers for your data. On the upside, they can do equality, which can be useful.

d = {name: "J"}
# => {:name=>"J"}

r = {name: "J"}
# => {:name=>"J"} 

d == r
# => true 

On the downside, hashes don’t complain when you typo something which sucks because an obvious error is easy to fix while a subtle typo can take hours to find and debug.

d = {name: "J"}
d[:nome]
# => nil

Hashes are for storing data and passing it around, not for behavior. Hashes can’t tell you when their data is wrong because they have no validations of any kind.

api_response = {status_code: 500, result: "success"}
# => {:status_code=>500, :result=>"success"} 

api_response[:result]
# => "success"  # NOPE! INTERNAL SERVER ERROR!!

api_response.success?
Traceback (most recent call last):
        6: from /home/user/.rvm/gems/ruby-2.7.2/bin/ruby_executable_hooks:24:in `<main>'
        5: from /home/user/.rvm/gems/ruby-2.7.2/bin/ruby_executable_hooks:24:in `eval'
        4: from /usr/share/rvm/rubies/ruby-2.7.2/bin/irb:23:in `<main>'
        3: from /usr/share/rvm/rubies/ruby-2.7.2/bin/irb:23:in `load'
        2: from /usr/share/rvm/rubies/ruby-2.7.2/lib/ruby/gems/2.7.0/gems/irb-1.2.6/exe/irb:11:in `<top (required)>'
        1: from (irb):8
NoMethodError (undefined method `success?' for {:status_code=>404, :result=>"success"}:Hash)

Classes

Defining a class is super simple and they can be expanded on later.

class Book 
  attr_accessor :title, :author 

  def initialize(args) 
    @title = args.fetch(:title)
    @author = args.fetch(:author)
  end
end

colbook = Book.new(title: "I am America and so can you.", author: "Stephen Colbert")
colbook.author 
# => "Stephen Colbert"

Classes can also have behavior that can make them superior to hashes when trying to interpret the data they encapsulate in a consistent manner.

class Book 
  attr_accessor :title, :author 

  def initialize(args) 
    @title = args.fetch(:title)
    @author = args.fetch(:author)
  end

  def title_page
    "#{title} by #{author}"
  end
end

colbook = Book.new(title: "I am America and so can you.", author: "Stephen Colbert")
colbook.author
# => "Stephen Colbert"

colbook.title_page
# => "I am America and so can you. by Stephen Colbert"

This means that our hash from before could have inspected its status code and returned a false from the success? method. So a class is generally better than a hash but we also don’t want to define them for every little thing.

Struct

Structs are quick classes. They are useful when you won’t need the data or behavior to be passed out to the whole app but you need something more resilient than a hash.

ApiResponse = Struct.new(:status_code, :result) do
  def success?
    status_code == 200 ? true : false
  end
end
response = ApiResponse.new(500, "success")
response.result
# => "success" 

response.status_code
#=> 500 

response.success?
# => false 

Structs are lot faster than everything else on this list besides classes and you can also compare two Structs to determine equality.

response1 = ApiResponse.new(200, "success")
response2 = ApiResponse.new(200, "success")

response1 == response2
# => true


response3 = ApiResponse.new(500, "Internal Server Error")
response1 == response3
# => false

OpenStruct

Open structs create new objects while a Struct defines a class that already has attributes, an equality method (==), and is enumerable.

require 'ostruct'
paul1 = OpenStruct.new(name: "Paul", age: 12)
paul2 = OpenStruct.new(name: "Paul", age: 12)
paul1 == paul2
# => true

While you can’t define a method on and OpenStruct, one advantage they have is that you can add a new attribute to it after it has been defined. You can’t do this with a Struct.

require 'ostruct'
person = OpenStruct.new(name: "Paul", age: 12)
person.name
# => "Paul"

person.age
# => 12 

person.hair_color
# => nil 
person.hair_color = "red"
person.hair_color
# => "red"

So OpenStructs come off pretty much as fancy hashes. They are extremely slow and should not be used anywhere you can avoid them.

Conclusion

So, OpenStructs looks like a wrapper around a hash and they are much slower than hashes, classes or structs. They are pretty much the worst option. If you just want to pack some data in without any behavior then hashes are an ok option.

However, Structs are better than OpenStructs and Hashes not only because they are faster but because of their option to add behavior. Defining a struct means creating a small class that can be reused without writing up a whole new class file (or in Rails creating a new class that might be backed by a database table).

So the TLDR of this whole post is that while hashes may be very familiar and OpenStructs very convenient, you should convert them to structs whenever performance or data integrity is important.

Some good links:

https://palexander.posthaven.com/ruby-data-object-comparison-or-why-you-should-never-ever-use-openstruct
https://www.rubyguides.com/2017/06/ruby-struct-and-openstruct/

Other articles you might like: