Working with Ruby
Hi, I am Jan. This is my old Ruby blog. I still post about Ruby, but I now do it on idiosyncratic-ruby.com. You should also install Irbtools to improve your IRB.

Oh, this sweet and tasty syntactic sugar!

This article is written for people with experience in programming in general, but who are new to Ruby.
A German version is published in the offline magazine #2, a magazine by some students of TU Dresden.

The intention is to demonstrate some features of Ruby and show, what is so great about Ruby:

A clean syntax combined with the possibility to adapt the language to given requirements flexibly.

This leads to a good readable code. At larger projects, you often create a nice DSL (“Domain Specific Language”) “on the fly”. Furthermore, the interpreter cares about common tasks for you, for example, the allocation of memory or cumbersome iteration over a complex object. You should be able to concentrate on the gist: solving the problem and writing maintainable code.

Ruby does this by following the “Principle of Least Surprise”. A comment of language designer Yukihiro Matsumoto, how this has to be understood: “The principle of least surprise means principle of least my surprise. And it means the principle of least surprise after you learn Ruby very well. For example, I was a C++ programmer before I started designing Ruby. I programmed in C++ exclusively for two or three years. And after two years of C++ programming, it still surprised me.” (source)

The Basics

Let us begin with the basic types (which are actually not so basic: they also offer diverse methods). One of Ruby’s strengths is the comfortable way of handling these often used data structures and classes. For example, you do not have to bother about limits of integers – Ruby creates integers of any size for you (see listing 1).
The class String allows elegant possibilities for strings to reflect and edit themselves. Examples for this in listing 2.

 1
2
>> 2**2222 # raise 2 to 2222
=> 773838557787812127092191329117736395767580535204436915072674675545116664333902256211182137926950211751902449516673226756472839779783938771158021368882276356848070794209358070102043443325352136545050000110534739684220468927252571854522863758511196723438562910609190153214669067072101789621542172105425151134765107554703780036162346002513302884046411618787327146549168697301079918047772519772786759323456323386160178483395107203321577048282851869205647641463310698242417312882919566526471326406798360010017941481024087519190750874466247021692054677462758217679031488509940948129354330970633083278012074165038033940046145844629478331113175706858291419937603163400966242304
 1
2
3
4
5
6
input_string = 'Hello'
input_string << ' Universe'
input_string.delete! 'o'

puts input_string # Hell Universe
puts input_string.empty? # false

Another “simple” data type are symbols. These are strings, which begin with a colon. Different from strings, they do not offer any typical string operations. And they do not need to, because their aim is to be identifier tokens. So, every occurrence of the same symbol refers to exactly the same object (in contrast to Strings). For this reason they are nice as keys for hashs (see below) or for writing well readable case-statements.

The most common used complex data types are arrays and hashs. Arrays are “collections” of objects, which are indexed by numbers. They are dynamical. You can easily append new elements with the << Operator. Some of the operations are demonstrated in listing 3. Hashs, however, take any objects as keys, normally strings or symbols. Unlike arrays, the key-value-pairs do not have an order. The hash syntax can be seen in listing 4. If the last parameter of a method is a hash, and you call the method without brackets, you are also allowed to omit the hash brackets, as shown in listing 7.

 1
2
3
4
5
6
7
8
9
10
11
12
>> a = []  # empty array
=> []
>> b = [1,4,7,10]  # non-empty array
=> [1, 4, 7, 10]
>> a << 2 << 4 << 6  # add some elements
=> [2, 4, 6]
>> a + b  # unite them
=> [2, 4, 6, 1, 4, 7, 10]
>> a - b  # a without the elements of b
=> [2, 6]
>> a & b # intersection
=> [4]
 1
2
3
4
5
6
7
8
9
a = {}  # empty hash
b = {:a_key => 'is here', :another_key => 'is there'}  # a hash literal
b[:a_key] = 'invalid'
a[6] = 15
a.merge! b
a[6] = 99
p a.values.join ','  # "99,invalid,is there" 

c = {this: 'looks', almost: 'like json'} # a ruby 1.9 literal

A more specialized data class is, for example, the Range class. An example is demonstrated in listing 10.
Regular Expressions are also easy to use, as can be seen in listing 5

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
input_string = '...<a href="https://offline.ifsr.de">Offline Online</a> noch mehr Text.
Und noch ein Link: <a href="https://heise.de/developer">So startet der Tag</a>...'

expr = /<a\s+?href="(\S*?)"/i

# one way to check for first occurrence
match = input_string.match expr
puts match[1] if match

# ...another way using a thread-global, perl-like special variable and the match-operator...
if input_string =~ expr
  puts $~[1]
end

# ...or get every match!
if matchs = input_string.scan expr
  puts matchs
end

# in ruby 1.9 it's also possible to name your groups
if input_string =~ /<a\s+?href="(?<address>\S*?)"/i
  puts $~[:address]
end

Another important class is the NilClass, which has only one instance: the nil object. A little quiz about it: What is the return value of the expression 0.nil? The answer: false. Why? Because 0 does not always depict an unknown or false value. However, the nil object means something like ‘undefined’. Only this and the false object return false in boolean comparisons, all the others return true. Because of this strict concept, some possible mistakes are eliminated, for example, when analysing user input.

Worth mentioning is, that you are the master of the type of a variable and so you have to convert it manually. The expression "5"*5 does not lead to 25. The method * is also implemented for the class String – but in a different way. The actual output is "55555" (also see “Duck Typing”).

When working with methods, you can also focus on solving the problem:

  • Redundant brackets for easy method calls – not needed
  • return statement for the last statement of a method – nonessential
  • Type declarations – there are none

Method definitions are done with the easy syntax def method_name...end. When you use the prefix self. within a class, you define global class methods. It is also possible, to use an existing object, followed by a dot, as prefix. That is the way to define methods on single objects.

Let us look at listing 2 again. Methods, which deliver a boolean value, have the convention to end with a question mark. “Dangerous” methods, for example such, that change the receiver object directly, have an exclamation mark as the last sign. You should comply with this convention, because it is very useful quickly to get an idea of what the method is doing. Another serviceable feature is, that you can write method calls like value assignments if the last sign of the method name is an equal sign. Listings 7 shows an example.

It is also good to know is, that most operators are implemented as usual methods, written without the brackets. For example, when you need a good name for a method of a custom class, which adds something essential to it, why not just take <<? Square brackets can also be redefined, as listing 6 shows.

Furthermore, there are conventions and rules for choosing the names of methods and variables:

  • Constants- and class names are capitalized
  • Method- and variable names begin with a lowercase letter
  • Instance variables of an object and class variables of a class are prefixed with an @
  • Bad, global variables are prefixed with $

So it is already possible to recognize the sort and the scope of a variable by its first letter. Another note on the scope: Unlike methods, instance variables of a class are always private, there are no public ones. But it is very easy to define ‘getter’- and ‘setter’ methods, as listing 7 shows.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
module UserStatus # define hash-like access for an own module/class
  @settings = {}
  @current_users = []

  def self.[](key)
    @settings[key]
  end

  def self.[]=(key,val)
    @settings[key] = val
  end

  def self.<<(another_user)
    @current_users << another_user
  end
end

UserStatus << 'Sabine'  # => ["Sabine"]
UserStatus << 'Jan'     # => ["Sabine", "Jan"]

UserStatus[:registration_method] = :email
puts UserStatus[:registration_method]  # :email
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Hello
  def initialize(input='',params = {})
    @string,@params = input,params
  end

  # a simple getter for @string
  def string
    @string
  end

  # a simple setter for @string
  def string=(value)
    @string = value
  end

  # an even simplier way for doing both for @params ;)
  attr_accessor :params # the attr_accessor-method takes a symbol with the same name as the instance variable
end

some_var = Hello.new 'my_input',
                     :admin => true,
                     :user => 'Stefan'

puts some_var.params[:user]  if some_var.params[:admin]  # "Stefan"

Let us now take a look at some more complex program structures.

Exceptions

While coding, you want to concentrate on the main aspects. Nobody feels like locating permanently deficient values and then having quit the current methods somehow. Anymore, those are often nested. Like many other languages, Ruby allows to throw exceptions. This is easily done by raise 'exception name' or raise ExceptionClass, 'exception name'. When the interpreter gets to this line, he searches backwards in the call stack for a rescue statement. If none is found, the program is exited. Examples can be found in listing 8. It must be pointed out that rescue statements – for the reason of having a well-arranged structure – are only allowed in method definitions and in sections, which are built up by the keywords begin...end.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def divide(x,y)
  raise ArgumentError, "divided by 0"  if y == 0
  x / y
end

begin
  print 'Please enter the dividend: '; a = gets.to_f
  print 'Please enter the divisor: ';  b = gets.to_f
  puts divide a, b
rescue ArgumentError => e
  puts 'Please check your input! You ' << e.to_s
  retry
rescue
  puts "There's a unknown error! Exiting now..."
else  # There were no errors
  puts 'All calculations were successful :)'
ensure  # Always output this!
  puts 'Thank you for using this program.'
end

Iterators

Lots of (Java-)programmer like iterators to get all the objects of a collection. So do Ruby programmers – they even prefer them to typical loop statements. There are two kinds of iterators in Ruby, which are demonstrated in the following.

Let us look at the array [1,2,3,4,5,6,7] on which we like to iterate. The usual Ruby Way is an internal iterator, which you can see, for example, in listing 9. The each method of the array is called and given a Ruby block, whose “block parameter” contains the right object of each step.

The iteration is controlled by the interpreter. If you want to do it on your own, you need external iterators. This can be done by writing iterator_variable=[1,2,3,4,5,6,7].each. Now you have the power to decide, when to get the next value by just calling iterator_variable.next. Another application of this approach is the parallel iteration on two objects. External iterators are only available in newer Ruby versions.

Functional Style

There is also much fun for fans of functional programming. Functional programming is not always a good style, but some mindsets can easily be integrated into an imperative programming language like Ruby. For example the map method, which maps every element of an Enumerable to another one, with the help of a connected block. Another useful iterator-method is inject: The connected block accumulates a specific value. Both is demonstrated in listing 9.

The already often mentioned blocks are nothing else than anonymous functions. They are an essential part of Ruby and are easy to use in own methods. In order to do this, just define the last method parameter with the prefix &. Now the method can be called with a block. In the definition the block is called by the key word yield.

Because Object Oriented Programming is an important element of Ruby, blocks can also be saved as objects: either as Proc or as Lambda. Both classes are similar, but have some differences if you have a closer look at them. More information is available in the documentation, listing 9 shows shortly the basic use. Also keep in mind, that the behaviour of block objects has changed a lot between Ruby 1.8 and 1.9.

 1
2
3
4
5
6
7
8
9
10
11
12
>> a = [1,2,3,4,5].map{|ele| ele*ele}
=> [1, 4, 9, 16, 25]
>> a.inject{|product,ele| product*ele}
=> 14400
>> a = lambda {|x| x%2 == 0 ? 'even' : 'odd'}
=> #<Proc:0xb7ab4cf0@(irb):21>
>> a.call 7
=> "odd"
>> b = ->(x) {x%2 == 0 ? 'even' : 'odd'}  # the 'stabby' syntax is new in 1.9
=> #<Proc:0x952e0b4@(irb):27 (lambda)>
>> b.call 4
=> "even"

You have lambdas, you can redefine operators and you also have the possibility to change all kinds of classes – enough sets of tools to program in a very functional way. Advanced examples on this are available e.g. in the book B1, chapter 6.8.

Mixins

Let us now look at an alternative to typical inheritance. Instead of classes you can also define modules. These can be looked at as slimmed classes (to be accurate: classes themselves are a subclass of the module class Module). It is not possible to instantiate objects from modules. On the one hand they are used to get a new namespace. On the other hand to include them as Mixins in other classes or even in single, existing objects. Both can be done with the method include. The module defined methods are now available in the class, respectively to the object!

A good example is given by the module Enumerable. On many “collection objects”, similar operations have to be done, for example testing, if a specific element is included. Instead of implementing this functionality for each of these classes, the methods are defined in the module Enumerable. It can be integrated into each class that implements the iterator method each – at one blow you posses about 20 useful methods. Listing 10 shows some.

This is also an illustration of Ruby’s “Duck Typing”. In Ruby it is more important, how an object behaves, than in which class it is defined. This means, Ruby expects, that a specific method reacts in a specific way, regardless of how it is working internally or of which type of the receiver object is. The Enumerable methods do not care, whether they are used on a hash or an array, as long those have a working each method.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
>> r = 1..42
=> 1..42
>> r.each {|ele| print "#{ele} "}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 => 1..42
>> r.member? 13
=> true
>> r.max
=> 42
>> r.grep 13..31
=> [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
>> r.select{ |ele| ele % 11 == 0 }
=> [11, 22, 33]
>> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89].select{ |ele| ele % 11 == 0 }
=> [55]

Concurrency

An aspect of growing importance in the field of programming is concurrency. Again, the Ruby syntax is intuitive. You create a new Thread object and connect it with a block. The commands of the block get executed, while the main program is continued. In addition there are various tools to control the thread in form of methods of the thread instance. There are also more ways to program in parallel (keyword: Fiber). Take a look at the documentation, if you want to know more.

However, attention should be paid to which Ruby implementation is used. While Ruby 1.8 is only simulating concurrency using the Green Threads technology, there is a native implementation of threads in Ruby 1.9 – but with Global Interpreter Lock. To avoid this problem, you have to use JRuby, because the JVM implements real threads, of course.

Meta Programming

If this flexibility of Ruby is still insufficient for you, you should take a deeper look at Ruby’s strong meta programming skills. Normally, you should keep away from these, ingenuous use can lead to unexpected results very quickly. Besides this, with the power of meta programming, there is also a performance loss. There will not be a detailed descriptions of all the possibilities, again I refer to the Ruby documentation.

Instead of that, here is a short summary of the main aspects:

  • Objects/classes can analyse themselves.
  • Methods and classes can be defined and changed at runtime. This is known as “Monkey Patching”.
  • Evaluation of code in the context of classes and objects is possible.
  • Access to private methods and variables is possible.
  • There are the magic methods method_missing and const_missing, which are called, if there is no method or class with a specific name.

Ruby 1.9

Ruby 1.9 is not fully compatible with the widespread version 1.8. There are changes to the UTF-8 support, a change about the scope of block variables, many deep, specific changes and also light syntax changes. Here some examples:

  • Changes concerning the dealing with strings. "Hello"[1] gives you the ASCII value 97 in 1.8, in 1.9 it gives you the letter ‘a’ instead.
  • A new possibility of defining hash literals, when the keys are symbols. At the end of listing 4 this is demonstrated. At method calls, hashs without brackets feel like named parameters.
  • Named groups in regular expressions. See the example in listing 5.
  • A short way to define lambdas, see listing 9.

StdLib, Gems & Rails

Ruby contains a rich standard library. Listing 11 shows an example of use. Furthermore, many other libraries are available, int the Ruby world they are called “Gems”. These are easy to install with the command line tool “gem”, a packet management system, which also dissolves dependencies. The most famous gem is Rails, a well-thought, popular web framework. Web sites using Rails are, for example, twitter.com or xing.com. More information on Ruby on Rails are available at rubyonrails.org or in B2. Lots of more projects are hosted on github.

 1
2
3
require 'yaml'
hash = {7 => 'Sieben', 1000 => ['Eintausend','Tausend']}
y hash # puts out the hash in yaml-format

Tools

Now the “insider tip” (actually not so inside ;) ) for every Ruby programmer: The program irb, which is included in a complete Ruby installation. The abbreviation probably stands for “I love Ruby”. It is an interactive Ruby console, in which you can spontaneously try your Ruby code snippets. At first a bit unusual, but after some time you do not want to miss it!

The included program ri, which shows the Ruby documentation for any class or method, is also quite helpful.

Conclusions

The given freedom can – of course – easily be misused. So it is is advisable to follow conventions.

Ruby does not suit for everyone. For example, if you always want to be sure, which type a parameter of a specific method has, you should look for alternatives. Everyone else: just try it :)!

Finally, there is a felicitous quotation by Ruby designer matz: “Language designers want to design the perfect language. They want to be able to say, ‘My language is perfect. It can do everything.’ But it’s just plain impossible to design a perfect language, because there are two ways to look at a language. One way is by looking at what can be done with that language. The other is by looking at how we feel using that language, how we feel while programming.” (source)

Resources

  • [B1] David Flanagan, Yukihiro Matsumoto: “The Ruby Programming Language”
  • [B2] Sam Ruby, Dave Thomas, David Heinemeier Hansson: “Agile Web Development With Rails”
Creative Commons License