Ruby is better than Python for Unix-like system administration
Disclaimer. Python is a very good general purpose scripting language, especially for scientist and non programmers but we are going to show here Ruby is hands down superior as a Unix scripting language and power tool. When Bash/Ksh/X-shell constrains your thinking and slows down your coding, still there are not reasons to write it in C++, just do it in Ruby. It will be faster and more enjoyable to write, if you write well the code will also easier to read and maintain than Python code.
Ruby Culture is the Unix culture, your culture
Ruby was conceived by a man who worked a lot in Perl and Unix. This is very much transparent from the language itself. If you are a Unix professional you will know at least half of of these names/symbols by heart:
` `, $?, STDIN, STDOUT, STDERR, ${}, ARGV, $$ ... # shell-isms
%Q, %q, %x, -e, -F, -wnlp, / /, $VAR, @VAR, HEREDOC ... # perl-awk-isms
All these names/symbols are in use in Ruby and they mean what you already know (a few exception exist, e.g. the use of $ and @ for variables).
What does this proves to you ? That Ruby speaks the language you know and was made to solve the problems you face every day. For Python this is not true, Python was born as a teaching language, not as an admin power tool. Python comes with its own simple graphical IDE that is great for Windows and beginners in general but it is not immediately useful for a Unix administrator which does most of the work on the command line. Ruby, on the contrary, can be perfectly driven from the command line. I would say Ruby3 here beats also Perl5 since it has a very powerful REPL builtin, called irb
.
One-liners
A Unix system manager works most of the time at the shell, rarely with GUI tools and for very good reasons: [1] GUI tools are hard to document [2] GUI tools are prone to frequent restyling that make documentation useless [3] GUI tools are hard to automate [4] GUI tools often have heavy dependencies in terms of libraries required [5] GUI tools are not practical to operate on a remote server, require low latency [6] GUI tools require X installed on the server, something you rarely want. We use instead the command shell e.g. Bash/Ksh/Csh etc.
A one-liner is a program that can be written in a single line at the shell prompt. With Ruby you can write simple and complex one-liners with ease. In Python you can’t write complex one-liners because to do anything useful and readable you must go newline and indent the code.
Are one-liners of any use? In my experience yes. Because you can copy the one-liner in your documentation *.odt file or Google Document or else, and then drop it on the shell prompt when needed. It just works. No tmp.sh files to write, no tmp.sh file to delete, you don’t need write access in the location where your prompt is, you don’t need your favorite editor on the remote system.
Example of one-liner. This comes from my notes about configuring Samba. I keep the original language (Italian) and post a picture to make it clear these are notes in real use. This script find and removes all files *.tdb and *.tlb in the directory specified by the output of smbd -b
and filtered with a grep
.
My language is my World: Regexp
I never used Fortran but I remember a physicist friend told me it is possible to index arrays starting from 0 or 1, depending on your choice and convenience. This can be very useful, sometimes indexing things at zero just adds a step of complexity. Fortran was indeed a language born for numerical calculations, it is natural it developed good features for that use case.
One of the principles of Unix is that its configuration files are textual data, readable by human eye. Data, when possible, is textual. System scripts for system automation are in textual form, the output of programs if possible is text, the most classic Internet protocols are text based. Text is unifying nature of Unix, it is text that flows in pipes, passes through processes and eventually reaches a file or the network. In some sense text lines are to Unix systems what lists are in Lisp-like programming: the central unifying pattern. The graphics below, comes from an excellent article I red recently, there are well represented those pipes we are talking about, you can’t see it, but if you could open those pipes you would most probably see a text stream.
The most powerful existing tool to describe a block of text is the regular expression. As a descendant of Perl, Ruby has regular expressions as first class citizen. This means you don’t need to load any external package nor to use any other data structure to define a Regexp. They are built-in as numbers, strings, arrays, dictionaries, files etc.
Let me prove it with a code picture, you don’t even need to know Ruby to understand what I write, just the very basics of Regexp syntax and the Object Oriented Programming dot notation.
For comparison, how is the situation in Python? In Python to use a regexp you need to load the module re
, then you need create the regexp through a String, and they don’t even call it Regexp, they call it a “Pattern”.
Suppose you were writing numerical analysis software, would you use a programming language where to define a Float you must use a String and do something like import fl
and then pi = Float("3.14")
, finally if you ask the type you get type(pi) = NumberWithDot
? I guess you wouldn’t. That is why somebody working with text, as all Unix administrators do, should prefer Ruby to Python, it has the right tools for the job.
Completely Object Oriented and regular
In my experience the only two widely used programming languages more regular than Ruby are the Lisp-like family of languages and the the Smalltalk-like family. Those are extremely regular languages, at the point that even the C-like if — then — else construct or the for-loop, that nowadays we consider obvious and a must-have for every decent language, they do not exist. You must phrase those concepts in “Object Receiving a Message” model or as List, nothing escapes.
Ruby is a bit more tolerant/friendly than Lisp and Smalltalk, sloppy if you wish, and defines some syntactic sugar for “if”, to define functions, classes etc. Its most basic control structures are similar to what we are used to from the C/C++ family of languages.
If you disregard a few exceptions, Ruby is fully object oriented, everything is an object and you do all the programming by sending messages to objects. You must think in term of Classes, Objects and Methods. If you don’t like it, better you don’t use this language, then I recommend you use Perl for Unix scripting.
The benefit of the regularity of the language is mostly in the ease of reading and memorization. I don’t write Ruby code all the time. I write it when I need to automate something in my servers. For me it is very important the language is regular enough for it to be easy to remember and to guess.
Example. Nowadays for me it is easier to get the epoch with this ruby -e 'p Time.now.to_i’
than to remember the date
formatting parameter. Why is it easier? Because ruby -e
is the same syntax I used in Perl5 in the past, Time.now
is obvious to_i
means convert to integer, and this is also obvious after a bit of coding in Ruby, p
at the beginning is the equivalent of the omnipresent print
and it is hard to forget.
Python on the other side is more like C++, you can program imperatively, functionally, put in some OO, but for sure it does not look at all like a full OO language, at the point that this is defined len([1,2.3])
but you can’t do this [1,2,3].len()
. The second syntax is the only reasonable one for OO people. In Ruby we do "hello".size()
as we do [1,2,3].size()
, this all makes sense, it is regular and clean. All objects that can have “a length”, we expect them to be able to answer to the size
method.
Ruby is a fully hackable language
In the same line of Lisp-like languages and the Smalltalk-like languages Ruby is fully hackable. By that I mean you can expand and modify the language as you wish (of course, maintaining the overall conceptual framework so, Lisp code will have function calls and Smalltalk and Ruby code will have objects). If the application is complex enough you may expand and transform the language to talk about the application objects instead of twisting your problem to fit the language.
Example. Let’s see an extreme and perilous example, let’s redefine the concept of equality between numbers. It is not so absurd, sometimes this is exactly what you want. In numerical analysis for example you never test if a floating point number is zero directly, like this f == 0.0
, instead you do something like abs(f) < epsilon
. The code below shows you how you can redefine ==
in Ruby to make it behave like in numerical analysis.
# .make sure we don't loose the original '==' method
# giving it an alias, a synonim
class Float
alias origEq ==
end
# . define "small" with the global variable $epsilon
$epsilon = 0.0001
# . change the behaviour of 'A == B' in the case where
# A and B are FLOAT and B = 0.0.
class Float
def ==(other)
if (other.origEq(0.0)) then
return (self.abs < $epsilon)
else
return (self.origEq(other))
end
end
end
# EXERCISE. if you write == 0 instead of 0.0 it won't work, why?
0.1 == 0.0 # false
0.001 == 0.0 # false
0.000001 == 0.0 # => true !
It is NOT recommended to code like this and you are warned to think at least seven times before doing something similar. But, it is NOT forbidden ! Ruby, like Unix, supposes you to be a reasonable being who can read and think before typing.
Example. Extending builtin classes with new methods. This is far less dangerous than redefining existing methods. I do this very often. Let’s extend the class String. I want every string to be able to tell if its content looks like an IPv4 address.
If s.class
is String
I wish to be able to say s.ipv4?
, I want to get a boolean as output. E.g. I want "10.11.12.123".ipv4? => true
and "foobar".ipv4? => false
. Also, I wish the procedure be tolerant to leading and trailing spaces in the string to test.
I took the regexp for matching IPv4 from here, doing the rest was a breeze.
# . Adding a method to the String builtin class
class String
# . check if the string looks like an IPv4 address, spaces
# at the beginning and end of the string are ignored.
def ipv4?
self.strip.match? /^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}$/
end
end
"foo".ipv4? # => false
"781.123.12.4".ipv4? # => false
"187.123.12.4".ipv4? # => true
In Python all of this is science fiction and it is simply impossible, you are not allowed to modify builtin classes.
Parentheses: less fog, more active code
Parentheses are especially annoying to me because they require to press two keys: Shift and ‘[‘ . If it was needed once in a while it would be OK, but in Python parentheses are everywhere. In Python3 they had the silly idea to require them also the most typed thing ever: print. Now it is print("hello world")
. In Python2 it was print "Hello world"
. Pyton2 was better.
In Ruby we are even better than Python2. Not only you can write puts "hello world"
but there are shortcuts as p "hello world"
. Isn’t that cool? Image how faster you can go, don’t type, think. Print is ubiquitous in debugging.
We can go much further than that. In Ruby, most of the time, where it is not ambiguous, you can call methods without parentheses.This is really nice and make chained methods look very clean. Let’s see a trivial example, in the next line we test if a string contains a certain word. We chain 4 methods and we use parentheses only when it makes sense, you can well see how readable this is.
" Sopra la panca la capra campa ".strip.downcase.
split(/\s+/).include?("capra")
A multi-line REPL and its package weight
Sometimes it is faster to solve a problem in Ruby (or Python) REPL than in the Bash/Ksh/X shell; they are just more powerful. There is a limitation though that hinders the Python3 REPL that comes with the language, it does not have multi-line editing. This is quite a problem because Python is heavily based on spaces and newlines, trying to write code without them is just plain impossible or indecent. If you want to do any readable programming from the REPL in Python you must use iPython3 but there comes another problem, iPython3 is big and has a lot of dependencies, system managers don’t like that. More dependencies more problems, longer upgrade times, less secure systems over all.
In Ruby3 the story is completely different, Ruby3 ships with its own multi-line REPL called irb which is all what you need to write multi-line code interactively on the REPL.
Is the REPL of any use in Unix? For me yes, absolutely, it is the best solution when you must run the code only once. [1] It easier to write than a one-liner since you can go newline [2] it can be developed progressively watching intermediate values and experimenting along the way [3] when you close the REPL it is all gone, there is no garbage temporary files to remove [4] You are not constrained inside the Shell quoting mechanism, the one-liner is always a Shell String and it must be delimited, usually with "..."
or '...'
, this is less a problem for Ruby than Python, since Ruby, like Perl, has a very flexible extra string quoting mechanisms like %q{ hello ' " }
where the delimiter {..}
is of your choice.
Example. In my backup system I keep a weekly dump of a few important databases in the company. When I wrote the db backup system it seemed to me a good idea to use the DD-MMM-YYYY-HH-MM
format to name files, for human readability reasons. After a while I am convinced this is not practical, it is better to have by default the files listed by creation time in the shell. So, we want to change names to a bunch of directories, we want the new names to be of the form backup-YYYY-mm-DD-HH-MM
, part of the original name list appear in the picture below, how to proceed?
This is an example of one-shot problem, once solved it is gone forever. there is no need to write a one-liner and store it into my notes. It is easy enough it does not need me to fire up an editor create a file for the script and so on, we solve it at the REPL.
I start the REPL with irb
, I get the list of directories to rename in the current one with dirList0 = Dir.Glob("*")
. Observe that Ruby has the concept of globbing, Unix culture. Then I do some experiments to establish how to parse the existing string into a DateTime object getting dt0 = DateTime.strptime(dirList.firt, "%d-%b-%Y-%H-%M")
and finally I find how to print out the DateTime object as I wish dt0.strftime("%Y-%m-%d-%H-%M")
. Observe I put a lot of unneeded parentheses to make the code more natural to read for non-Rubysts. Good, we have all the tools to write our loop into the multi-line REPL editor.
In the picture below you can see what I typed into my REPL. As a result I print the command I should run to rename each directory. If you want to run the command I just add system(cmd)
or `cmd`
, as in Perl, so again our Unix culture is not wasted. I am not executing cmd
right now because to write in the backup directory I need elevated privileges, it is something I do only when I am fully focused on the issue. Finally, I am using system
and mv
to remove files to keep things familiar for Unix people, it goes without saying that Ruby has its own file manipulation tools.
After we motivated a multi-line REPL can be of good use let’s see how many external packages you need to install to have one. Below is a comparison of the number of packages required in OpenBSD 7.4 to install either iPython3 or Ruby3. The image speaks by itself, there is no comparison, from the administrator perspective the the only language who can beat Ruby3 here is Perl5, since it comes with the OS, there is nothing to install. Still, Perl5 does not have full grown, default REPL, it was not born with that in mind, it has a simple debugger mode you can run with perl -d -e '1'
.
Cross OS and cross decades portability with JRuby
If you move your script from computer A to computer B there is a good probability it will not run in B. Ruby may not be installed in B, Ruby PATH may be different, RUBY_VERSION may be different, some gems could be missing of different, some gem could have non compatible versions, some code could depend on compiling and libraries, some could could be OS dependent … These are the first possible reasons coming to my mind. Take into account that if you have a “version problem” changing your software to run on a different version of the interpreter can be not trivial: moving from Python2 to Python3 was a very long nightmare. I know more than one people whose job was changed to “Python2 to Python3 conversion guy”.
Possible solution. There exists a Ruby version written in Java which is just a *.jar file . It is called JRuby. If you move your *.jar file and your script from computer A and computer B both having Java well installed then the script is going to behave in the same way (at least in theory). There is noting to install, just transport files ! This is really fantastic.
It is true Java it has its own issues, but moving your scripts to it ensures a lot more portability that standard Ruby. If your Ruby language and user code and is just a bunch of files that must be fed into the JVM, it is up the Java environment to ensure the ability to run the code even several years in the future. It is up to the Java environment to ensure a certain degree of portability across different OS. Java is backed and used by large corporations, it is not going to be dismissed in the next decade.
Extra bonus. From JRuby you can use directly the huge amount of libraries available for Java and you can use Java threads.
What about Python? In Python portability it is an imperial unresolved mess. There was once an active Jython, but it is still dormant in Python2. You should by all mean not use Python2 any more.
Conclusions
As of today Ruby is mostly associated with the Rails web framework. I never used Rails. I use Ruby because it is the best scripting language for Unix from the ones I tried (Perl5, Python2–3, Awk, Tcl, Bash, Ksh, Csh). It replaces Perl, Awk, Python and several Unix power tools are a shaded copy of what I can reach with Ruby one-liners e.g. xargs, grep, sed etc.
While Python main motto is “There is only one way to do it”, Ruby motto is “Ruby is the programmer best friend”. Help yourself.
Happy hacking !
Revisions and clarifications ex-post
- [26-feb-2024].
p
is not a synonym forputs
. It is an inspecting tool. Still, on simple objects like instances of Float, String, Array, Time, Hash (these I tested) it will print out what you expect, therefore, to some extent, you can use it like aputs
shortcut. Check the difference in ruby-doc, Kernel#p . [thanks to the Reddit reader @rubyrt to make me notice]