In this post I'm going to use a very small program that I wrote recently as a little bit of a joke for a friend to explain and demonstrate how the famous Unix pipes work.
Prerequisite knowledge: A little command line fu, a little Ruby knowledge and an understanding of the terms STDIN and STDOUT.
So, first of all: pipes. What are they? What can you do with them? Pipes are, in a word, awesome. They allow you to use the output of one program as the input to another. For example:
$ history | grep ssh
This will list all of your previous ssh commands. The
history command shows a
list of your command history (this command may not exist on your system
depending on what you're working with, have a Google around if your shell
complains about the
history command not existing).
grep is the
quintessential searching tool. It will parse through text and return only the
lines that contain the given search term. It has many options, many hidden
treasures and if you get the time, I strongly recommend learning what you can
grep. It is very powerful.
What the pipe is doing, basically, is linking the STDOUT from
history to the
grep is reading
history's STDOUT as if it were its own
STDIN. That's all that pipes do! But through this simple, genius idea there are
a phenomenal amount of possibilities. There are some great examples of the
possibilities in Gary Bernhardt's fantastic talk:
The Unix Chainsaw
› Command line input
Writing Ruby programs that accept command line input is very simple. I'm going to use a program I wrote as a joke called "Gommoize" as an example. All Gommoize does is replace all vowels in a given input with the letter "O". The description of its origin can be found on its GitHub page.
Here's a simple first iteration of the program:
#!/usr/bin/env ruby input = ARGV.shift puts input.gsub(/[aeiou]/, 'o').gsub(/[AEIOU]/, 'O')
The first line is what's called a "shebang". It tells the shell what program to use to run the script. This is cool because it lets us run the script just by its name instead of needing to specify what program to run it with. For example, we could just do this:
$ chmod +x script $ ./script
Instead of having to do this:
$ ruby script.rb
All because of the shebang.
input = ARGV.shift line simply takes the first command line argument from
the list of command line arguments.
ARGV is an array of arguments that were
supplied to our program and
shift is a method of the
Array class in Ruby
that removes the first element of an array and returns it.
The last line in the script outputs the input but with all vowels switched with
the letter "O". I've written a naive implementation of case sensitivity by using
two global substitutions (the
gsub method). There's probably a more elegant
way of doing this but I feel this method will suffice for now.
That's it. We can now use this program like so:
$ ./gommoize "Gemma"
And the output would be "Gommo" (assuming our file is called "gommoize" and has
been made executable via
chmod). However, we can't pipe to the program at the
moment. If we try this:
$ touch somefile.txt $ echo "Gemma" >> somefile.txt $ cat somefile.txt | ./gommoize
It will result in an error about a method not being found on nil. Bummer. That means the program tried to get an argument from the command line and didn't find one. Fortunately, this is a really easy problem to solve.
› Modifying our program for piping
The modification we need to apply to allow our program to be piped to is both simple and elegant:
#!/usr/bin/env ruby input = ARGV.shift || $stdin.read puts input.gsub(/[aeiou]/, 'o').gsub(/[AEIOU]/, 'O')
Only the line that reads the input needs to change. We're using a little trick
that Ruby lets us do with the logical or
|| operator. If the first expression
in the or statement
ARGV.shift || $stdin.read evaluates to true (which is
anything apart from nil or false in Ruby) then the second part is not evaluated.
true || puts("Hello")
puts("Hello") will never be printed out in the code above, because the
first part of the or statement evaluates to true. So if our program gets no
command line arguments passed to it, it will look to STDIN for its input. If
we've piped to the program, STDIN will contain the output from the program that
has been piped to our program. So now:
$ cat somefile.txt | ./gommoize
Should work :)
› Summing up
Hopefully this post has gone some way to demystifying pipes for you. It's easy to write your command line programs in a way that allows them to take either direct input from the command line as arguments or input from pipes and doing so will make your program more usable and a better member of the Unix ecosystem.
Also, don't be fooled. Reading STDIN for pipe input will work regardless of the programming language that you use, not just Ruby. Give it a try!
If you have any questions, feel free to email me! firstname.lastname@example.org