Fun with Bash: The Tee Replicator
By Dylan Thinnes, Last Edited Fri 15 Nov 22:03:29 GMT 2019
Table of Contents
Anyone who’s met me will know I delight in taking small tools or concepts with odd corner cases and exploiting them to do things that they are otherwise unsuited for. Today, we’ll be going over a simple task and covering a fun way to implement it using nothing but one common unix shell command, tee
, and Bash to glue it together.
The Task
Let us suppose the following basic task: design a program that takes the standard input and copies it times to standard output. The is passed as an argument.
$ echo "a" | my-program 5
a
a
a
a
a
There are many ways to implement such a program, with varying levels of performance. For example, Bash for loops are extremely simple but similarly slow. GNU yes does the trick, and is quick about it. We’ll be using tee
, which sits somewhere in the middle on performance.
Intro to Tee
tee
is a very simple program. It takes its standard input and prints it to standard output and also writes it of the files named in its arguments.
For example:
$ echo a | tee ./my-file
would both print “a” to standard output and write “a” into “./my-file”.
The Fun Part
In keeping with the Unix philosophy, since tee simply writes to files you tell it to, it can also write to files that aren’t actually files, such as… /dev/stdout
.
Thus, the program tee /dev/stdout
will copy its standard input to both standard output and, again, standard output.
$ echo a | tee /dev/stdout
a
a
Furthermore, if you specify /dev/stdout several times, it will copy that several times again.
$ # /dev/stdout specified twice
$ echo a | tee /dev/stdout /dev/stdout
a # The original output
a # The first copy
a # The second copy
Arbitrary Powers of Two
Quite obviously, pipe tee
into itself times and you get duplication.
$ # Pipe it into itself twice, j = 2 -> 2^(j+1) = 8
$ echo a | tee /dev/stdout | tee /dev/stdout | tee /dev/stdout
a
a
a
a
a
a
a
a
This gives us arbitrary duplication to times with processes, which will serve useful in a moment.
Standard Error Deserves Recognition Too
tee
can also pipe to /dev/stderr
, which allows us to write the output of a command to both /dev/stdout
and /dev/stderr
.
From then on, stderr will, as it always does, “pass through” any future tee program.
By copying program output to stderr and stdout, we can operate on the stdout stream independently of what was originally copied, transforming it, and then merge it back with its original self using the bash redirect 2>&1
.
In essence, the following two commands are equivalent:
(tee /dev/stderr | my_command) 2>&1
A=$(cat)
echo $A
echo $A | my_command
Of course, the former solution is what we’ll be using today for its
- Wanton abuse of tools never meant for the purpose.
- Lack of cattiness.
Tying All Our Components Together
Those of you who’ve fiddled with bits before will likely anticipate the solution now.
First, we take our and decompose it into its binary representation:
n binary
4 = 00100
10 = 01010
12 = 01100
29 = 11101
Then, we take our initial input, which can be considered as a single () occurence. We double our input, creating , then , then etc. successively, until we reach a digit present in the original number.
So, if we have , we double twice, until reaching
objective: 12 = 01100
start double double
stdout: 00001 => 00010 => 00100
stderr: 00000 => 00000 => 00000
Where lines “stdout” and “stderr” above denote how many copies of the original input are in stdout and stderr at any given time.
Then, we copy the current duplicates to stderr, “saving” it.
objective: 12 = 01100
start double double copy
stdout: 00001 => 00010 => 00100 => 00100
stderr: 00000 => 00000 => 00000 => 00100
Then, we continue to double again until reaching the next digit, then copying again, and repeat this process until there are no digits remaining.
objective: 12 = 01100
start double double copy double copy
stdout: 00001 => 00010 => 00100 => 00100 => 01000 => 01000
stderr: 00000 => 00000 => 00000 => 00100 => 00100 => 01100
Finally, we clear stdout (using > /dev/null
) and then swap stderr to stdout (using 2&>1
).
objective: 12 = 01100
copy double copy clear swap
stdout: ... => 00100 => 01000 => 01000 => 00000 => 01100
stderr: ... => 00100 => 00100 => 01100 => 01100 => 00000
If we take this sequence of steps and systematically turn them into a shell script, we get a little something like this:
dupe () {
local N=$1
# Avoid any processing if N is below 1.
if [[ $N > 0 ]]; then
if [[ $N == 1 ]]; then
# If the current bit is the last bit, copy to
# stderr and stop duplicating
tee /dev/stderr
elif [[ $((N % 2)) == 1 ]]; then
# If current bit is one, copy to stderr, set
# current bit to zero, and continue duplicating
tee /dev/stderr | dupe $((N-1))
else
# If current bit is zero, duplicate input once,
# shift N to next bit, and continue duplicating
tee /dev/stdout | dupe $((N/2))
fi
fi
}
# Throw away stdout & redirect stderr to stdout
dupe $1 2>&1 >/dev/null
A not-so-small aside: the behaviour expressed in the first clause of the innermost if statement, [[ $N == 1 ]]
, can be removed and expressed in an else branch on the outermost if statement.
dupe () {
local N=$1
if [[ $N > 0 ]]; then
if [[ $((N % 2)) == 1 ]]; then
# If current bit is one, copy to stderr, set
# current bit to zero, and continue duplicating
tee /dev/stderr | dupe $((N-1))
else
# If current bit is zero, duplicate input once,
# shift N to next bit, and continue duplicating
tee /dev/stdout | dupe $((N/2))
fi
else
tee
fi
}
# Redirect stderr to stdout & redirect stderr to stdout.
dupe $1 2>&1 >/dev/null
Furthermore, with the final step of the recursion (either with the innermost if statement of the former implementation, or the outermost else branch of the latter implementation), clearing stdout can be moved inside the final call.
dupe () {
local N=$1
if [[ $N > 0 ]]; then
if [[ $((N % 2)) == 1 ]]; then
# If current bit is one, copy to stderr, set
# current bit to zero, and continue duplicating
tee /dev/stderr | dupe $((N-1))
else
# If current bit is zero, duplicate input once,
# shift N to next bit, and continue duplicating
tee /dev/stdout | dupe $((N/2))
fi
else
# Throw away stdout
tee >/dev/null
fi
}
# Redirect stderr to stdout
dupe $1 2>&1
The distinction is largely one of “neether or naither”. Which one you use is largely up to you - it adds one extra call to the recursive function to use the else branch on the outermost if statement.
Thus, we’ve reached the end of this mini-post! Congratulations! Using nothing but tee and built-in bash features, you can now duplicate any possible input any number of times!