IBM的LPI复习资料之LPI101-Topic103 :GNU和Unix命令（4）流、管道和重定向

来源：互联网发布：手机建模软件中文编辑：程序博客网时间：2024/06/05 03:23

摘要：

也许你觉得流和管道让Linux专家看起来像管道工人一样，那么让我们来看看究竟，并且学习一下重定向和多路输出。还会学到把一个流作为命令的参数。

概述：

本文教会你关于重定向标准输入输出流的基本知识，具体是：

重定向标准IO流：标准输入流，标准输出流，标准错误流
使用管道把一个命令的输出当作另一个命令的输入
把输出同时写入标准输出流和文件
把一个命令的输出当作另一个命令的参数

创建实验用目录结构：

在本文里，我们将继续使用在上一篇文章中创建的文件做实验，即便你没学过上篇文章或者没有当时没有保存那些文件也不要紧，我们重新建立它们。首先在家目录下创建一个名为lpi103-4的目录，然后在这个目录下创建所需的文件。这些可以通过如下命令来完成。

[root@localhost ~]# mkdir -p lpi103-4 && cd lpi103-4 && {
> echo -e "1 apple\n2 pear\n3 banana" > text1
> echo -e "9\tplum\n3\tbanana\n10\tapple" > text2
> echo "This is a sentence. " !#:* !#:1->text3
echo "This is a sentence. " "This is a sentence. " "This is a sentence. ">text3
> split -l 2 text1
> split -b 17 text2 y; }

执行完成后的目录结构如下：

[root@localhost lpi103-4]# ll
total 28
-rw-r--r-- 1 root root 24 Jun 26 14:21 text1
-rw-r--r-- 1 root root 25 Jun 26 14:21 text2
-rw-r--r-- 1 root root 63 Jun 26 14:21 text3
-rw-r--r-- 1 root root 15 Jun 26 14:21 xaa
-rw-r--r-- 1 root root 9 Jun 26 14:21 xab
-rw-r--r-- 1 root root 17 Jun 26 14:21 yaa
-rw-r--r-- 1 root root 8 Jun 26 14:21 yab

重定向标准IO流

像bash这样的Linux shell程序，无论接受输入还是发送输出都是按照字符序列（字符流）来进行的。每一个字符都与前后相邻的其他字符彼此独立。这些字符不会组成结构化的记录，也不会是固定大小的块。无论真正的字符流来自或者流向一个文件、键盘、显示的窗口还是其他的设备，都可以统一通过文件IO技术来访问。Linux shell程序使用三个标准IO流，每一个都用一个众所周知的文件描述符来指定：

stdout代表标准输出流，用来显示命令的输出结果，对应的文件描述符是1；
stderr代表标准错误流，用来显示命令的错误输出，对应的文件描述符是2；
stdin代表标准输入流，用来为命令提供输入，对应的文件描述符是0。

输入流为程序提供输入，通常来自于终端键盘。输出流打印文本字符，通常打印到终端设备。过去的终端设备就是一个ASCII打字机或者显示终端，但是现在的终端往往是图形桌面的一个文本窗口。

重定向输出

有两种重定向输出的的方式：

n> 把文件描述符n代表的输出流重定向到文件。此时必须对文件有写权限。如果指定的文件不存在，就会创建它，如果存在，就会替换掉原来的内容，而且往往没有警告。

n>> 与上面类似，只是不会替换文件原来的内容，而是追加到原内容之后。

上面的n代表文件描述符，如果没有给出，那么其默认值就是1，也就是标准输出。下面是一些例子：

[root@localhost lpi103-4]# ls x* z*
ls: cannot access z*: No such file or directory
xaa xab
[root@localhost lpi103-4]# ls x* z* >stdout.txt 2>stderr.txt
[root@localhost lpi103-4]# ls w* y*
ls: cannot access w*: No such file or directory
yaa yab
[root@localhost lpi103-4]# ls w* y* >>stdout.txt 2>>stderr.txt
[root@localhost lpi103-4]# cat stdout.txt
xaa
xab
yaa
yab
[root@localhost lpi103-4]# cat stderr.txt
ls: cannot access z*: No such file or directory
ls: cannot access w*: No such file or directory

我们说过使用n>来进行重定向会覆盖掉原来的内容。然而你可以通过set -o noclobber来改变这种行为，如果这么做了，可以使用n>|来覆盖。

[root@localhost lpi103-4]# set -o noclobber
[root@localhost lpi103-4]# ls x* z* >stdout.txt 2>stderr.txt
bash: stdout.txt: cannot overwrite existing file
[root@localhost lpi103-4]# ls x* z* >|stdout.txt 2>|stderr.txt
[root@localhost lpi103-4]# cat stdout.txt
xaa
xab
[root@localhost lpi103-4]# cat stderr.txt
ls: cannot access z*: No such file or directory
[root@localhost lpi103-4]# set +o noclobber #恢复原来的设置

有时候你可能想要把stdout和stderr重定向到同一个文件，这种情况包括自动处理或者后台工作，目的是可以在以后再查看输出。这可以通过&>或&>>来完成。另一种可行的方法是先重定向n到m，然后把m重定向到文件，使用的命令是 m>&n或者m&>>n。重定向的顺序非常重要，如：

command 2>&1 >output.txt

与

commnd >output.txt 2>&1

是不一样的。

第一个例子中，stderr被重定向到了stdout，然后stdout被重定向到了文件output.txt，但是第二次重定向知识影响了stdout，并没有影响stderr。

第二个例子中，stderr被重定向到了stdout，而此时的stdout已经重定向到了output.txt，所以stderr也就重定向到了output.txt。

[root@localhost lpi103-4]# ls x* z* &>output.txt
[root@localhost lpi103-4]# cat output.txt
ls: cannot access z*: No such file or directory
xaa
xab
[root@localhost lpi103-4]# ls x* z* >output.txt 2>&1
[root@localhost lpi103-4]# cat output.txt
ls: cannot access z*: No such file or directory
xaa
xab
[root@localhost lpi103-4]# ls x* z* 2>&1 >output.txt # stderr 没有定向到output.txt
ls: cannot access z*: No such file or directory
[root@localhost lpi103-4]# cat output.txt
xaa
xab

有些时候，你可能想把stdout或stderr忽略掉。这可以通过重定向到特殊的/dev/null文件来完成。如下：

[root@localhost lpi103-4]# ls x* z* 2>/dev/null
xaa xab
[root@localhost lpi103-4]# cat /dev/null

重定向标准输入

与我们可以重定向stdout和stderr一样，我们可以通过<操作符来重定向stdin。前面文章中我们使用cat file | command 的形式来把file作为command的输入来源，这不是必须的，我们可以通过重定向stdin来实现同样的效果。如下：

[root@localhost lpi103-4]# tr ' ' '\t'<text1
1 apple
2 pear
3 banana

包括bash在内的shell程序们还有一个here-document的概念，这是另外一种形式的重定向。它的格式是<<紧接一个单词（如"END")，这个END的作用是作为输入的终结标志，如下：

[root@localhost lpi103-4]# sort -k2 <<END
> 1 apple
> 2 pear
> 3 banana
> END
1 apple
3 banana
2 pear

你可能想知道是否可以不输入END，而是通过Ctrl+d来结束输入。答案是肯定的，但是这在脚本文件里行不通，因为脚本文件里不能输入Ctrl+d。因为shell脚本为大量使用了tab来缩进代码，出现了针对这种情况的一种here-document变体---- <<-，此时每行开头的tab将被忽略。

下面例子中，我们创建了一个脚本文件，文件中使用了<<-。如下：

[root@localhost lpi103-4]# ht=$(echo -en "\t")
[root@localhost lpi103-4]# cat<<END>ex-here.sh
> cat <<-EOF
> apple
> EOF
> ${ht}cat <<-EOF
> ${ht}pear
> ${ht}EOF
> END
[root@localhost lpi103-4]# cat ex-here.sh
cat <<-EOF
apple
EOF
cat <<-EOF
pear
EOF
[root@localhost lpi103-4]# bas
base64 basename bash bashbug-32
[root@localhost lpi103-4]# bash ex-here.sh
apple
pear

创建管道

前面我们在“文本流和过滤器”中说过，可以使用管道来连接多个文本处理命令。但是管道并不局限于文本流，尽管在文本流中经常使用。

把stdout导向stdin

使用管道操作符|可以把第一个命令的stdout定向到第二个命令的stdin。更多地命令和管道可以组成更长的管线。管线中的每一个命令都可以有参数，很多命令使用一个单独的-代表输入文件是stdin。伪代码如下：

command1 | command2 parameter1 | command3 parameter1 - parameter2 | command4

需要注意的是，管道只是把stdout定向到了stdin。你不能使用2| 把stderr也加入管道中。如果stderr已经被重定向到了stdout，那么所有的输出流就被管道处理了。例子如下：

[root@localhost lpi103-4]# ls y* x* z* u* q*
ls: cannot access z*: No such file or directory
ls: cannot access u*: No such file or directory
ls: cannot access q*: No such file or directory
xaa xab yaa yab
[root@localhost lpi103-4]# ls y* x* z* u* q* 2>&1 | sort
ls: cannot access q*: No such file or directory
ls: cannot access u*: No such file or directory
ls: cannot access z*: No such file or directory
xaa
xab
yaa
yab

Linux或者Unix系统上的管道的一个优点是，管道处理过程中并没有临时文件生成。第一个命令的stout不会写入一个文件，然后第二个命令读取这个文件。前面的文章中提到过使用tar 来一步完成打包和压缩，如果你工作的Unix系统中的tar恰好不支持-z或者-j等压缩选项，这也没有问题，使用管道可以轻松搞定，如下：

bunzip -c somefile.tar.bz2 | tar -xvf -

从文件而不是stout开始一个管线

上面的例子中，管线开始于一个产生输出的命令。当然也可以开始于一个已经存在的文件，使用<把第一个命令的输入重定向到这个文件即可。（译者注：从文件开始也符合过滤器的哲学，就是过滤器只处理数据而不产生数据）。

把输出作为参数

在前面的管道的讨论中，你学会了如何利用一个命令的输出作为另一个命令的输入。假设你想要一个命令的输出或者是一个文件的内容作为一个命令的参数而不是输入呢？管道无法做到这个，解决方法是：

使用xargs命令
使用 find -exec
命令替换

现在我们来学习前两种。命令替换可以在命令行上使用，但是更经常的是出现在脚本中。

使用xargs命令

xargs命令从标准输入读取数据，然后构建和执行命令，并把读取的数据作为命令的参数。如果没有命令，那么默认使用echo。如下例子：

[root@localhost lpi103-4]# cat text1
1 apple
2 pear
3 banana
[root@localhost lpi103-4]# xargs<text1
1 apple 2 pear 3 banana

为什么只有一行输出呢？默认情况下，xargs根据空白分隔输入，每一个分隔的部分作为一个参数。然而，当xargs构建命令的时候，它会一次性传递尽可能多的参数。可以用过-n 或者--max-args来修正这种默认的行为。如下所示：

[root@localhost lpi103-4]# xargs<text1 echo "args >"
args > 1 apple 2 pear 3 banana
[root@localhost lpi103-4]# xargs --max-args 3 <text1 echo "args >"
args > 1 apple 2
args > pear 3 banana
[root@localhost lpi103-4]# xargs -n 1 <text1 echo "args >"
args > 1
args > apple
args > 2
args > pear
args > 3
args > banana

如果输入中的空白被单引号或双引号包围，或者被反斜线转义，那么xargs就不会分隔输入。如下：

[root@localhost lpi103-4]# echo '"4 plum"' | cat text1 -
1 apple
2 pear
3 banana
"4 plum"
[root@localhost lpi103-4]# echo '"4 plum"' | cat text1 - | xargs -n 1
1
apple
2
pear
3
banana
4 plum

目前为止，所有的参数都被加到了命令的后面。如果你需要在其他的参数中使用它们，那么可以使用-I选项。如下：

[root@localhost lpi103-4]# xargs -I XYZ echo "START XYZ REPEAT XYZ END" <text1
START 1 apple REPEAT 1 apple END
START 2 pear REPEAT 2 pear END
START 3 banana REPEAT 3 banana END
[root@localhost lpi103-4]# xargs -IX echo "<X><X>" <text1
<1 apple><1 apple>
<2 pear><2 pear>
<3 banana><3 banana>
[root@localhost lpi103-4]#
[root@localhost lpi103-4]# cat text1 text2 | xargs -L2
1 apple 2 pear
3 banana 9 plum
3 banana 10 apple

其中 -L选项告诉xargs把每一行当作一个参数。

尽管例子中我们使用了一个简单的文本文件来演示，实际中，你很少会这么使用，通常使用的输入是来自ls，grep等命令的输出。如下：

[root@localhost lpi103-4]# ls | xargs grep "1"
text1:1 apple
text2:10 apple
xaa:1 apple
yaa:1

如果上面例子中一个或者多个文件名包含空格会如何呢？答案是会产生错误。

对于ls命令，你可以使用--quoting-style选项来强制为文件名增加引号或转义。一个更好的解决方案是使用xargs的-0选项，这样xargs就使用\0来分隔输入的参数。尽管ls不支持选项来产生\0结尾的文件名输出，很多其他的命令都支持。

下面的例子中展示了这些技巧。

[root@localhost lpi103-4]# cp text1 "text 1"
[root@localhost lpi103-4]# ls *1 | xargs grep "1" # error
text1:1 apple
grep: text: No such file or directory
grep: 1: No such file or directory
[root@localhost lpi103-4]# ls --quoting-style escape *1
text1 text\ 1
[root@localhost lpi103-4]# ls --quoting-style shell *1
text1 'text 1'
[root@localhost lpi103-4]# ls --quoting-style shell *1 | xargs grep "1"
text1:1 apple
text 1:1 apple
[root@localhost lpi103-4]# ls *1 | tr '\n' '\0' | xargs -0 grep "1"
text1:1 apple
text 1:1 apple

xargs不能构建任意长度的命令。直到Linux 内核的 2.26.3版本中，命令的最大数量也是有限的。像rm somepath/*，如果是一个有很多长文件名的文件的目录，那么这可能会失败，并告知参数太长。在一些老的Linux版本或者Unix系统中，这个限制可能仍然会存在。

可以使用xargs的--show-limits选项来显示默认的限制，使用-s选项来设置这个限制的大小。

使用find -exec或者 find和xargs一起使用

在“文件与目录管理”中，我们学会了如何根据文件名、修改时间、文件大小、或者其他的文件属性使用find来查找文件。一旦你得到了这样一组文件，你通常会对他们做些什么：删除它们、复制、重命名、或者其他操作。现在让我们来看看find的-exec选项，它具有和xargs类似的功能。

[root@localhost lpi103-4]# find text[12] -exec cat text3 {} \;
This is a sentence. This is a sentence. This is a sentence.
1 apple
2 pear
3 banana
This is a sentence. This is a sentence. This is a sentence.
9 plum
3 banana
10 apple

把它与xargs相比，你会发现很多不同点：

你必须使用{}来标记需要增加文件名的地方，文件名不会自动被添加到命令后；
你必须使用；来结束命令，因为shell也使用这个元字符，所以需要转义或者引号包围；
对于每一个输入的文件名执行一次命令。

对应的xargs版本的程序如下：

[root@localhost lpi103-4]# find text[12] |xargs cat text3
This is a sentence. This is a sentence. This is a sentence.
1 apple
2 pear
3 banana
9 plum
3 banana
10 apple

现在让我们回到“文件名中有空格”的问题。如下：

[root@localhost lpi103-4]# find . -name "*1" -exec grep "1" {} \;
1 apple
1 apple

目前为止，一切OK。但是是不是丢失了什么东西呢？是哪一个文件包含grep找的行呢？文件名不见了！这是因为对每一个文件都执行一次grep，而grep只查找一个文件时不会给出文件名在输出中。如果使用xargs会如何呢？我们前面已经看到过含空格文件名的问题了。

解决方法有两种：一是使用find的-print0 选项来产生以\0分隔的结果输出；二是现代版本的find可以使用+而不是；来结束命令，这将导致find会传递尽可能多的文件名，从而只执行一次命令。

[root@localhost lpi103-4]# find . -name "*1" -print0 | xargs -0 grep "1"
./text 1:1 apple
./text1:1 apple
[root@localhost lpi103-4]# find . -name "*1" -exec grep "1" {} +
./text 1:1 apple
./text1:1 apple

通常情况下，上述两种方法都可以使用，只是个人好恶问题。记住，在管道中传送没有处理过的空格会产生问题，所以使用find -print0和xargs -0 来处理它们。其他的命令，如tar也支持-0选项，所以如果你一个命令支持-0，那么就尽可能的使用这个选项吧！

最后需要说的是如果操作很多文件，那么一定要做好备份，并且在执行实际操作之前要验证好。

多路输出tee

这部分通过简要讨论另外一个命令来结束本文。有时候你可能需要在屏幕上看到输出，同时还想把输出写入文件供以后查阅。当然你可以通过把命令的输出重定向到一个文件中，然后使用tail -f 命令来实时查阅这个文件来实现。然而还有另外一个更简单的方法，那就是使用tee命令。

你可以在管线中使用tee。它的参数是一个或者多个文件名来存放stdout的内容。使用-a选项会附加而不是覆盖掉文件原来的内容。前面我们也说过，你需要在管道之前就把stderr重定向到stdout，这样才能在结果中保存两者。如下列：

[root@localhost lpi103-4]# ls text[1-3] | tee f1 f2
text1
text2
text3
[root@localhost lpi103-4]# cat f1
text1
text2
text3
[root@localhost lpi103-4]# cat f2
text1
text2
text3