perl命令行应用——中英文对照

来源:互联网 发布:windows7 性能优化 编辑:程序博客网 时间:2024/05/19 22:01

Perl Command-Line Options

by Dave Cross
August 10, 2004

Perl has a large number of command-line options that can help to make your programs more concise and open up many new possibilities for one-off command-line scripts using Perl.In this article we'll look at some of the most useful of these.

Safety Net Options

There are three options I like to think of as a "safety net," as they can stop you from making a fool of yourself when you're doing something particularly clever (or stupid!). And while they aren't ever necessary,it's rare that you'll find an experienced Perl programmer working without them.

The first of these is -c.This option compiles your program without running it.This is a great way to ensure that you haven't introduced any syntax errors while you've been editing a program.When I'm working on a program I never go more than a few minutes without saving the file and running:

  $ perl -c <program>

Related Reading

Perl in a Nutshell
By Stephen Spainhour, Ellen Siever, Nathan Patwardhan

This makes sure that the program still compiles. It's far easier to fix problems when you've only made a few changes than it is to type in a couple of hundred of lines of code and then try to debug that.

The next safety net is the -w option. This turns on warnings that Perl will then give you if it finds any of a number of problems in your code. Each of these warnings is a potential bug in your program and should be investigated. In modern versions of Perl (since 5.6.0) the -w option has been replaced by the use warnings pragma, which is more flexible than the command-line option so you shouldn't use -w in new code.

The final safety net is the -T option. This option puts Perl into "taint mode." In this mode, Perl inherently distrusts any data that it receives from outside the program's source -- for example, data passed in on the command line, read from a file, or taken from CGI parameters.

Tainted data cannot be used in an expression that interacts with the outside world -- for example, you can't use it in a call to system or as the name of a file to open. The full list of restrictions is given in the perlsec manual page.

In order to use this data in any of these potentially dangerous operations you need to untaint it. You do this by checking it against a regular expression. A detailed discussion of taint mode would fill an article all by itself so I won't go into any more details here, but using taint mode is a very good habit to get into -- particularly if you are writing programs (like CGI programs) that take unknown input from users.

Actually there's one other option that belongs in this set and that's -d. This option puts you into the Perl debugger. This is also a subject that's too big for this article, but I recommend you look at "perldoc perldebug" or Richard Foley's Perl Debugger Pocket Reference.

Command-Line Programs

The next few options I want to look at make it easy to run short Perl programs on the command line. The first one, -e, allows you to define Perl code to be executed by the compiler. For example, it's not necessary to write a "Hello World" program in Perl when you can just type this at the command line.

  $ perl -e 'print "Hello World/n"'

You can have as many -e options as you like and they will be run in the order that they appear on the command line.

  $ perl -e 'print "Hello ";' -e 'print "World/n"'

Notice that like a normal Perl program, all but the last line of code needs to end with a ; character.

Although it is possible to use a -e option to load a module, Perl gives you the -M option to make that easier.

  $ perl -MLWP::Simple -e'print head "http://www.example.com"'

So -Mmodule is the same as use module. If the module has default imports you don't want imported then you can use -m instead. Using -mmodule is the equivalent of use module(), which turns off any default imports. For example, the following command displays nothing as the head function won't have been imported into your main package:

  $ perl -mLWP::Simple -e'print head "http://www.example.com"'

The -M and -m options implement various nice pieces of syntactic sugar to make using them as easy as possible. Any arguments you would normally pass to the use statement can be listed following an = sign.

  $ perl -MCGI=:standard -e'print header'

This command imports the ":standard" export set from CGI.pm and therefore the header function becomes available to your program. Multiple arguments can be listed using quotes and commas as separators.

  $ perl -MCGI='header,start_html' -e'print header, start_html'

In this example we've just imported the two methods header and start_html as those are the only ones we are using.

Implicit Loops

Two other command-line options, -n and -p, add loops around your -e code. They are both very useful for processing files a line at a time. If you type something like:

  $ perl -n -e 'some code' file1

Then Perl will interpret that as:

  LINE:
while (<>) {
# your code goes here
}

Notice the use of the empty file input operator, which will read all of the files given on the command line a line at a time. Each line of the input files will be put, in turn, into $_ so that you can process it. As a example, try:

  $ perl -n -e 'print "$. - $_"' file

This gets converted to:

  LINE:
while (<>) {
print "$. - $_"
}

This code prints each line of the file together with the current line number.

The -p option makes that even easier. This option always prints the contents of $_ each time around the loop. It creates code like this:

  LINE:
while (<>) {
# your code goes here
} continue {
print or die "-p destination: $!/n";
}

This uses the little-used continue block on a while loop to ensure that the print statement is always called.

Using this option, our line number generator becomes:

  $ perl -p -e '$_ = "$. - $_"'

In this case there is no need for the explicit call to print as -p calls print for us.

Notice that the LINE: label is there so that you can easily move to the next input record no matter how deep in embedded loops you are. You do this using next LINE.

  $ perl -n -e 'next LINE unless /pattern/; print $_'

Of course, that example would probably be written as:

  $ perl -n -e 'print unless /pattern/'

But in a more complex example, the next LINE construct could potentially make your code easier to understand.

If you need to have processing carried out either before or after the main code loop, you can use a BEGIN or END block. Here's a pretty basic way to count the words in a text file:

  $ perl -ne 'END { print $t } @w = /(/w+)/g; $t += @w' file.txt

Each time round the loop we extract all of the words (defined as contiguous runs of /w characters into @w and add the number of elements in @w to our total variable $t. The END block runs after the loop has completed and prints out the final value in $t.

Of course, people's definition of what constitutes a valid word can vary. The definition used by the Unix wc (word count) program is a string of characters delimited by whitespace. We can simulate that by changing our program slightly, like this:

  $ perl -ne 'END { print $x } @w = split; $x += @w' file.txt

But there are a couple of command-line options that will make that even simpler. Firstly the -a option turns on autosplit mode. In this mode, each input record is split and the resulting list of elements is stored in an array called @F. This means that we can write our word-count program like this:

  $ perl -ane 'END {print $x} $x += @F' file.txt

The default value used to split the record is one or more whitespace characters. It is, of course, possible that you might want to split the input record on another character and you can control this with the -F option. So if we wanted to change our program to split on all non-word characters we could do something like this:

  $ perl -F'/W' -ane 'END {print $x} $x += @F' file.txt

For a more powerful example of what we can do with these options, let's look at the Unix password file. This is a simple, colon-delimited text file with one record per user. The seventh column in this file is the path of the login shell for that user. We can therefore produce a report of the most-used shells on a given system with a command-line script like this:

  $ perl -F':' -ane '$s{$F[6]}++;' /
> -e 'END { print "$_ : $s{$_}" for keys %s }' /etc/passwd

OK, so it's longer than one line and the output isn't sorted (although it's quite easy to add sorting), but perhaps you can get a sense of the kinds of things that you can do from the command line.

Record Separators

In my previous article I talked a lot about $/ and $/ -- the input and output record separators. $/ defines how much data Perl will read every time you ask it for the next record from a filehandle, and $/ contains a value that is appended to the end of any data that your program prints. The default value of $/ is a new line and the default value of $/ is an empty string (which is why you usually explicity add a new line to your calls to print).

Now in the implicit loops set up by -n and -p it can be useful to define the values of $/ and $/. You could, of course, do this in a BEGIN block, but Perl gives you an easier option with the -0 (that's a zero) and -l (that's an L) command-line options. This can get a little confusing (well, it confuses me) so I'll go slowly.

Using -0 and giving it a hexadecimal or octal number sets $/ to that value. The special value 00 puts Perl in paragraph mode and the special value 0777 puts Perl into file slurp mode. These are the same as setting $/ to an empty string and undef respectively.

Using -l and giving it no value has two effects. Firstly, it automatically chomps the input record, and secondly, it sets $/ equal to $/. If you give -l an octal number (and unlike -0 it doesn't accept hex numbers) it sets $/ to the character represented by that number and also turns on auto-chomping.

To be honest, I rarely use the -0 option and I usually use the -l option without an argument just to add a new line to the end of each line of output. For example, I'd usually write my original "Hello World" example as:

  $ perl -le 'print "Hello World"'

If I'm doing something that requires changing the values of the input and output record separators then I'm probably out of the realm of command-line scripts.

In-Place Editing

With the options that we have already seen, it's very easy to build up some powerful command-line programs. It's very common to see command line programs that use Unix I/O redirection like this:

  $ perl -pe 'some code' < input.txt > output.txt

This takes records from input.txt, carries out some kind of transformation, and writes the transformed record to output.txt. In some cases you don't want to write the changed data to a different file, it's often more convenient if the altered data is written back to the same file.

You can get the appearance of this using the -i option. Actually, Perl renames the input file and reads from this renamed version while writing to a new file with the original name. If -i is given a string argument, then that string is appended to the name of the original version of the file. For example, to change all occurrences of "PHP" to "Perl" in a data file you could write something like this:

  $ perl -i -pe 's//bPHP/b/Perl/g' file.txt

Perl reads the input file a line at a time, making the substitution, and then writing the results back to a new file that has the same name as the original file -- effectively overwriting it. If you're not so confident of your Perl abilities you might take a backup of the original file, like this:

  $perl -i.bak -pe 's//bPHP/b/Perl/g' file.txt

You'll end up with the transformed data in file.txt and the original file backed up in file.txt.bak. If you're a fan of vi then you might like to use -i~ instead.

Further Information

Perl has a large number of command-line options. This article has simply listed a few of the most useful. For the full list (and for more information on the ones covered here) see the "perlrun" manual page.

 

Perl命令行应用介绍

作 者: Dave Cross
发 表:August 10, 2004
原 名: Perl Command-Line Options
原 文:http://www.perl.com/pub/a/2004/08/09/commandline.html
译 者: "Qiang":qiang

Perl 有很多命令行参数. 通过它, 我们有机会写出更简单的程序. 在这篇文章里我们来了解一些常用的参数.

Safety Net Options

在使用 Perl 尝试一些聪明( 或 stupid) 的想法时, 错误难免会发生. 有经验的 Perl 程序员常常使用三个参数来提前找到错误所在,

-C 是第一个. 这个参数编译 Perl 程序但不会真正运行它. 由此检查所有语法错误. 每次修改 perl 程序之后我都会立刻使用它来找到任何语法错误.

  1. $ perl -c program.pl
复制代码



-W 是第二个参数. 它会提示你任何潜在的问题. Perl 5.6.0 之后的版本已经用 use warnings; 替换了 -w .你应该使用 use warnings 因为它要比 -w 更灵活.

-T 是第三个参数. 它把 perl 放到了 tain 模式.  在这个模式里, Perl 会质疑任何程序外传来的数据. 例如,从命令行读取, 外部文件里读取 或是 CGI 程序里传来的数据. 这些数据在 -T 模式里都会被 Tainted 掉.

Tainted 数据不可以被用来和外部交互. 例如 使用在 system 调用和用作 open 的文件名. perlsec 文档里有更多什么数据会被Tainted 掉的例子.

要想使用 Tainted 的数据就需要 untaint这个数据. untaint 是通过正则表达式来实现.这里我不会太多的讲述 taint 模式. 如果你要编写的程序 (例如 CGI 程序) 需要从从用户那里接受不可知的输入, 我推荐使有 taint 模式

-d ,Perl Debugger , 在这里值得一提但我们无法顾及, 我推荐阅读文档 'perldoc perldebug' 或 Richard Foley 的 Perl Debugger Pocket Reference 一书.

Command-Line Programs

下面的几个 Perl 参数可以让短小的 Perl 程序运行在命令行. -e 可以让 Perl 程序在命令行上运行.例如, 我们可以在命令行上运行 "Hello World" 程序而不用把它写入文件再运行.

  1. $ perl -e 'print "Hello World/n"'
复制代码



多个 -e 也可以同时使用, 运行顺序根据它出现的位置.

  1. $ perl -e 'print "Hello ";' -e 'print "World/n"'
复制代码


象所有的 Perl 程序一样, 只有程序的最后一行不需要以 ; 结尾.

虽然你也可以象通常一样引用模块, 但 -M 让它变得更容易.

  1.   $ perl -MLWP::Simple -e 'print head "http://www.example.com"'
复制代码



-M模块名 和 use 模块名 一样. 如果不想引入模块的缺省值, 你可以使用 -m. -m模块名 和 use 模块名() 一样. 例如下面这个例子, 因为 head 函数是缺省引入,而使用 -m 时就不会, 结果是没有输出.

  1.   $ perl -mLWP::Simple -e 'print head "http://www.example.com"'
复制代码



-m 和 -M 通过 = 来引入某个模块的特别函数.

  1. $ perl -MCGI=:standard -e 'print header'
复制代码



这里, CGI.pm 的 ":standard" 被引入, header 函数因此可以使用.要引入多个参数可以通过使用引号和逗号.

  1. $ perl -MCGI='header,start_html' -e 'print header, start_html'
复制代码



这里我们引入了 header 和 start_html 函数.

Implicit Loops

-n 和 -p 增加了循环的功能, 使你可以一行一行来处理文件.

  1. $ perl -n -e 'some code' file1
复制代码



这与下面的程序一样.

  1. LINE:
  2.     while (<>;) {
  3.       # your code goes here
  4.     }
复制代码



<>; 打开命令行里的文件,一行行的读取.每一行缺省保存在 $_

  1.   $ perl -n -e 'print "$. - $_"' file
复制代码



上面的这一行可以写成

  1.   LINE:
  2.     while (<>;) {
  3.       print "$. - $_"
  4.     }
复制代码



输出当前行数 $. 和当前行 $_.

-p 可以让上面的程序变得更容易. -p 会输出 $_ 就像这样

  1. LINE:
  2.     while (<>;) {
  3.       # your code goes here
  4.     } continue {
  5.       print or die "-p destination: $!/n";
  6.     }
复制代码



continue 在这里保证print 在每次循环都会被调用.

使用 -p, 我们的打印行数程序可以改为

  1.   $ perl -p -e '$_ = "$. - $_"'
复制代码



注意到那个 LINE: 标签 ? 我们可以利用它来跳到下一个循环. 使用 next LINE

  1.   $ perl -n -e 'next LINE unless /pattern/; print $_'
复制代码



如果想在循环的前后做些处理, 可以使用 BEGIN 或 END block. 下面的这一行计算文件里的字数.

  1.   $ perl -ne 'END { print $t } @w = /(/w+)/g; $t += @w' file.txt
复制代码



每一行所有匹配的字放入数组 @w , 然后把 @w 的元素数目递加到  $t. END block 里的 print 最后输出文件总字数.

还有两个参数可以让这个程序变得更简单. -a 打开自动分离 (split)  模式. 空格是缺省的分离号. 输入根据分离号被分离然后放入缺省数组 @F. 由此,我们可以把上面的程序改写为

  1. $ perl -ane 'END {print $x} $x += @F' file.txt
复制代码



你也可以通过 -F 把缺省的分离号改为你想要的.例如把分离号定为非字符:

  1. $ perl -F'/W' -ane 'END {print $x} $x += @F' file.txt
复制代码



下面通过 Unix password 文件来介绍一个复杂的例子.  Unix password 是文本文件, 每一行是一个用户记录, 由冒号 : 分离. 第 7 行是用户的登录 shell 路径. 我们可以得出每一个不同 shell 路径被多少个用户使用 :

  1.   $ perl -F':' -ane '$s{$F[6]}++;' /
  2.   >; -e 'END { print "$_ : $s{$_}" for keys %s }' /etc/passwd
复制代码



虽然现在不是一行, 但是你可以看出使用参数可以解决什么问题.

Record Separators

以前我提到过 $/ 和 $/ -- 输入,输出分隔号. $/ 用来分隔从文件句柄里读出的数据, 缺省 $/ 分隔号是 /n , 这样每次从文件句柄里就会一行行的读取.  $/  缺省是空字符, 用来自动加到要 print 的数据尾端. 这就是为什么很多时候 print 都要在末尾加上 /n.

$/ 和 $/ 可与 -n -p 一起使用. 在命令行上相对应为 -0 (零) 和 -l ( 这是 L ). -0 后面可以跟一个16 进制或8进制数值, 这个值用来付给 $/ . -00 打开段落模式, -0777 打开slurp 模式 (即可以一次把整个文件读入) , 这与把 $/ 设为空字符和 undef 一样效果.

单独使用 -l  有两个效果,  第一自动 chomp 输入分隔号, 第二 把$/ 值付给 $/ ( 这样 print 的时候就会自动在末尾加 /n )

我个人常常使用 -l 参数, 用来给每一个输出加 /n. 例如

  1. $ perl -le 'print "Hello World"'
复制代码



In-Place Editing

使用已有的参数我们可以写出很有效的命令行程序. 常见的Unix I/O 重定向:

  1. $ perl -pe 'some code' < input.txt >; output.txt
复制代码



这个程序从 input.txt 读取数据, 然后做一些处理再输出到 output.txt. 你当然也可以把输出重定向到同一个文件里.

上面的程序可以通过 -i 参数做的更简单些. -i  把源文件更名然后从这个更名的源文件里读取.最后把处理后的数据写入源文件. 如果 -i 后跟有其他字符串, 这个字符串与源文件名合成后来生成一个新的文件名. 此文件会被用来储存原始文件以免被 -i  参数覆盖.

这个例子把所有 php 字符替换为 perl :

  1. $ perl -i -pe 's//bPHP/b/Perl/g' file.txt
复制代码



程序读取文件的每一行, 然后替换字符, 处理后的数据重新写入( 即覆盖 ) 源文件. 如果不想覆盖源文件, 可以使用

  1. $perl -i.bak -pe 's//bPHP/b/Perl/g' file.txt
复制代码



这里处理过的数据写入 file.txt , file.txt.bak 是源文件的备份.

 

原文 http://bbs.chinaunix.net/viewthread.php?tid=499434