perl 细节

来源:互联网 发布:淘宝游戏专营店铺出租 编辑:程序博客网 时间:2024/05/16 06:40

functions篇:

    you can think of fucntions as terms in an expression, along with literals and variables. Or you can think of them as prefix operators that process the arguments after them.
    (可以将perl的函数看成是表达式中的一个term,就像常量和变量;你可以可以将perl函数看成是一个有前缀的运算符,这个运算符后面有很多的参数。)


    some of these operators, er, funcitons take a LIST as an argument. Elements of the LIST should be separated by commas(or by =>). The elements of the LIST are evaluated in a list context, so each element will return either a scalar or a list value, depending on its sensitivity to list context.


    Each returned value, whether scalar or list, will be interpolated as part of the overall sequence of scalar values.


    functions may be used either with or without parentheses around their arguments. If you do use parentheses, the simple but occasionally surprising rule is this: if it looks like a function, then it is a function, so precedence doesn't matter. Otherwise, it's a list operator or unary operator, and precedence does matter. Be careful, becasue if you put whitespace between the keyword and its left parenthesis, that doesn't keep it from being a function.

    (perl函数可以使用括号也可以不使用括号来将其传入的参数括起来,如果你使用了括号,就要记住一点:如果一个perl函数看起来像一个函数,那么它就是一个函数,优先级不重要,但是如果一个perl函数是一个操作符,那么优先级就很重要)


   例子:
      print (1+4)*6; #将会打印5,而不是30,因为print是一个操作符(operator),所以print只处理括号里面的数据,
      print(1+4)*6; #将会打印5,而不是30,因为print是一个操作符(operator),所以print只处理括号里面的数据。
      $a=print(3+4)+5; #$a等于6,因为print(3+4)的返回值(成功或失败的状态,成功等于1,失败等于0)加上5等于6.
      print +(1+2)*4; #将会打印12,因为(1+2)被前面的+操作符当成算子了。这里面(1+2)前面的加号可以使用减号,效果是一样的,只要使用unary operator都可以


    In general, functions in Perl that server as wrappers for syscalls of the same name ( like chown(2), fork(2),closedir(2), etc) all return true when they succeed and undef otherwise, which is different from the C (return -1 on failure). Exceptions to this rule are wait, waitpid, and syscall. Syscalls also set the special $!($OS_ERROR) variable on failure. other fucntions do not, except accidently.
    
    There is no rule that relates the behavior of a function in list context to its behavior in scalar context, ovr vice versa.  

 

   Verbs: many of the verbs in Perl are commands: They tell the Perl interperter to do something. some verbs translate their input parmaters into return values, we tend to call these verbs "functions". Verbs are also sometimes called operators(when built-in), or subroutines(when user-defined). But call them whatever you like -- they all return a value.

 

    Subroutines may be named with an initial &, although the funny character is optional when calling the subroutine.

 

    If any list operator(such as print) or any named unary operator (such as chdir) is followed by a left parenthesis as the netx token (ignoring whitespace), the operator and its parenthesized arguments are given highest precedence, as if it were a normal function all. The rule is that: If it looks like a function call, it is a function call. You can make it look like a nonfunction by prefixing the parentheses with a unary plus, which does absolutely nothing, semantically speaking--it doesn't even coerce the argument to be numeric
    例子:

      chdir +($foo)||die;
      chdir +($foo)*20;
      rand +(10)*20l #rand(10*20)
      print $foo, exit; #never print,
      print($foo),exit; #after printing ,exit.

 

Variables 篇:

Type        character   example      Is a name for
scalar            $            $cents       an individual value (number of string)
array             @           @large       a list of values, keyed by number
hash             %           %interest   a group of values, keyed by string
subroutine    &            &how         a callable chunk of perl code
typeglob       *           *struck        everything named struck

 

typeglob:

one use of typeglobs is for passing or storing filehandles.
   例子:

     $fh = *STDOUT; #save away a filehandle.
     $fh = /*STDOUT.
another use of typeglobs is to alias one symbol table entry to another symbol table entry.
  例子:

     *foo = *bar;
     *foo = /$bar; # alias just one variable from a typeglob by assigning a reference. make $foo an alias for $bar, but doesn't make @foo an alias for @bar.
Alll these affect global (package) variable only.

  you can use different quoting mechanisms to make different kinds of values. Double quotation marks(double quotes) do variable interpolationg and backslash interpolation while single quotes suppress interpolation. and backquotes will execute an external program and return the output of the program, so you can capture it as a single string containing all the lines of output.
  例子:
   $exit = system("vim $file");  #numeric status of a command
   $cwd = `pwd`;                    #string output from a command.

reference:
   例子:
   $ary = /@myarray;                #referecne to a named array
   $hsh = /%myhash;                #reference to a named hash
   $sub = /&mysub;                   #reference to a named subroutine
   $ary = [1,3,5,7];                    #reference to an unnamed array
   $hsh = {Na=>19, Cl=>35};      #reference to an unnamed hash
   $sub = sub { print $state; };    #referecne to an unnamed subroutine
   $fido = new Camel "Amelia";     #reference to an object
  
array:

    If you use the array in a conditional, the array returns the number of elementes in the array.Do not be tempted to use defined @files , it doesn't work because the defined function is asking whether a scalar is equal to undef, but an array is not a scalar.

@whatever=(); $#whatever=-1;#assigning to $#days changes the length of the array.  truncate an array does not recover its memory, you have to undef(@whatever) to free its memory back to your process's memory pool.

  例子:  ($a,$b)=($b,$a); #swap two variables, these happen in paralled.

hash:
  the => operator is just a synonym for a comma, but it's more visually distinctive and also quotes any bare identifiers to the left of it.
   例子:

     $field = rediao_group(
             name=> 'animals',
             values => ['camel','llama','ram','wolf'],
             defualt => 'camel',
             linebreak => 'true',
             labels => /%animal_names,
             );  #use named parameters to invoke complicated functions.
   when you evaluate a hash variable in a scalar context, it returns a true value only if the hash contains any key/value pairs whatsoever. the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. to find the nubmer of keys in  a hash, use the keys function in a scalar context:scalar(keys(%hash));

    you can emulate a multidimensional hash by specifying more than one key within the braces, separated by commas. the listed keys are concatenated together, separated by the content of $:, which has a default value of chr(28).
   例子:

       $people{$state,$country}=$census_results;
       $people{join $: => $state, $country} = $census_results; # they are equal;
       $hash{$x,$y,$z} # a single value
       @hash{$x,$y,$z} # a slice of three values.
   例子: 

       $wife{"Jacob"} = ["Leah","$achedl","Bilhah","Zilpah"]; # a unamed array reference.
       $wife{"Jacob"}[0]= "Leah";
       $kids_of_wife{"Jacob"}={
            "Leah"   => ["Reuben","Simeon","Levi","Judah"],
             "Zilpah" => ["Gad","Asher"]
           };
        $kids_of_wife{"Jacob"}{"Leah"}[0] = "Reuben";


LIST:
  (LIST), list literals are denoted by separating individual values by commas, and enclosing the list in parenthese where precedence requies it. Because it almost never hurts to use extra paretheses.

  In a scalar context, the list literal merely evaluates each of itsarguments in scalar context, and returns the value of the final element.

   A list value is different from an array, a real array variable also knows its context, and in a list context, it would return its internal list of values just like a list literal. But in a sclar context it returns only the length of the array.

  A list value may also be subscripted like a normal array,you must put the list in parentheses(real ones) to avoid ambiguity.
   例子:

      @stuff=("one","two","three");
      $stuff=@stuff; #stuff=3;
      $stuff = ("one", "two","three"); #$stuff="three"
      $modification_time = ( stat($file) )[9];
      $modification_time = stat($file)[9]; #wrong.
      () = funkshun(); #call your function in list context, if you had just called the function without an assignment, it would have instead been called in void context, which is a kind of scalar context, and might have caused the function to behave completely differently.
      $x=( ($a,$b) = (7,7,7) ); # $x=3, not 2, list assignment in scalar context returns the number of elements produced by the expression on the right side of the assigment.

 

construct                 meaning
$days->[28]            29th element of array pointed to by reference $days
@days[3..5]             array slice containing ($days[3],$days[4],$days[5])
@days{'Ja','Feb'}     hash slice containing ($days{'Ja'},$days{'Feb'})

Perl provides two kinds of namespace:
 (1)symbol tables: global hashes that happend to contain symbol table entries for global variables,including the hashes for other symbol tables, a symbol table in perl is also known  as a package
     例子:

      $Santa::Helper::Reindeer::Rudolph::nose#all the leading identifiers are the names of nested symbol tables.he symbol table is named Santa::Helper::Reindeer::Rudolph::, and the actual variable within that symbol table is $nose.
(2)lexical scopes: unnamed scratchpads that don't live in any symbol table, but are attched to a block of code in your program. they contain variables that can only be seen by the block.Variables attached to a lexical scope are not in any package, so lexically scoped variable names may not contain the :: sequence

 

常量:

Perl uses the comma as a list separator, you cannot use it to separate the thousands in a large number. Perl does allow you to use an underscore character instead. The underscore only works within literal numbers specified in your program, not for strings functioning as numbers or data read from somewhere else. Similarly, the leading 0x for hexadecimal, 0b for binary, and 0 for octal work only for literals.
例子:

    $x = 6.02e23;                    # scientific notation
    $x = 4_294_967_296;       # underline for legibility
    $x = 0377;                         # octal
    $x = 0xffff;                         # hexadecimal
    $x = 0b1100_0000;          # binary

 

interpolation
  you can only interpolate expression thatbegin with $ or @, a complete hash specified with a % may not be interpolated into the string. you can put braces around the identifier to distinguish it from followoing alphanumerics, an identifier within such braces is forced to be a string, as is any single identifier within a hash subscript.
     例子:

       $days{'Feb'}  # == $days{Feb}
        print "The prifce is ${price}good/n";
        @days{Jan,Feb}# wrong, in particular, you should use quotes in slices, anything more complicated in the subscript is interpreted as an expression, an then you'd have to put in the quotes.

引号:
quote constructs,any nonalphanumeric, nonwhitespace delimiter may be used in place of /
''         q//   
""       qq//  
``      qx//      command execution
()        qw//      word list
//       m//       pattern match
s///    s///      pattern substitution
y///    tr///     character translation
""      qr//      regular expression

package:
    package Camel; Perl will assume from this point on that any unspecified verbs or nouns are about Camel. It does this by automatically prefixing any global name with the module name "Camel::".
    $fido = new Camel "Amelia"; We are actually invoking the &new verb in the Camel package,$fido remembers that it is pointing to a Camel. This is how object-oriented programming works.
    use Camel; not only borrows verbs from another package, but also checks that the module you name is loaded in from disk. In fact, you must say sth like use Camel; before you say: $fido = new Camel "Amelia"; because otherwise Perl wouldn't know what a Camel is.
    pragmas:some of the built-in modules don't actually introduce verbs at all, but simply warp the Perl language in various useful ways. These special modules we call pragmas. For instance, strict.


Run perl:

  (1)if you are trying to cram everything Perl needs to knwo into 80 colums or so, you can simply call perl explicitly from the command line, you can use the -e switch: perl -e 'print "hello, world!/n"; '
  (2)put all your commands into a file and then : perl filename #you invoke the Perl interpreter explicity
  (3)let the operating system find the interpreter for you. On some systems, there may be ways of associating various file extensions or directories with a particular application. On Unix systems that support the #! "shebang" notation.
     
    
Operator篇:
   Terms: any term is of highest precedence in Perl. Terms include variables, quote and quotelike operators, most expresssions in parentheses, or brackets or braces, and any function whose arguments are parenthesized.

    Assignment in Perl returns the actual variabale as an lvalue, so that you can modify the same variable more than once in a statment:
   例子:

       ($temp ==32)*=5/9;  
       chomp($nubmer=<STDIN>);

   Logical operators:

     (1) C's punctuational operators work well when you want your logical operators to bind more tightly than commas,

     (2) Basisc's word-based operators work well when you want you commas to bind more tightly than you logical operators.

 

 Comparsion operators:
   numeric  string   return value
   ==          eq
   !=           ne
   <            lt
   >            gt
   <=          le
   >=          ge
   <=>        cmp      0 if equal,1 if $a greater, -1 if $b greater
   
 File test operator:
     example    name
      -e $a
      -r $a
      -w $a
      -d $a         direcotry
      -f $a          a regular file
      -t $a          a text file

Truth in Perl is always evaluated in a scalar context. Other than that, no type coercion is done:
      (1) any string is true except for "" and "0";
      (2) any number is true except for 0;
      (3) any reference is true;
      (4) any undefined value is false;


2**3**4  #2**(3**4)

 

The binary -> operator is an infix dereference operator. If the right side is a [] array subscript, a {} hash subscript, or a () subroutine argument list, the left side must be a reference(either hard or symbolic) to an array, a hash, a subroutine, respectively.
  例子:

    $aref->[42];
    $href->{"corned beef"}
    $sref->(1,2,3)
    Bear->new("yogi");#it is a method call of some kind, the left side must evaluate to either an object or a calss name .


The autoincrement operator: if the variable has only been used in string contexts since it was set, has a value that is not the null string, and matches the pattern /^[a-zA-Z]*[0-9]*$/, the increment is done as a string, preserving each character within its range, with carry. the autodecrement operator is not magical
    例子:

      print ++($foo='a0'); # print b1
      print ++($foo='Az'); # print Ba
      print ++($foo='zz'); # print aaa

$a%$b: if $b is positive, then the result of $a%$b is $a minus the largest multiple of $b taht is not greater than $a(which means the result will always be in the range 0 .. $b-1). if $b is negative, then the result of $a%$b is $a minus the smallest multiple of $b that is not less than $a(which means the result will be in the range $b+1 .. 0);
   例子:

     print +(-9%5),"/n";   #2
     print +(-9%-5),"/n";  #-3
     print +(9%5),"/n";    #3
     print +(9%-5),"/n";   #-2

The && and || operators return the last value evaluated, and the left argument is always evaluated in scalar context.
  例子:

       $home = $ENV{HOME}
                   || $ENV{LOGDIR}
                   || (getpwuid($<))[7]
                   || die "you're homeless!/n";

Bear in mind that the conditinal operator binds more tightly than the various assignment operators.
  例子:

      $a % 2? $a+=10: $a+=2; # this would be parsed like this :(($a % 2)?($a+=10):$a)+=2;

 

Target OP= Expr; #first, assignment operators always parse at the precedence level of ordinary assignment, regardless of the precedence that OP would have by itself. second, Target is evaluated only once. Usually that doesn't matter unless there are side effects, such as an autoincrement:
   例子:

       $var[$a++] += $value; #$a is incremented once.

$xyz =$x or $y or $z; #The precedence of the assignment is higher than or but lower than ||;  $xyz= $x||$y||$z;

 

  Operators like eval {}, do{}, and sub{} all look like compund statements, they really aren't. from the  outside, those operators are just terms in an expression, and thus they need an explicit semicolon if used as the last item in a statment.
any simple statement may optionally be followed by a single modifier, just before the terminating semicolon. the possible modifiers are:
    if expr
    unless expr
    while expr
    until expr
    foreach list  # it evaluates once for each element in its list, with $_ aliased to the current element.
    shutup() unless $you_want_me_to_leave;

 

The scope of a variable declared in the controlling condition extends from its declaration through the rest of that condititional only, including any elsifs and the final else clause if present, but not beyond:
  if( (my $color=<STDIN>) =~/red/i )
  {
     $value= 0xff0000;
   }elsif ($scolor =~ /green/i )
   {
     $value = 0x00ff00;
   }else
   {
     $value = 0x000000;
   }#after the else, the $color variable is no longer in scope.   
  
The while or until statement can have an optional extra block: the continuse block. This block is executed every time the block is continued, either by falling off the end of the first block or by an explicit next.
   while (my $line = <STDIN> )
   {
       $line = lc $line;
    }continue
    {
       print $line;   #still visible
     }
       
The foreach loop iterates over a list of values by setting the control variable(Var) to each successive element of the list
   foreach var(list){
      ...
    } # the foreach keyword is just a synonym for the for keyword, you can use foreach and for interchangeably. if the var is omitted, the global $_ is used.       
var is an implicit alias for each item in the list. if you modify var, you also modify each item in the list.

 

last, next,redo:

LINE: while(<STDIN>)
{
    last LINE if /^$/;
    ...
}#exit whendone with mail header. the last operator immediately exits the loop in question. The continue block, if any, is not executed.   

LINE: while(<STDIN>)
{
   next LINE if /^#/;  #skip commnents
   next LINE if /^$/;  #skip blank lines;
   ...
}continue{
    $count++;
}# the next operator skips the rest of the current iteration of the loop and starts the next one. If there is a continuse clause on the loop, it is executed just before the condition is re-evaluated.
while(<>)
{
   chomp;
   if(s///$//)
   {
     $_.=<>;
     redo unless eof; # don't read past each file's eof
   }
}# the redo operator restarts the loop block without evaluating the conditional again.The continue block, if any, is not executed. This operator is often used by programs that want to fib to themselves about what was just input. Suppose you were processing a file that sometimes had a backslash at the end of a line to continue the record on the next line.

Control 篇
    if(){}elsif(){}else{}; the if statement evaluates a truth condtion and executes a block if the condition is true, the braces are required by definition.

 

   The foreach statement provides a list context to the expression in parentheses. Each element of the list is aliased to the loop variable in turn, the loop variable refers to the element itself,rather than a copy of the element. Hence, modify the loop variable also modifies the original array.


Regular Expression篇


  (1)   Perl quantifiers are by deafult greedy, this means that they will attempt to match as much as they can as long as the whole pattern still matches.
  (2)   the regular expressions will try to match as early as possible,this event takes precedence over being greedy. Since scanning happens left-to-right, this means that the pattern will match as far left as possible, even if there is some other place where it could match longer.
      例子:

         $_ = "fred xxxxxx barney";
         s/x*//;  # it will have absolutely no effect! this is because the x* (meaning zero or more "x" characters) will be able to match the nothing at the beginning of the string,since the null string happens to be zero characters wide an there's a null string just sitting there plain as day before the "f" of "fred".
        /bam{2}/ will match bamm ,/(bam){2}/ will match bambam.
  (3)  use minimal matching by placing a question mark after any quantifier.
         例子:

           /.*?:/ # will stop at the first :
   (4) an anchor is sth that matches a "nothing",but a special kind of nonthing that depends on its surroundings. it tries to match something of zero width.
         /b  matches at a word boundary.
         ^   the beginning of the string.
         $   the end of the string.
  (5) a backslash followed by an integer. The integer orresponding to a given pair of parentheses is determined by counting left parentheses from the beginning of the pattern, starting with one. Outside the regular expression itself, such as in the replacement part of a substitution, you use a $ followed by an interger。
         例子:

            /<(.*?).*?<///1>/      #match <b>aaa</b>
            s/(/S+)/s+(/S+)/$2 $1/ #swap the first two words of a string

原创粉丝点击