Accustomed
awk
users should take special note of the following:
Semicolons are required after all simple statements in
perl
(except at the end of a block).
Newline is not a statement delimiter.
Curly brackets are required on ifs and whiles.
Variables begin with $ or @ in
perl.
Arrays index from 0 unless you set $[.
Likewise string positions in substr() and index().
You have to decide whether your array has numeric or string indices.
Associative array values do not spring into existence upon mere reference.
You have to decide whether you want to use string or numeric comparisons.
Reading an input line does not split it for you. You get to split it yourself
to an array.
And the
split
operator has different arguments.
The current input line is normally in $_, not $0.
It generally does not have the newline stripped.
($0 is the name of the program executed.)
$<digit> does not refer to fields--it refers to substrings matched by the last
match pattern.
The
print
statement does not add field and record separators unless you set
$, and $\.
You must open your files before you print to them.
The range operator is "..", not comma.
(The comma operator works as in C.)
The match operator is "=~", not "~".
("~" is the one's complement operator, as in C.)
The exponentiation operator is "**", not "^".
("^" is the XOR operator, as in C.)
The concatenation operator is ".", not the null string.
(Using the null string would render "/pat/ /pat/" unparsable,
since the third slash would be interpreted as a division operator--the
tokener is in fact slightly context sensitive for operators like /, ?, and <.
And in fact, . itself can be the beginning of a number.)
Next,
exit
and
continue
work differently.
The following variables work differently
Awk Perl
ARGC $#ARGV
ARGV[0] $0
FILENAME $ARGV
FNR $. - something
FS (whatever you like)
NF $#Fld, or some such
NR $.
OFMT $#
OFS $,
ORS $\
RLENGTH length($&)
RS $/
RSTART length($\`)
SUBSEP $;
When in doubt, run the
awk
construct through a2p and see what it gives you.
Cerebral C programmers should take note of the following:
Curly brackets are required on ifs and whiles.
You should use "elsif" rather than "else if"
Break
and
continue
become
last
and
next,
respectively.
There's no switch statement.
Variables begin with $ or @ in
perl.
Printf does not implement *.
Comments begin with #, not /*.
You can't take the address of anything.
ARGV must be capitalized.
The "system" calls link, unlink, rename, etc. return nonzero for success, not 0.
Signal handlers deal with signal names, not numbers.
Seasoned
sed
programmers should take note of the following:
Backreferences in substitutions use $ rather than \.
The pattern matching metacharacters (, ), and | do not have backslashes in front.
The range operator is .. rather than comma.
Sharp shell programmers should take note of the following:
The backtick operator does variable interpretation without regard to the
presence of single quotes in the command.
The backtick operator does no translation of the return value, unlike csh.
Shells (especially csh) do several levels of substitution on each command line.
Perl
does substitution only in certain constructs such as double quotes,
backticks, angle brackets and search patterns.
Shells interpret scripts a little bit at a time.
Perl
compiles the whole program before executing it.
The arguments are available via @ARGV, not $1, $2, etc.
The environment is not automatically made available as variables.