Main index

Introducing UNIX and Linux


Perl

Overview
Introduction
      Why yet another utility?
      Beginning Perl
      Invoking Perl
      Documentation on perl
      Perl Scripts
Variables
Input and output
      Files and redirection
      Pipes
      The DATA filehandle
Fields
Control structures
Predefined Perl
      Functions
      Modules
Regular expressions
      Single character translation
      String editing
Perl and the Kernel
Quality code
When do I use Perl?
Summary
Exercises

String editing

Just as the Perl command tr is similar to the shell command tr, the Perl command s is similar to Sed. The syntax is the same as for tr:

s/search-string/replacement-string /options

For example, to replace Hello or hello in variable $greeting by Howdy, we might have:

$greeting =~ s/[Hh]ello/Howdy;

Worked example

Write a Perl script which takes one argument, assumed to be the name of a file which is itself a Perl program, and copies that file to standard output with all comments removed. Assume that # is not contained in any quoted string in the argument program.
Solution: A comment commences with # and continues to the end of the line, so a search string matching a comment is #.*$ (note the use of $ to anchor the search string to the end of the line, and .* to match an arbitrary length sequence of characters). The script then becomes

open (FILE,$ARGV[0]);
while (<FILE>) {
  $_ =~ s/#.*$//;
  print "$_" ;
}

Worked example

Write a Perl script to read text from standard input, and send to standard output the text with all leading/trailing whitespace removed, and all sequences of tabs/spaces replaced with a single space character.
Solution: This is another use of the s, with a simple enclosing loop. The substitution is performed in two stages: first, remove the leading space, then the trailing space, then the excess whitespace. Note that a tab is whitespace, and is entered as a \t.

while (<STDIN>) {
  $_ =~ s/^[ \t]*//;       # remove leading spaces
  $_ =~ s/[ \t]*$//;       # remove trailing spaces
  $_ =~ s/[ \t][ \t]*/ /g; # squash internal whitespace
  print "$_" ;
}

Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck