Home Random Page


CATEGORIES:

BiologyChemistryConstructionCultureEcologyEconomyElectronicsFinanceGeographyHistoryInformaticsLawMathematicsMechanicsMedicineOtherPedagogyPhilosophyPhysicsPolicyPsychologySociologySportTourism






Ranges by line number

You can specify a range on line numbers by inserting a comma between the numbers. To restrict a substitution to the first 100 lines, you can use:

sed '1,100 s/A/a/'

If you know exactly how many lines are in a file, you can explicitly state that number to perform the substitution on the rest of the file. In this case, assume you used wc to find out there are 532 lines in the file:

sed '101,532 s/A/a/'

An easier way is to use the special character "$," which means the last line in the file.

sed '101,$ s/A/a/'

The "$" is one of those conventions that mean "last" in utilities like cat -e, vi, and ed. "cat -e" Line numbers are cumulative if several files are edited. That is,

sed '200,300 s/A/a/' f1 f2 f3 >new

is the same as

cat f1 f2 f3 | sed '200,300 s/A/a/' >new

Ranges by patterns

You can specify two regular expressions as the range. Assuming a "#" starts a comment, you can search for a keyword, remove all comments until you see the second keyword. In this case the two keywords are "start" and "stop:"

sed '/start/,/stop/ s/#.*//'

The first pattern turns on a flag that tells sed to perform the substitute command on every line. The second pattern turns off the flag. If the "start" and "stop" pattern occurs twice, the substitution is done both times. If the "stop" pattern is missing, the flag is never turned off, and the substitution will be performed on every line until the end of the file.

You should know that if the "start" pattern is found, the substitution occurs on the same line that contains "start." This turns on a switch, which is line oriented. That is, the next line is read and the substitute command is checked. If it contains "stop" the switch is turned off. Switches are line oriented, and not word oriented.

You can combine line numbers and regular expressions. This example will remove comments from the beginning of the file until it finds the keyword "start:"

sed -e '1,/start/ s/#.*//'

This example will remove comments everywhere except the lines between the two keywords:

sed -e '1,/start/ s/#.*//' -e '/stop/,$ s/#.*//'

The last example has a range that overlaps the "/start/,/stop/" range, as both ranges operate on the lines that contain the keywords. I will show you later how to restrict a command up to, but not including the line containing the specified pattern. It is in Operating in a pattern range except for the patterns But I have to cover some more basic principles.

Before I start discussing the various commands, I should explain that some commands cannot operate on a range of lines. I will let you know when I mention the commands. In this next section I will describe three commands, one of which cannot operate on a range.

Delete with d

Using ranges can be confusing, so you should expect to do some experimentation when you are trying out a new script. A useful command deletes every line that matches the restriction: "d." If you want to look at the first 10 lines of a file, you can use:



sed '11,$ d' <file

which is similar in function to the head command. If you want to chop off the header of a mail message, which is everything up to the first blank line, use:

sed '1,/^$/ d' <file

You can duplicate the function of the tail command, assuming you know the length of a file. Wc can count the lines, and expr can subtract 10 from the number of lines. A Bourne shell script to look at the last 10 lines of a file might look like this:

#!/bin/sh
#print last 10 lines of file
# First argument is the filename
lines=`wc -l $1 | awk '{print $1}' `
start=`expr $lines - 10`
sed "1,$start d" $1


Click here to get file: sed_tail.sh
The range for deletions can be regular expressions pairs to mark the begin and end of the operation. Or it can be a single regular expression. Deleting all lines that start with a "#" is easy:

sed '/^#/ d'

Removing comments and blank lines takes two commands. The first removes every character from the "#" to the end of the line, and the second deletes all blank lines:

sed -e 's/#.*//' -e '/^$/ d'

A third one should be added to remove all blanks and tabs immediately before the end of line:

sed -e 's/#.*//' -e 's/[ ^I]*$//' -e '/^$/ d'

The character "^I" is a CTRL-I or tab character. You would have to explicitly type in the tab. Note the order of operations above, which is in that order for a very good reason. Comments might start in the middle of a line, with white space characters before them. Therefore comments are first removed from a line, potentially leaving white space characters that were before the comment. The second command removes all trailing blanks, so that lines that are now blank are converted to empty lines. The last command deletes empty lines. Together, the three commands remove all lines containing only comments, tabs or spaces.

This demonstrates the pattern space sed uses to operate on a line. The actual operation sed uses is:

 

· Copy the input line into the pattern space.

· Apply the first
sed command on the pattern space, if the address restriction is true.

· Repeat with the next sed expression, again
operating on the pattern space.

· When the last operation is performed, write out the pattern space
and read in the next line from the input file.

Printing with p

Another useful command is the print command: "p." If sed wasn't started with an "-n" option, the "p" command will duplicate the input. The command

sed 'p'

will duplicate every line. If you wanted to double every empty line, use:

sed '/^$/ p'

Adding the "-n" option turns off printing unless you request it. Another way of duplicating head's functionality is to print only the lines you want. This example prints the first 10 lines:

sed -n '1,10 p' <file

Sed can act like grep by combining the print operator to function on all lines that match a regular expression:

sed -n '/match/ p'

which is the same as:

grep match

Reversing the restriction with !

Sometimes you need to perform an action on every line except those that match a regular expression, or those outside of a range of addresses. The "!" character, which often means not in UNIX utilities, inverts the address restriction. You remember that

sed -n '/match/ p'

acts like the grep command. The "-v" option to grep prints all lines that don't contain the pattern. Sed can do this with

sed -n '/match/ !p' </tmp/b

Relationships between d, p, and !

As you may have noticed, there are often several ways to solve the same problem with sed. This is because print and delete are opposite functions, and it appears that "!p" is similar to "d," while "!d" is similar to "p." I wanted to test this, so I created a 20 line file, and tried every different combination. The following table, which shows the results, demonstrates the difference:

Relations between d, p, and !
Sed Range Command Results
 
sed -n 1,10 p Print first 10 lines
sed -n 11,$ !p Print first 10 lines
sed 1,10 !d Print first 10 lines
sed 11,$ d Print first 10 lines
 
sed -n 1,10 !p Print last 10 lines
sed -n 11,$ p Print last 10 lines
sed 1,10 d Print last 10 lines
sed 11,$ !d Print last 10 lines
 
sed -n 1,10 d Nothing printed
sed -n 1,10 !d Nothing printed
sed -n 11,$ d Nothing printed
sed -n 11,$ !d Nothing printed
 
sed 1,10 p Print first 10 lines twice, then next 10 lines once
sed 11,$ !p Print first 10 lines twice, then last 10 lines once
 
sed 1,10 !p Print first 10 lines once, then last 10 lines twice
sed 11,$ p Print first 10 lines once, then last 10 lines twice

This table shows that the following commands are identical:

sed -n '1,10 p'

sed -n '11,$ !p'

sed '1,10 !d'

sed '11,$ d'

It also shows that the "!" command "inverts" the address range, operating on the other lines.


Date: 2016-01-14; view: 799


<== previous page | next page ==>
Using sed in a shell here-is document | Operating in a pattern range except for the patterns
doclecture.net - lectures - 2014-2024 year. Copyright infringement or personal data (0.006 sec.)