CATEGORIES:

Biology Chemistry Construction Culture Ecology Economy Electronics Finance Geography History Informatics Law Mathematics Mechanics Medicine Other Pedagogy Philosophy Physics Policy Psychology Sociology Sport Tourism

Working with Multiple Lines

There are three new commands used in multiple-line patterns: "N," "D," and "P." I will explain their relation to the matching "n," "d," and "p" single-line commands.

The "n" command will print out the current pattern space (unless the "-n" flag is used), empty the current pattern space, and read in the next line of input. The "N" command does not print out the current pattern space and does not empty the pattern space. It reads in the next line, but appends a new line character along with the input line itself to the pattern space.

The "d" command deleted the current pattern space, reads in the next line, puts the new line into the pattern space, and aborts the current command, and starts execution at the first sed command. This is called starting a new "cycle." The "D" command deletes the first portion of the pattern space, up to the new line character, leaving the rest of the pattern alone. Like "d," it stops the current command and starts the command cycle over again. However, it will not print the current pattern space. You must print it yourself, a step earlier. If the "D" command is executed with a group of other commands in a curly brace, commands after the "D" command are ignored. The next group of sed commands is executed, unless the pattern space is emptied. If this happens, the cycle is started from the top and a new line is read.

The "p" command prints the entire pattern space. The "P" command only prints the first part of the pattern space, up to the NEWLINE character. Neither the "p" nor the "P" command changes the patterns space.

Some examples might demonstrate "N" by itself isn't very useful. the filter

sed -e 'N'

doesn't modify the input stream. Instead, it combines the first and second line, then prints them, combines the third and fourth line, and prints them, etc. It does allow you to use a new "anchor" character: "\n." This matches the new line character that separates multiple lines in the pattern space. If you wanted to search for a line that ended with the character "#," and append the next line to it, you could use

#!/bin/sh

sed '

# look for a "#" at the end of the line

/#$/ {

# Found one - now read in the next line

# delete the "#" and the new line character,

s/#\n//

}' file

You could search for two lines containing "ONE" and "TWO" and only print out the two consecutive lines:

#!/bin/sh

sed -n '

/ONE/ {

# found "ONE" - read in next line

# look for "TWO" on the second line

# and print if there.

/\n.*TWO/ p

}' file

The next example would delete everything between "ONE" and "TWO:"

#!/bin/sh

sed '

/ONE/ {

# append a line

# search for TWO on the second line

/\n.*TWO/ {

# found it - now edit making one line

s/ONE.*\n.*TWO/ONE TWO/

}

}' file

You can either search for a particular pattern on two consecutive lines, or you can search for two consecutive words that may be split on a line boundary. The next example will look for two words which are either on the same line or one is on the end of a line and the second is on the beginning of the next line. If found, the first word is deleted:

#!/bin/sh

sed '

/ONE/ {

# append a line

# "ONE TWO" on same line

s/ONE TWO/TWO/

# "ONE

# TWO" on two consecutive lines

s/ONE\nTWO/TWO/

}' file

Let's use the
"D" command, and if we find a line containing
"TWO" immediately after a line containing
"ONE," then delete the first line:

#!/bin/sh

sed '

/ONE/ {

# append a line

# if TWO found, delete the first line

/\n.*TWO/ D

}' file

Click here to get file: sed_delete_line_after_word.sh

If we wanted to print the first line instead of deleting it, and not print every other line, change the "D" to a "P" and add a "-n" as an argument to sed:

#!/bin/sh

sed -n '

# by default - do not print anything

/ONE/ {

# append a line

# if TWO found, print the first line

/\n.*TWO/ P

}' file

Click here to get file: sed_print_line_after_word.sh

It is very common to combine all three multi-line commands. The typical order is "N," "P" and lastly "D." This one will delete everything between "ONE" and "TWO" if they are on one or two consecutive lines:

#!/bin/sh

sed '

/ONE/ {

# append the next line

# look for "ONE" followed by "TWO"

/ONE.*TWO/ {

# delete everything between

s/ONE.*TWO/ONE TWO/

# print

# then delete the first line

}

}' file

Click here to get file: sed_delete_between_two_words.sh

Earlier I talked about the "=" command, and using it to add line numbers to a file. You can use two invocations of sed to do this (although it is possible to do it with one, but that must wait until next section). The first sed command will output a line number on one line, and then print the line on the next line. The second invocation of sed will merge the two lines together:

#!/bin/sh

sed '=' file | \

sed '{

s/\n/ /

Click here to get file: sed_merge_two_lines.sh

If you find it necessary, you can break one line into two lines, edit them, and merge them together again. As an example, if you had a file that had a hexadecimal number followed by a word, and you wanted to convert the first word to all upper case, you can use the "y" command, but you must first split the line into two lines, change one of the two, and merge them together. That is, a line containing

0x1fff table2

will be changed into two lines:

0x1fff

table2

and the first line will be converted into upper case. I will use tr to convert the space into a new line:

#!/bin/sh

tr ' ' '\012' file|

sed ' {

y/abcdef/ABCDEF/

s/\n/ /

Click here to get file: sed_split.sh

It isn't obvious, but sed could be used instead of tr. You can embed a new line in a substitute command, but you must escape it with a backslash. It is unfortunate that you must use "\n" in the left side of a substitute command, and an embedded new line in the right hand side. Heavy sigh. Here is the example:

#!/bin/sh

sed '

s/ /\

/' | \

sed ' {

y/abcdef/ABCDEF/

s/\n/ /

Click here to get file: sed_split_merge.sh

Sometimes I add a special character as a marker, and look for that character in the input stream. When found, it indicates the place a blank used to be. A backslash is a good character, except it must be escaped with a backslash, and makes the sed script obscure. Save it for that guy who keeps asking dumb questions. The sed script to change a blank into a "\" following by a new line would be:

#!/bin/sh
sed 's/ /\\\
/' file

Click here to get file: sed_addslash_before_blank.sh

Yeah. That's the ticket. Or use the C shell and really confuse him!

#!/bin/csh -f
sed '\
s/ /\\\\
/' file

Click here to get file: sed_addslash_before_blank.csh

A few more examples of that, and he'll never ask you a question again! I think I'm getting carried away. I'll summarize with a chart that covers the features we've talked about:

Pattern Space	Next Input	Command	Output	New Pattern Space	New Text Input
AB	CD	n	<default>	CD	EF
AB	CD	N	-	AB\nCD	EF
AB	CD	d	-	-	EF
AB	CD	D	-	-	EF
AB	CD	p	AB	AB	CD
AB	CD	P	AB	AB	CD

AB\nCD	EF	n	<default>	EF	GH
AB\nCD	EF	N	-	AB\nCD\nEF	GH
AB\nCD	EF	d	-	EF	GH
AB\nCD	EF	D	-	CD	EF
AB\nCD	EF	p	AB\nCD	AB\nCD	EF
AB\nCD	EF	P	AB	AB\nCD	EF

Date: 2016-01-14; view: 1084

<== previous page	\|	next page ==>
Multi-Line Patterns	\|	Keeping more than one line in the hold buffer

doclecture.net - lectures - 2014-2024 year. Copyright infringement or personal data (0.008 sec.)