The sed command

The sed command is another tool that uses Regular Expressions, see the page on the grep command for more information on these. sed is used for performing edits to a text file. Probably the most common use of sed is to perform find-and-replace type operations, but it can also be used for many other things, in particular printing only specific lines of files.

For a more detailed introduction to sed, you might be interested in reading the world’s best introduction to sed.

Find-replace using sed

Let’s create a simple text file

echo -e "lineA\nlineB\nlineC" > lines.txt

Let’s say we want to replace “line” on each line with “row”. We would use the following sed expression:

sed -e "s/line/row/" lines.txt > rows.txt
cat rows.txt

The -e option tells sed that we want to use a regular expression for editing, and the regular expression in this case takes the form s/find/replace/. The s tells sed that we’re doing a substitution, and the / characters delimit our search and replacement. As you can imagine, we could get much more complicated with the regular expressions used to specify what we want to find and how we want to replace it, but we won’t go into this here.

Note that if you want to edit a file directly, rather than edit a file and send the edited version to standard output, you would use the -i (inplace) argument. Be careful with this: you should always test out your regular expression before running it, in case it does things you don’t intend! If you have a large file, you can use head to see just the first 10 lines, then pipe this to sed:

head file.txt | sed -e "s/find/replace/"

This will allow you to test out your regular expression on a subset of the file. When you’re happy, you can run it on the whole file in place:

sed -i -e "s/find/replace/" file.txt

A final note: by default, sed replaces the first instance of your search string per line. If you want to replace every instance, use s///g (global).

Selecting lines from a file with sed

To print a specific line using sed, you use the p instruction. You pair this with the -n option, which does not return any lines unless explicitely told to. So, to print the 10th line of a file, you’d do:

sed -n '10p' file.txt

To print lines 56 to 61:

sed -n '56,61p' file.txt

To print every second line, starting with the second line, you’d do:

sed -n '2~2p' file.txt

This last one is a handy way of extracting just the sequences from an (unwrapped) FASTA file.