Linux sed 命令

2017-07-14|Categories: external-cmd|

正则匹配没有「懒惰模式」

sed支持POSIX兼容的正则表达式,不支持Perl风格的正则表达式,详见《正则表达式学习笔记》

下面的例子来自 http://coolshell.cn/articles/9104.html

例:去掉某html中的tags

<b>This</b> is what <span style="text-decoration: underline;">I</span> meant. Understand?

如果你这样搞的话,就会有问题:

$ sed 's/<.*>//g' html.txt
 meant. Understand?

要解决上面的那个问题,就得像下面这样,其中的[^>]指定了除了>的字符重复0次或多次:

$ sed 's/<[^>]*>//g' html.txt
This is what I meant. Understand?

子命令s自定义分隔符

http://backreference.org/2010/02/20/using-different-delimiters-in-sed/

What if, in sed, you have lots of slashes in the pattern and/or replacement?

One solution is to escape them all (the so-called sawtooth effect):

sed 's/\/a\/b\/c\//\/d\/e\/f\//'    # change "a/b/c/" to "d/e/f/"

but that is ugly and unreadable. It's a not-so-known fact that sed can use ANY character as separator for the "s" command. Basically, sed takes whatever follows the "s" as the separator. So, our code above can be rewritten for example in one of the following ways:

sed 's_/a/b/c/_/d/e/f/_'
sed 's;/a/b/c/;/d/e/f/;'
sed 's#/a/b/c/#/d/e/f/#'
sed 's|/a/b/c/|/d/e/f/|'
sed 's /a/b/c/ /d/e/f/ '       # yes, even space
# etc.

An even less-known fact is that you can use a different delimiter even for patterns used in addresses, using a special syntax:

# do this (ugly)...
sed '/\/a\/b\/c\//{do something;}'
# ...or these (better)
sed '\#/a/b/c/#{do something;}'
sed '\_/a/b/c/_{do something;}'
sed '\%/a/b/c/%{do something;}'
# etc.

Leave A Comment