BASH命令之乐(3)之grep

来源:互联网 发布:mc无线虚能矩阵 编辑:程序博客网 时间:2024/04/30 03:42
用法: grep [选项]... PATTERN [FILE]...

每个 文件 或是标准输入中查找 PATTERN。

     如果你想在上千行文件中查找某个单词或者其他你需要的东西,更甚者在不同目录下成千上万的文件中查找文件中匹配的某些条件(PATTERN),grep作为UNIX中用于文本搜索的大师级工具,是你的首选!

      grep能够接受正则表达式和通配符,下面具体学习如何使用此强大的工具。

1、选项

正则表达式选择与解释:  -E, --extended-regexp     PATTERN 是一个可扩展的正则表达式(缩写为 ERE)  -F, --fixed-strings       PATTERN 是一组由断行符分隔的定长字符串。  -G, --basic-regexp        PATTERN 是一个基本正则表达式(缩写为 BRE)  -P, --perl-regexp         PATTERN 是一个 Perl 正则表达式  -e, --regexp=PATTERN      用 PATTERN 来进行匹配操作  -f, --file=FILE           从 FILE 中取得 PATTERN  -i, --ignore-case         忽略大小写  -w, --word-regexp         强制 PATTERN 仅完全匹配字词  -x, --line-regexp         强制 PATTERN 仅完全匹配一行  -z, --null-data           一个 0 字节的数据行,但不是空行
其他选项:

-v, --invert-match       显示不包含匹配PATTERN(文本)的行-n, --line-number       打印匹配行的行号-o, --only-matching     仅显示文本中匹配到的文本部分-c                      统计文件或文本中包含匹配字符串的行数(不是匹配字符串的个数)-l, --files-with-matches  只打印匹配FILES 的文件名(多文件搜索时)-b, --byte-offset         打印匹配所在行的字符或者字节偏移-r, --recursive           递归,like --directories=recurse-R, --dereference-recursive  likewise, but follow all symlinks      --include=FILE_PATTERN  只查找匹配FILE_PATTERN 的文件      --exclude=FILE_PATTERN  跳过匹配FILE_PATTERN 的文件和目录      --exclude-from=FILE   跳过所有除FILE 以外的文件      --exclude-dir=PATTERN  跳过所有匹配PATTERN 的目录。


 仍然拿上篇文章vim强大探究之光标移动   中的一段实例作为测试文本:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ cat greptest.txt  1、I am eagerly awaiting my next disappointment. —Ashleigh Brilliant   2、Every man’s memory is his private literature. —Aldous Huxley   3、Life is what happens to you while you’re busy making other plans. —John Lennon   4、Life is really simple, but we insist on making it complicated. —Confucius   5、Do not dwell in the past, do not dream of the future, concentrate the mind on the   6、present moment. —Buddha   7、The more decisions that you are forced to make alone, the more you are aware of   8、your freedom to choose. —Thornton Wilder

2、操作

1)在文件中搜索一个单词:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep past greptest.txt  5、Do not dwell in the past, do not dream of the future, concentrate the mind on the 
也可以从stdin中读取:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ echo -e “this is a grep test\nthe next line” |grep line
“this is a grep testnthe next line

一个grep命令可以对多个文件搜索:

$grep "match_text" file1 file2 file3 ...

可以用匹配符匹配,这个需要用到上文中提到的选项 -E

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -E "Aldou[sde]" greptest.txt  2、Every man’s memory is his private literature. —Aldous Huxley 
或者

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ echo "This is a grep test." | grep -E "[a-z]+\."
This is a grep test.

grep -E 可以用egrep替代

输出文件中匹配到的文本部分,可以使用选项-o,如上例:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ echo "This is a grep test." | grep -o -E "[a-z]+\."
test.

打印匹配行之外的所有行,使用-v选项:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -E -v  "Aldou[sde]" greptest.txt  1、I am eagerly awaiting my next disappointment. —Ashleigh Brilliant   3、Life is what happens to you while you’re busy making other plans. —John Lennon   4、Life is really simple, but we insist on making it complicated. —Confucius   5、Do not dwell in the past, do not dream of the future, concentrate the mind on the   6、present moment. —Buddha   7、The more decisions that you are forced to make alone, the more you are aware of   8、your freedom to choose. —Thornton Wilder
没有打印匹配的第二行


 统计文件或者文本中包含匹配字符串的行数,使用选项-c:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -c is greptest.txt 5
打印出包括匹配字符串的行数,使用-n选项:

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -n -o is greptest.txt 1:is2:is2:is3:is4:is4:is7:is

打印匹配所位于的字符或者字节的偏移,使用-b选项:liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -b -o is greptest.txt

36:is101:is105:is157:is247:is275:is459:is

选项-b总是和-o一起使用

搜索多个文件并找出匹配文本位于哪个文件中,用 -l选项

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -l is greptest.txt out.txt greptest.txt

3、其他选项功能

1) 递归搜索文件,-R,-r

liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10$ grep "epoll" . -R -n./fs/eventpoll.c:47: * There are three level of locking required by epoll :./fs/eventpoll.c:62: * during epoll_ctl(EPOLL_CTL_DEL) and during eventpoll_release_file()../fs/eventpoll.c:65: * This mutex is acquired by ep_free() during the epoll file./fs/eventpoll.c:67: * if a file has been pushed inside an epoll set and it is then./fs/eventpoll.c:68: * close()d without a previous call to epoll_ctl(EPOLL_CTL_DEL)../fs/eventpoll.c:69: * It is also acquired when inserting an epoll fd onto another epoll./fs/eventpoll.c:70: * fd. We do this so that we walk the epoll tree and ensure that this./fs/eventpoll.c:71: * insertion does not create a cycle of epoll file descriptors, which。。。 。。。

这个命令是开发人员使用最多的命令,可以查找某些文本位于哪些源文件中。

2)忽略样式的大小写,-i

liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10$ echo This is a grep line. |grep -i "LINE"This is a grep line.

3)匹配多个样式,用-e

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -e "Life" -e "more" greptest.txt 3、Life is what happens to you while you’re busy making other plans. —John Lennon   4、Life is really simple, but we insist on making it complicated. —Confucius   7、The more decisions that you are forced to make alone, the more you are aware of 
加上-o(需要加到最后)

liujl@liujl-ThinkPad-Edge-E431:~/mybash$ grep -e "Life" -e  "more" greptest.txt -oLifeLifemoremore

4)在grep中搜索包括或者排除文件

--include=FILE_PATTERN  只查找匹配FILE_PATTERN 的文件
--exclude=FILE_PATTERN  跳过匹配FILE_PATTERN 的文件和目录
在当前目录及其子目录中搜索所有的以.h和.c结尾的文件,搜索这些文件中含有“poll”字符串的行:
按照示例给的方法搜索,如下
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --include  *.{h,c}
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$
没有任何结果,如果单个搜索:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --include  *.h
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$

也是没有结果,那是什么原因?
下面这样搜索:

liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --include  "*.h"./mount.h:#include <linux/poll.h>./mount.h:      wait_queue_head_t poll;./cachefiles/internal.h:        wait_queue_head_t               daemon_pollwq;  /* poll waitqueue for daemon */./cachefiles/internal.h:#define CACHEFILES_STATE_CHANGED        3       /* T if state changed (poll trigger) */./cachefiles/internal.h:        wake_up_all(&cache->daemon_pollwq);./gfs2/glock.h:extern int gfs2_glock_poll(struct gfs2_holder *gh);./fuse/fuse_i.h:#include <linux/poll.h>。。。  。。。



或者:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --include  "*.h" --include "*.c" |more./lockd/clntlock.c:      * to lose callbacks, however, so we're going to poll from./mount.h:#include <linux/poll.h>./mount.h:      wait_queue_head_t poll;./pipe.c:#include <linux/poll.h>./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:pipe_poll(struct file *filp, poll_table *wait)./pipe.c:       poll_wait(filp, &pipe->wait, wait);./pipe.c:                * behave exactly like pipes for poll()../pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM | POLLERR | POLLHUP);./pipe.c:       .poll           = pipe_poll,./ncpfs/sock.c:#include <linux/poll.h>。。 。   。。。



都可以搜索到正确的结果。
那尝试 --include=FILE_PATTEN的形式:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --include=*.{h,c} | more./lockd/clntlock.c:      * to lose callbacks, however, so we're going to poll from./mount.h:#include <linux/poll.h>./mount.h:      wait_queue_head_t poll;./pipe.c:#include <linux/poll.h>./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:pipe_poll(struct file *filp, poll_table *wait)./pipe.c:       poll_wait(filp, &pipe->wait, wait);./pipe.c:                * behave exactly like pipes for poll()../pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM | POLLERR | POLLHUP);./pipe.c:       .poll           = pipe_poll,./ncpfs/sock.c:#include <linux/poll.h>。。。  。。。



上面的操作可以搜索出结果,或者用find和xargs组合:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ find . -name "*.h" -o -name "*.c" |xargs grep "poll" | more./lockd/clntlock.c:      * to lose callbacks, however, so we're going to poll from./mount.h:#include <linux/poll.h>./mount.h:      wait_queue_head_t poll;./pipe.c:#include <linux/poll.h>./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);./pipe.c:                       wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM);./pipe.c:pipe_poll(struct file *filp, poll_table *wait)./pipe.c:       poll_wait(filp, &pipe->wait, wait);./pipe.c:                * behave exactly like pipes for poll()../pipe.c:               wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM | POLLERR | POLLHUP);./pipe.c:       .poll           = pipe_poll,./ncpfs/sock.c:#include <linux/poll.h>。。。 ,,,



ok,--include选项先研究到这,看一下--exclude,当然,和--include类似,其作用是排除需要搜索的文件
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ grep "poll" . -r --exclude "README" | more./lockd/clntlock.c:      * to lose callbacks, however, so we're going to poll from./mount.h:#include <linux/poll.h>./mount.h:      wait_queue_head_t poll;./pipe.c:#include <linux/poll.h>。。。 。。。


排除所有的README文件。
5) 打印出匹配文本之前或者之后的行
文件控制:
  -B, --before-context=NUM  打印以文本起始的NUM 行
  -A, --after-context=NUM   打印以文本结尾的NUM 行
  -C, --context=NUM         打印输出文本NUM 行
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ seq 1012345678910



使用-A选项:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ seq 10 | grep 5 -A 35678



打印出第5行之后的3行(包括当前行,5行)
使用-B选项:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ seq 10 | grep 5 -B 32345


打印出第5行之前的3行(包括当前行)
使用-C选项:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ seq 10 | grep 5 -C 32345678



打印出第5行之前的3行和之后的3行(包括当前行)
如果有多个匹配,以一行“--”作为各个匹配之间的定界符:
liujl@liujl-ThinkPad-Edge-E431:~/下载/linux-3.10/fs$ echo -e "a\nb\nc\na\nb\nc\n" | grep a -A 1ab--ab


0 0
原创粉丝点击