Cut Command Examples
来源:互联网 发布:学生手机兼职赚钱软件 编辑:程序博客网 时间:2024/06/06 10:56
About cut
Cut out selected fields of each line of a file. Cut command can be used to display only specific columns from a text file or other command outputs.
Syntax
[skypeGNU@localhost ~]$ cut --help
Usage: cut OPTION... [FILE]... Print selected parts of lines from each FILE to standard output. Mandatory arguments to long options are mandatory for short options too. -b, --bytes=LIST select only these bytes -c, --characters=LIST select only these characters -d, --delimiter=DELIM use DELIM instead of TAB for field delimiter -f, --fields=LIST select only these fields; also print any line that contains no delimiter character, unless the -s option is specified -n with -b: don't split multibyte characters --complement complement the set of selected bytes, characters or fields -s, --only-delimited do not print lines not containing delimiters --output-delimiter=STRING use STRING as the output delimiter the default is to use the input delimiter Use one, and only one of -b, -c or -f. Each LIST is made up of one range, or many ranges separated by commas. Selected input is written in the same order that it is read, and is written exactly once.Each range is one of: N N'th byte, character or field, counted from 1 N- from N'th byte, character or field, to end of line N-M from N'th to M'th (included) byte, character or field -M from first to M'th (included) byte, character or field With no FILE, or when FILE is -, read standard input.
-b list The list following -b specifies byte positions (for instance, -b1-72 would pass the first 72bytes of each line). When -b and -n are used together, list is adjusted so that no multi-byte character is split. If -b is used, the input line should contain 1023 bytes or less.
-c list The list following -c specifies character positions (for instance, -c1-72 would pass the first 72 characters of each line).
-f list The list following -f is a list of fields assumed to be separated in the file by a delimiter character (see -d ); for instance, -f1,7 copies the first and seventh field only. Lines with no field delimiters will be passed through intact (useful for table subheadings), unless -s is specified. If -f is used, the input line should contain 1023 characters or less.
list A comma-separated or blank-character-separated list of integer field numbers (in increasing order), with optional - to indicate ranges (for instance, 1,4,7; 1-3,8; -5,10(short for 1-5,10); or 3- (short for third through last field)).
-n Do not split characters. When -b list and -n are used together, list is adjusted so that no multi-byte character is split. -d delim The character following -d is the field delimiter (-f option only).Default is tab. Space or other characters with special meaning to the shell must be quoted. delim can be a multi-byte character.
-s Suppresses lines with no delimiter characters in case of -f option. Unless specified, lines with no delimiters will be passed through untouched. file A path name of an input file. If no file operands are specified, or if a file operand is -, the standard input will be used.
Examples
Following are some of the examples.
For most of the example, we’ll be using the following test file.
[skypeGNU@localhost ~]$ cat test.txt
cat command for file oriented operations.cp command for copy files or directories.ls command to list out files and directories with its attributes.
1. Select Column of Characters
To extract only a desired column from a file use -c option.
The following example displays 2nd character from each line of a file test.txt
[skypeGNU@localhost ~]$ cut -c2 test.txt
aps
As seen above, the characters a, p, s are the second character from each line of the test.txt file.
2. Select Column of Characters using Range
Range of characters can also be extracted from a file by specifying start and end position delimited with -. The following example extracts first 3 characters and the 5rd character of each line from a file called test.txt
[skypeGNU@localhost ~]$ cut -b1-3,5 test.txt
catccp ols o
这里需要注意的是:cut命令如果使用了-b,-c或-f选项,那么执行此命令时,cut会先把后面所有的范围进行从小到大排序,然后再提取。所以想通过指定范围的方式来排列特定的字符[字节/域]顺序是行不通的。
[skypeGNU@localhost ~]$ cut -b5,1-3 test.txt
catccp ols o
3. Select Column of Characters using either Start or End Position
Either start position or end position can be passed to cut command with -c option.The following specifies only the start position before the ‘-’. This example extracts from 4rd character to end of each line from test.txt file.
[skypeGNU@localhost ~]$ cut -c4- test.txt
command for file oriented operations.command for copy files or directories.command to list out files and directories with its attributes.
The following specifies only the end position after the ‘-’. This example extracts 5 characters from the beginning of each line from test.txt file.
[skypeGNU@localhost ~]$ cut -c-5 test.txt
cat ccp cols co
4. Select a Specific Field from a File
Instead of selecting x number of characters, if you like to extract a whole field, you can combine option -f and -d. The option -f specifies which field you want to extract, and the option -d specifies what is the field delimiter that is used in the input file.The following example displays only first field of each lines from /etc/passwd file using the field delimiter : (colon). In this case, the 1st field is the username. The file
[skypeGNU@localhost ~]$ head -5 /etc/passwd | cut -d':' -f1
rootbindaemonadmlp
5. Select Multiple Fields from a File
You can also extract more than one fields from a file or stdout. Below example displays username and home directory of users who has the login shell as “/bin/bash”.[skypeGNU@localhost ~]$ grep '/bin/bash' /etc/passwd | cut -d':' -f1,6
root:/rootskypeGNU:/home/skypeGNU
To display the range of fields specify start field and end field as shown below. In this example, we are selecting field 1 through 4, 6 and 7
[skypeGNU@localhost ~]$ grep '/bin/bash' /etc/passwd | cut -d':' -f1-4,6,7
root:x:0:0:/root:/bin/bashskypeGNU:x:500:500:/home/skypeGNU:/bin/bash
6. Select Fields Only When a Line Contains the Delimiter
In our /etc/passwd example, if you pass a different delimiter other than : (colon), cut will just display the whole line.In the following example, we’ve specified the delimiter as | (pipe), and cut command simply displays the whole line, even when it doesn’t find any line that has | (pipe) as delimiter.
[skypeGNU@localhost ~]$ head -3 /etc/passwd | cut -d'|' -f1
root:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologin
But, it is possible to filter and display only the lines that contains the specified delimiter using -s option.The following example doesn’t display any output, as the cut command didn’t find any lines that has | (pipe) as delimiter in the /etc/passwd file.
[skypeGNU@localhost ~]$ head -3 /etc/passwd | cut -d'|' -s -f1
无输出
7. Select All Fields Except the Specified
FieldsIn order to complement the selection field list use option --complement.The following example displays all the fields from /etc/passwd file except field 7
[skypeGNU@localhost ~]$ head -3 /etc/passwd | cut -d':' -f7
/bin/bash/sbin/nologin/sbin/nologin
[skypeGNU@localhost ~]$ head -3 /etc/passwd | cut -d':' --complement -f7
root:x:0:0:root:/rootbin:x:1:1:bin:/bindaemon:x:2:2:daemon:/sbin
8. Change Output Delimiter for Display
By default the output delimiter is same as input delimiter that we specify in the cut -d option.To change the output delimiter use the option –output-delimiter as shown below. In this example, the input delimiter is : (colon), but the output delimiter is # (hash).
[skypeGNU@localhost ~]$ head -3 /etc/passwd | cut -d':' -f1,6,7 --output-delimiter='#'
root#/root#/bin/bashbin#/bin#/sbin/nologindaemon#/sbin#/sbin/nologin
9. Change Output Delimiter to Newline
In this example, each and every field of the cut command output is displayed in a separate line. We still used --output-delimiter, but the value is $’\n’ which indicates that we should add a newline as the output delimiter.
[skypeGNU@localhost ~]$ grep '^root:' /etc/passwd | cut -d':' -f1,6
root:/root
[skypeGNU@localhost ~]$ grep '^root:' /etc/passwd | cut -d':' -f1,6 --output-delimiter=$'\n'
root/root
10. Combine Cut with Other Unix Command Output
The power of cut command can be realized when you combine it with the stdout of some other Unix command.Once you master the basic usage of cut command that we’ve explained above, you can wisely use cut command to solve lot of your text manipulation requirements.
(1) Displays the unix login names for all the users in the system.
[skypeGNU@localhost ~]$ cut -d':' -f1 /etc/passwd | head -3
rootbindaemon
(2) Displays the total memory available on the system.
[skypeGNU@localhost ~]$ free | tr -s ' ' | sed '/^Mem/!d' | cut -d' ' -f2
1021060
关于字符 -c 和字节 -b 的讨论:
[skypeGNU@localhost ~]$ cat test_cn.txt
复旦大学上海交通大学南京大学中国人民大学香港科技大学
[skypeGNU@localhost ~]$ cut -c2 test_cn.txt
旦 海 京 国 港
[skypeGNU@localhost ~]$ cut -b2 test_cn.txt
� � � � �
看到了吧,上面发生了什么情况。用-c则会以字符为单位,输出正常;而-b只会傻傻的以字节(8位二进制位)来计算,输出就是乱码。既然提到了这个知识点,就再补充一点。
在计算机中,所有的数据在存储和运算时都要使用二进制数表示(因为计算机用高电平和低电平分别表示1和0),例如,像a、b、c、d这样的52个字母(包括大写)、以及0、1等数字还有一些常用的符号(例 如*、#、@等)在计算机中存储时也要使用二进制数来表示,而具体用哪些二进制数字表示哪个符号,当然每个人都可以约定自己的一套(这就叫编码),而大家如果要想互相通信而不造成混乱,那么大家就必须使用相同的编码规则,于是美国有关的标准化组织就出台了所谓的ASCII编码,统一规定了上述常用符号用哪些二进制数来表示。
ASCII 码使用指定的7 位或8 位二进制数组合来表示128 或256 种可能的字符。标准ASCII 码也叫基础ASCII码,使用7 位二进制数来表示所有的大写和小写字母,数字0 到9、标点符号, 以及在美式英语中使用的特殊控制字符。所以,对于英文来说一个字符对应一个字节是没有任何问题的,一个字符8bit。但是问题是汉字编码数量庞大,字形复杂,所以只用第一个字节是没有办法表示的。 所以必须用多个字节来表示一个字符。常见的中文字符集有: GB2312-80字符集,中文名国家标准字符集, Big-5字符集,中文名大五码, GBK字符集,中文名国家标准扩展字符集。
ISO/IEC 10646 / Unicode字符集,一个字符用16bit表示.
字符串在内存中的存放方法:
在 ASCII 阶段,单字节字符串使用一个字节存放一个字符(SBCS)。比如,"Bob123" 在内存中为:
在使用 ANSI 编码支持多种语言阶段,每个字符使用一个字节或多个字节来表示(MBCS),因此,这种方式存放的字符也被称作多字节字符。比如,"中文123" 在中文 Windows 95 内存中为7个字节,每个汉字占2个字节,每个英文和数字字符占1个字节D6 D0CE C431323300中文123\0
在 UNICODE 被采用之后,计算机存放字符串时,改为存放每个字符在 UNICODE 字符集中的序号。目前计算机一般使用 2 个字节(16 位)来存放一个序号(DBCS),因此,这种方式存放的字符也被称作宽字节字符。比如,字符串 "中文123" 在 Windows 2000 下,内存中实际存放的是 5 个序号:
当遇到多字节字符时,可以使用-n选项,-n用于告诉cut不要将多字节字符拆开。
例子如下:
[skypeGNU@localhost ~]$ cut -b2 test_cn.txt
� � � � �
[skypeGNU@localhost ~]$ cut -b2 -n test_cn.txt 这里什么也没有打印。
cut有哪些缺陷和不足?
猜出来了吧?对,就是在处理多空格时。
如果文件里面的某些域是由若干个空格来间隔的,那么用cut就有点麻烦了,因为cut只擅长处理“以一个字符间隔”的文本内容。
- Cut Command Examples
- Xargs Command Examples
- Sort Command Examples
- tr Command Examples
- Tee Command Usage Examples
- ip command examples
- 15 TCPDUMP Command Examples
- Redis - Command examples
- 10 Wget Command Examples
- Unix cut command
- Linux: cut command
- linux ps command useful examples
- Linux / Unix: chroot Command Examples
- 例说linux command(cut)
- 12 Linux Which Command, Whatis Command, Whereis Command Examples
- 15 Practical Linux Find Command Examples
- UNIX / Linux: 10 Netstat Command Examples
- 10 IPCS Command Examples (With IPC Introduction)
- Javascript跳转页面和打开新窗口等方法
- 栈的两种实现方法--数组实现与链式实现
- 2013.10.12北京360技术笔试(部分总结)
- [Cocos2D]如何创建Cocos2D-X的项目(版本号2.2)
- 第七周项目1-求并联电阻
- Cut Command Examples
- ResultSet 游标权限获取问题
- 解析泛泰手机rawdata/phoneinfo分区(以泛泰A850L为例)
- windows下用c++建立socket客户端
- STL list型容器的使用
- URLEncoder.encode与URLDecoder.docode传递中文参数编码与解码
- The identity 'iPhone Developer:XXXX doesn't match any&nb
- ffmpeg 源代码简单分析 : av_register_all()
- shell脚本删除旧日志文件