Python字符串操作汇总

来源：互联网发布：慕尼黑协定知乎编辑：程序博客网时间：2024/05/16 14:44

Python字符串操作汇总

字符串操作

复制字符串

sStr1 =  'Alice'sStr2 = sStr1print sStr2     #打印结果 ：Alice

连接字符串

sStr1 = 'Alice'sStr2 = ' Bob'sStr1 += sStr2print sStr1     #打印结果 ：Alice Bob

查找字符串

sStr1 = 'Alice'sStr2 = 'c'print sStr1.index(sStr2)    #打印结果 ：3   #如果找不到字符串的话，会报值错误

比较字符串

sStr1 = 'Alice'sStr2 = 'Bob'print cmp(sStr1,sStr2)     #打印结果 ：-1

字符串长度

sStr1 = 'Alice'print len(sStr1)      #打印结果 ：5

字符串按次数打印

sStr = 'Alice'print sStr * 3      #打印结果 ：AliceAliceAlice

去除字符串两端空格或其他字符

sStr1 = '  Alice Bob  'sStr2 = '---Alice Bob---'print sStr1.strip(' ')      #打印结果 ：Alice Bobprint sStr2.strip('-')      #打印结果 ：Alice Bob

字符串大小写互转

S = "abCD"print S.lower() #小写     #打印结果 ：abcdprint S.upper() #大写     #打印结果 ：ABCDprint S.swapcase() #大小写互换      #打印结果 ：ABcdprint S.capitalize() #首字母大写     #打印结果 ：AbcdS = "abCD efGH"print S.capitalize()     #打印结果 ：Abcd efgh

追加指定长度的字符串

sStr1 = 'Alice'sStr2 = 'Cindy'n = 3sStr1 += sStr2[0:n]print sStr1#打印结果 ：AliceCin

字符串指定长度比较

sStr1 = 'Alice'sStr2 = 'Cindy'n = 3print cmp(sStr1[0:n],sStr2[0:n])    #打印结果 ：-1

复制指定长度的字符

sStr2 = 'Alice'n = 3sStr1 = sStr2[0:n]print sStr1     #打印结果 ：Ali

翻转字符串

sStr1 = 'Alice'sStr1 = sStr1[::-1]print sStr1     #打印结果 ：ecilA

将字符串前n个字符替换为指定的字符

sStr1 = 'Alice'ch = 'B'n = 3sStr1 = n * ch + sStr1[3:]    print sStr1 #打印结果 ：BBBce

查找字符串

sStr1 = 'abcdefg'sStr2 = 'cde'print sStr1.find(sStr2)    #打印结果 : 2

切割字符串

s = 'ab,cde,fgh,ijk'<pre name="code" class="python">delimiter = '--'mylist = ['Brazil', 'Russia', 'India', 'China']print delimiter.join(mylist)       #打印结果 ：Brazil--Russia--India--China

print(s.split(',')) #打印结果：['ab', 'cde', 'fgh', 'ijk']

指定步长切割字符串

a = [0,1,2,3,4,5,6,7,8,9,10]print a[::2]#打印结果：[0, 2, 4, 6, 8, 10]print a[::3]#打印结果：[0, 3, 6, 9]

连接字符串

delimiter = '--'mylist = ['Brazil', 'Russia', 'India', 'China']print delimiter.join(mylist)       #打印结果 ：Brazil--Russia--India--China

字符串中expandtabs()方法

#把字符串中的 tab 符号('\t')转为空格，tab 符号('\t')默认的空格数是 8s = '\tAlice\tBob\tCindy'print s.expandtabs()        #打印结果 ：        Alice   Bob     Cindy

字符串分割partition

#partition() 方法用来根据指定的分隔符将字符串进行分割。#如果字符串包含指定的分隔符，则返回一个3元的元组，第一个为分隔符左边的子串，第二个为分隔符本身，第三个为分隔符右边的子串。print 'dog wow wow jiao'.partition('dog')print 'dog wow wow jiao'.partition('ww')    # ('dog wow wow jiao', '', '')print 'dog wow wow jiao'.partition('wow')print 'dog wow wow wow wowjiao'.rpartition('wow')

打印结果：

('', 'dog', ' wow wow jiao')('dog wow wow jiao', '', '')('dog ', 'wow', ' wow jiao')('dog wow wow wow ', 'wow', 'jiao')

字符串包含

sStr1 = 'Alice'sStr2 = 'ic'print (sStr2 in sStr1)      #Trueprint (sStr2 not in sStr1)      #False

字符串截取

str = '0123456789'print str[0:3] #截取第一位到第三位的字符print str[:] #截取字符串的全部字符print str[6:] #截取第七个字符到结尾print str[:-3] #截取从头开始到倒数第三个字符之前print str[2] #截取第三个字符print str[-1] #截取倒数第一个字符print str[::-1] #创造一个与原字符串顺序相反的字符串print str[-3:-1] #截取倒数第三位与倒数第一位之前的字符print str[-3:] #截取倒数第三位到结尾print str[:-5:-3] #逆序截取

打印结果：

0120123456789678901234562998765432107878996

字符串在输出时的对齐

S.ljust(width,[fillchar])#输出width个字符，S左对齐，不足部分用fillchar填充，默认的为空格。S.rjust(width,[fillchar]) #右对齐S.center(width, [fillchar]) #中间对齐S.zfill(width) #把S变成width长，并在右对齐，不足部分用0补足sStr1 = '  Alice Bob  'print sStr1.rjust(18,'-')    #打印结果 ：-----  Alice Bob

字符串中的搜索和替换

S.find(substr, [start, [end]])#返回S中出现substr的第一个字母的标号，如果S中没有substr则返回-1。start和end作用就相当于在S[start:end]中搜索

S.index(substr, [start, [end]])#与find()相同，只是在S中没有substr时，会返回一个运行时错误

S.rfind(substr, [start, [end]])#返回S中最后出现的substr的第一个字母的标号，如果S中没有substr则返回-1，也就是说从右边算起的第一次出现的substr的首字母标号

S.rindex(substr, [start, [end]])#从右往左查找子串的索引

S.count(substr, [start, [end]]) #计算substr在S中出现的次数

S.replace(oldstr, newstr, [count])#把S中的oldstar替换为newstr，count为替换次数。这是替换的通用形式，还有一些函数进行特殊字符的替换

S.strip([chars])#把S中前后chars中有的字符全部去掉，可以理解为把S前后chars替换为NoneS.lstrip([chars])S.rstrip([chars])

字符串的分割和组合

S.split([sep, [maxsplit]])#以sep为分隔符，把S分成一个list。maxsplit表示分割的次数。默认的分割符为空白字符S.rsplit([sep, [maxsplit]])

S.splitlines([keepends])#把S按照行分割符分为一个list，keepends是一个bool值，如果为真每行后而会保留行分割符。S.join(seq) #把seq代表的序列──字符串序列，用S连接起来

字符串的mapping，这一功能包含两个函数

String.maketrans(from, to)#返回一个256个字符组成的翻译表，其中from中的字符被一一对应地转换成to，所以from和to必须是等长的。

S.translate(table[,deletechars])# 使用上面的函数产后的翻译表，把S进行翻译，并把deletechars中有的字符删掉。需要注意的是，如果S为unicode字符串，那么就不支持 deletechars参数，可以使用把某个字符翻译为None的方式实现相同的功能。此外还可以使用codecs模块的功能来创建更加功能强大的

字符串编码和解码

S.encode([encoding,[errors]])# 其中encoding可以有多种值，比如gb2312 gbk gb18030 bz2 zlib big5 bzse64等都支持。errors默认值为"strict"，意思是UnicodeError。可能的值还有'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' 和所有的通过codecs.register_error注册的值。

S.decode([encoding,[errors]])#解码

字符串的测试、判断函数，这一类函数在string模块中没有，这些函数返回的都是bool值

S.startswith(prefix[,start[,end]])#是否以prefix开头S.endswith(suffix[,start[,end]])#以suffix结尾S.isalnum()#是否全是字母和数字，并至少有一个字符S.isalpha() #是否全是字母，并至少有一个字符S.isdigit() #是否全是数字，并至少有一个字符S.isspace() #是否全是空白字符，并至少有一个字符S.islower() #S中的字母是否全是小写S.isupper() #S中的字母是否便是大写S.istitle() #S是否是首字母大写的

字符串类型转换函数，这几个函数只在string模块中有

string.atoi(s[,base])#base默认为10，如果为0,那么s就可以是012或0x23这种形式的字符串，如果是16那么s就只能是0x23或0X12这种形式的字符串string.atol(s[,base]) #转成longstring.atof(s[,base]) #转成float

注意：字符串对象是不可改变的，也就是说在python创建一个字符串后，你不能把这个字符中的某一部分改变。任何上面的函数改变了字符串后，都会返回一个新的字符串，原字串并没有变。其实这也是有变通的办法的，可以用S=list(S)这个函数把S变为由单个字符为成员的list，这样的话就可以使用S[3]='a'的方式改变值，然后再使用S=" ".join(S)还原成字符串。

字典转字符串

dict = {'name': 'Zara', 'age': 7, 'class': 'First'}print type(str(dict)), str(dict)#打印结果 ：<type 'str'> {'age': 7, 'name': 'Zara', 'class': 'First'}

列表转为字符串，

nums=[1, 3, 5, 7, 8, 13, 20];print str(nums)     #打印结果 ：[1, 3, 5, 7, 8, 13, 20]

元组转为字符串

tup=(1, 2, 3, 4, 5)     print tup.__str__()     #打印结果 ：(1, 2, 3, 4, 5)

字符串转为元组

print tuple(eval("(1,2,3)"))        #打印结果 ：(1, 2, 3)

字符串转为列表

print list(eval("(1,2,3)"))     #打印结果：[1, 2, 3]

字符串转为字典

print type(eval("{'name':'ljq', 'age':24}"))        #打印结果 ：<type 'dict'>

字符串原样输出（r/R）

#即使包含转义字符，可以完全打印出来而不会被自动转义sStr = r'\tljsdl\'\"\\'print sStr        #打印结果 ：\tljsdl\'\"\\

字符串三引号

#三引号可以将复杂的字符串进行复制:三引号允许一个字符串跨多行，字符串中可以包含换行符、制表符以及其他特殊字符sStr ='''<HTML><HEAD><TITLE>Friends CGI Demo</TITLE></HEAD><BODY><H3>ERROR</H3><B>%s</B><P><FORM><INPUT TYPE=button VALUE=BackONCLICK="window.history.back()"></FORM></BODY></HTML>'''print sStr      #打印结果 ：与三引号中内容一致

Python字符串和日期互转

import time,datetime# strftime() 函数根据区域设置格式化本地时间/日期，函数的功能将时间格式化print time.strftime("%Y-%m-%d %X", time.localtime())#strptime() 函数按照特定时间格式将字符串转换为时间类型t = time.strptime("2009 - 08 - 08", "%Y - %m - %d")#t为一个结构体，第1,2,3位分别为年，月，日y,m,d = t[0:3]print datetime.datetime(y,m,d)

打印结果：

2016-05-07 10:30:072012-06-08 00:00:00

Python时间格式转化符号含义：

%a 英文星期简写%A 英文星期的完全%b 英文月份的简写%B 英文月份的完全%c 显示本地日期时间%d 日期，取1-31%H 小时， 0-23%I 小时， 0-12%m 月， 01 -12%M 分钟，1-59%j 年中当天的天数%w 显示今天是星期几%W 第几周%x 当天日期%X 本地的当天时间%y 年份 00-99间%Y 年份的完整拼写

字符串格式化输出

%符号

print 'name is %s age is %d' % ('Alice',22)#打印结果：name is Alice age is 22

通过位置：

print '{0},{1}'.format('kzc',18)#打印结果 ：kzc,18print '{},{}'.format('kzc',18)#打印结果 ：kzc,18print '{1},{0},{1}'.format('kzc',18)#打印结果 ：18,kzc,18

通过关键字参数：

print '{name},{age}'.format(age=18, name='Alice')#打印结果 ：Alice,18

通过对象属性：

class Person:    def __init__(self, name, age):        self.name, self.age = name, age     def __str__(self):        return 'This guy is {self.name},is {self.age} old'.format(self=self) print str(Person('Alice',22))#打印结果：This guy is Alice,is 22 old

通过下标：

p=['kzc',18]print '{0[0]},{0[1]}'.format(p)#打印结果：kzc,18

模板字符串：（关键字参数(substitute)）

单词替换

from string import Template  s = Template('$x, gloriout $x!')  print s.substitute(x = 'slurm')     #打印结果：slurm, gloriout slurm!

单词字母替换：

from string import Template  s = Template("It's ${x}tastic!")  print s.substitute(x = 'slurm')     #打印结果：It's slurmtastic!

插入$符号（使用$$）:

from string import Template  s = Template("Make $$ selling $x!")  print s.substitute(x = 'slurm')     #打印结果 ：Make $ selling slurm!

字典变量提供值:

from string import Template  s = Template('A $thing must never $action')  d = {}  d['thing'] = 'gentleman'  d['action'] = 'show his socks'  print s.substitute(d)      #打印结果 ： A gentleman must never show his socks

用*作为字段宽度或精度：

print '%.*s' % (5, 'Guido van Rossum')      #打印结果：Guido

填充与对齐：

填充常跟对齐一起使用

^、<、>分别是居中、左对齐、右对齐，后面带宽度

:号后面带填充的字符，只能是一个字符，不指定的话默认是用空格填充

print '{:>8}'.format('189')print '{:0>8}'.format('189')print '{:a>8}'.format('189')

打印结果：

     18900000189aaaaa189

精度与类型：

print '{:.2f}'.format(321.33345)321.33

字符串进制输出：

#b、d、o、x分别是二进制、十进制、八进制、十六进制print '{:b}'.format(17)     #打印结果 ：10001print '{:d}'.format(17)     #打印结果 ：17print '{:o}'.format(17)     #打印结果 ：21print '{:x}'.format(17)     #打印结果 ：11

金额千位分隔符：

print '{:,}'.format(1234567890)     #打印结果 ：1,234,567,890

转换标志：

pi = 3.1415     #打印结果 ：0000003.14print '%010.2f' % pi        #打印结果 ：3.14      print '%-10.2f' % pi        #打印结果 ：      3.14print '%10.2f' % pi         #打印结果 ：      3.14print '%+10.2f' % pi        #打印结果 ：     +3.14

字符串操作实例

在指定的预定义字符前添加反斜杠

def addslashes(s):    d = {'"':'\\"', "'":"\\'", "\0":"\\\0", "\\":"\\\\"}    return ''.join(d.get(c, c) for c in s)s = "John 'Johny' Doe (a.k.a. \"Super Joe\")\\\0"print sprint addslashes(s)

打印结果：

John 'Johny' Doe (a.k.a. "Super Joe")\John \'Johny\' Doe (a.k.a. \"Super Joe\")\\\

只显示字母与数字

def OnlyCharNum(s, oth=''):    s2 = s.lower();    fomart = 'abcdefghijklmnopqrstuvwxyz0123456789'    for c in s2:        if not c in fomart:            s = s.replace(c, '');    return s; print(OnlyCharNum("a000 aa-b"))     #打印结果 ：a000aab

Python实现wordcount

# word frequency in a text# tested with Python24  vegaseat  25aug2005# Chinese wisdom ...str1 = """Man who run in front of car, get tired.Man who run behind car, get exhausted."""print "Original string:"print str1print# create a list of words separated at whitespaceswordList1 = str1.split(None)# strip any punctuation marks and build modified word list# start with an empty listwordList2 = []for word1 in wordList1:    # last character of each word    lastchar = word1[-1:]    # use a list of punctuation marks    if lastchar in [",", ".", "!", "?", ";"]:        word2 = word1.rstrip(lastchar)    else:        word2 = word1    # build a wordList of lower case modified words    wordList2.append(word2.lower())print "Word list created from modified string:"print wordList2print# create a wordfrequency dictionary# start with an empty dictionaryfreqD2 = {}for word2 in wordList2:    freqD2[word2] = freqD2.get(word2, 0) + 1# create a list of keys and sort the list# all words are lower case alreadykeyList = freqD2.keys()keyList.sort()print "Frequency of each word in the word list (sorted):"for key2 in keyList:    print "%-10s %d" % (key2, freqD2[key2])

打印结果：

Original string:Man who run in front of car, get tired.Man who run behind car, get exhausted.Word list created from modified string:['man', 'who', 'run', 'in', 'front', 'of', 'car', 'get', 'tired', 'man', 'who', 'run', 'behind', 'car', 'get', 'exhausted']Frequency of each word in the word list (sorted):behind     1car        2exhausted  1front      1get        2in         1man        2of         1run        2tired      1who        2

0 0