Python入门:PIL之验证码破解
来源:互联网 发布:淘宝客qq群拉人 编辑:程序博客网 时间:2024/05/23 15:06
环境介绍
1、当前文件的路径是:/Users/frankslg/PycharmProjects/cjb/ver/ver_code1.py
2、而存放图片的路径是:/Users/frankslg/PycharmProjects/cjb/img/*.jpeg
3、os.getcwd()
Out[3]: ‘/Users/frankslg/PycharmProjects/cjb’
代码实现
#ver_code1.pyfrom PIL import Imageimport pytesseractimport osdef convert(pic_path,pic): #先将图片进行灰度处理,也就是处理成单色,然后进行下一步单色对比 imgrey = pic.convert('L') #去除图片噪点,170是经过多次调整后,去除噪点的最佳值 ''' 其实就是对已处理的灰度图片,中被认为可能形成验证码字符的像素进行阀值设定, 如果阀值等于170,我就认为是形成验证码字符串的所需像素,然后将其添加进一个空table中, 最后通过im.point将使用table拼成一个新验证码图片 ''' threshold = 170 table = [] for i in range(256): if i < threshold: table.append(0) else: table.append(1) #使用table(是上面生成好的)生成图片 out = imgrey.point(table,'1') out.save(pic_path + '/' + 'cjb'+ str(threshold) + '.jpeg','jpeg') #读取处理好的图片的路径 a = pic_path + '/' + 'cjb' + str(threshold) + '.jpeg' img3 = Image.open(a,'r') #将图片中的像素点识别成字符串(图片中的像素点如果没有处理好, #可能在识别过程中会有误差,如多个字符少个字符,或者识别错误等) vcode = pytesseract.image_to_string(img3) print(vcode)#此句也是测试结果时使用的 return vcode#此句才是将被破解的验证码字符串返回给需要的代码的if __name__ == '__main__': pic_path = (os.getcwd()[:-4])+ '/img'#先获取图片的存储路径 pic = pic_path + '/' + os.listdir(pic_path)[0]#找到对应的图片,此处的0是指, #找图片目录中第一个图片,你可以根据自己的需要进行修改 pic_open = Image.open(pic,'r') convert(pic_path,pic_open)
运行效果
原图:
灰度图:
清除噪点后的图:
注:这里要说明一点,清除噪点后的图是白底黑字,还是黑底白字就看噪点处理代码中大于噪点时使用的是1还是0
代码执行后的结果:
WDHA
参考资料
In[18]: help(Image.open(pic,’r’).convert)
Help on method convert in module PIL.Image:
convert(mode=None, matrix=None, dither=None, palette=0, colors=256) method of PIL.JpegImagePlugin.JpegImageFile instance
Returns a converted copy of this image. For the “P” mode, this
method translates pixels through the palette. If mode is
omitted, a mode is chosen so that all information in the image
and the palette can be represented without a palette.
The current version supports all possible conversions between"L", "RGB" and "CMYK." The **matrix** argument only supports "L"and "RGB".When translating a color image to black and white (mode "L"),the library uses the ITU-R 601-2 luma transform:: L = R * 299/1000 + G * 587/1000 + B * 114/1000The default method of converting a greyscale ("L") or "RGB"image into a bilevel (mode "1") image uses Floyd-Steinbergdither to approximate the original image luminosity levels. Ifdither is NONE, all non-zero values are set to 255 (white). Touse other thresholds, use the :py:meth:`~PIL.Image.Image.point`method.:param mode: The requested mode. See: :ref:`concept-modes`.:param matrix: An optional conversion matrix. If given, this should be 4- or 12-tuple containing floating point values.:param dither: Dithering method, used when converting from mode "RGB" to "P" or from "RGB" or "L" to "1". Available methods are NONE or FLOYDSTEINBERG (default).:param palette: Palette to use when converting from mode "RGB" to "P". Available palettes are WEB or ADAPTIVE.:param colors: Number of colors to use for the ADAPTIVE palette. Defaults to 256.:rtype: :py:class:`~PIL.Image.Image`:returns: An :py:class:`~PIL.Image.Image` object.
In[10]: help(im.point)
Help on method point in module PIL.Image:
point(lut, mode=None) method of PIL.JpegImagePlugin.JpegImageFile instance
Maps this image through a lookup table or function.
:param lut: A lookup table, containing 256 (or 65336 if self.mode=="I" and mode == "L") values per band in the image. A function can be used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.:param mode: Output mode (default is same as input). In the current version, this can only be used if the source image has mode "L" or "P", and the output has mode "1" or the source image mode is "I" and the output mode is "L".:returns: An :py:class:`~PIL.Image.Image` object.
In[16]: help(pytesseract.image_to_string)
Help on function image_to_string in module pytesseract.pytesseract:
image_to_string(image, lang=None, boxes=False, config=None)
Runs tesseract on the specified image. First, the image is written to disk,
and then the tesseract command is run on the image. Resseract’s result is
read, and the temporary files are erased.
also supports boxes and config.if boxes=True "batch.nochop makebox" gets added to the tesseract callif config is set, the config gets appended to the command. ex: config="-psm 6"
- Python入门:PIL之验证码破解
- 爬虫实战---python图片验证码破解,PIL和安装
- Python + PIL 图片验证码
- Python入门:验证码破解(二)
- PIL(Python Image Library)生成验证码
- 在Python中用PIL做验证码
- python+ PIL 生成验证码图片
- python selenium+PIL+免验证码登陆
- Python 破解验证码
- Python 破解验证码
- Python破解验证码
- Python: Window 64位 Python 使用PIL && 验证码生成
- Python的PIL库实现验证码图片
- Python的PIL库实现验证码图片
- Python学习常用第三方模块:PIL,生成验证码
- python 使用tesseract-ocr , pytesseract , PIL进行验证码识别
- python基于PIL和tesseract的验证码识别
- Python的PIL库实现验证码图片
- 电磁波 —— 一种光波
- switch程序理解
- 正则表达式中的单个字符含义
- 五分钟战胜 Python 字符编码
- Launcher3桌面Icon的文字size的 修改
- Python入门:PIL之验证码破解
- 在IOS工程中使用OC调用C语言国密算法SM4(来替换DES算法)
- Linux 下串口编程入门
- MySQL索引原理及慢查询优化
- mybatis实战教程(mybatis in action),mybatis入门到精通
- 这10篇 iOS 热文,你别错过哦
- 1076. Forwards on Weibo
- android socket
- 自定义ProgressBar的进度色彩