Python 按行读取文本文件 缓存 和 非缓存实现
来源:互联网 发布:tvp软件是什么 编辑:程序博客网 时间:2024/05/07 13:42
需求
最近项目中有个读取文件的需求,数据量还挺大,10万行的数量级。
java 使用缓存读取文件是,会相应的创建一个内部缓冲区数组在java虚拟机内存中,因此每次处理的就是这一整块内存。
简单的想:就是如果不用缓存,每次都要硬盘–虚拟机缓存–读取;有了缓存,提前读了一段放在虚拟机缓存里,可以避免频繁将硬盘上的数据读到缓存里。
因为对内存的操作肯定是比硬盘的操作要快的。
对了,java还有映射内存,可以解决大文件读写的问题。
思路
大文件读写不能一次全部读入内存,这样会导致耗尽内存。(但是在内存允许的情况下,全部读入内存是不是速度更快??)
对于大文件可以一行一行读取,因为我们处理完这行,就可以把它抛弃。
我们也可以一段一段读取大文件,实现一种缓存处理。每次读取一段文件,将这段文件放在缓存里,然后对这段处理。这会比一行一行快些。
方法1:一行一行读取
我们可以打开一个文件,然后用for循环读取每行,比如:
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">method1</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(newName)</span>:</span> s1 = time.clock() oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> open(newName): newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#判断是不是空行</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> newLine.strip(): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"deal %s lines"</span> %(count) e1 = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(e1-s1)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul>
我们测试一下
<code class="hljs tex has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">fileName = 'E:<span class="hljs-command" style="box-sizing: border-box; color: rgb(0, 0, 136);">\\</span>pythonProject<span class="hljs-command" style="box-sizing: border-box; color: rgb(0, 0, 136);">\\</span>ruisi<span class="hljs-command" style="box-sizing: border-box; color: rgb(0, 0, 136);">\\</span>correct_re.txt'method1(fileName)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
输出
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">218376</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span>cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.288900734402</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
方法1.1 一行一行读取的变形
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">method11</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(newName)</span>:</span> s1 = time.clock() oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> file = open(newName) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>: line = file.readline() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> line: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> line.strip(): newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"deal %s lines"</span> %(count) e1 = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(e1-s1)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li></ul>
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">218376</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span>cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.371977884619</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
耗时和方法1差不多,比方法1稍微多些。
方法2:一行一行,使用fileinput模块
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">method2</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(newName)</span>:</span> s1 = time.clock() oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> fileinput.input(newName): newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> newLine.strip(): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"deal %s lines"</span> %(count) e1 = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(e1-s1)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li></ul>
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">218376</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span>cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.514534051673</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
这儿的耗时差不多是方法1的两倍。
借助缓存,每次读取1000行
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">method3</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(newName)</span>:</span> s1 = time.clock() file = open(newName) oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>: lines = file.readlines(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1024</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#print len(lines)</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> lines: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> lines: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> line.strip(): newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"deal %s lines"</span> %(count) e1 = time.clock()</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li></ul>
Note
readlinessizehint() 参数是限定字节大小,不是行数。
注意默认有个内部缓冲区大小是8KB,如果设定值小于 8*1024。那么都是按照8KB来的。print len(lines)
输出大概都为290。
只有当设定值大于8KB,上面的print len(lines)
才会发生变化。
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">218376</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span>cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.296652349397</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
这儿的性能还没方法1,表现好。可以调整每次读取的行数,比如500,1000等等,可以达到不同的耗时。
方法4 一次性全部读到内存里
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">method4</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(newName)</span>:</span> s1 = time.clock() file = open(newName) oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> file.readlines(): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> line.strip(): newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"deal %s lines"</span> %(count) e1 = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(e1-s1)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li></ul>
输出
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">218376</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span>cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.30108883108</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
结论
推荐使用
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-operator" style="box-sizing: border-box;">with</span> <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">open</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'foo.txt'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> f: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">line</span> <span class="hljs-operator" style="box-sizing: border-box;">in</span> f: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># do_something(line)</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
对于大文件可以使用索引,这个索引记录下每行开头的位置,之后就可以用file.seek()定位了。如果文件内容修改了,还需要重新建立索引。这个索引可以有很多种方法建立,但是都需要将文件遍历一次。
参考资料
python的readlines返回行数问题
Python按行读文件
- Python 按行读取文本文件 缓存 和 非缓存实现
- Python 按行读取文本文件 缓存 和 非缓存实现
- python decorator 实现缓存
- 全缓存、行缓存和无缓存
- 使用volley+universal image loader实现数据缓存和读取
- VC和C++按行读取文本文件
- LRU缓存和实现
- 按行读取文本文件
- 缓存算法的 Python 实现
- 实现cache缓存(Python)
- 用闭包实现非侵入式缓存
- 缓存IO和非缓冲IO
- 生成缓存 读取缓存 删除缓存
- js本地缓存的生成和读取
- PHPcms 缓存的读取和设置
- PHPcms 缓存的读取和设置
- python简单文本文件读取
- 缓存-生成缓存,读取缓存,删除缓存的类
- SenchaTouch添加隐藏域判断是否选择图片
- 【物联网】 AR9344开发环境的搭建和编译固件
- ListView中 item footerview headerview 的布局 事件监听等问题的收集整理
- C/C++ 随机数
- 如何在Eclipse下安装SVN插件——subclipse以及安装svn常出现的错误总结
- Python 按行读取文本文件 缓存 和 非缓存实现
- git checkout
- 菜鸟之路-Listview使用缓存加载网络图片错位Bug解决
- Android官方Training阅读笔记 ---- Managing the Activity Lifecycle(Starting an Activity) (一)
- linux设置串口终端
- iOS block传值
- Python 知识点 记录 日积月累
- Oracle ORA-01704文字字符串过长
- NS2的离散事件驱动原理(Scheduler, Handler, Event, Timer)