使用http 上传文件的原理
来源:互联网 发布:制作手机相册软件 编辑:程序博客网 时间:2024/06/05 00:11
可参考的文章有:
http://www.cnblogs.com/kaixuan/archive/2008/01/31/1060284.html
通过 http 协议上传文件(rfc1867协议概述,jsp 应用举例,客户端发送内容构造)
1、概述
在最初的 http 协议中,没有上传文件方面的功能。 rfc1867 (http://www.ietf.org/rfc/rfc1867.txt) 为 http 协议添加了这个功能。客户端的浏览器,如 Microsoft IE, Mozila, Opera 等,按照此规范将用户指定的文件发送到服务器。服务器端的网页程序,如 php, asp, jsp 等,可以按照此规范,解析出用户发送来的文件。
Microsoft IE, Mozila, Opera 已经支持此协议,在网页中使用一个特殊的 form 就可以发送文件。
绝大部分 http server ,包括 tomcat ,已经支持此协议,可接受发送来的文件。
各种网页程序,如 php, asp, jsp 中,对于上传文件已经做了很好的封装。
2、上传文件的实例:用 servelet 实现(http server 为 tomcat 4.1.24)
1. 在一个 html 网页中,写一个如下的form :
<form enctype="multipart/form-data" action="http://192.168.29.65/UploadFile" method=post> load multi files :<br> <input name="userfile1" type="file"><br> <input name="userfile2" type="file"><br> <input name="userfile3" type="file"><br> <input name="userfile4" type="file"><br> text field :<input type="text" name="text" value="text"><br> <input type="submit" value="提交"><input type=reset> </form> 2. 服务端 servelet 的编写 public void doPost( HttpServletRequest request, HttpServletResponse response ) { DiskFileUpload diskFileUpload = new DiskFileUpload(); // 允许文件最大长度 diskFileUpload.setSizeMax( 100*1024*1024 ); // 设置内存缓冲大小 diskFileUpload.setSizeThreshold( 4096 ); // 设置临时目录 diskFileUpload.setRepositoryPath( "c:/tmp" ); List fileItems = diskFileUpload.parseRequest( request ); Iterator iter = fileItems.iterator(); for( ; iter.hasNext(); ) { FileItem fileItem = (FileItem) iter.next(); if( fileItem.isFormField() ) { // 当前是一个表单项 out.println( "form field : " + fileItem.getFieldName() + ", " + fileItem.getString() ); } else { // 当前是一个上传的文件 String fileName = fileItem.getName(); fileItem.write( new File("c:/uploads/"+fileName) ); } } } 为简略起见,异常处理,文件重命名等细节没有写出。 3、 客户端发送内容构造 假设接受文件的网页程序位于 http://192.168.29.65/upload_file/UploadFile. a bb XXX ccc 客户端应该向 192.168.29.65 发送如下内容: POST /upload_file/UploadFile HTTP/1.1 Accept: text/plain, */* Accept-Language: zh-cn Host: 192.168.29.65:80 User-Agent: Mozilla/4.0 (compatible; OpenOffice.org) Connection: Keep-Alive -----------------------------7d33a816d302b6 Content-Disposition: form-data; name="userfile1"; filename="E:/s" Content-Type: application/octet-stream a bb XXX ccc -----------------------------7d33a816d302b6 Content-Disposition: form-data; name="text1" foo -----------------------------7d33a816d302b6 Content-Disposition: form-data; name="password1" bar -----------------------------7d33a816d302b6-- 此内容必须一字不差,包括最后的回车。
现在第三方的 http upload file 工具库很多。Jarkata 项目本身就提供了fileupload 包http://jakarta.apache.org/commons/fileupload/ 。文件上传、表单项处理、效率问题基本上都考虑到了。在 struts 中就使用了这个包,不过是用 struts 的方式另行封装了一次。这里我们直接使用 fileupload 包。至于struts 中的用法,请参阅 struts 相关文档。
这个处理文件上传的 servelet 主要代码如下:
假设我们要发送一个二进制文件、一个文本框表单项、一个密码框表单项。文件名为 E:/s ,其内容如下:(其中的XXX代表二进制数据,如 01 02 03)Content-Type:multipart/form-data;boundary=---------------------------7d33a816d302b6
Content-Length: 424
注意:Content-Length: 424 这里的424是红色内容的总长度(包括最后的回车)
注意这一行:
Content-Type: multipart/form-data; boundary=---------------------------7d33a816d302b6
根据 rfc1867, multipart/form-data是必须的.
---------------------------7d33a816d302b6 是分隔符,分隔多个文件、表单项。其中33a816d302b6 是即时生成的一个数字,用以确保整个分隔符不会在文件或表单项的内容中出现。前面的 ---------------------------7d 是 IE 特有的标志。 Mozila 为---------------------------71
用手工发送这个例子,在上述的 servlet 中检验通过。
注意 enctype="multipart/form-data", method=post, type="file" 。根据 rfc1867, 这三个属性是必须的。multipart/form-data 是新增的编码类型,以提高二进制文件的传输效率。具体的解释请参阅 rfc1867.
第二篇文章使用perl实现的:
http://www.vivtek.com/rfc1867.html
RFC1867 is the standard definition of that "Browse..." button that you use to upload files toa Web server. It introduced the INPUT field type="file", which is that button, and alsospecified a multipart form encoding which is capable of encapsulating files for upload alongwith all the other fields on an upload form.
It's not easy to find documentation on how to work with this stuff, though. Partly this isbecause if you're writing a Perl CGI it's really rather easy to work with, and partly it's dueto the fact that Microsoft IIS ASP doesn't (exactly) support RFC1867 file upload. So on the one handthe Unixheads think it's too trivial to document, while the ASP script kiddies think that file upload is the exclusive preserve of genius and guru alike. I.e. Bill doesn't think you need to use it.
If that last sounds overly bitter, it's because I just finished up a really horrible job thatinvolved uploading files to an IIS server. It would have been nice had somebody at Microsoftfound file upload a sufficiently significant function to design competently. As it is, IIS 5.0now provides a "Request.ReadBinary" method that gives you the whole request in plaintext, andgraciously allows you to design your own object to read it. Note that VBS has no (easy)ability to read this binary data.
So let's assume for the time being that you're working with some reasonable non-IIS server. Howdo you really deal with file upload? It turns out to be easy. First, you design your form so thatit will actually do an upload. In short, do this:
<form action=/mycode.cgi method=post enctype=multipart/form-data
>
<input type="file"
>
</form>
In case you were wondering, the standard encoding type for a form is application/x-www-form-urlencoded, and if you leave the multipart enctype out of your form, then Netscape, for one, will not uploadthe file, it'll just include the filename. If that's what you actually want, this is prettyuseful. (However, the RFC leaves behavior in this situation undefined, so you shouldn't rely onany particular behavior. I haven't looked to see what IE does in this situation. Undoubtedlysomething different.)
So this much information I already knew going into my horrible project, or at least knew of it.That's why I assumed that the server end was just as simple. And as I mentioned, in Perl itisn't much more difficult than retrieving normal posted data is already. It's just that IISdoesn't support multipart/form-data posts, that's all. Oh, Microsoft has a solution of sorts,called the something-or-other manager, and IIS 5.0 is so powerful that this manager thingy isnow included right in the service pack with, gee, at least a kilobyte of documentation.
Yeesh. I'm off-track again, aren't I?
OK, so when this post gets to the server, what does it look like? Well, first of all theContent-type header of the request is set to
multipart/form-data; boundary=[some stuff]
This is how you can ascertain that you're really dealing with a properly encoded upload post.The boundary value is probably of the form --------------------------------1878979834, where thedigits are randomly generated. This boundary is a MIME boundary; it's guaranteed not to appearanywhere in the data except between the multiple parts of the data.
The data itself appears in blocks that are made up of lines separated by CR/LF pairs. It lookslike this, more or less:
-------------------------------18788734234
Content-Disposition: form-data; name="nonfile_field"
value here
-------------------------------18788734234
Content-Disposition: form-data; name="myfile"; filename="ad.gif"
Content-Type: image/gif
[ooh -- file contents!]
-------------------------------18788734234--
As you can see, this post isn't from the form I listed above, because I threw in a non-uploadfield just to show what it looks like. Anyway, you can see where everything is. Note thatyou get the originating local filename of the document for free in this format, meaning thatyou can use this to develop a document management system. Actualimplementation is left as an exercise for the reader. I'll write more later on this topic,especially if you ask me any questions. Hint, hint.
So a Perl reader for this guy is simple: you iterate on the lines of the input and break onyour boundary. Do things with the parts as you find them.I have an extensive example that you can read and use, whichyou can see here. It works (I'm using it daily) and it's well-documented.
And thus concludes the lesson for today. Go forth and upload files.
- RFC1867 at Ohio State
An interesting RFC, actually, as it goes into some of the alternatives that the working grouprejected in the interest of a clean design. - Perl/CGI implementation of RFC1867
My implementation in Perl. Literately programmed.
具体协议请看:
http://tools.ietf.org/html/rfc1867
http://tools.ietf.org/html/rfc2854
http://tools.ietf.org/html/rfc2388
- 使用http 上传文件的原理
- 使用http 上传文件的原理
- http上传文件的原理
- Http-----文件上传原理
- Http 文件上传原理解析
- 基于http协议的文件上传的原理
- 文件上传的原理
- 文件上传的原理
- 文件上传的原理
- 文件上传的原理
- 使用MFC提供的Http类下载和上传文件
- android使用http协议实现文件的上传
- Android使用http协议实现文件的上传
- Android使用http协议实现文件的上传
- Android使用http协议实现文件的上传
- 使用MFC提供的Http类下载和上传文件
- Android使用http协议实现文件的上传
- http上传/下载文件时,Content-Disposition的使用
- 难以抉择的选择无对错,坚定地走好当前选择是王道
- window.showModalDialog实例
- SVN服务端Subversion与客户端TortoiseSVN教程--解决同步问题
- 致QQ、360的一封信
- C# 正则表达式取值
- 使用http 上传文件的原理
- Ubuntu中设置中文输入法
- 验证数字的正则表达式集
- 进程的状态
- xp从运行开启常见快捷命令
- 第四天 11月3日
- 迈出第一步
- ubuntu 下minicom的使用
- 最近两件搞笑的闹剧,1个是360装纯洁少女,1个是传说中silverlight死了