URL Encoding
来源:互联网 发布:网络qos 编辑:程序博客网 时间:2024/06/05 14:49
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
RFC 1738: Uniform Resource Locators (URL) specification
The specification for URLs (RFC 1738, Dec. '94) poses a problem, in that it limits the use of allowed characters in URLs to only a limited subset of the US-ASCII character set:
"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."HTML, on the other hand, allows the entire range of the ISO-8859-1 (ISO-Latin) character set to be used in documents - and HTML4 expands the allowable range to include all of the Unicode character set as well. In the case of non-ISO-8859-1 characters (characters above FF hex/255 decimal in the Unicode set), they just can not be used in URLs, because there is no safe way to specify character set information in the URL content yet [RFC2396.]
URLs should be encoded everywhere in an HTML document that a URL is referenced to import an object (A, APPLET, AREA, BASE, BGSOUND, BODY, EMBED,FORM, FRAME, IFRAME, ILAYER, IMG, ISINDEX, INPUT, LAYER, LINK, OBJECT, SCRIPT, SOUND, TABLE, TD, TH, and TR elements.)
What characters need to be encoded and why?
ASCII Control characters Why:These characters are not printable.Characters:Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.)Non-ASCII characters Why:These are by definition not legal in URLs since they are not in the ASCII set.Characters:Includes the entire "top half" of the ISO-Latin set 80-FF hex (128-255 decimal.)"Reserved characters" Why:URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded.Characters:
Points
(Hex)
Points
(Dec)
Ampersand ("&")
Plus ("+")
Comma (",")
Forward slash/Virgule ("/")
Colon (":")
Semi-colon (";")
Equals ("=")
Question mark ("?")
'At' symbol ("@")
24
26
2B
2C
2F
3A
3B
3D
3F
4036
38
43
44
47
58
59
61
63
64"Unsafe characters" Why:Some characters present the possibility of being misunderstood within URLs for various reasons. These characters should also always be encoded.Characters:
Points
(Hex)
Points
(Dec)
'Less Than' symbol ("<")
'Greater Than' symbol (">")22
3C
3E34
60
62These characters are often used to delimit URLs in plain text.'Pound' character ("#")2335This is used in URLs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.Percent character ("%")2537This is used to URL encode/escape other characters, so it should itself also be encoded.Misc. characters:
Left Curly Brace ("{")
Right Curly Brace ("}")
Vertical Bar/Pipe ("|")
Backslash ("/")
Caret ("^")
Tilde ("~")
Left Square Bracket ("[")
Right Square Bracket ("]")
Grave Accent ("`")
7B
7D
7C
5C
5E
7E
5B
5D
60
123
125
124
92
94
126
91
93
96Some systems can possibly modify these characters.
How are characters URL encoded?
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character.
- Example
- Space = decimal code point 32 in the ISO-Latin set.
- 32 decimal = 20 in hexadecimal
- The URL encoded representation will be "%20"
URL encoding converter
The box below allows you to convert content between its unencoded and encoded forms. The initial input state is considered to be "unencoded" (hit 'Convert' at the beginning to start in the encoded state.) Further, to allow actual URLs to be encoded, this little converter does not encode URL syntax characters (the ";", "/", "?", ":", "@", "=", "#" and "&" characters)...if you also need to encode these characters for any reason, see the "Reserved characters" table above for the appropriate encoded values.
NOTE:
This converter uses the String.charCodeAt and String.fromCharCode functions, which are only available in Javascript version 1.2 or better, so it doesn't work in Opera 3.x and below, Netscape 3 and below, and IE 3 and below. Browser detection can be tiresome, so this will just fail in those browsers...you have been warned. 8-}
Browser Peculiarities
- Internet Explorer is notoriously relaxed in its requirements for encoding spaces in URLs. This tends to contribute to author sloppiness in authoring URLs. Keep in mind that Netscape and Opera are much more strict on this point, and spaces MUST be encoded if the URL is to be considered to be correct.
- URL Encoding
- URL encoding
- URL Encoding
- URL encoding
- url encoding
- URL及URL encoding 简述
- URL Encoding/Decoding
- iphone Quickie: URL Encoding
- URL Encoding - CFURLCreateStringByAddingPercentEscapes
- Swift IOS url Encoding
- URL encoding科普
- ios URL Encoding
- URL-encoding : ASCII Character
- URL Encoding/Decoding in C
- URL encoding方法 cocoa foundation
- URL encoding方法 cocoa foundation
- URL encoding方法 cocoa foundation
- java URL encoding and decoding
- Google应该改进Android的五个方面
- stm_aix stm_bpx stm_bm stm_ai stm_bp 参数说明
- 字符串匹配算法原理简述
- setContentViewsetContentView----R.java----String.xml
- AIX整理
- URL Encoding
- DatasetDropDown控件 和 customdataset配合使用
- 【转载】C++程序设计之四书五经/C++学习书籍介绍
- 支持3D界面 Android 3.0最早将10月发布
- VS2003+IE8无法启动调试的解决方法
- Oracle命令整理
- 甲骨文发布Solaris11和SPARC芯产品路线图
- 中文搜索引擎2010Q2市场份额
- SurfaceView 和 View的区别