base64 浅谈以及自己的认识

来源:互联网 发布:直通车怎么开淘宝 编辑:程序博客网 时间:2024/05/18 00:16
Base64是网络上最常见的用于传输8Bit字节代码的编码方式之一,大家可以查看RFC2045~RFC2049,上面有MIME的详细规范。 Base64要求把每三个8Bit的字节转换为四个6Bit的字节(3*8 = 4*6 = 24),然后把6Bit再添两位高位0,组成四个8Bit的字节,也就是说,转换后的字符串理论上将要比原来的长1/3 

php 的函数:base64_encode() 和 base64_decode() 

base64的编,解码原理 

Base64 编码其实是将3个8位字节转换为4个6位字节,( 3*8 = 4*6 = 24 ) 这4个六位字节 其实仍然是8位,只不过高两位被设置为0. 当一个字节只有6位有效时,它的取值空间为0 到 2的6次方减1 即63,也就是说被转换的Base64编码的每一个编码的取值空间为(0~63) 。 

事实上,0~63之间的ASCII码有许多不可见字符,所以应该再做一个映射,映射表为 

'A' ~ 'Z' ? ASCII(0 ~ 25) 

'a' ~ 'z' ? ASCII(26 ~ 51) 

'0' ~ '9' ? ASCII(52 ~ 61) 

' ' ? ASCII(62) 

'/' ? ASCII(63) 

这样就可以将3个8位字节,转换为4个可见字符。 

具体的字节拆分方法为:(图(画得不好,领会精神 :-)) 

aaaaaabb ccccdddd eeffffff    //abcdef其实就是1或0,为了看的清楚就用abcdef代替 

~~~~~~~~ ~~~~~~~~ ~~~~~~~~ 

字节 1 字节 2 字节 3 

    || 
    \/ 

00aaaaaa 00bbcccc 00ddddee 00ffffff 

注:上面的三个字节位原文,下面四个字节为Base64编码,其前两位均为0。 

这样拆分的时候,原文的字节数量应该是3的倍数,当这个条件不能满足时,用全零字节 

补足,转化时Base64编码用=号代替,这就是为什么有些Base64编码以一个或两个等号结 

束的原因,但等号最多有两个,因为:如果F(origin)代表原文的字节数,F(remain)代 

表余数,则 

F(remain) = F(origin) MOD 3 成立。 

所以F(remain)的可能取值为0,1,2. 

如果设 n = [F(origin) – F(remain)] / 3 

当F(remain) = 0 时,恰好转换为4*n个字节的Base64编码。 

当F(remain) = 1 时,由于一个原文字节可以拆分为属于两个Base64编码的字节,为了 

让Base64编码是4的倍数,所以应该为补2个等号。 

当F(remain) = 2 时,由于两个原文字节可以拆分为属于3个Base64编码的字节,同理, 

应该补上一个等号。 

base64 编码后的字符串末尾会有0到2个等号,这些等号在解码是并不必要,所以可以删除。 
在网络GET 和 POST参数列表的时候,‘+’不能正常传输,可以把它替换成‘|’ 
这样经过base64编码后的字符串就只有‘|’和‘/‘,所以经过这样处理base64编码的字符串可以作为参数列表的以个参数值来传输 

======================================================================== 
以下是老外写的一个实现: 
package   com.meterware.httpunit; /******************************************************************************************************************** *   $Id:   Base64.java,v   1.4   2002/12/24   15:17:17   russgold   Exp   $ * *   Copyright   (c)   2000-2002   by   Russell   Gold * *   Permission   is   hereby   granted,   free   of   charge,   to   any   person   obtaining   a   copy   of   this   software   and   associated   *   documentation   files   (the   "Software "),   to   deal   in   the   Software   without   restriction,   including   without   limitation   *   the   rights   to   use,   copy,   modify,   merge,   publish,   distribute,   sublicense,   and/or   sell   copies   of   the   Software,   and *   to   permit   persons   to   whom   the   Software   is   furnished   to   do   so,   subject   to   the   following   conditions: * *   The   above   copyright   notice   and   this   permission   notice   shall   be   included   in   all   copies   or   substantial   portions   *   of   the   Software. * *   THE   SOFTWARE   IS   PROVIDED   "AS   IS ",   WITHOUT   WARRANTY   OF   ANY   KIND,   EXPRESS   OR   IMPLIED,   INCLUDING   BUT   NOT   LIMITED   TO *   THE   WARRANTIES   OF   MERCHANTABILITY,   FITNESS   FOR   A   PARTICULAR   PURPOSE   AND   NONINFRINGEMENT.   IN   NO   EVENT   SHALL   THE *   AUTHORS   OR   COPYRIGHT   HOLDERS   BE   LIABLE   FOR   ANY   CLAIM,   DAMAGES   OR   OTHER   LIABILITY,   WHETHER   IN   AN   ACTION   OF *   CONTRACT,   TORT   OR   OTHERWISE,   ARISING   FROM,   OUT   OF   OR   IN   CONNECTION   WITH   THE   SOFTWARE   OR   THE   USE   OR   OTHER *   DEALINGS   IN   THE   SOFTWARE. * *******************************************************************************************************************/ /**   *   A   utility   class   to   convert   to   and   from   base   64   encoding.   *   *   @author   <a   href= "mailto:russgold@httpunit.org "> Russell   Gold </a>   **/ public   class   Base64   {         final   static   String   encodingChar   =   "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/ ";         /**           *   Returns   the   base   64   encoded   equivalent   of   a   supplied   string.           *   @param   source   the   string   to   encode           */         public   static   String   encode(   String   source   )   {                 char[]   sourceBytes   =   getPaddedBytes(   source   );                 int   numGroups   =   (sourceBytes.length   +   2)   /   3; //这里的意思是字节。char【】 中每个都是一个字节 8位。 numgroup是字节分组 就是一个组代表一个4个6位的 base64 编码字符                char[]   targetBytes   =   new   char[4]; // 分配 一个numgroup分配一个 32位。 4个6位的  一个标准base64 单位
                char[]   target   =   new   char[   4   *   numGroups   ]; //分配整个的长度                for   (int   group   =   0;   group   <   numGroups;   group++)   {   //原始数组3个。 3*8 分配成 4*6 换成一个标准的 targerbytes                        convert3To4(   sourceBytes,   group*3,   targetBytes   );                         for   (int   i   =   0;   i   <   targetBytes.length;   i++)   {                                 target[   i   +   4*group   ]   =   encodingChar.charAt(   targetBytes[i]   );                         }                 }                 int   numPadBytes   =   sourceBytes.length   -   source.length();  //差多少个字符 补多少个 =                for   (int   i   =   target.length-numPadBytes;   i   <   target.length;   i++)   target[i]   =   '= ';                 return   new   String(   target   );         }         private   static   char[]   getPaddedBytes(   String   source   )   { //将每个原始source 字符 分成3* 的格式 就是说任意的 均分成3个。 1 2 3均分3份 4 5 6分成6个                char[]   converted   =   source.toCharArray();                 int   requiredLength   =   3   *   ((converted.length+2)   /3);                 char[]   result   =   new   char[   requiredLength   ];                 System.arraycopy(   converted,   0,   result,   0,   converted.length   );                 return   result;         }         private   static   void   convert3To4(   char[]   source,   int   sourceIndex,   char[]   target   )   {                 target[0]   =   (char)   (   source[   sourceIndex   ]   > > >   2);                 target[1]   =   (char)   (((source[   sourceIndex       ]   &   0x03)   < <   4)   |   (source[   sourceIndex+1   ]   > > >   4));                 target[2]   =   (char)   (((source[   sourceIndex+1   ]   &   0x0f)   < <   2)   |   (source[   sourceIndex+2   ]   > > >   6));                 target[3]   =   (char)   (     source[   sourceIndex+2   ]   &   0x3f);         }         /**           *   Returns   the   plaintext   equivalent   of   a   base   64-encoded   string.           *   @param   source   a   base   64   string   (which   must   have   a   multiple   of   4   characters)           */         public   static   String   decode(   String   source   )   {                 if   (source.length()%4   !=   0)   throw   new   RuntimeException(   "valid   Base64   codes   have   a   multiple   of   4   characters "   );                 int   numGroups   =   source.length()   /   4;                 int   numExtraBytes   =   source.endsWith(   "== "   )   ?   2   :   (source.endsWith(   "= "   )   ?   1   :   0);                 byte[]   targetBytes   =   new   byte[   3*numGroups   ];                 byte[]   sourceBytes   =   new   byte[4];                 for   (int   group   =   0;   group   <   numGroups;   group++)   {                         for   (int   i   =   0;   i   <   sourceBytes.length;   i++)   {                                 sourceBytes[i]   =   (byte)   Math.max(   0,   encodingChar.indexOf(   source.charAt(   4*group+i   )   )   );                         }                         convert4To3(   sourceBytes,   targetBytes,   group*3   );                 }                 return   new   String(   targetBytes,   0,   targetBytes.length   -   numExtraBytes   );         }         private   static   void   convert4To3(   byte[]   source,   byte[]   target,   int   targetIndex   )   {                 target[   targetIndex     ]     =   (byte)   ((   source[0]   < <   2)   |   (source[1]   > > >   4));                 target[   targetIndex+1   ]   =   (byte)   (((source[1]   &   0x0f)   < <   4)   |   (source[2]   > > >   2));                 target[   targetIndex+2   ]   =   (byte)   (((source[2]   &   0x03)   < <   6)   |   (source[3]));         } } 



原创粉丝点击