C# Char型
来源:互联网 发布:软件著作权摊销年限 编辑:程序博客网 时间:2024/04/29 12:34
c#中一个Char型 可以存储一个汉字?
https://msdn.microsoft.com/zh-cn/library/system.char.aspx
The .NET Framework uses the Char structure to represent a Unicode character. The Unicode Standard identifies each Unicode character with a unique 21-bit scalar number called a code point, and defines the UTF-16 encoding form that specifies how a code point is encoded into a sequence of one or more 16-bit values. Each 16-bit value ranges from hexadecimal 0x0000 through 0xFFFF and is stored in a Char structure.The value of a Char object is its 16-bit numeric (ordinal) value.
Char Objects, Unicode Characters, and Strings
A String object is a sequential collection of Char structures that represents a string of text. Most Unicode characters can be represented by a single Char object, but a character that is encoded as a base character, surrogate pair, and/or combining character sequence is represented by multiple Char objects. For this reason, aChar structure in a String object is not necessarily equivalent to a single Unicode character.
Multiple 16-bit code units are used to represent single Unicode characters in the following cases:
Glyphs, which may consist of a single character or of a base character followed by one or more combining characters. For example, the character ä is represented by a Char object whose code unit is U+0061 followed by a Char object whose code unit is U+0308.(The character ä can also be defined by a single Char object that has a code unit of U+00E4.)The following example illustrates that the character ä consists of two Char objects.
C#VBusing System;using System.IO;public class Example{ public static void Main() { StreamWriter sw = new StreamWriter("chars1.txt"); char[] chars = { '\u0061', '\u0308' }; string strng = new String(chars); sw.WriteLine(strng); sw.Close(); }}// The example produces the following output:// ä
Characters outside the Unicode Basic Multilingual Plane (BMP). Unicode supports sixteen planes in addition to the BMP, which represents plane 0. A Unicode code point is represented in UTF-32 by a 21-bit value that includes the plane. For example, U+1D160 represents the MUSICAL SYMBOL EIGHTH NOTE character. Because UTF-16 encoding has only 16 bits, characters outside the BMP are represented by surrogate pairs in UTF-16. The following example illustrates that the UTF-32 equivalent of U+1D160, the MUSICAL SYMBOL EIGHTH NOTE character, is U+D834 U+DD60. U+D834 is the high surrogate; high surrogates range from U+D800 through U+DBFF. U+DD60 is the low surrogate; low surrogates range from U+DC00 through U+DFFF.
C#VBusing System;using System.IO;public class Example{ public static void Main() { StreamWriter sw = new StreamWriter(@".\chars2.txt"); int utf32 = 0x1D160; string surrogate = Char.ConvertFromUtf32(utf32); sw.WriteLine("U+{0:X6} UTF-32 = {1} ({2}) UTF-16", utf32, surrogate, ShowCodePoints(surrogate)); sw.Close(); } private static string ShowCodePoints(string value) { string retval = null; foreach (var ch in value) retval += String.Format("U+{0:X4} ", Convert.ToUInt16(ch)); return retval.Trim(); }}// The example produces the following output:// U+01D160 UTF-32 = ð (U+D834 U+DD60) UTF-16
Characters and Text Elements
Because a single character can be represented by multiple Char objects, it is not always meaningful to work with individual Char objects. For instance, the following example converts the Unicode code points that represent the Aegean numbers zero through 9 to UTF-16 encoded code units. Because it erroneously equates Charobjects with characters, it inaccurately reports that the resulting string has 20 characters.
using System;public class Example{ public static void Main() { string result = String.Empty; for (int ctr = 0x10107; ctr <= 0x10110; ctr++) // Range of Aegean numbers. result += Char.ConvertFromUtf32(ctr); Console.WriteLine("The string contains {0} characters.", result.Length); }}// The example displays the following output:// The string contains 20 characters.
You can do the following to avoid the assumption that a Char object represents a single character.
You can work with a String object in its entirety instead of working with its individual characters to represent and analyze linguistic content.
You can use the StringInfo class to work with text elements instead of individual Char objects. The following example uses the StringInfo object to count the number of text elements in a string that consists of the Aegean numbers zero through nine. Because it considers a surrogate pair a single character, it correctly reports that the string contains ten characters.
C#VBusing System;using System.Globalization;public class Example{ public static void Main() { string result = String.Empty; for (int ctr = 0x10107; ctr <= 0x10110; ctr++) // Range of Aegean numbers. result += Char.ConvertFromUtf32(ctr); StringInfo si = new StringInfo(result); Console.WriteLine("The string contains {0} characters.", si.LengthInTextElements); }}// The example displays the following output:// The string contains 10 characters.
If a string contains a base character that has one or more combining characters, you can call the String.Normalize method to convert the substring to a single UTF-16 encoded code unit. The following example calls the String.Normalize method to convert the base character U+0061 (LATIN SMALL LETTER A) and combining character U+0308 (COMBINING DIAERESIS) to U+00E4 (LATIN SMALL LETTER A WITH DIAERESIS).
C#VBusing System;public class Example{ public static void Main() { string combining = "\u0061\u0308"; ShowString(combining); string normalized = combining.Normalize(); ShowString(normalized); } private static void ShowString(string s) { Console.Write("Length of string: {0} (", s.Length); for (int ctr = 0; ctr < s.Length; ctr++) { Console.Write("U+{0:X4}", Convert.ToUInt16(s[ctr])); if (ctr != s.Length - 1) Console.Write(" "); } Console.WriteLine(")\n"); }}// The example displays the following output:// Length of string: 2 (U+0061 U+0308)// // Length of string: 1 (U+00E4)
- C# Char型
- C#中的char
- C# char类型
- C#中的字符(Char)处理
- c# byte char string转换
- c# byte char string转换
- C#对char[]的处理
- C# String.IndexOfAny 方法 (Char[])
- C# 调用 C dll char*
- C# 中的char 和 byte
- C# byte 和 char 转化
- C# char ToString/C# DateTime ToString
- c#中char.IsDigit和char.IsNumeric的区别
- C#中int,string,char[],char的转换(待续)
- C#中string型字段的区别 (char、varchar、nchar、nvarchar)
- C#中string与char[]转换
- C# char 和string之间转换
- C# char[] 与 string之间的转换
- 几个问题的思考
- Linux操作命令(九)
- HttpUnit 使用示例 抓取网页内容
- (四十四)TabBarController和NagivationController配合
- 定位Oops的具体代码行
- C# Char型
- 程序员自己的svn代码库 同步公司和家里的代码
- eclipse常用插件
- linux下rsync增量同步方法
- 计算机英语 记录
- Tomcat中Context标签使用
- HTTP协议介绍 记录
- asp.net MVC3 从客户端(&)中检测到有潜在危险的 Request.Path 值。
- android新组件RecyclerView使用介绍和进阶使用,替用Gallery