CLR via C# 总结之Chap5 Primitive, Reference, and Value Types
来源:互联网 发布:常见国内域名购买商 编辑:程序博客网 时间:2024/05/01 09:35
Primitive Types
Any data types the compiler directly supports are called primitive types. Primitive types map directly to types existing in the Framwork Class Library(FCL).
图1
The primitive type names are short and thus giving some terms of convinience. However, the author doesn’t support the use of it and here are his reasons:
- In common sense of many programmers,
int
represents a 32-bit integer on a 32-bit OS and a 64-bit integer on a 64-bit OS. But this is NOT true for C#. In C#,int
always maps toSystem.Int32
. UsingInt32
will remove potential confusion. - In C#,
long
maps toSystem.Int64
, while in a different language, it maps to anInt16
orInt32
. Someone reading source code in one language could easily misinterpret the code’s intention if he is used to programming in a different language. This is also true for other language, not just C#. - The FCL has many methods that have type names as part of their method names. So you may feel unnatural to use
float val = br.ReadSingle();
although it’s right. (In my view, if you feel OK, then it doesn’t matter a lot)
So, it does have some effect on the developing of coding, we can try to use its real name then.
2. - In common sense of many programmers,
Int32 i = 5;Int64 j = i; // Implicit cast does happen
Based on the casting rules, this code should NOT be able to compile because nither one of this two type derives from the other. However, it does compile because C# compiler has intimate knowledge of primitive types and applies special rules when compiling the code —— It produces necessary IL to make things happen as expected when it recognize common programming patterns like this.
C# allows implicit casts if the conversion is “safe”, which means no loss of data is possible. If the conversion is potentially unsafe, it requires explicit cast. Keep in mind that different compilers can apply different casting rules so you need to be careful.
Primitive types can be written as literals. A literal is considered to be an instance of the type itself, so it can call the methods of its corresponding type directly.
Checked and unchecked primitive type operations
Byte b = 100;b = (Byte) (b + 200); // Implicit cast in the right parenthesis because no data loss, b now contains 44
The first step requires
b
be expanded to 32-bit value(or 64-bit value if any operand requires more than 32 bits).In most programming scenarios, silent overflow is undesirable. However, in some rare programming scenarios, such as calculating a hash value or a checksum, this overflow is not only acceptable but is also desired.
By default, overflow is NOT checked in C#. But if you use checking version, the IL instructions like
add
will be replaced by their checking versions, likeadd.ovf
. You make the choice.The way to check overflow:
- use the /checked+ compiler switch to turn on/off gloabally
- use code like :
checked(...); unchecked(...); checked{...}
Byte b = 100;b = checked((Byte) (b + 200)); // OverflowException is thrownb = (Byte) checked(b + 200); // No OverflowExceptionchecked{ Byte c = 100; c += 200; // Can use += if we use checked block}
Attention: Calling a method within a checked operator or statement has no impact on that method
checked{ SomeMethod(300); // Assume SomeMethod tries to load 300 into a Byte // It will NOT throw an OverflowException if it is not compiled with checked instructions}
Programming Recommendation
- Use signed data types wherever possible. This allows the compiler to detect more overflow/underflow errors. Also, some parts of the class library(such as
Length
properties ofArray
andString
are hard-coded to return signed values. - Explicitly use checked around blocks where an unwanted overflow might occur due to invalid input data. Also,
OverflowException
can be caught as well. - Explicitly use
unchecked
around blocks where overflow is OK.
- Use signed data types wherever possible. This allows the compiler to detect more overflow/underflow errors. Also, some parts of the class library(such as
Reference Types and Value Types
- Value type instances don’t use GC(garbage collector), so their use reduces pressure in the managed heap and reduces the number of collections and application requires over its lifetime.
In .NET Framework SDK ducumentation, any type called a class is a reference type, while each value type is referred to as a structure or an enumeration. All of the structures are immediately derived from
System.ValueType
abstract type, which is itself immediately derived fromSystem.Object
type.struct SomeVal{public Int32 x;}static void Demo(){ SomeVal v1 = new SomeVal(); // Allocated on the thread's stack!!! new doesn't mean heap as it does in C++}
In C#, types declared using
struct
are value types, and types declared usingclass
are reference types.SomeVal v1 = new SomeVal(); // line 1Int32 a = v1.x;SomeVal v2; // line 3Int32 b = v2.x; // error CS0170: Use of possibly unassigned field 'x'
In fact, both line 1 and 3 produce IL that allocates the instance on the thread’s stack and zeroes the fields. The only difference is that C# “thinks” that the instance is initialized if you use the
new
operator. ????In some situations, value types can give better performance. Declare a type as a value type if ALL the following are true:
- The type acts as a primitive type. When a type offers no members that alter its fields, we say that the type is immutable. In fact, it is recommended that many value types mark all their fields as
readonly
(discussed later in Chap7). - The type doesn’t need to inherit from any other type!!!
- The type won’t have any other types derived from it!!!
What’s more, when a value type is used as argument or returned value, all the fields in the value type instance are copied, hurting performance. So the following statements also need to be true:
- Instances of the type are small(about 16 bytes or less).
- Instance of the type are large(>16 bytes) and are NOT passed as method parameters or returned from methods.
- The type acts as a primitive type. When a type offers no members that alter its fields, we say that the type is immutable. In fact, it is recommended that many value types mark all their fields as
- The main advantage of value types is that they’re NOT allocated as objects in the managed heap. Here are some of the ways in which they differ:
- Value type objects have two representations: an unboxed form and a boxed form, while reference types are always in a boxed form.
System.ValueType
overrides theEquals
andGetHashCode
methods inSystem.Object
, and due to perfromance issues with this default implementation, these two method should be overrided in your own value type. But How?????????????- Any new vitual methods should NOT be introduced into a value type. No methods can be abstract and all methods are implicitly sealed(can’t be overridden). You CAN’T define a new value type or a new reference type by using a value type as a base type!!!
- Reference types initialized to
null
, while a special feature, called nullable types, is offered to add the notion of nullability to a value type. - When assigned, field-by-field copy is made for value type while only the memory address is copied for reference type.
How the CLR Controls the Layout of a Type’s Fields
- To improve performance, the CLR is capable of arranging the fields of a type any way it chooses(Why and How this affect performance in detail???). e.g. Fields in memory might be reordered by the CLR so that object references are grouped together and data fields are properly aligned and packed.
- Tell the CLR what to do by applying the
System.Runtime.InteropServices.StructLayoutAttribute
attribute on the class or structure you’re defining. PassLayoutKind.Auto
to have the CLR arrange the fields,LayoutKind.Sequential
to have the CLR preserve your field layout, orLayoutKind.Explicit
to explicitly arrange the fields in memory by using offsets. - MS’s C# compiler selects
LayoutKind.Auto
for reference types(classes) andLayoutKind.Sequential
for value types. - Apply an instance of the
System.Runtime.InteropServices.FieldOffsetAttribute attribute
to each field passing to this attribute’s constructor an Int32 indicating the offset (in bytes) of the field’s first byte from the beginning of the instance. Explicit layout is typically used to simulate what would be a union in unmanaged C/C++ because you can have multiple fields starting at the same offset in memory.
// Let the CLR arrange the fields to improve// performance for this value type.[StructLayout(LayoutKind.Auto)]internal struct SomeValType{ private readonly Byte m_b;}// The developer explicitly arranges the fields of this value type.[StructLayout(LayoutKind.Explicit)]internal struct SomeValType {[FieldOffset(0)]private readonly Byte m_b; // The m_b and m_x fields overlap each[FieldOffset(0)]private readonly Int16 m_x; // other in instances of this type}
- It should be noted that it is illegal to define a type in which a reference type and a value
type overlap.
Boxing and Unboxing Value Types
- In many cases, you must get a reference to an instance of a value type. e.g.
public virtual Int32 Add(Object value);
forArrayList
. This indicates thatAdd
requires a reference(or pointer) to an object on the managed heap as a parameter. So a value type must be converted into a true heap-managed object, and a reference to this object must be obtained (注: 虽然说ValueType
也是继承自Object
,但是这个歌参数不是ValueType
就已经说明这个函数想要的并不是值类型了,它内部也并不会有ValueType所特有的方法什么的)。 - What happens when an instance of a value type is boxed:
- Memory is allocated from the managed heap. The amount of memory allocated is the size required by the value type’s fields plus the two additional overhead members (the type object pointer and the sync block index) required by all objects on the managed heap.
- The value type’s fields are copied to the newly allocated heap memory.
- The address of the object is returned.
3.
“`
// Declare a value type.
struct Point {
public Int32 x, y;
}
public sealed class Program { public static void Main() { ArrayList a = new ArrayList(); Point p; // Allocate a Point (not in the heap). for (Int32 i = 0; i < 10; i++) { p.x = p.y = i; // Initialize the members in the value type. a.Add(p); // Box the value type and add the // reference to the Arraylist. } ... }}```The Point value type variable (p) can be reused because the `ArrayList` **never knows anything about it**. **Note that** the lifetime of the boxed value type **extends beyond** the lifetime of the unboxed value type.
4.
Noted that the FCL now includes a new set of generic collection classes that make the non-generic collection classes obsolete(过时/废弃的). e.g. Use System.Collections.Generic.List<T>
class instead of the System.Collections.ArrayList
class.
The generic collection classes offer many **improvements** over the non-generic equivalents. For example, the API has been **cleaned up and improved**, and the performance
of the collection classes has been greatly improved as well. But one of the biggest improvements
is that the generic collection classes allow you to **work with collections of value types
without requiring that items in the collection be boxed/unboxed**. This in itself greatly improves
performance because far fewer objects will be created on the managed heap, thereby
reducing the number of garbage collections required by your application. Furthermore,
you will get compile-time type safety, and your source code will be cleaner due to fewer casts.
5.
`Point p = (Point) a[0]; // Unboxing and copy
Here you’re taking the reference (or pointer) contained in element `0` of the `ArrayList` and trying
to put it into a Point
value type instance, p
.
For this to work, all of the fields contained in the boxed
Point
object must be copied into the value type variable, p
, which is on the thread’s stack. The CLR
accomplishes this copying in two steps:
1. Unboxing : The address of the Point
fields in the boxed Point
object
is obtained.
2. The values of these fields are copied from the heap to the stack-based value type instance
**Mind** that unboxing doesn't involve the copying of any bytes in memory, and note that an unboxing operationis typically followed by copying the fields.
While unboxing, a
NullReferenceException
is thrown if the variable containing the reference to the boxed value type instance isnull
; and if the reference doesn’t refer to an object that is a boxed instance of the desired value type,
anInvalidCastException
is thrown.public static void Main() {Int32 x = 5;Object o = x; // Box x; o refers to the boxed objectInt16 y = (Int16) o; // Throws an InvalidCastExceptionInt16 z = (Int16)(Int32) o; // Unbox to the correct type and cast}
7.
“`
public static void Main() {
Point p;
p.x = p.y = 1;
Object o = p; // Boxes p; o refers to the boxed instance
// Change Point's x field to 2 p = (Point) o; // Unboxes o AND copies fields from boxed // instance to stack variable p.x = 2; // Changes the state of the stack variable o = p; // Boxes p; o refers to a new boxed instance}```The code at the bottom of this fragment is intended **only** to change `Point`’s x field from `1` to `2`. This involves unboxing/copy and boxing again. A whole new boxed instance in the managed heap is created.
Some languages, such as C++/CLI, allow you to unbox a boxed value type **without copying the
fields**. In this case, the unboxed instance’s fields happen to be in a boxed object on the heap, and we can manipulate the data directly. This improves performance by avoid both allocating and copying twice!Be aware of the boxing related operations mentioned above if you are the least bit concerned about your application’s performance. Use ILDasm.exe or .Net Reflector to view the IL code and see where the box IL instructions are.
The help of IL code:
// Consider this codepublic static void Main() {Int32 v = 5; // Create an unboxed value type variable.Object o = v; // o refers to a boxed Int32 containing 5.v = 123; // Changes the unboxed value to 123Console.WriteLine(v + ", " + (Int32) o); // Displays "123, 5"}
The boxing operations occur THREE times!!! When we are very concerned about the performance of some code, we should be attention to the boxing operations. (In this demo, this is because the value type instances must be converted to
Object
to fitString.Concat(Object, Object, Object)
first)Basically, if you want a reference to an instance of a value type, the instance must be boxed.
You can still call virtual methods(such as
Equals
,GetHashCode
, orToString
) inherited or overridden by the type. If your value type overrides one of these virtual methods, then the CLR can invoke the method nonvirtually because value types are implicitly sealed and cannot have any types derived from them. In addition, the
value type instance being used to invoke the virtual method is NOT boxed. However, if your override
of the virtual method calls into the base type’s implementation of the method, then the value type
instance does get boxed when calling the base type’s implementation so that a reference to a heap
object gets passed to thethis
pointer into the base method.In addition, casting an unboxed instance of a value type to one of the type’s interfaces requires the
instance to be boxed, because interface variables must always contain a reference to an object on the
heap. (Covered later in Chap 13)
14.
using System;internal struct Point : IComparable { private readonly Int32 m_x, m_y; // Constructor to easily initialize the fields public Point(Int32 x, Int32 y) { m_x = x; m_y = y; } // Override ToString method inherited from System.ValueType public override String ToString() { // Return the point as a string. Note: calling ToString prevents boxing return String.Format("({0}, {1})", m_x.ToString(), m_y.ToString()); } // Implementation of type-safe CompareTo method public Int32 CompareTo(Point other) { return Math.Sign(Math.Sqrt(m_x * m_x + m_y * m_y) - Math.Sqrt(other.m_x * other.m_x + other.m_y * other.m_y)); } // Implementation of IComparable's CompareTo method public Int32 CompareTo(Object o) { if (GetType() != o.GetType()) { throw new ArgumentException("o is not a Point"); } // Call type-safe CompareTo method return CompareTo((Point) o); }}public static class Program { public static void Main() { // Create two Point instances on the stack. Point p1 = new Point(10, 10); Point p2 = new Point(20, 20); // p1 does NOT get boxed to call ToString (a virtual method). Console.WriteLine(p1.ToString());// "(10, 10)" // Name hiding or override? Same effect? Only differ when Polymorphism happens??? // p DOES get boxed to call GetType (a non-virtual method). Console.WriteLine(p1.GetType());// "Point" // What if Point has a method named GetType hiding the non-virtual method in base class // p1 does NOT get boxed to call CompareTo. // p2 does NOT get boxed because CompareTo(Point) is called. Console.WriteLine(p1.CompareTo(p2));// "-1" // p1 DOES get boxed, and the reference is placed in c. IComparable c = p1; Console.WriteLine(c.GetType());// "Point" // p1 does NOT get boxed to call CompareTo. // Because CompareTo is not being passed a Point variable, // CompareTo(Object) is called, which requires a reference to // a boxed Point. // c does NOT get boxed because it already refers to a boxed Point. Console.WriteLine(p1.CompareTo(c));// "0" // c does NOT get boxed because it already refers to a boxed Point. // p2 does get boxed because CompareTo(Object) is called. Console.WriteLine(c.CompareTo(p2));// "-1" // c is unboxed, and fields are copied into p2. p2 = (Point) c; // Proves that the fields got copied into p2. Console.WriteLine(p2.ToString());// "(10, 10)" }}
Normally, to call a virtual method, the CLR needs to determine the objects’s type in order to locate the type’s method table. However, for value type, JIT compiler sees its overrided ToString method so it call the method directly. The compiler knows that polymorphism can’t come into play. Note that if Point
’s ToString method internally calls base.ToString()
, then the value type instance would be boxed when calling System.ValueType
’s ToString
method.
In the call to the nonvirtual GetType
method, p1
does have to be boxed.
The reason is that the Point
type inherits GetType
from System.Object
. So to callGetType
,
the CLR must use a pointer to a type object, which can be obtained only by boxing p1
.
Casting to IComparable
: When casting p1
to a variable (c
) that is of an interface type, p1
must be boxed because interfaces are reference types by definition.
15.
I realize that all of this information about reference types, value types, and boxing might be overwhelming
at first. However, a solid understanding of these concepts is critical to any .NET Framework
developer’s long-term success. Trust me: having a solid grasp of these concepts will allow you to build
efficient applications faster and easier.
Changing Fields in a Boxed Value Type by Using Interfaces (and Why You Shouldn’t Do This)
- Some languages, such as C++/CLI, let you change the fields in a boxed value type, but C# does not.
However, you can fool C# into allowing this by using an interface. The following code is a modified
version of the previous code.
using System;// Interface defining a Change methodinternal interface IChangeBoxedPoint { void Change(Int32 x, Int32 y);}// Point is a value type.internal struct Point : IChangeBoxedPoint { private Int32 m_x, m_y; public Point(Int32 x, Int32 y) { m_x = x; m_y = y; } public void Change(Int32 x, Int32 y) { m_x = x; m_y = y; } public override String ToString() { return String.Format("({0}, {1})", m_x.ToString(), m_y.ToString()); }}public sealed class Program { public static void Main() { Point p = new Point(1, 1); Console.WriteLine(p); // 1 1 p.Change(2, 2); Console.WriteLine(p); // 2 2 Object o = p; Console.WriteLine(o); // 2 2 ((Point) o).Change(3, 3); Console.WriteLine(o); // 2 2 // Boxes p, changes the boxed object and discards it ((IChangeBoxedPoint) p).Change(4, 4); Console.WriteLine(p); 2 2 // Changes the boxed object and shows it ((IChangeBoxedPoint) o).Change(5, 5); // This points to the address in managed heap!!! Console.WriteLine(o); // 5 5 }}
In the last example, the boxed Point
referred to by o
is cast to an IChangeBoxedPoint
. No boxing
is necessary here because o
is already a boxed Point. Then Change is called, which does change
the boxed Point’s m_x
and m_y
fields.
Object Equality and Identity
This is about how to define a type
that properly implements object equality. Mind that equality is about value/content, and identity is about if they are the same one.The implementation of
Object
’sEquals
method looks
like this.public class Object { public virtual Boolean Equals(Object obj) { // If both references point to the same object, // they must have the same value. if (this == obj) return true; // Assume that the objects do not have the same value. return false; }}
However, the default implementation of
Object
’sEquals
method really **implements identity, not
value equality**.Here is how to properly
implement anEquals
method internally:- If the
obj
argument isnull
, returnfalse
; - If the
this
andobj
arguments refer to the same object, returntrue
. Identity must lead to equality. This can improve perfromance when comparing objects with many fields. - If
this
andobj
arguments refer to objects of different types, returnfalse
. - For each instance field defined by the type, compare the value in the
this
object with the
value in the obj object. If any fields are not equal, returnfalse
. - Call the base class’s
Equals
method so it can compare any fields defined by it. However, if the base class isObject
, then don’t callEquals
because it tests identity, and this comparing will fail even if contents are the same but not thesame object.
- If the
If you override
Object
’sEquals
method, thisEquals
method can no longer be called to test for identity. To fix this,Object
offers a static ReferenceEquals
method, which is implemented like this.public class Object { public static Boolean ReferenceEquals(Object objA, Object objB) { return (objA == objB); }}
You should always call
ReferenceEquals
if you want to check for identity. You shouldn’t use the C#==
operator (unless you cast both operands toObject
first) because one of the operands’ types could overload the==
operator, giving it semantics other
than identity.BTW,
System.ValueType
(the base class of all value types) does overrideObject
’sEquals
method and is correctly implemented to perform a value equality check (not an
identity check).Internally,
ValueType
’sEquals
method uses reflection (covered in Chapter 23, “Assembly Loading
and Reflection”) to comparing each filed by calling theEquals
method of them. Because the CLR’s reflection mechanism is slow, when defining
your own value type, you should overrideEquals
and provide your own implementation to improve
the performance of value equality comparisons that use instances of your type. Of course, in your
own implementation, do not callbase.Equals
.Properties of equality: reflexive, symmetric, transitive, consistent.
A couple more things to do:
- Have the type implement the System.IEquatable interface’s Equals method. This generic interface allows you to define a type-safe
Equals
method. Usually, you’ll implement
theEquals
method that takes anObject
parameter to internally call the type-safeEquals
method. - Overload the
==
and!=
operator methods. Usually, you’ll implement these operator methods
to internally call the type-safeEquals
method.
- Have the type implement the System.IEquatable interface’s Equals method. This generic interface allows you to define a type-safe
Object Hash Codes
System.Object
provides a virtualGetHashCode
method so that anInt32
hash code cna be obtained for any and all objects, also provide customization.- Note that if you define a type and override the
Equals
method, you should also override theGetHashCode
method. Because some collections in CLR require thatany two objects that are equal must have the same hash code value. - When we add a key/value pair to a collection, a hash code for the key object is obtained first, and used to decide which “bucket” to put the value. So we cannot change the ke directly, otherwise we cannot find the real value object in the collection any more, also, the object cannot be deleted, leading to memory leak. We should fetch the pair, remove the original pair, modify the key and add the new pair back into the hash table.
- When selecting an algorithm for calculatin hash codes for instances of our type, try to follow these guidelines:
- Good Random distribution gives the best performance of the hash table.
- You can call the base type’s
GetHashCode
method in your algorithm., but generally not that ofObject
orValueType
, because either method doesn’t lend itself to hiah-performance. - Use at least one instance field. Ideally, the fields used in your algorithml should be immutable.
- Exevute as quickly as possible.
- Objects with the same value should return the same code.
- Important: Never, ever persist hash code values, because hash code values are subject to change.
The dynamic Primitive Type
总结得好烦,先占坑,有心情了再补哈哈哈
- CLR via C# 总结之Chap5 Primitive, Reference, and Value Types
- Type Fundamentals (.NET: Primitive types, reference types and value types )
- Primitive Types and Reference Types in Javascript
- Distinguish Between Value Types and Reference Types - Effective C#学习笔记(6)
- CLR via C# 之旅
- CLR via C# 总结之Chap4 Type Fundamentals
- C# Concepts: Value vs Reference Types
- C# Concepts: Value vs Reference Types
- [C#] C# Concepts: Value vs Reference Types
- Primitive vs. Reference Data Types
- Primitive Types and Objects
- CLR 是怎样去处理Boxing and Unboxing Value Types?
- 初读CLR Via C# 之 堆栈
- CLR via C#摘抄
- clr via c#读书笔记
- CLR via C#
- clr via c# 小记
- CLR via C#
- 多线程访问同一资源第二步 : 线程设置 以及 线程打印 用synchronized解决数据不匹配问题
- Spark 原理及RDD理解
- 电脑wifi连接受限
- UIImageView显示动态图
- 最小花费
- CLR via C# 总结之Chap5 Primitive, Reference, and Value Types
- kubernetes2权威指南随书代码
- python datetime一些简单使用
- Python杂记
- C++ 花括号位置
- (6)nodejs学习---cookie和session
- MipMap
- Oracle 密码权限修改
- multiset