Opaque Pointers——不透明指针

来源:互联网 发布:电脑为什么没有端口 编辑:程序博客网 时间:2024/05/22 03:38
 

Programming Tools - Opaque Pointers

by Chris McKillop

One of the most powerful concepts when writing software is abstraction - hiding the details of a system behind a simpler interface. This article shows you how to use a language feature of C (and C++) to provide a very powerful form of abstraction for use in libraries: opaque pointers.

C/C++ has an interesting language feature. If you declare a typedef of a structure pointer, you don't need to provide that structure's definition. For example:

typedef struct _hidden_struct *handle;

This has declared a new type, handle, which is a pointer to a struct _hidden_struct. What's interesting about this? Mainly the fact that you can now use this handle type without ever having a definition of the structure it's a pointer to. Why is this powerful? That's what this article will show you (some people might already see why). Please note that the typedef above (and those used throught this article) aren't strictly needed to hide the structure's internal definition - you can just keep using the struct keyword as part of the name. However, the typedefs are used to make an abstraction away from the fact that there's a structure at all, instead of turning it into a "handle."

Sample problem

To really see the power of opaque pointers, we'll need a problem to solve. Let's suppose we want to make an image library that loads and saves bitmaps. We know that as time goes on this library will need to be able to do more things (new image types, basic transforms) and we want to preserve both compile time and runtime compatibility for this library. This means that if we have an older application, it should work with a new shared library and that if that older application is rebuilt against the new library and headers, it should still build without errors.

What about C++?

Before this goes much further, I need to address the waving hands of all the C++ fans out there, who might be thinking: "C++ already lets me do all of this behind a nice class interface with inheritance and other nice C++ language features". And this is pretty much true in the case of compile-time compatibility. But, because C++ doesn't let you separate the public and private definitions of a class, the way the class is declared changes its runtime behavior (class size, vtable offsets, etc.) - you can't provide runtime compatibility without a lot of tender care (and sometimes special compilers). So, this is still useful to C++ people, even if the slant is more towards those of us using C.

Okay, back to the library. When people design libraries like this, they'll often declare a structure that will get filled in by the library and an API for manipulating this structure. For example:

typedef struct

{

void *ptr;

int size;

int bytes_per_pixel;

} bitmap_t;

int bitmap_load_file( char *filename, bitmap_t *bitmap );

int bitmap_save_file( char *filename, bitmap_t *bitmap );

So, to use this library, you would declare a bitmap_t variable and invoke bitmap_load_file() with a filename and a pointer to the bitmap_t that was declared. For example:

bitmap_t bitmap;

int ret;

ret = bitmap_load_file( "sample.bmp", &bitmap );

Then the user of the library could simply access the pointer inside the structure and, along with the size and bytes_per_pixel members, go to work on the image. However, because the contents of bitmap_t are known publicly, if the library is updated with new entries in that structure, then the size of the structure will have been changed. And applications that use an updated shared library will probably crash or Other Bad Things when they call into the new library with a structure of a smaller (different) size.

Alternatively, the API could be defined to take a structure size:

int bitmap_load_file( char *filename, bitmap_t *bitmap, int size );

int bitmap_save_file( char *filename, bitmap_t *bitmap, int size );

and would be used like this:

ret = bitmap_load_file( "sample.bmp", &bitmap, sizeof( bitmap_t ) );

Which effectively tags the version of the library by using the size of the structure. However, this makes for a terrible support nightmare inside the library where the structure size has to be checked and different code paths are taken based on this structure size. This leads to a lot of code bloat.

We could also try to avoid the structure size issue by padding the structure with some "reserved" space:

typedef struct

{

void *ptr;

int size;

int bytes_per_pixel;

char reserved[64];

} bitmap_t;

But this is only a temporary fix. What if we didn't choose a value that's large enough? Then we're back to the case where a new library causes problems. If we're liberal with our reserved space, then we waste memory. (Since you're reading this on a QNX web page, I'm guessing that wasting memory doesn't sit well with you either.)

Another problem common to all of these approaches is what can occur if the layout of the structure changes. Say a new version of the library is built with a structure definition that looks like this:

typedef struct

{

int version;

void *ptr;

int size;

int bytes_per_pixel;

} bitmap_t;

Then the compiled apps will also get "confused", since what was previously the structure member ptr is now version, and so on. The position of a structure member within a structure is important. Seems pretty much impossible to meet our goals, huh? Fear not, loyal readers, the situation isn't that dire!

Hiding the structure's internals

The common thread to all the situations above was that the compiled application was aware of the size of the structure and the location in the structure of the structure members. So we need to hide the internals of the structure and provide access functions to get the important data out of the structure.

Let's try out a new public interface:

typedef struct _internal_bitmap * bitmap_t;

int bitmap_alloc( bitmap_t *bitmap );

int bitmap_free( bitmap_t *bitmap );

int bitmap_load_file( bitmap_t bitmap, char *filename );

int bitmap_save_file( bitmap_t bitmap, char *filename );

int bitmap_get_ptr( bitmap_t bitmap, void **ptr );

int bitmap_get_size( bitmap_t bitmap, int *size );

int bitmap_get_bpp( bitmap_t bitmap, int *bpp );

And now we can maintain a private interface that applications never get to see or use, only the library:

struct _internal_bitmap

{

void *ptr;

int size;

int bytes_per_pixel;

}

Did you notice the opaque pointer? Also, notice we've added "access functions" to get the interesting data from the new bitmap "handle"? It's pretty obvious now that we're passing in a bitmap_t (a structure pointer) as a handle to the library, but the alloc and free functions are a little confusing for people.

When we declare a bitmap_t, we're really just declaring a pointer to a structure, so we need to provide some memory for that pointer to point at. Here are the guts of the bitmap_alloc() function:

int bitmap_alloc( bitmap_t *bitmap )

{

struct _internal_bitmap *handle;

handle = ( struct _internal_bitmap * )malloc( sizeof( *handle ) );

if( handle == NULL )

{

return -1;

}

memset( handle, 0, sizeof( *handle ) );

*bitmap = handle;

return 0;

}

Since a bitmap_t is just a struct pointer, we allocate the proper sized struct (which we can do - this code is part of the library and it knows how big the structure is). Once we've verified that the malloc() didn't fail, we assign the newly allocated structure to the bitmap_t pointer. So when the application calls this function, it will get back the proper sized structure to pass into the rest of the library functions.

Here's an example of an "access function" that uses the allocated bitmap handle:

int bitmap_get_ptr( bitmap_t bitmap, void **ptr )

{

if( ptr == NULL )

{

return -1;

}

*ptr = bitmap->ptr;

return 0;

}

Since the library knows the definition of the _internal_bitmap structure, it can directly access its members. If you tried to access the internals of the bitmap_t handle in application code, the compiler would return an error, because it has no idea how the structure is organized or what any of its members are named.

For the last bit of code, I'll write a function that loads a bitmap, sets all the pixels to 255, and writes the bitmap back:

int turn_bitmap_white( char *filename )

{

int ret, i;

bitmap_t bitmap;

unsigned char *ptr;

int size;

ret = bitmap_alloc( &bitmap );

if( ret )

return ret;

ret = bitmap_load_file( bitmap, filename );

if( ret )

return ret;

ret = bitmap_get_ptr( bitmap, (void **)&ptr );

ret |= bitmap_get_size( bitmap, &size );

if( ret )

return ret;

for( i=0; i<size; i++ )

{

ptr[i] = 255;

}

ret = bitmap_save_file( bitmap, filename );

if( ret )

return ret;

bitmap_free( &bitmap );

return 0;

}

Problem solved

So, we've solved our problem. If we change the structure layout, we'll be okay, since the application code can't access the internals of the structure directly and must use "access functions" to get at the internal data.

If we change the size of the structure, we'll be okay, since the library itself allocates the memory for the structure and knows the proper size of the structure to allocate for the given version of the library. You can now replace the library (in shared library form) without having to rebuild any applications. And you can rebuild applications against the library without change. Now, this does assume that you haven't changed the API to your library - opaque pointers are a powerful tool, but they can't perform magic.

Lastly, the use of opaque pointers enforces good programming practice by providing a defined interface (abstraction) between the application and the library. This is usually a good method even when doing your own projects, since it lets you easily separate functionality for future projects!

 

 

http://blog.chinaunix.net/space.php?uid=13701930&do=blog&id=336480

蓝色部分为自己增加内容

不透明类型

不透明数据类型隐藏了它们内部格式或结构。在C语言中,它们就像黑盒一样。支持它们的语言不是很多。作为替代,开发者们利用typedef声明一个类型,把它叫做不透明类型,希望其他人别去把它重新转化回对应的那个标准C类型。通常开发者们在定义一套特别的接口时才会用到它们。比如说用来保存进程标识符的pid_t类型。该类型的实际长度被隐藏起来了——尽管任何人都可以偷偷撩开它的面纱,发现它就是一个int。如果所有代码都不显式的利用它的长度(显式利用长度这里指直接使用int类型的长度,比如说在编程时使用sizeof(int)而不是sizeof(pid_t)),那么改变时就不会引起什么争议,这种改变确实可能会出现:在老版本的Unix系统中,pid_t的定义是short类型。

       另外一个不透明数据类型的例子是atomic_t。在第八章“内核同步方法”中介绍过,它放置的是一个可以进行原子操作的整型值。尽管这种类型就是一个int,但利用不透明类型可以帮助确保这些数据只在特殊的有关原子操作的函数中才会被使用。不透明类型还帮助我们隐藏了类型的长度,该类型也并不总是完整的32位。

内核还用到了其它一些不透明类型,包括dev_t,gid_t和uid_t等等。处理不透明类型时的原则是:

¨         不要假设该类型的长度。

¨         不要将该类型转化回其对应的C标准类型使用。

¨         编程时要保证在该类型实际存储空间和格式发生变化时代码不受影响。

===================================================================
 
问:glib里_GMutex类型是在哪里定义的呀? 找了一圈没看到。

只看到个typedef sturct _GMutex GMutex;// glib/gthread.h P76  glib-2.29.10
 
答:The GMutex struct is an opaque data structure.
 
Opaque data types do not reveal their internal format or structure. They are about as "black box" as you can get in C. There is not a lot of language support for them. Instead, developers declare a typedef, call it an opaque type, and hope no one typecasts it back to a standard C type. All use is generally through a special set of interfaces that the developer creates.
 
See 《Linux_Kernel_Development_Second_Edition》or
Google "Opaque data type"
 
* Sometimes OPAQUE means "this function doesn't return anything at all",
similarly to "returns void" in C and some other languages.

* Sometimes OPAQUE means "this value can be any datatype at all" (eg,
input of the count() aggregate).

* Sometimes OPAQUE means "I know exactly what this should be, but it's
an internal datatype with no SQL-level equivalent". This is usually
what's at stake with a trigger function.

* Sometimes OPAQUE means "I know exactly what this should be, but I
haven't declared it yet" (ie, a forward reference). This is needed
to declare I/O functions of user-defined types, since the system
requires the functions to already exist when the type is declared.
 
如果声明了_GMutex,那么GMutex就变成了普通的结构,如果不声明他就是一个不透明类型。

再说,mutex类型是内核对象,必须定义在内核中,不依赖于系统是很难实现的。

不透明类型是一种灵活的类型,他的大小是未知的。所以一般只能定义GMutex *变量,不能定义GMutex对象。然后通过一套接口来操作GMutex,就是
代码:

struct _GThreadFunctions
{
  GMutex*  (*mutex_new)           (void);

  GMutex*  (*mutex_lock)           (void);
  GMutex*  (*mutex_trylock)           (void);
  GMutex*  (*mutex_unlock)           (void); 

  GMutex*  (*mutex_free)           (void); 
 /* .... */
};

 下面是win32中对mutex_new 的实现
  static GMutex * g_mutex_new_win32_cs_impl(void)
  {
        CRITICAL_SECTION * cs = g_new(CRITICAL_SECTION ,1);
        gpointer * retval = g_new(gpointer, 1);
         InitializeCriticalSection(cs);
         *retval = cs;
         return (GMutex *)retval;
  }
  // GMutex隐藏了实际具体平台相关的结构,相当于对用户统一表示,GMutex只是作为一种标示来使用,用户是不可见的。
这有点像C++的interface, 那一堆函数指针就是vtable。这张表要进行初始化,间接让他们指向操作系统提供的API。
所以就没有定义_GMutex必要,因为这份数据已经定义在系统核心中,我们只需要一个叫GMutex的名字来引用这份数据。