char,wchar_t,TCHAR的区别

来源：互联网发布：excel重复数据个数编辑：程序博客网时间：2024/05/17 00:56

The difference here is the character type. When calling WinAPI functions, the type of string you pass must match the type of string it expects.

It's a little confusing, but simple once you understand it:

On Windows:
- char is 8 bits
- wchar_t is 16 bits
- TCHAR is #defined as either char or wchar_t depending on your Unicode settings.

That said:
- MessageBox takes TCHAR strings (LPCTSTR)
- MessageBoxA takes char strings (LPCSTR)
- MessageBoxW takes wchar_t strings (LPCWSTR)

Therefore if you're using the MessageBox function, you must give it TCHARs. If you're using MessageBoxA, you must give it chars, etc.

When using string literals:

"GOOD"  // <- this is a char stringL"GOOD"  // <- this is a wchar_t string_T("GOOD")  // <- this is a TCHAR string

Therefore:

MessageBox(NULL,"GOOD","NOTE",MB_OK);

This fails because you're passing char strings to a function that takes TCHARs.

All of the below would work:

MessageBox(NULL,_T("GOOD"),_T("NOTE"),MB_OK); // TCHAR string to TCHAR function - OKMessageBoxA(NULL,"GOOD","NOTE",MB_OK);  // char string to char function - OKMessageBoxW(NULL,L"GOOD",L"NOTE",MB_OK); // wchar_t string to wchar_t function - OK

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I believe the real confusion lies in another workaround approach to this problem:
i.e. setting the "character set" field in (Project->property->configuration property) of visual studio.

That setting shouldn't matter. Properly written code will compile regardless of what that setting is set to.

The only reason that works is because TCHAR isn't really typesafe since it's just a #define. Ideally, you would still get the error even after trying that "workaround".

If you set the project setting such that multi-byte character set is used, it seems that char strings are automatically treated as tchar string when necessary.

Sort of. Changing that setting just makes TCHAR be defined as char, so char and TCHAR become interchangable. However you should not rely on that, as it makes your program dependent on that setting. It's best to just use the functions correctly as I outlined in my previous post.

if MessageBox() is expecting a pointer to unicode characters
AND
"GOOD" is actually represented in unicode,
the compiler should NOT give me an error.

This is confusing you because you're thinking about it the wrong way.

Unicode is just a character encoding. chars and wchar_t can both represent Unicode (in UTF-8 and UTF-16 respectively). No the term "Unicode" is meaningless in this context.

Yes, "Good" is a valid Unicode string (UTF-8), but it's a char string and therefore does not work with MessageBox, which is a TCHAR function. Whether or not it's Unicode doesn't really matter... what matters is the character type.

Do you imply that the string "GOOD" takes one byte per character (i.e. char) regardless of my project setting?

Yes.

"GOOD" is a char string and therefore is always 1 byte per character.
L"GOOD" is a wchar_t string and therefore is always 2 bytes per character (on Windows)
_T("GOOD") is a TCHAR string and can be either 1 or 2 bytes per character depending on the settings.

If that's the case, then what is the "unicode/multi-byte" project setting for?

Honestly I don't know what good it's for. I pretty much just always set it to Unicode because I have little reason not to.

0 0