OS/2 codes: How to make 'char' type of gcc + kLIBC compatible with IBM Visual Age C/C++ and Open Watcom C/C++
When you compile sources for IBM Visual Age C/C++(later VAC) or Open Watcom C/C++(later OW) with gcc/g++ + kLIBC(later gcc), there is a thing you should notice. It is char type.
char type is a primitivie type to hold a character on C/C++. And it is usually 1-byte size. If it holds an integer, it can be singned char or unsigned char according to implementaiton.
VAC and OW define char to unsigned char by default. Its range is 0 to 255.
On the other hands, gcc define char to signed char by default. Its range is -128 to 127.
This difference has no problem usually. However, if bit-wise operation is used or DBCS characters manipulations are performed, potential problems may occur.
For example, if you did right-bit-shift by 1 on a char varaible holding 128(10000000 in binary), you would expect 64(01000000b). But you may get 64(01000000, VAC and OW) or -64(11000000b, gcc). This is because right-bit-shift conserve MSB(sign-bit) if signed-type.
Try to build and run the following codes with gcc, VAC and OW.
EMX developers were already aware of this issue. So they introduced OS2EMX_PLAIN_CHAR macro to avoid this. Unless it is defined, some OS/2 types such as BYTE, PCH, PSZ, PCCH, PCSZ are typedefed to unsigned char. Otherwise, they are typedefed to char.
In general, BYTE is used to hold unsigned char instead of using unsigned char directly on OS/2. Therefore this was a good strategy.
However, this approach has a problem. In C++, when using string literals, gcc generates many errors about 'invalid conversion'. Or when passing PBYTE, PCH, PSZ, PCCH, and PCSZ variables to APIs using PCHAR arguments, errors of the same type occur. And vice versa.
This is because string literals and PCHAR are ( const ) char *, respectively, but the others are ( const ) unsigned char * unless OS2EMX_PLAIN_CHAR.
Build and run the following codes.
To compile the above source with g++, you should define OS2EMX_PLAIN_CHAR like this,
However, with OS2EMX_PLAIN_CHAR we returned to the origin. That is, the difference of char type between gcc and VAC/OW. Bit-wise operations on BYTE, especially right-bit-shift, may generate differnet values in gcc and VAC/OW.
/// ----- 2016/11/24
In addition, sign extension is also one of main causes of the problems. Here is an example.
What do you expect ? 0x80 ? No, it's 0xFFFFFF80 because of sign extension.
OS2EMX_PLAIN_CHAR makes BYTE to be defined to char not unsigned char. And gcc defines char to signed char by default. As a result, BYTE value 0x80 becomes ULONG value 0xFFFFFF80 due to its sign.
The way to workaround this is to apply bit-and operation such as b & 0xFF to get low 8 bits only. However, this is more or less ugly because such an operation should be performed on all BYTE variables.
/// -----
After all, only OS2EMX_PLAIN_CHAR is not enough. What is neeed ? You should use -funsigned-char. This option makes gcc define char to unsigned char by default.
Therefore, if you want to use sources for VAC or OW with gcc, then you should define OS2EMX_PLAIN_CHAR and pass -funsigned-char option to gcc.
This is true for general OS/2 programs, too. So if you want to development OS/2 native programs, then define OS2EMX_PLAIN_CHAR and pass -funsigned-char option to gcc.
char type is a primitivie type to hold a character on C/C++. And it is usually 1-byte size. If it holds an integer, it can be singned char or unsigned char according to implementaiton.
VAC and OW define char to unsigned char by default. Its range is 0 to 255.
On the other hands, gcc define char to signed char by default. Its range is -128 to 127.
This difference has no problem usually. However, if bit-wise operation is used or DBCS characters manipulations are performed, potential problems may occur.
For example, if you did right-bit-shift by 1 on a char varaible holding 128(10000000 in binary), you would expect 64(01000000b). But you may get 64(01000000, VAC and OW) or -64(11000000b, gcc). This is because right-bit-shift conserve MSB(sign-bit) if signed-type.
Try to build and run the following codes with gcc, VAC and OW.
1 2 3 4 5 6 7 8 9 10 11 | /** @file chshift.c */ #include <stdio.h> int main( void ) { char ch = 1 << 7; printf("ch = %d, ch >> 1 = %d\n", ch, ch >> 1 ); return 0; } |
EMX developers were already aware of this issue. So they introduced OS2EMX_PLAIN_CHAR macro to avoid this. Unless it is defined, some OS/2 types such as BYTE, PCH, PSZ, PCCH, PCSZ are typedefed to unsigned char. Otherwise, they are typedefed to char.
In general, BYTE is used to hold unsigned char instead of using unsigned char directly on OS/2. Therefore this was a good strategy.
However, this approach has a problem. In C++, when using string literals, gcc generates many errors about 'invalid conversion'. Or when passing PBYTE, PCH, PSZ, PCCH, and PCSZ variables to APIs using PCHAR arguments, errors of the same type occur. And vice versa.
This is because string literals and PCHAR are ( const ) char *, respectively, but the others are ( const ) unsigned char * unless OS2EMX_PLAIN_CHAR.
Build and run the following codes.
1 2 3 4 5 6 7 8 9 10 11 12 13 | /** @file strliterals.cpp */ #include <os2.h> #include <iostream> int main( void ) { PCSZ psz = "Hello, world!!!"; std::cout << psz << "\n"; return 0; } |
To compile the above source with g++, you should define OS2EMX_PLAIN_CHAR like this,
g++ strliterals.cpp -DOS2EMX_PLAIN_CHAR
However, with OS2EMX_PLAIN_CHAR we returned to the origin. That is, the difference of char type between gcc and VAC/OW. Bit-wise operations on BYTE, especially right-bit-shift, may generate differnet values in gcc and VAC/OW.
/// ----- 2016/11/24
In addition, sign extension is also one of main causes of the problems. Here is an example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | /** @file signext.c */ #define OS2EMX_PLAIN_CHAR #define INCL_DOS #include <os2.h> #include <stdio.h> int main( void ) { BYTE b = 0x80; printf("b = %x\n", b ); return 0; } |
What do you expect ? 0x80 ? No, it's 0xFFFFFF80 because of sign extension.
OS2EMX_PLAIN_CHAR makes BYTE to be defined to char not unsigned char. And gcc defines char to signed char by default. As a result, BYTE value 0x80 becomes ULONG value 0xFFFFFF80 due to its sign.
The way to workaround this is to apply bit-and operation such as b & 0xFF to get low 8 bits only. However, this is more or less ugly because such an operation should be performed on all BYTE variables.
/// -----
After all, only OS2EMX_PLAIN_CHAR is not enough. What is neeed ? You should use -funsigned-char. This option makes gcc define char to unsigned char by default.
Therefore, if you want to use sources for VAC or OW with gcc, then you should define OS2EMX_PLAIN_CHAR and pass -funsigned-char option to gcc.
This is true for general OS/2 programs, too. So if you want to development OS/2 native programs, then define OS2EMX_PLAIN_CHAR and pass -funsigned-char option to gcc.
댓글
댓글 쓰기