This chapter covers the following topics:
The formal Watcom C/C++ compiler command-line syntax is shown below:
The square brackets [ ] denote items that are optional.
If a path isn't specified, the current working directory is assumed. If the file isn't in the current directory, an adjacent C directory (that is,../c) is searched.
If no file extension is included in the specified name, the default file extension is .occ. A search of the current directory is made. If not successful, an adjacent OCC directory (that is, ../occ) is searched, if it exists.
You can use an environment variable to specify commonly used compiler options, as follows:
Command | Environment Variable |
---|---|
wcc | WCC |
wpp | WPP |
wcc386 | WCC386 |
wpp386 | WPP386 |
These options are processed before options specified on the command line.
For example:
export "WCC=-d1 -ot" export "WPP=-d1 -ot" export "WCC386=-d1 -ot" export "WPP386=-d1 -ot"
The above examples define the default options to be d1 (include line number debugging information in the object file), and ot (favor time optimizations over size optimizations).
Once a particular environment variable has been defined, those options listed become the default each time the associated compiler is used. The compiler command line can be used to override any options specified in the environment string.
This section gives some examples of using Watcom C/C++ to compile C/C++ source programs.
wcc report -d1 -s wpp report -d1 -s
wcc -mm -fpc calc wpp -mm -fpc calc
wcc kwikdraw -2 -fpi87 -oaxt wpp kwikdraw -2 -fpi87 -oaxt
wcc386 -mf -3s calc wpp386 -mf -3s calc
wcc386 kwikdraw -3r -fpi87 -oaimxt wpp386 kwikdraw -3r -fpi87 -oaimxt
wcc ../source/modabs -d2 wpp ../source/modabs -d2 wcc386 ../source/modabs -d2 wpp386 ../source/modabs -d2
export "WCC=-i=/includes -mc" export "WPP=-i=/includes -mc" export "WCC386=-i=/includes -mf" export "WPP386=-i=/includes -mf" wcc /cprogs/grep.tst -fi=iomods.c wpp /cprogs/grep.tst -fi=iomods.c wcc386 /cprogs/grep.tst -fi=iomods.c wpp386 /cprogs/grep.tst -fi=iomods.c
wcc grep -fo=../obj/ -mm wpp grep -fo=../obj/ -mm wcc386 grep -fo=../obj/ -mf wpp386 grep -fo=../obj/ -mf
wcc -dDBG=1 grep -fo=../obj/.dbo -mc wpp -dDBG=1 grep -fo=../obj/.dbo -mc wcc386 -dDBG=1 grep -fo=../obj/.dbo -mf wpp386 -dDBG=1 grep -fo=../obj/.dbo -mf
wcc -g=GKS -s /gks/gopks wpp -g=GKS -s /gks/gopks
This example assumes that this file contains the definition of the routine gopengks(), as follows:
void far gopengks( int workstation, long int h ) { . . . }
For a small code model, the routine gopengks() must be defined in this file as far, since it's placed in another group. The s option is also specified to prevent a runtime call to the stack overflow check routine that's placed in a different code segment at link time. Since the gopengks() routine appears in a different code segment, it must be prototyped by C routines in other groups as:
void far gopengks( int workstation, long int h );
The Watcom C/C++ compiler contains many options for controlling the code to be produced. It's impossible to have a certain set of compiler options that produce the absolute fastest execution times for all possible applications. With that said, we'll list the compiler options that we think give the best execution times for most applications. You might have to experiment with different options to see which combination of options generates the fastest code for your particular application.
The recommended options for generating the fastest 16-bit code are:
The recommended options for generating the fastest 32-bit code are:
Option on causes the compiler to replace floating-point divisions with multiplications by the reciprocal. This generates faster code (multiplication is faster than division), but the result might not be the same because the reciprocal might not be exactly representable.
Option oe causes small user-written functions to be expanded in-line rather than generating a call to the function. Expanding functions in-line can further expose other optimizations that couldn't otherwise be detected if a call were generated to the function.
Option oa causes the compiler to relax alias checking.
Option ot must be specified to cause the code generator to select code sequences that are faster without any regard to the size of the code. The default is to select code sequences that strike a balance between size and speed.
Option ox is equivalent to oilmr and s, which cause the compiler to expand intrinsic functions in-line (oi), perform loop optimizations (ol), generate 387 instructions in-line for math functions such as sin(), cos() and sqrt() (om), reorder instructions to avoid pipeline stalls (or), and not to generate any stack overflow checking (s). Option or is very important for generating fast code for the Pentium processor.
Option zp4 causes all data to be aligned on 4-byte boundaries. The default is zp1, which packs all data. This reduces the amount of data memory required but requires extra clock cycles to access data that isn't on an appropriate boundary.
Options 0, 1, 2, 3, 4 and 5 emit code sequences optimized for processor-specific instruction set features and timings. For 16-bit applications, the use of these options might limit the range of systems on which the application will run, but there are improvements in execution performance.
Options fp2, fp3, and fp5 emit floating-point operations targetted at specific features of the math coprocessor in the Intel series. For 16-bit applications, the use of these options might limit the range of systems on which the application will run but, there are improvements in execution performance.
Option fpi87 causes in-line 80x87 numeric data processor instructions to be generated into the object code for floating-point operations. Floating-point instruction emulation isn't included, so as to obtain the best floating-point performance in 16-bit applications.
For 32-bit applications, the use of the fp5 option gives good performance on the Pentium but less than optimal performance on the 386 and 486. The use of the 5 option gives good performance on the Pentium and minimal, if any, impact on the 386 and 486. Thus, the options -oneatx -zp4 -5 -fp3 give good overall performance for the 386, 486 and Pentium processors.
The Watcom C/C++ compiler issues two types of diagnostic messages: warnings and errors. A warning message doesn't prevent the production of an object file. However, error messages indicate that a problem is severe enough that it must be corrected before the compiler will produce an object file.
If the compiler prints diagnostic messages on the screen, it also places a copy of these messages in a file in your current directory. The file has the same file name as the source file, and an extension of .err. This error file is a handy reference when you wish to correct the errors in the source file.
To illustrate the diagnostic features of Watcom C/C++, we'll modify the hello program in such a way as to introduce some errors:
#include <stdio.h> int main() { int x; printf( "Hello world\n" ); return( y ); }
The equivalent C++ program follows:
#include <iostream.h> #include <iomanip.h> int main() { int x; cout << "Hello world" << endl; return( y ); }
In this example, we have added the lines:
int x;
and
return( y );
and changed the keyword void to int.
We compile the program with the warning option.
wcc hello -w3 wpp hello -w3 wcc386 hello -w3 wpp386 hello -w3
For the C program, the following output appears on the screen:
hello.c(7): Error! E1011: Symbol 'y' has not been declared hello.c(5): Warning! W202: Symbol 'x' has been defined, but not referenced hello.c: 8 lines, included 174, 1 warnings, 1 errors
For the C++ program, the following output appears on the screen:
File: hello.cpp (8,13): Error! E029: symbol 'y' has not been declared (9,1): Warning! W014: no reference to symbol 'x' 'x' declared at: (6,9) hello.cpp: 9 lines, included 1267, 1 warning, 1 error
Here we see an example of both types of messages; an error and a warning message have been issued. As indicated by the error message, we require a declarative statement for the identifier y. The warning message indicates that, while it isn't a violation of the rules of C/C++ to define a variable without ever using it, we probably didn't intend to do so. On examining the program, we find that:
The complete list of Watcom C/C++ diagnostic messages is presented in an appendix of this guide:
When using the #include preprocessor directive, a header is identified by a sequence of characters placed between the ``<'' and ``>'' delimiters (for example, <file>), and a source file is identified by a sequence of characters enclosed by quotation marks (for example, "file"). Watcom C/C++ makes a distinction between the use of ``<>'' or quotation marks to surround the name of the file to be included. The search techniques for header files and source files are slightly different. Consider the following example:
#include <stdio.h> /* a system header file */ #include "stdio.h" /* your own header or source file */
You should use ``<'' and ``>'' when referring to standard or system header files, and quotation marks when referring to your own header and source files.
The character sequence placed between the delimiters in an #include directive represents the name of the file to be included. The file name may include node, path, and extension.
It isn't necessary to include the node and path specifiers in the file specification when the file resides on a different node or in a different directory. Watcom C/C++ provides a mechanism for looking up include files that might be located in various directories and disks of the computer system. Watcom C/C++ searches directories for header and source files in the following order (the search stops once the file has been located):
The default build targets are:
For example, the environment variable OS2_INCLUDE is searched if the build target is ``OS2''. The build target would be OS/2 if:
In the above example, <stdio.h> and "stdio.h" could refer to two different files if:
The compiler searches the directories listed in i paths (see the description of the compiler i option) and the INCLUDE environment variable in a manner analogous to that which the operating system shell uses when searching for programs by using the PATH environment variable.
The export command can be used to define an INCLUDE environment variable that contains a list of directories. Issue a command of the form
export INCLUDE=path:path...
before running the Watcom C/C++ compiler for the first time.
The following example illustrates the use of the #include directive:
#include <stdio.h> #include <time.h> #include <dos.h> #include "common.c" int main() { initialize(); update_files(); create_report(); finalize(); } #include "part1.c" #include "part2.c"
If the above text is stored in the source file report.c in the current directory, then we might issue the following commands to compile the application.
export INCLUDE=/usr/include://1/headers wcc report -fo=../obj/-i=../source wpp report -fo=../obj/-i=../source wcc386 report -fo=../obj/-i=../source wpp386 report -fo=../obj/-i=../source
In the above example, the export command is used to define the INCLUDE environment variable. It specifies that the /usr/include directory (of the current node) and the /headers directory (a directory on node 1) are to be searched.
The i option for the Watcom C/C++ compiler defines a third place to search for include files. The advantage of the INCLUDE environment variable is that it doesn't need to be specified each time you use the compiler.
The Watcom C/C++ preprocessor forms an integral part of Watcom C/C++. When any form of the p option is specified, only the preprocessor is invoked. No code is generated, and no object file is produced. The output of the preprocessor is written to the standard output file, although it can also be redirected to a file using the fo option.
Suppose the following C/C++ program is contained in the file msgid.c:
#define _IBMPC 0 #define _IBMPS2 1 #if _TARGET == _IBMPS2 char *SysId = { "IBM PS/2" }; #else char *SysId = { "IBM PC" }; #endif /* Return pointer to System Identification */ char *GetSysId() { return( SysId ); }
We can use the Watcom C/C++ preprocessor to generate the C/C++ code that would actually be compiled by the Watcom C/C++ compiler by issuing the following command:
wcc msgid -plc -fo -d_TARGET=_IBMPS2 wpp msgid -plc -fo -d_TARGET=_IBMPS2 wcc386 msgid -plc -fo -d_TARGET=_IBMPS2 wpp386 msgid -plc -fo -d_TARGET=_IBMPS2
The file msgid.i is created, and contains the following C/C++ code:
#line 1 "msgid.c" char *SysId = { "IBM PS/2" }; #line 9 "msgid.c" /* Return pointer to System Identification */ char *GetSysId() { return( SysId ); }
Note that the file msgid.i can be used as input to the Watcom C/C++ compiler, as follows:
wcc msgid.i wpp msgid.i wcc386 msgid.i wpp386 msgid.i
Since #line directives are present in the file, the Watcom C/C++ compiler can issue error messages in terms of the original source file line numbers.
In addition to the standard ANSI-defined macros supported by the Watcom C/C++ compilers, several additional system-dependent macros are also defined. These are described in this section. See the WATCOM C Language Reference manual for a description of the standard macros.
The Watcom C/C++ compilers run on various host operating systems including DOS, OS/2, Windows NT, and QNX. Any of the supported host operating systems can be used to develop applications for a number of target systems. By default, the target operating system for the application is the same as the host operating system unless some option or combination of options is specified. For example, DOS applications are built on DOS by default, OS/2 applications are built on OS/2 by default, and so on. But the flexibility is there to build applications for other operating systems/environments.
The macros described below can be used to identify the target system for which the application is being compiled.
The Watcom C/C++ compilers support both 16-bit and 32-bit application development. The following macros are defined for 16-bit and 32-bit target systems:
The Watcom C/C++ compilers support application development for a variety of operating systems. The following macros are defined for particular target operating systems.
(16-bit only) This macro is defined when the zw, zW, zWs, or bt=windows option is specified.
(32-bit only) This macro is defined when the zw or bt=windows option is specified.
The following macros indicate which compiler is compiling the C/C++ source code:
The value of the macro depends on the version number of the compiler. The value is 100 times the version number (version 8.5 yields 850, version 9.0 yields 900, etc.).
The following macros are defined under the conditions described:
Watcom C/C++ supports the use of some special keywords to describe system-dependent attributes of functions and other object names. These attributes are inspired by the Intel processor architecture and the plethora of function-calling conventions in use by compilers for this architecture. In keeping with the ANSI C and C++ language standards, Watcom C/C++ uses a double underscore (__), or a single underscore followed by an uppercase letter (for example, _S) as a prefix for these keywords. To support compatibility with other C/C++ compilers, alternate forms of these keywords are also supported through predefined macros.
Watcom C/C++ predefines the macros near and _near to be equivalent to the __near keyword.
Watcom C/C++ predefines the macros far, _far and SOMDLINK (16-bit only) to be equivalent to the __far keyword.
Watcom C/C++ predefines the macros huge and _huge to be equivalent to the __huge keyword.
Watcom C/C++ predefines the macro _based to be equivalent to the __based keyword.
Watcom C/C++ predefines the macro _segment to be equivalent to the __segment keyword.
Watcom C/C++ predefines the macro _segname to be equivalent to the __segname keyword.
Watcom C/C++ predefines the macro _self to be equivalent to the __self keyword.
Watcom C/C++ predefines the macros cdecl, _cdecl, _Cdecl and SOMLINK (16-bit only) to be equivalent to the __cdecl keyword.
Watcom C/C++ predefines the macros pascal, _pascal and _Pascal to be equivalent to the __pascal keyword.
Watcom C/C++ predefines the macros fortran and _fortran to be equivalent to the __fortran keyword.
For example:
#include <i86.h> void __interrupt int10( union INTPACK r ) { . . . }
The code generator emits instructions to save all registers. The registers are saved on the stack in a specific order so that they may be referenced using the INTPACK union as shown in the DOS example above.
The code generator emits instructions to establish addressability to the program's data segment since the DS segment register contents are unpredictable. The function returns using an IRET (16-bit) or IRETD (32-bit) (interrupt return) instruction.
Watcom C/C++ predefines the macros interrupt and _interrupt to be equivalent to the __interrupt keyword.
For example:
void __export _Setcolor( int color ) { . . . }
Watcom C/C++ predefines the macro _export to be equivalent to the __export keyword.
For example:
void __export __loadds _Setcolor( int color ) { . . . }
If the function in an OS/2 1.x Dynamic Link Library requires access to private data, the data segment register must be loaded with an appropriate value since it will contain the DS value of the calling application on entry to the function.
Watcom C/C++ predefines the macro _loadds to be equivalent to the __loadds keyword.
Watcom C/C++ predefines the macro _saveregs to be equivalent to the __saveregs keyword.
Watcom C/C++ predefines the macros _syscall, _System and SOMLINK (32-bit only) to be equivalent to the __syscall keyword.
This keyword can be used under 32-bit OS/2 to call 16-bit functions from your 32-bit flat model program. Integer arguments are automatically converted to 16-bit integers, and 32-bit pointers are converted to far16 pointers before calling a special thunking layer to transfer control to the 16-bit function.
Watcom C/C++ predefines the macros _far16 and _Far16 to be equivalent to the __far16 keyword. This keyword is compatible with Microsoft C.
In the OS/2 operating system (version 2.0 or higher), the first 512 megabytes of the 4 gigabyte segment referenced by the DS register is divided into 8192 areas of 64K bytes each. A far16 pointer consists of a 16-bit selector referring to one of the 64K byte areas, and a 16-bit offset into that area.
A pointer declared as:
[type] __far16 *name;
defines an object that is a far16 pointer. If such a pointer is accessed in the 32-bit environment, the compiler generates the necessary code to convert between the far16 pointer and a flat 32-bit pointer.
For example, the declaration:
char __far16 *bufptr;
declares the object bufptr to be a far16 pointer to char.
A function declared as:
[type] __far16 func( [arg_list] );
declares a 16-bit function. Any calls to such a function from the 32-bit environment causes the compiler to convert any 32-bit pointer arguments to far16 pointers, and any int arguments from 32 bits to 16 bits. (In the 16-bit environment, an object of type int is only 16 bits.) Any value returned from the function is converted in an appropriate manner.
For example, the declaration
char * __far16 Scan( char *buffer, int len, short err );
declares the 16-bit function Scan(). When this function is called from the 32-bit environment, the buffer argument is converted from a flat 32-bit pointer to a far16 pointer (which, in the 16-bit environment, would be declared as char __far *. The len argument is converted from a 32-bit integer to a 16-bit integer. The err argument is passed unchanged. On returning, the far16 pointer (far pointer in the 16-bit environment) is converted to a 32-bit pointer that describes the equivalent location in the 32-bit address space.
In the OS/2 operating system (version 2.0 or higher), the first 512 megabytes of the 4 gigabyte segment referenced by the DS register is divided into 8192 areas of 64K bytes each. A far16 pointer consists of a 16-bit selector referring to one of the 64K byte areas, and a 16-bit offset into that area.
A pointer declared as:
[type] * _Seg16 name;
defines an object that's a far16 pointer. Note that the _Seg16 appears on the right side of the *, which is opposite to the __far16 keyword described above.
For example,
char * _Seg16 bufptr;
declares the object bufptr to be a far16 pointer to char (the same as above).
The _Seg16 keyword may not be used to describe a 16-bit function. A #pragma directive must be used instead. A function declared as:
[type] * _Seg16 func( [parm_list] );
declares a 32-bit function that returns a far16 pointer.
For example, the declaration:
char * _Seg16 Scan( char * buffer, int len, short err );
declares the 32-bit function Scan(). No conversion of the argument list takes place. The return value is a far16 pointer.
#pragma aux fast_mul = \ "imul eax,edx" \ parm caller [eax] [edx] \ value struct; struct fixed { unsigned v; }; fixed __pragma( "fast_mul") operator *( fixed, fixed ); fixed two = { 2 }; fixed three = { 3 }; fixed foo() { return two * three; }
See the following chapters for more information on pragmas:
Near pointers are generally the most efficient type of pointer because they are small, and the compiler can assume knowledge about what segment of the computer's memory the pointer (offset) refers to. Far pointers are the most flexible because they allow the programmer to access any part of the computer's memory, without limitation to a particular segment. However, far pointers are bigger and slower because of the additional flexibility.
Based pointers are a compromise between the efficiency of near pointers and the flexibility of far pointers. With based pointers, the programmer takes responsibility to tell the compiler which segment a near pointer (offset) belongs to, but may still access segments of the computer's memory outside of the normal data segment (DGROUP). The result is a pointer type that's as small as, and almost as efficient, as a near pointer, but with most of the flexibility of a far pointer.
An object declared as a based pointer falls into one of the following categories:
To support based pointers, the following keywords are provided:
as well as the :> operator. These keywords and this operator are described in the following sections:
Two macros, defined in malloc.h, are also provided:
They are used in a manner similar to NULL, but are used with objects declared as __segment and __based, respectively.
A segment constant based pointer or object has its segment value based on a specific, named segment. A segment constant based object is specified as:
[type] __based( __segname( "segment" ) ) object_name;
and a segment constant based pointer is specified as:
[type] __based( __segname( "segment" ) ) *object-name;
where segment is the name of the segment in which the pointer or object is based. As shown above, the segment name is always specified as a string. There are three special segment names recognized by the compiler:
The _CODE segment is the default code segment. The _CONST segment is the segment containing constant values. The _DATA segment is the default data segment. If the segment name isn't one of the three recognized names, then a segment is created with that name. If a segment constant based object is being defined, then it is placed in the named segment. If a segment constant based pointer is being defined, then it can point at objects in the named segment.
The following examples illustrate segment constant based pointers and objects.
In the first example,
int __based( __segname( "_CODE" ) ) ival = 3; int __based( __segname( "_CODE" ) ) *iptr;
ival is an object that resides in the default code segment. iptr is an object that resides in the data segment (the usual place for data objects), but points at an integer that resides in the default code segment. iptr is suitable for pointing at ival.
In the second example,
char __based( __segname( "GOODTHINGS" ) ) thing;
thing is an object that resides in the segment GOODTHINGS, which is created if it doesn't already exist. (The creation of segments is done by the linker, and is a method of grouping objects and functions. Nothing is implicitly created during the execution of the program.)
A segment object based pointer derives its segment value from another named object. A segment object based pointer is specified as follows:
[type] __based( segment ) *name;
where segment is an object defined as type __segment.
An object of type __segment may contain a segment value. Such an object is particularly designed for use with segment object based pointers.
The following example illustrates a segment object based pointer:
__segment seg; char __based( seg ) *cptr;
The object seg contains only a segment value. Whenever the object cptr is used to point to a character, the actual pointer value is made up of the segment value found in seg and the offset value found in cptr. The object seg might be assigned values such as the following:
A void based pointer must be explicitly combined with a segment value to produce a reference to a memory location. A void based pointer doesn't infer its segment value from another object. The :> (base) operator is used to combine a segment value and a void based pointer.
For example, on a personal computer running DOS with a color monitor, the screen memory begins at segment 0xB800, offset 0. In a video text mode, to examine the first character currently displayed on the screen, the following code could be used:
extern void main() { __segment screen; char __based( void ) *scrptr; screen = 0xB800; scrptr = 0; printf( "Top left character is '%c'.\n", *(screen:>scrptr) ); }
The general form of the :> operator is:
segment :> offset
where segment is an expression of type __segment, and offset is an expression of type __based( void ) *.
A self-based pointer infers its segment value from itself. It is particularly useful for structures such as linked lists, where all of the list elements are in the same segment. A self-based pointer pointing to one element may be used to access the next element, and the compiler uses the same segment as the original pointer.
The following example illustrates a function that prints the values stored in the last two members of a linked list:
struct a { struct a __based( __self ) *next; int number; }; extern void PrintLastTwo( struct a far *list ) { __segment seg; struct a __based( seg ) *aptr; seg = FP_SEG( list ); aptr = FP_OFF( list ); for( ; aptr != _NULLOFF; aptr = aptr->next ) { if( aptr->next == _NULLOFF ) { printf( "Last item is %d\n", aptr->number ); } else if( aptr->next->next == _NULLOFF ) { printf( "Second last item is %d\n", aptr->number ); } } }
The argument to the function PrintLastTwo is a far pointer, pointing to a linked list structure anywhere in memory. It is assumed that all members of a particular linked list of this type reside in the same segment of the computer's memory. (Another instance of the linked list might reside entirely in a different segment.) The object seg is given the segment portion of the far pointer. The object aptr is given the offset portion, and is described as being based in the segment stored in seg.
The expression aptr->next refers to the next member of the structure stored in memory at the offset stored in aptr and the segment implied by aptr, which is the value stored in seg. So far, the behavior is no different than if next had been declared as:
struct a *next;
The expression aptr->next->next illustrates the difference of using a self-based pointer. The first part of the expression (aptr->next) is as described above. However, using the result to point to the next member is done by using the offset value found in the next member, and combining it with the segment value of the pointer used to get to that member, which is still the segment implied by aptr, which is the value stored in seg. If next hadn't been declared using __based( __self ), then the second pointing operation would refer to the offset value found in the next member, but with the default data segment (DGROUP), which might or might not be the same segment stored in seg.
The Watcom Code Generator performs such optimizations as common subexpression elimination, global flow analysis, and so on.
In some cases, the code generator could do a better job of optimizing code if it could use more memory. This is indicated when a message such as:
Not enough memory to optimize procedure 'xxxx'
appears on the screen as the source program is compiled. In such an event, you may wish to make more memory available to the code generator.
A special environment variable can be used to obtain memory usage information or set memory usage limits on the code generator.
The WCGMEMORY environment variable can be used to request a report of the amount of memory used by the compiler's code generator for its work area. For example,
export "WCGMEMORY=?"
When the memory amount is "?", then the code generator reports how much memory was used to generate the code.
It can also be used to instruct the compiler's code generator to allocate a fixed amount of memory for a work area. For example,
export "WCGMEMORY=128"
When the memory amount is nnn then exactly nnnK bytes are used. In the above example, 128K bytes is requested. If less than nnnK is available then the compiler quits with a fatal error message. If more than nnnK is available then only nnnK bytes are used.
If you have a software quality assurance requirement that the same results (that is, code) be produced on two different machines then you should use this feature. To generate identical code on two personal computers with different memory configurations, you must ensure that the WCGMEMORY environment variable is set identically on both machines.