Last update Jul 23, 2002
"Programming Languages - C", ANSI/ISO/IEC 9899-1999
Extension | File types |
---|---|
.c | C source files |
.h | C header files |
.cpp | C++ source files |
.h | C++ header files |
.obj | Object files |
Note: For compatibility with other compilers, .cxx, .hpp, and .hxx are also valid extensions for C++ source and header files for the compiler. These extensions have fallen out of favor and should not be used in the future.
With an include statement in the form,
#include <filename>the preprocessor searches for filename along the list of paths specified by the -I option. If not found, the search continues along the list of paths specified by the INCLUDE environment variable. This type of include statement is used for system include files. Alternatively, with an include statement of the form,
#include "filename"the preprocessor searches for filename in the default directory. If the file is not found, the compiler searches for filename as if it had been enclosed in angle brackets (<>). This type of include statement is for user include files.
With either type of include statement, if the file extension is .hpp, and if the preprocessor can't find the file, it repeats the search using the file extension .h.
The compiler treats character strings between double quotes ("") or <> pairs as a file name. This allows you to use almost any character, including spaces, the single ('), the backslash (\), or the sequences slash-apostrophe (/*) or double slash (//) in a file name. The compiler does not allow the new-line character to appear between the file name delimiters.
Notice that the backslash character is not used as an escape character in include path strings. Instead, backslash has the same meaning as it does on the command line, denoting a subdirectory of the current directory. For example,
#include <sys\stat.h>tells the preprocessor to look for stat.h in the sys subdirectory, which is the focus of current search.
The compiler ignores case in file names and accepts pathnames of any length up to the maximum allowed by the operating system.
To create an include path list, use either the command line option, -I, or the environment variable, INCLUDE. In both cases, the list consists of one or more pathnames separated by semicolons. The compiler first checks paths that are specified on the command line, and then the paths in INCLUDE. Pathnames are typically absolute paths, but you can use relative pathnames as well. If you specify a relative pathname, it is considered relative to the same directory as the enclosing file.
There is no compiler limit on the number of nested #include directives or the number #include paths. The practical limit is the amount of memory available to the compiler or to the operating system.
"Information Technology - Programming Languages - C++", ISO/IEC 14882-1998A more readable definition and tutorial on C++ is:
The C++ Programming Language Special Edition Bjarne Stroustrup, Addison-Wesley
A sequence of bit-fields with the same word size can be packed into a structure. No bit-field may be wider than its word size. If a bit-field straddles a word boundary, it is placed in the next word. For example, the bit-field declaration
struct bits { int b1: 24; int b2: 16; int b3: 16; int b4: 24; };is represented in memory as:
31.......0 unused..b1 b3......b2 unused..b4The compiler allocates bit-fields beginning with the low-order bit of a word. When packing bit-fields within a structure, the compiler uses an unnamed field with a width of 0 to close out the current word. A bit-field with a different word size from the preceding bit-field causes this closing out to happen automatically, just as a nonbit-field member of the structure does.
ARM p. 22 (cf. p. 7; 3.2.1c), Gray pp. 486-7Within structures, you can set alignment on byte, word (two bytes, the default), or long word (four bytes) boundaries. To suppress the default alignment within structures, pass the -a[1|2|4|8] option to the compiler. This ensures that structure members are aligned on the specified boundary. Such structure realignment is useful for defining structures that map onto particular hardware devices or predefined data elements. Alignment control operates only within structures; everything else is aligned on word boundaries.
The compiler does not generate structures of size 0 if there are no nonstatic data members; the minimum size of a structure is 1 byte. This minimum size prevents new() from returning zero when it allocates an instance of a structure.
Warning: Compile each source file referencing a given structure with the same type of alignment. If two files that reference the same structure are compiled with different alignments, the compiler does not detect it, but you will get unpredictable error messages from the linker or at run time.
The variable _new_handler is declared to be a pointer to a function. It is declared in the C++ Standard Library, and is NULL by default. Its declaration is:
void (*_new_handler)(void);When new fails, it tests whether _new_handler points to a function or if _new_handler is NULL. If _new_handler contains a value, the function it points to is called. If properly written, the function will reclaim memory until there is enough to satisfy the original request to new. If _new_handler is NULL, new returns a NULL pointer.
There are two ways to set _new_handler; directly as in:
void newfailed_ handler(void); // prototype of handler _new_handler = newfailed_handler; // set _new_handleror through the set_new_handler() library function as in:
set_new_handler(newfailed_handler);
Static constructors are called in the reverse of the order in which they were linked. Consequently, constructors in the standard library are called first. Static destructors are called in the order they were linked. Destructors in the standard library are called last.
When the run-time library prints a run-time error message, the program is aborted without calling any static destructors. Because it is possible that a serious error has occurred, to limit the damage it is preferable to immediately stop the program. For example, when the heap has been corrupted, halt execution.
RTTI adds a new member to the virtual function table, a pointer to the type information stored in the table. Thus, classes cannot share vtbl[] s (an optimization the compiler performs by default) when compiling with RTTI. The pointer to the type information is located at a negative offset relative to the start of the vtbl[]; this preserves compatibility with the Microsoft Object Model, which does not support RTTI.
Max length of an identifier: 254 Max length of an external identifier: 254 Number of arguments to a macro: 127 Depth of nested #include directives: number of file handlesThe compiler sets no limits on the following code elements, but operating system or hardware requirements may impose practical limits:
Complexity of a declaration Length of macro replacement text Number of arguments to a function Number of cases in a switch Number of characters in an argument to a macro Number of characters in a line Number of characters in a string Number of command line arguments Number of #if directives that can be nested Number of #include paths Number of subscripts in an arrayThe header limits.h specifies the largest and smallest values of the integral types.
There are 3 basic floating point types.
Type Default Size Format float 4 bytes IEEE single precision double 8 bytes IEEE double precision long double 8(10) bytes IEEE double (extended) precisionThe header floating.h, defines the implementation defined characteristics of the floating types.
ARM p. 10, Gray pp. 480-1
Multi-character constants have type int and can contain between one and four characters from the execution character set. If the constant has more than four characters, then the compiler generates an error. If a character string of three or four characters is assigned to a short, then the last two characters are used in the assignment. For example:
short foo = 'ABCD';will assign CD (0x4344) to foo.
If the character following the backslash character is not one of the defined escape sequences, then the compiler generates an "undefined escape sequence" error.
String literals are distinct. They do not overlap in memory, but it is good practice to not modify them at runtime.
int x, y, z; int u = 3000; signed int v = -56;// signed int == int unsigned int r = 0xf000;For 16-bit memory models, an int is two bytes; for 32-bit memory models, an int is four bytes.
short a, b; short int c = -45; signed short d = 2145; unsigned short int e = 0x123f;
long a, b, c = 109; long f = -1L; signed long g = 67; unsigned long int h = 0x0045123f;
float fl1, fl2; float fnum = 1.56; float gnum = 1.23E3;
double dp1, ext; double dnum = 11.435; double fnum = 121.23E4;In Digital Mars C++, a double is an eight byte unit.
long double ld = 1.678E33;In Digital Mars C++, a long double is the same size as a double, eight bytes, for all 16 bit memory models and for all 32 bit DOS memory models. For 32 bit Windows, a long double is 10 bytes.
There are several ways to denote char values. One way uses the character flanked by single quotes, such as 'A'. The second way uses the character's ASCII code preceded by a backslash and flanked by quotes, such as '\65'. Alternatively, within an assignment statement, you can simply use the integer value of the ASCII code, such as 65. Character variables can also be expressed in the same formats as character constants, as discussed earlier. Examples of each of these ways of denoting char values are included here:
char a_character = 'A'; char a_character = '\65'; char a_character = 65; signed char a_character = 0x41; unsigned char a_character;
char s[6] = "hello";Digital Mars C++ supports this syntax with automatic arrays as well as with global and static arrays. In this example, the 6 is optional. When the size of the array is not given, the compiler allocates enough space to hold the string and its terminating null character.
Digital Mars C++ also supports the wide-character string type, wchar_t, for example:
wchar_t s[] = L" hello";Wide char types are used to refer to wide-character strings; they are equivalent to unsigned shorts (two bytes). They are used to hold character sets where individual characters do not fit into a single byte. Wide char types are needed, for example, when attribute information, such as color or font, is encoded along with the character's ASCII value. The type of wchar_t is defined in stddef.h.
ARM pp. 31-2, 322, Gray p. 489
In C++, a pointer or reference to an object of a const type can be cast into a pointer or a reference to a non-const type. The resulting reference still applies to the original object. In Digital Mars C++, it is possible to modify the value of the constant object through the resulting pointer or reference. This works only if the original pointer or reference contained a valid address. In this way, you can cast away the "constness" of an object.
ARM p. 71, 37, Gray pp. 500-2
int a[10]; void f() { int* p = &a[10]; *p = 0xdeadbeef; }You can subtract two pointers that point to objects in the same array in order to find the number of elements separating the operands. The result is of type ptrdiff_t, which is defined as long in <stddef.h>. It is an error to subtract pointers which point to objects of differing types. However, explicit casting allows the operations and circumvents the error.
ARM p. 73, Gray p. 503
union u_tag { int ival; float fval; } u_obj; int i; u_obj.fval = 4.0; i = u_obj.ival;assigns 0x40800000 to i, because the bit pattern stored in u_obj.fval is assigned to variable i with no conversion.
ARM p. 53
ARM p. 56, Gray p. 497
C++ Extensions
__typeinfo (expression)The syntax for __typeinfo is identical to the syntax for sizeof expressions. __typeinfo returns an int, whose bit settings specify the type of expression:
1expression is a class/struct/union 2expression has a destructor 4expression has a virtual destructorOther bits are reserved, and should not be assumed to contain a meaningful value.
__typeinfo, like sizeof, yields information about the static type, not the dynamic type, so:
class A {. . .}; class B : A {. . .}; class A *p = new B; int i = __typeinfo (*p); // returns information on A, not B int s = sizeof (* p); // returns information on A, not BSince the compiler generates different code for objects like array new/ delete depending on the presence of a destructor, __ typeinfo can be useful in code that must manipulate storage allocation in a robust manner.
ARM p. 61, Gray p. 499
ARM pp. 114-5, Gray p. 523
ARM p. 173,241, Gray p. 545
The declaration of a pointer to a function must specify both the return type and the parameter list. For example:
int memcmp( void *, void *, unsigned);requires a compatible function pointer to be declared as:
int (*fp)(void *, void *, unsigned);The portable way to create a pointer to a C function from within C++ is to use:
extern "C" { int (*fp)(int); }The alternative, nonportable, syntax is:
int (__cdecl *fp)(int);Similarly, the way to declare a C++ function with a parameter that is a C function pointer is:
extern "C" { typedef (FP)(int); } int foo(FP fp);
Digital Mars C++ generates code with C linkage for any function with a variable number of arguments, even if the function is explicitly declared to have Pascal linkage.
For example, if you write a prototype for a function that takes a float argument, and define the same function using an "old style" function definition, an ANSI compiler will promote the float to a double when it compiles the definition. This will generate an error because the definition no longer matches the prototype. In other words, the function is now defined to take an argument of type double, but prototyped to take a float.
When you compile with the require prototypes -r (strict prototyping) option, Digital Mars C++ does not generate an error in cases where an old style function definition and a prototype exist for the same function. In other words, no error is generated as long as the types for the parameters in the prototype are the same as the types of the arguments to the function before promotion.
If you do not compile with -r, Digital Mars C++ considers the differing
function definitions to be invalid and generates an error. The same
results occur when you compile with -A (enforce the ANSI
standard).
Templates
One of the more interesting features of the C++ language is the
template. Templates let you define container classes and generic
functions without giving up type checking. Digital Mars C++ provides
for the definition, declaration, and use of both class templates and
function templates, as described in ARM, Chapter 14, and Gray,
Chapter 8.
template <class T> square(T v);declares the existence of a square function whose argument, v, is of type T, and whose return value is also of type T.
A function template definition provides the information needed by the compiler to generate instantiations. For example:
template <class T> square(T v) { return v * v; }defines the square function to produce a return value by multiplying the argument, v, by itself. Observe that this is still not enough information for the compiler to generate code for a specific square operation. The needed information is provided by the context in which the square function is called. For example:
void byUse() { int i = 5; float f = 3.14; i = square(i); f = square(f); }instructs the compiler to produce two specializations of the square function: one for ints and one for floats. In Digital Mars C++, specialization is the use of a template that generates code.
A class template declaration states the existence of a class template definition. For example:
template <class T> class List;declares the existence of a List class object.
A class template definition specifies a list of members for the class template. For example:
template <class T> class List { public: T *GetFirst(); List *GetRest(); private: T *pFirst; List *pRest; };where template member function definitions, which provide the information needed by the compiler to generate member definitions, are needed for GetFirst and GetRest. For example:
template <class T> T *List< T>::GetFirst() { return *pFirst; }simply calls the internal C function, pFirst.
As with template functions, a call in a specific context is needed for the compiler to produce a specialization of a template member function.
The linker, rather than the compiler, eliminates the redundant code from multiple occurrences of the same specialization. By default, the compiler marks the code generated by each specialization as a COMDAT. The linker processes COMDATs specially, removing duplicate definitions. For more information on COMDATS, see the section on the -NC option in Chapter 2, "Compiling Code."
For the compiler to be able to generate code for a template member function, it must have access to the class template and the template member function definition, as well as the relevant specialization. Consequently, the template member function definition must be included in the compilation unit (for example, the source file and its included files) that contains the specializations of the class template.
A simple and effective way to use templates is to include template declarations with the template definitions. When this is done, the compiler will be able to generate code when it needs to and the linker will remove duplicate instance definitions using the COMDAT mechanism.
ARM p. 378, Gray p. 613
The __with Statement
For ease in porting Modula 2 code to C++, Digital Mars C++ includes a
__with construct. The form __with is used instead of with to
avoid name-space conflicts. The syntax of the __with construct is:
__with (expression) statementwhere expression must evaluate to an instance of an object of a class. It is evaluated only once. Within the statement, a scope is introduced for this class. This scope is searched before all other scopes. When a member of the class is parsed, it is semantically equivalent to expression. member.
__with statements do not affect access rules. __with clauses may be nested; the rules are the same as for braces {}. Refer to class members of a previously nested __with clause without specifying the class; if an identifier is not a member of the innermost __with clause, outer __with clauses are searched automatically.
The following example shows how to resolve member references using nested __with statements.
class A { int X; public: int Y; } a; int func(int i, A *p) { A *q = p; p[0].y = 2; p[1].y = 8; __with (*p) { y = 4; // p[0].y = 4 p++; // does not affect __with's expression __with (*q) { if (i == 3) return y; // returns q->y (4) } if (i == 5) return x; // illegal; p->x is private else if (i == 6) return p->y; // returns 8 else return y; // p->y is returned (4) } }
__based, __cdecl, __cs, __declspec, __export, __far, __fortran, __handle, __huge, __interrupt, __loadds, __near, __pascal, __ssThese keywords provide support for mixed language programming and for mixed memory models. Although they are not necessary for all applications, their judicious use can significantly enhance program performance.
The keywords __cs, __far, __handle, __huge, __interrupt, __loadds, __near, __ss are applicable for 16 bit programs and are described in 16 Bit Pointer Types and Type Modifiers.
Note: The keywords __far, __interrupt, __handle, __loadds, and __huge are ignored by default in compilations using the Win32 (the default) memory model. Ignore these keywords in any compilation by specifying the -NF compiler option.
For backwards compatibility with legacy code, the keywords _cs, cdecl, _cdecl, far, _far, near, _near, huge, _huge, pascal, and _pascal are also supported. These are incompatible with the language standards, and so should be replaced wherever found with the versions that have two leading underscores.
__export may also be used to designate all members of a class for export by adding the keyword between the class and the tag name. For example:
class __export A {...};
Note: Like other extended keywords, __declspec is disabled with the -A (ANSI compatibility) compiler option.
The dllimport attribute takes the place of the IMPORTS statement in a module definition file; it indicates that a function resides in a DLL.
The dllexport attribute takes the place of the EXPORTS statement in a module definition file (and the __export keyword). Use it in DLLs to indicate which functions and data objects are available to other applications and DLLs.
The syntax for declaring a function or data object with the dllimport or dllexport attributes is:
__declspec(dllimport) type name(args); __declspec(dllexport) type name(args);
The syntax for declaring a function with the naked attribute is:
__declspec(naked) type function_name(args);
The syntax for declaring a variable with the thread attribute is:
__declspec(thread) type variable_name = expr;
__emit__(0xB47F, 0x89CB, 0xCD21);and inserts these instructions into the code:
mov AH,7F mov BX,CX int 21It's probably better to use the inline assembler rather than __emit__.