The file format accepted by the linker is rigidly defined; each file contains all information required to resolve all address references contained within it, although this may be simply a reference to some "external" symbol. As the linker processes successive object modules, such external references may be resolved. If any remain unresolved after the last module has been processed, the link operation has failed.
Organizing programs in this manner maximizes the opportunity to re-use code. For example, once you (or someone else) have created a function to input a single keystroke from the keyboard, you never need to re-create it. All you have to do is to refer to the function whenever you need to get keyboard input.
For this strategy to work, it's necessary that organization and formats follow certain rules. These rules were set forth, for "object modules", by Intel in their specification for Object Module Format (OMF), and have been extended by other firms (notably Microsoft and Borland).
In addition to making it easy to re-use common functions and procedures, modular design has another significant advantage:when you change a program, only those functions that were actually changed need to be re-compiled. The entire collection of object modules that make up your program are re-linked.
Originally, if you placed all your functions and procedures into the same source file, the compiler would put them all into a single object module. Then, when you need only one of those object modules later, you'll find that all the other object modules are loaded along with the one you wanted. To overcome this, it was necessary to break the source file into a number of smaller source files, each containing only one (or a few, related) functions. Organizing source files in this manner tended to complicate maintenance of the program, but simplified re-use of the code.
The preceding paragraph was true until the introduction of "smart linking". It is now possible to generate encapsulated functions within a single object module, yet have only those functions which actually are used by a specific program linked into the executable file. OPTLINK recognizes the special records that make this possible and treats them correctly. However, not all language processors yet generate the special COMDAT records that make smart linking work.
A compromise that addresses both problems is to combine the smaller source files into a single larger file for storage and while editing, but then split it into smaller files before compiling or assembling it. Details of doing this are outside the scope of this manual; the thing to keep in mind is the trade-off between ease of maintenance and ease of re-use, so that you can plan your projects for maximum effectiveness in both areas.
When you run source files through a compiler or assembler, the normal output is one or more object modules per file. These individual modules must still be linked; that is, they must all be combined into a single executable program, with memory locations assigned to each symbol in each module, and all references within each module to symbols which are defined in other modules must be resolved. This task is a function of OPTLINK.
Most high-level languages include one or more "run time library" files as a part of their package, and make extensive use of the modules it contains. In addition, general software products such as screen display utilities may be sold as add-on libraries. And you can combine your own object modules into libraries as well. Thus using libraries is the way to simplify tracking the numerous modules involved in a complex program.
Typically, an object module's original source contains references to symbols which may be in other object module's source compiled at approximately the same time, and also contain references to symbols defined in modules within the libraries being used. These are known as "external" symbol references, and each time an actual memory location is assigned to one of these symbols, it is said to "resolve" the symbol reference.
If any external symbol references remain unresolved after all object modules have been processed, the linker searches for their definitions within any libraries that have been specified. Each object module specified to the linker for inclusion (whether specified explicitly, or by being located during a library search) may resolve previous external symbol references, but it may also introduce new ones that will require resolution.
OPTLINK will search any number of libraries in order to resolve external references that remain after all supplied object modules have been processed. The library files to be searched may be specified either by means of commands embedded in the object modules, or by explicit commands to OPTLINK. If any references remain unresolved after all libraries have been searched, OPTLINK reports an error, but can still create the executable file (see /ONERR and /ERRORFLAG options).
A significant difference between OPTLINK and other linkers is that OPTLINK always resolves external references from the first library (in the supplied or default list of library files) that contains a definition, even if the reference itself occurs in a subsequent library. Microsoft LINK and many other linkers resolve such a reference by using the first definition found after the reference.
OPTLINK searches for each file first in the current working (default) directory, then in the directories named by the OBJ environment variable, and finally in the directories named by the LIB environment variable.
If a requested object module cannot be found, operation terminates with a fatal error.
The modules first searched are those named in the command line or supplied interactively, followed by those named in the input data (searched in the order in which they were named). The first PUBLIC symbol encountered that matches an undefined EXTERN is used and any subsequent occurrences of that symbol as a PUBLIC are ignored, so the sequence in which libraries are searched may have significant impact on a program's operation.
After all named libraries have been searched, or if no libraries are named in the input, the libraries called for by internal records of the object modules (i. e. requested by the translator which generated the object modules) are searched unless this capability has been turned off by the use of the /NODEFAULTLIBRARYSEARCH command.
In library searches, when no path is specified with the library name, the current default directory is searched first when looking for any specific .lib file. If none is found there, the path( s) listed in the LIB environment variable are used. If a requested library module cannot be found, a warning message is issued but operation continues.
The segment name is assigned by you, if you write in assembly language, when you use the SEGMENT/ENDS declarations. If the object module was generated by a high level language, the segment name is assigned by the translator. The class name is also supplied by you, by means of a modifier you may add to the SEGMENT declaration. If given, the class name is enclosed in single quotes, as in:
CODESEG SEGMENT PUBLIC BYTE 'CODE'In this example, the segment name is CODESEG and the class name is CODE.
Unless a /PACKCODE or /PACKDATA option switches are used, OPTLINK combines segments having matching segment and class names based on their combine type, which may be PUBLIC, COMMON, PRIVATE, or STACK. PUBLIC segments combine into a single bigger segment; COMMON segments are assigned the same address (that is, all use the same memory, at the same time), and PRIVATE segments are not combined at all. The final attribute of a segment is its alignment, which may be BYTE, WORD, DWORD, PARA, or PAGE (corresponding to boundaries at multiples of 1, 2, 4, 16, or 256, respectively).
Within each program, all segments of the same class are loaded in memory adjacent to each other. If no alignment is specified, PARA is used.
Fix-ups are processed after all segment addresses and symbol references are known, to perform final reconciliation of cross-module references.