Wednesday, October 16, 2019

Dynamic-Link Libraries Implicit Linking

Implicit Linking Overview

A DLL module must first be built before an executable module that imports from that DLL can be built. 

  1. Header file of DLL contains function prototypes, structures, symbols that the DLL wants to export
  2. Source code of DLL not required to build an executable module that imports from DLL
  3. Compiler processes each source code module and produces one .obj module per code module
  4. Linker combines .obj modules and produces single DLL image file
  5. If DLL exports at least one function or variable, linker produces a .lib file, which contains a list of exported functions and variable symbol names
  6. Linker combines .obj modules into single executable image file. 
  7. Executable image file contains an import section, which lists all needed DLL module along with their functions/symbols/variables
  8. Loader creates VA space for new process, maps executable module into address space, recursively parses executable module's import section (maps each required DLL into the address space)

Building DLL Module

C++ classes can be exported only if modules importing the C++ class are compiled using a compiler from the same vendor (so do not export C++ classes unless you know executable module uses the same developer tools). This is because of name mangling. 

Header file of DLL contains all the variables and functions DLL wants to export along with any symbols/data structures used with the exported functions/variables. 

Use single header file to include in both executable and DLL source code. Avoid exporting variables because this breaks the abstraction barrier. 

In Mylib.h: 
#ifdef MYLIBAPI
// Functions/variable definitions being exported. 
#else
// Tells the compiler that we import variable/functions from some DLL module
#define MYLIBAPI extern "C" __declspec(dllimport)
#endif


In MyLib.cpp
// Tells the compiler  that the variable, function, C++ class will be exported from the resulting DLL module
#define MYLIBAPI extern "C" __declspec(dllexport)
// Technically the define is not necessary here because compiler remembers the symbols to export when it parses the header file
#include "Mylib.h"


Use extern "C" modifier only when writing C++ code because C++ compilers mangle function and variable names, which causes linker problems. 

extern "C" tells compiler not to mangle the variable or function names, which makes the variable/functions accessible to modules written in C, C++, or other programming languages. 

Don't define MYLIBAPI before the header file in executable since compiler will be confused. 

The __declspec(dllexport) causes compiler to embed additional info in .obj file. When DLL is linked, linker detects information about exported symbol and automatically makes a .lib file. 

The .lib file is required to link any executable module that references the DLL's exported symbols. 
Linker also embed a table of exported symbols and their relative virtual addresses (RVA) in DLL file 

RVA identifies offsets in the DLL file image to where the exported symbol can be found. 

MSFT wants you to link using symbol's name even though you can technically link by ordinal. MSFT guarantees that linking to DLLs by ordinal will work. 

DLLs for Use with Non-Visual C++

MSFT C compiler mangles C functions when your function uses __stdcall (WINAPI) calling convention. 

When you export an __stdcall function, MSFT compiler mangles function names by adding a _ and a @ followed by number of bytes that are passed to the function as a parameter. 

Example:

__declspec(dllexport) LONG __stdcall MyFunc(int a, int b); 

This function will be exported as _MyFunc@8

Building an executable and attempting to link MyFunc will fail if you do not use MSFT compiler. 

To tell MSFT compiler to not export mangled names:

Create a .def file with an EXPORTS section, which will tell linker to export using the .def file name

Or, in the DLL source code modules, add a linker directive to tell linker to export a MyFunc with the same entry point as _MyFunc@8. 
#pragma comment(linker, "/export:MyFunc=_MyFunc@8")

Building Executable Module

When including the DLL header, the __declspec(dllimport) tells compiler that some symbols are imported from some DLL module. 

Linker combines .obj modules and determines which DLL contains imported symbols. User must pass DLL's .lib file (the list of exports) to the linker, so that the linker can figure out where the referenced symbol is and which DLL module contains the symbol. 

When linker resolves import symbols, it embeds import section in the executable module's image, which lists the DLLs required and the symbols referenced from each DLL. 

A memory address next to an import appears in the executable module if the symbol is bound. 

Running Executable Module

Import section does not contain the pathname of the DLL, so loader searches for the DLL. 

Search Order:
  1. Directory of the executable image file
  2. Windows system directory (returned by GetWindowsDirectory)
  3. 16-bit system directory
  4. Process' current directory
  5. Directories listed in PATH env vars
The search order can be changed by setting HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager


Loader only maps a module once regardless of how many other modules need that module. 

When loader finishes mapping DLL modules into the proc address space, it fixes up references to import symbols:
For each symbol listed, loader examines DLL's export section to make sure symbol exists.
Loader then retrieves RVA of symbol and adds it to the base address at which the DLL module is loaded (in the process) and saves the virtual address in the executable's import section.
When code references imported symbol, it will look in the calling module's import section, and retrieve the address of the imported symbol. 

To improve application load time, rebase and bind executable and DLL modules. 

Sources

Windows via C/C++ - J.Richter 2008

No comments:

Post a Comment