Wednesday, October 16, 2019

Memory-Mapped Files

Physical storage from reserving regions in a memory-mapped file comes from the file itself rather than the system's paging file. A file that has been mapped can be accessed as if the file were loaded in memory.

Uses of memory-mapped files:
  1. Load and execute PE files
  2. Access data file on disk without having to perform file I/O or buffering file contents
  3. Interprocess communication between processes on the same machine (other IPC implemented using memory-mapped files anyway)

Memory-Mapped Executables and DLLs

System reserves a region of address space. That location is specified in the linker's /BASE option (defaults to 0x00400000 for 32-bit applications)

Physical storage backing reserved region is the .exe file on disk (as opposed to the system's paging file)

System recursively calls LoadLibrary on each of the DLLs that the process says it needs (as well as each DLL's dependencies). 

The DLL specifies its preferred image base (usually 0x10000000 or 0x00400000). Override this option using linker's /BASE option. Standard system DLLs specify different base addresses to avoid conflicting when loaded into a single address space. 

If system cannot load DLL at preferred base, then it uses the relocation information to load it somewhere else. If relocation information was removed (using the linker's /FIXED switch), then system will not be able to load the DLL. 

If system needs to relocate DLL, then system notes that some of the physical storage for DLL is mapped to the paging file.

Static Data is Not Shared by Multiple Instances of an Executable or DLL

Memory-mapped files allows multiple running instances of the same application to share the same code and data in RAM. When a second instance of an app is ran, system maps the pages of VA containing the file's code and data into the second app's address space. 

GetSystemInfo to determine the size of the page 

Copy-on-write allows an instance to modify global/static data without changing the global data for all other instances. 

System commits storage in the paging file for all pages that are protected with copy-on-write when the process is first loaded. 

Sharing Data Across Multiple Instances

Compiler places all code in .text, uninitialized data in .bss, and initialized data in .data. Each section marked with some combination of READ, WRITE, EXECUTE, SHARED. Sections marked SHARED are shared across multiple instances (no copy-on-write). 

Use #pragma data_seg("mysection") to create user-defined section.

Initialized data can be placed in the Shared section. If data not initialize, compiler will not place it in the Shared section unless declared with allocate declaration specifier. 

#pragma data_seg("Shared")
LONG g_lInstanceCount = 0;
#pragma data_seg()

To place uninitialized in Shared section:
__declspec(allocate("Shared")) int mydata; 

Make sure the section is first created before __declspec-ing it. 

In the /SECTION switch of the linker's command line, set /SECTION:Shared, RWS which names a section called "Shared" with Read Write Shared attributes. 

Can embed the linker switch in the source:
#pragma comment(linker, "/SECTION:Shared, RWS")

This embeds the string in the .drectve section of the generated .obj file, which makes it seem as if the string were passed as a command line argument. 

MSFT recommends not using Shared sections since anyone can load any DLL and have access to the shared memory. 

Maybe use this method instead of using a MUTEX if okay with dropping files to disk.

Memory-Mapped Data Files

To deal with large streams of data, can memory map data files to process' address space.

Open the file and tell system to reserve a region of VA space. Map the first byte of the file to the first byte of reserved region. Then access the region of virtual memory as if it contained the file. If file ends with single 0 byte, then can treat the data like it were an in-memory text string. (Interruptions during operations can cause problems and corrupt data.)

Using Memory-Mapped Files

A memory mapping is also known as a "section". Use ProcessExplorer to see sections. 

Three steps:
1. Create/open file kernel object identifying the file on disk to be used as a memory-mapped file
2. Create file-mapping kernel object 
3. Tell system to map all or part of the file-mapping into process' address space

Clean up:
1. Unmap the file-mapping kernel object from process' address space
2. Close file-mapping kernel object
3. Close file kernel object

CreateFile tells the OS the location of the file mapping's physical storage. The pathname indicates the exact location of physical storage. 

File-Sharing Modes

Flags that specify shared access to the file's data.
0 to not allow other attempts to open the file
FILE_SHARE_READ - do not allow attempts to open using GENERIC_WRITE
FILE_SHARE_WRITE - do not allow attempts to open using GENERIC_READ

CreateFileMapping

Similar to reserving address space and then committing physical storage to the region. Only difference is that physical storage from the file itself rather than the paging file.  
User must request permissions to the file-mapping object. 
For example, specifying PAGE_READONLY means that at least GENERIC_READ must have been passed to CreateFile. 

Section Attributes

SEC_NOCACHE - update the file's data on disk as you write data to the file
SEC_IMAGE - tells system to map the file with the page protections of a PE file (so PAGE_EXECUTE_READ for .text and PAGE_READWRITE for .data)
CreateFileMapping ignores SEC_RESERVE and SEC_COMMIT. 

File Mapping Size

Specify the maximum size of the file mapping: 

dwMaximumSizeHigh - 32-bit value that expresses the high 32 bits of the size (should always be 0 if the file is less than 4GB)
dwMaximumSizeLow - 32 bit value that expresses the low 32 bits of the size

Passing 0 to both params will create a file-mapping object that reflects current size of the file. This is okay if you only going to read from the file. Do not do this if the current file on disk is 0 bytes. 

pszName

Zero-terminated string assigned to the file-mapping object. The string allows sharing of the object with other processes. 

MapViewOfFile

Tells the system to reserve the region of address space for the file's data and commit the file's data as physical storage mapped to the region. 

Don't have to map the entire file into the process' address space at once. 
Map a small portion of the file - a view. 

Specify which byte in the data file should be mapped as the first byte in the view: 
dwFileOffsetHigh - high 32 bits
dwFileOffsetLow -  low 32 bits
Offset must be a multiple of system's allocation granularity. 

Specify how much of the file to map (same as specifying how large a region to reserve)
dwNumberOfBytesToMap (0 will map a view starting at specified offset to the end of the file) 

Specifying FILE_MAP_COPY will commit physical storage from the system's paging file, 

When system makes copy of original page, the copy has PAGE_READWRITE (as oppose to PAGE_WRITECOPY)

UnmapViewOfFile

Unmap the file's data, pass the base address of the returned region that was returned from MapViewOfFile.

To ensure updates written to disk, call FlushViewOfFile to force system to write a portion or all modified data to the disk image. This is useful if the memory-mapped file is over a network. Use FILE_FLAG_WRITE_THROUGH in CreateFile to ensure that server writes the file's data (and doesn't just cache it) when calling FlushViewOfFile. 

MoveMemory - to copy the page of data from the first view to a second view mapped with PAGE_READWRITE if one wants to save changed page of data. 

Make sure to close all open handles. 

Coherence

As long as we are mapping the same file-mapping object, the view data is guaranteed to be coherent meaning that if one application alters the contents of the file in one view, all other views are updated. 

Windows does not guarantee that view of different file-mapping objects backed by a single data file are coherent - only that multiple views of a single file-mapping object are coherent. 

When calling CreateFile for files that will be memory mapped, specify 0 in dwSharedMode to prevent other processes from opening it. 

Do not use memory-mapped files to share writable files over a network because system can't guarantee coherent view of data. 

Specifying the Base Address of Memory-Mapped File

MapViewOfFileEx to suggest that file be mapped into a particular address in the process' VA space. 
Specify the address in pvBaseAddress parameter. Useful for using memory-mapped files to share data with other processes. 

Using Memory-Mapped Files to Share Data Among Processes

IPC mechanisms such as RPC, COM, OLE, DDE, window messages (WM_COPYDATA), Clipboard, mailslots, pipes, sockets, etc are all based on memory-mapped files :) 

Have the two processes that want to share data map views of the same file-mapping object. This makes them share the same pages of physical storage. (Remember to use the exact same name for the file-mapping object). 

When system starts an application:
1. Calls CreateFile to open the .exe file on disk
2. CreateFileMapping 
3. MapViewOfFileEx with SEC_IMAGE flag and specified to map at the preferred base address
4. Create primary thread and point it to first byte of executable code of the mapped view

A second instance of the same application doesn't create a new file object or file-mapping object. The system simply maps a view of the file a second time in the context of the new process' address space. 

Memory-Mapped Files Backed by Paging File

For creating memory-mapped files backed by system's paging file rather than hard disk file. 

Call CreateFileMapping and pass INVALID_HANDLE_VALUE into hFile. System will create file-mapping object and commit physical storage from paging file. 

To share with other process, specify name in pszName. Other process can use OpenFileMapping to access the storage. 

Call CloseHandle when done. 

When file-mapping object is destroyed, data written to the file-mapping storage is destroyed by the system. 

SEC_RESERVE and SEC_COMMIT

These flags only applicable to file-mapping objects backed by paging file. They allow the creation of a file-mapping object without having to commit all the physical storage upfront. 

SEC_COMMIT commits storage from system paging file. 

SEC_RESERVE does not commit physical storage from the paging file. Calling MapViewOfFile(Ex) then reserves a region but does not commit physical storage back to the region. Attempts to access memory address in reserved region raises an access violation. 

Other threads can use the same file-mapping object to map a view of the same region. 

Use VirtualAlloc to commit physical storage to the region. All other processes that have mapped a view of the same file-mapping object can now successfully access the committed pages. 

Sources:

Jeffrey Richter, Christophe Nasarre - Windows via C/C++ 2008 - Chapter 17

No comments:

Post a Comment