| 1 | $Id: Fork.os2 1281 2004-03-06 21:48:26Z bird $ | 
|---|
| 2 |  | 
|---|
| 3 | Fork Design Draft | 
|---|
| 4 | -------------------- | 
|---|
| 5 |  | 
|---|
| 6 | 1.0 Intro | 
|---|
| 7 | ---------- | 
|---|
| 8 |  | 
|---|
| 9 | blah. | 
|---|
| 10 |  | 
|---|
| 11 |  | 
|---|
| 12 | 1.1 The SuS fork() Description | 
|---|
| 13 | ------------------------------ | 
|---|
| 14 |  | 
|---|
| 15 | NAME | 
|---|
| 16 |  | 
|---|
| 17 | fork - create a new process | 
|---|
| 18 |  | 
|---|
| 19 | SYNOPSIS | 
|---|
| 20 |  | 
|---|
| 21 | #include <unistd.h> | 
|---|
| 22 |  | 
|---|
| 23 | pid_t fork(void); | 
|---|
| 24 |  | 
|---|
| 25 | DESCRIPTION | 
|---|
| 26 |  | 
|---|
| 27 | The fork() function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as detailed below: | 
|---|
| 28 |  | 
|---|
| 29 | * The child process shall have a unique process ID. | 
|---|
| 30 | * The child process ID also shall not match any active process | 
|---|
| 31 | group ID. | 
|---|
| 32 | * The child process shall have a different parent process ID, | 
|---|
| 33 | which shall be the process ID of the calling process. | 
|---|
| 34 | * The child process shall have its own copy of the parent's file | 
|---|
| 35 | descriptors. Each of the child's file descriptors shall refer | 
|---|
| 36 | to the same open file description with the corresponding file | 
|---|
| 37 | descriptor of the parent. | 
|---|
| 38 | * The child process shall have its own copy of the parent's open | 
|---|
| 39 | directory streams. Each open directory stream in the child process | 
|---|
| 40 | may share directory stream positioning with the corresponding | 
|---|
| 41 | directory stream of the parent. | 
|---|
| 42 | * [XSI] The child process shall have its own copy of the parent's | 
|---|
| 43 | message catalog descriptors. | 
|---|
| 44 | * The child process' values of tms_utime, tms_stime, tms_cutime, and | 
|---|
| 45 | tms_cstime shall be set to 0. | 
|---|
| 46 | * The time left until an alarm clock signal shall be reset to zero, | 
|---|
| 47 | and the alarm, if any, shall be canceled; see alarm() . | 
|---|
| 48 | * [XSI] All semadj values shall be cleared. | 
|---|
| 49 | * File locks set by the parent process shall not be inherited by | 
|---|
| 50 | the child process. | 
|---|
| 51 | * The set of signals pending for the child process shall be | 
|---|
| 52 | initialized to the empty set. | 
|---|
| 53 | * [XSI] Interval timers shall be reset in the child process. | 
|---|
| 54 | * [SEM] Any semaphores that are open in the parent process shall | 
|---|
| 55 | also be open in the child process. | 
|---|
| 56 | * [ML] The child process shall not inherit any address space memory | 
|---|
| 57 | locks established by the parent process via calls to mlockall() | 
|---|
| 58 | or mlock(). | 
|---|
| 59 | * [MF|SHM] Memory mappings created in the parent shall be retained | 
|---|
| 60 | in the child process. MAP_PRIVATE mappings inherited from the | 
|---|
| 61 | parent shall also be MAP_PRIVATE mappings in the child, and any | 
|---|
| 62 | modifications to the data in these mappings made by the parent | 
|---|
| 63 | prior to calling fork() shall be visible to the child. Any | 
|---|
| 64 | modifications to the data in MAP_PRIVATE mappings made by the | 
|---|
| 65 | parent after fork() returns shall be visible only to the parent. | 
|---|
| 66 | Modifications to the data in MAP_PRIVATE mappings made by the | 
|---|
| 67 | child shall be visible only to the child. | 
|---|
| 68 | * [PS] For the SCHED_FIFO and SCHED_RR scheduling policies, the | 
|---|
| 69 | child process shall inherit the policy and priority settings | 
|---|
| 70 | of the parent process during a fork() function. For other s | 
|---|
| 71 | cheduling policies, the policy and priority settings on fork() | 
|---|
| 72 | are implementation-defined. | 
|---|
| 73 | * [TMR] Per-process timers created by the parent shall not be | 
|---|
| 74 | inherited by the child process. | 
|---|
| 75 | * [MSG] The child process shall have its own copy of the message | 
|---|
| 76 | queue descriptors of the parent. Each of the message descriptors | 
|---|
| 77 | of the child shall refer to the same open message queue | 
|---|
| 78 | description as the corresponding message descriptor of the parent. | 
|---|
| 79 | * [AIO] No asynchronous input or asynchronous output operations | 
|---|
| 80 | shall be inherited by the child process. | 
|---|
| 81 | * A process shall be created with a single thread. If a | 
|---|
| 82 | multi-threaded process calls fork(), the new process shall contain | 
|---|
| 83 | a replica of the calling thread and its entire address space, | 
|---|
| 84 | possibly including the states of mutexes and other resources. | 
|---|
| 85 | Consequently, to avoid errors, the child process may only execute | 
|---|
| 86 | async-signal-safe operations until such time as one of the exec | 
|---|
| 87 | functions is called. [THR]  Fork handlers may be established by | 
|---|
| 88 | means of the pthread_atfork() function in order to maintain | 
|---|
| 89 | application invariants across fork() calls. | 
|---|
| 90 |  | 
|---|
| 91 | When the application calls fork() from a signal handler and any of | 
|---|
| 92 | the fork handlers registered by pthread_atfork() calls a function | 
|---|
| 93 | that is not asynch-signal-safe, the behavior is undefined. | 
|---|
| 94 | * [TRC TRI] If the Trace option and the Trace Inherit option are | 
|---|
| 95 | both supported: | 
|---|
| 96 | If the calling process was being traced in a trace stream that | 
|---|
| 97 | had its inheritance policy set to POSIX_TRACE_INHERITED, the | 
|---|
| 98 | child process shall be traced into that trace stream, and the | 
|---|
| 99 | child process shall inherit the parent's mapping of trace event | 
|---|
| 100 | names to trace event type identifiers. If the trace stream in | 
|---|
| 101 | which the calling process was being traced had its inheritance | 
|---|
| 102 | policy set to POSIX_TRACE_CLOSE_FOR_CHILD, the child process | 
|---|
| 103 | shall not be traced into that trace stream. The inheritance | 
|---|
| 104 | policy is set by a call to the posix_trace_attr_setinherited() | 
|---|
| 105 | function. | 
|---|
| 106 | * [TRC] If the Trace option is supported, but the Trace Inherit | 
|---|
| 107 | option is not supported: | 
|---|
| 108 | The child process shall not be traced into any of the trace | 
|---|
| 109 | streams of its parent process. | 
|---|
| 110 | * [TRC] If the Trace option is supported, the child process of | 
|---|
| 111 | a trace controller process shall not control the trace streams | 
|---|
| 112 | controlled by its parent process. | 
|---|
| 113 | * [CPT] The initial value of the CPU-time clock of the child | 
|---|
| 114 | process shall be set to zero. | 
|---|
| 115 | * [TCT] The initial value of the CPU-time clock of the single | 
|---|
| 116 | thread of the child process shall be set to zero. | 
|---|
| 117 |  | 
|---|
| 118 | All other process characteristics defined by IEEE Std 1003.1-2001 shall | 
|---|
| 119 | be the same in the parent and child processes. The inheritance of | 
|---|
| 120 | process characteristics not defined by IEEE Std 1003.1-2001 is | 
|---|
| 121 | unspecified by IEEE Std 1003.1-2001. | 
|---|
| 122 |  | 
|---|
| 123 | After fork(), both the parent and the child processes shall be capable | 
|---|
| 124 | of executing independently before either one terminates. | 
|---|
| 125 |  | 
|---|
| 126 | RETURN VALUE | 
|---|
| 127 |  | 
|---|
| 128 | Upon successful completion, fork() shall return 0 to the child process | 
|---|
| 129 | and shall return the process ID of the child process to the parent | 
|---|
| 130 | process. Both processes shall continue to execute from the fork() | 
|---|
| 131 | function. Otherwise, -1 shall be returned to the parent process, no | 
|---|
| 132 | child process shall be created, and errno shall be set to indicate | 
|---|
| 133 | the error. | 
|---|
| 134 |  | 
|---|
| 135 | ERRORS | 
|---|
| 136 |  | 
|---|
| 137 | The fork() function shall fail if: | 
|---|
| 138 |  | 
|---|
| 139 | [EAGAIN] | 
|---|
| 140 | The system lacked the necessary resources to create another | 
|---|
| 141 | process, or the system-imposed limit on the total number of | 
|---|
| 142 | processes under execution system-wide or by a single user | 
|---|
| 143 | {CHILD_MAX} would be exceeded. | 
|---|
| 144 |  | 
|---|
| 145 | The fork() function may fail if: | 
|---|
| 146 |  | 
|---|
| 147 | [ENOMEM] | 
|---|
| 148 | Insufficient storage space is available. | 
|---|
| 149 |  | 
|---|
| 150 |  | 
|---|
| 151 |  | 
|---|
| 152 |  | 
|---|
| 153 | 2.0 Requirements and Assumptions Of The Implementation | 
|---|
| 154 | ------------------------------------------------------ | 
|---|
| 155 |  | 
|---|
| 156 | The Innotek LIBC fork() implementation will require the following features | 
|---|
| 157 | in LIBC to work: | 
|---|
| 158 | 1. A shared process management internal to LIBC for communication to the | 
|---|
| 159 | child that a fork() is in progress. | 
|---|
| 160 | 2. A very generalized and varied set of fork helper functions to archive | 
|---|
| 161 | maximum flexibility of the implementation. | 
|---|
| 162 | 3. Extended versions of some memory related OS/2 APIs must be implemented. | 
|---|
| 163 |  | 
|---|
| 164 | The implementation will further make the following assumption about the | 
|---|
| 165 | operation of OS/2: | 
|---|
| 166 | 1. DosExecPgm will not return till all DLLs are initated successfully. | 
|---|
| 167 | 2. DosQueryMemState() is broken if more than one page is specified. | 
|---|
| 168 | (no idea why/how/where it's broken, but testcase shows it is :/ ) | 
|---|
| 169 |  | 
|---|
| 170 |  | 
|---|
| 171 | 3.0     The Shared Process Management | 
|---|
| 172 | --------------------------------- | 
|---|
| 173 |  | 
|---|
| 174 | The fork() implementation requires a method for telling the child process | 
|---|
| 175 | that it's being forked and must take a very different startup route. For | 
|---|
| 176 | some other LIBC apis there are need for parent -> child and child -> parent | 
|---|
| 177 | information exchange. More specifically, the inheritance of sockets, | 
|---|
| 178 | signals, the different scheduler actions of a posix_spawn[p]() call, and | 
|---|
| 179 | possibly some process group stuff related to posix_spawn too if we get it | 
|---|
| 180 | figured out eventually. All this was parent -> child during spawn/fork. A | 
|---|
| 181 | need also exist for child -> parent notification and possibly exchange for | 
|---|
| 182 | process termination. It might be necessary to reimplement the different | 
|---|
| 183 | wait apis and implement SIGCHLD, it's likely that those tasks will make | 
|---|
| 184 | such demands. | 
|---|
| 185 |  | 
|---|
| 186 | The choice is now whether or not to make this shared process management | 
|---|
| 187 | specific to each LIBC version as a shared segement or try to make it | 
|---|
| 188 | survive normal LIBC updates. Making is specific have advantages in code | 
|---|
| 189 | size and memory footprint (no reserved fields), however it have certain | 
|---|
| 190 | disadvantages when LIBC is updated. The other option is to use a named | 
|---|
| 191 | shared memory object, defining the content with reserved space for later | 
|---|
| 192 | extensions so several versions of LIBC with more or less features | 
|---|
| 193 | implemented can co use the memory space. | 
|---|
| 194 |  | 
|---|
| 195 | The latter option is prefered since it allows more applications to | 
|---|
| 196 | interoperate, it causes less shared memory waste, the shared memory | 
|---|
| 197 | can be located in high memory and it would be possible to fork | 
|---|
| 198 | processes using multiple versions of LIBC. | 
|---|
| 199 |  | 
|---|
| 200 | The shared memory shall be named \SHAREMEM\INNOTEKLIBC.V01, the version | 
|---|
| 201 | number being the one of the shared memory layout and contents, it will | 
|---|
| 202 | only be increased when incompatible changes are made. | 
|---|
| 203 |  | 
|---|
| 204 | The shared memory shall be protected by an standard OS/2 mutex semaphore. | 
|---|
| 205 | It shall not use any fast R3 semaphore since the the usage frequency is | 
|---|
| 206 | low and the result of a messup may be disastrous. Care must be take for | 
|---|
| 207 | avoiding creation races and owner died scenarios. | 
|---|
| 208 |  | 
|---|
| 209 | The memory shall have a fixed size, since adding segments is very hard. | 
|---|
| 210 | Thus the size must be large enough to cope with a great deal of | 
|---|
| 211 | processes, while bearing in mind that OS/2 normally doesn't support more | 
|---|
| 212 | than a 1000 processes, with a theoritical max of some 4000 (being the | 
|---|
| 213 | max thread count). A very simplistic allocation scheme will be | 
|---|
| 214 | implemented. Practically speaking a fixed block size pool would do fine | 
|---|
| 215 | for the process structure, while for the misc structures like socket | 
|---|
| 216 | lists a linked list based heap would do fine. | 
|---|
| 217 |  | 
|---|
| 218 | The process blocks shall be rounded up to in size adding a reasonable | 
|---|
| 219 | amount of space resevered for future extensions. Reserved space must be | 
|---|
| 220 | all zeroed. | 
|---|
| 221 |  | 
|---|
| 222 | The fork() specific members of the process block shall be a pointer to | 
|---|
| 223 | the shared memory object for the fork operation (the fork handle) and | 
|---|
| 224 | list of forkable modules. The fork handle will it self contain | 
|---|
| 225 | information indicating whether or not another LIBC version have already | 
|---|
| 226 | started fork() handling in the child. The presense of the fork handle | 
|---|
| 227 | means that the child is being forked and normal dll init and startup | 
|---|
| 228 | will not be executed, but a registered callback will be called to do | 
|---|
| 229 | the forking of each module. (more details in section 4.0) | 
|---|
| 230 |  | 
|---|
| 231 | The parent shall before spawn, fork and exec (essentially before DosExecPgm | 
|---|
| 232 | or DosStartSession) create a process block for the child to be born and | 
|---|
| 233 | link it into an embryo list in the shared memory block. The child shall | 
|---|
| 234 | find it's process block by searching the embryo list using the parent pid | 
|---|
| 235 | as key. All DosExecPgm and DosStartSession calls shall be serialized within | 
|---|
| 236 | one LIBC version. (If some empty headed programmer manages to link together | 
|---|
| 237 | a program which may end up using two or more LIBC versions and having two | 
|---|
| 238 | or more thread doing DosExecPgm at the very same time, well then he really | 
|---|
| 239 | deserves what ever trouble he gets! At least don't blame me!) | 
|---|
| 240 |  | 
|---|
| 241 | Process blocks shall have to stay around after the process terminated | 
|---|
| 242 | (for child -> parent term exchange), a cleanup mechanism will be invoked | 
|---|
| 243 | whenever a free memory threshold is reached. All processes will register | 
|---|
| 244 | exit list handlers to mark the process block as zombie (and later | 
|---|
| 245 | perhaps setting error codes and notifying waiters/child-listeners). | 
|---|
| 246 |  | 
|---|
| 247 |  | 
|---|
| 248 |  | 
|---|
| 249 | 4.0 The fork() Implementation | 
|---|
| 250 | ----------------------------- | 
|---|
| 251 |  | 
|---|
| 252 |  | 
|---|
| 253 | The implementation is based on a fork handle and a set of primitives. | 
|---|
| 254 | The fork handle is a pointer to an shared memory object allocated for the | 
|---|
| 255 | occation and which will be freed before fork() returns. The primitives | 
|---|
| 256 | all operates on this handle and will be provided using a callback table | 
|---|
| 257 | in order to fully support multiple LIBC versions. | 
|---|
| 258 |  | 
|---|
| 259 |  | 
|---|
| 260 | 4.1 Forkable Executable and DLLs | 
|---|
| 261 | -------------------------------- | 
|---|
| 262 |  | 
|---|
| 263 | The support for fork() is an optional feature of LIBC. The default | 
|---|
| 264 | executable produced with LIBC and GCC is not be forkable. The fork | 
|---|
| 265 | support will be based on registration of the DLLs and EXEs in their | 
|---|
| 266 | LIBC supplied startup code (crt0/dll0). A set of fork versions of these | 
|---|
| 267 | modules exist with the suffix 'fork.o'. | 
|---|
| 268 |  | 
|---|
| 269 | The big differnece between the ordinary crt0/dll0 and the forkable | 
|---|
| 270 | crt0/dll0 is a per module structure, a call to register this, and the | 
|---|
| 271 | handling of the return code of that call. | 
|---|
| 272 |  | 
|---|
| 273 | The fork module structure: | 
|---|
| 274 | typedef struct __libc_ForkModule | 
|---|
| 275 | { | 
|---|
| 276 | /** Structure version. (Initially 'FMO1' as viewed in hex editor.) */ | 
|---|
| 277 | unsigned int    iMagic; | 
|---|
| 278 | /** Fork callback function */ | 
|---|
| 279 | int           (*pfnAtFork)(__LIBC_FORKMODULE *pModule, | 
|---|
| 280 | __LIBC_FORKHANDLE *pForkHandle, enum __LIBC_CALLBACKOPERATION enmOperation); | 
|---|
| 281 | /** Pointer to the _CRT_FORK_PARENT1 set vector. | 
|---|
| 282 | * It's formatted as {priority,callback}. */ | 
|---|
| 283 | void           *pvParentVector1; | 
|---|
| 284 | /** Pointer to the _CRT_FORK_CHILD1 set vector. | 
|---|
| 285 | * It's formatted as {priority,callback}. */ | 
|---|
| 286 | void           *pvChildVector1; | 
|---|
| 287 | /** Data segment base address. */ | 
|---|
| 288 | void           *pvDataSegBase; | 
|---|
| 289 | /** Data segment end address (exclusive). */ | 
|---|
| 290 | void           *pvDataSegEnd; | 
|---|
| 291 | /** Reserved - must be zero. */ | 
|---|
| 292 | int             iReserved1; | 
|---|
| 293 | } __LIBC_FORKMODULE, *__LIBC_PFORKMODULE; /* urg! conventions */ | 
|---|
| 294 |  | 
|---|
| 295 |  | 
|---|
| 296 | The fork callback function which crt0/dll0 references when initializing | 
|---|
| 297 | the fork modules structure is called _atfork_callback. It takes the fork | 
|---|
| 298 | handle, module structure, and an operation enum as arguments. LIBC will | 
|---|
| 299 | contain a default implementation of _atfork_callback() which simply | 
|---|
| 300 | duplicates the data segment, and processes the two set vectors | 
|---|
| 301 | (_CRT_FORK_*1). | 
|---|
| 302 |  | 
|---|
| 303 | crt0/dll0 will register the fork module structure and detect a forked | 
|---|
| 304 | child by calling __libc_ForkRegisterModule(). | 
|---|
| 305 |  | 
|---|
| 306 | Prototypes: | 
|---|
| 307 | /** | 
|---|
| 308 | * Register a forkable module. Called by crt0 and dll0. | 
|---|
| 309 | * | 
|---|
| 310 | * The call links pModule into the list of forkable modules | 
|---|
| 311 | * which is maintained in the process block. | 
|---|
| 312 | * | 
|---|
| 313 | * @returns 0 on normal process startup. | 
|---|
| 314 | * @returns 1 on forked child process startup. | 
|---|
| 315 | *          The caller should respond by not calling any _DLL_InitTerm | 
|---|
| 316 | *          or similar constructs. | 
|---|
| 317 | * @returns negative on failure. | 
|---|
| 318 | *          The caller should return from the dll init returning FALSE | 
|---|
| 319 | *          or DosExit in case of crt0. _atfork_callback() will take | 
|---|
| 320 | *          care of necessary module initiation. | 
|---|
| 321 | * @param   pModule     Pointer to the fork module structure for the | 
|---|
| 322 | *                      module which is to registered. | 
|---|
| 323 | */ | 
|---|
| 324 | int __libc_ForkRegisterModule(__LIBC_FORKMODULE *pModule); | 
|---|
| 325 |  | 
|---|
| 326 |  | 
|---|
| 327 |  | 
|---|
| 328 |  | 
|---|
| 329 |  | 
|---|
| 330 | 4.2 Fork Primitives | 
|---|
| 331 | ------------------- | 
|---|
| 332 |  | 
|---|
| 333 | These primitives are provided by the fork implementation in the fork | 
|---|
| 334 | handle structure. We define a set of these primitives now, if later | 
|---|
| 335 | new ones are added the users of these must check that they are | 
|---|
| 336 | actually present. | 
|---|
| 337 |  | 
|---|
| 338 | Example: | 
|---|
| 339 | rc = pForkHandle->pOps->pfnDuplicatePages(pModule->pvDataBase, pModule->pvDataEnd, __LIBC_FORK_ONLY_DIRTY); | 
|---|
| 340 | if (rc) | 
|---|
| 341 | return rc; /* failure */ | 
|---|
| 342 |  | 
|---|
| 343 | Prototypes: | 
|---|
| 344 | /** | 
|---|
| 345 | * Duplicating a number of pages from pvStart to pvEnd. | 
|---|
| 346 | * @returns     0 on success. | 
|---|
| 347 | * @returns appropriate non-zero error code on failure. | 
|---|
| 348 | * @param   pForkHandle Handle of the current fork operation. | 
|---|
| 349 | * @param       pvStart         Pointer to start of the pages. Rounded down. | 
|---|
| 350 | * @param       pvEnd           Pointer to end of the pages. Rounded up. | 
|---|
| 351 | * @param       fFlags          __LIBC_FORK_ONLY_DIRTY means checking whether the | 
|---|
| 352 | *                                              pages are actually dirty before bothering touching | 
|---|
| 353 | *                                              and copying them. (Using the partically broken | 
|---|
| 354 | *                                              DosQueryMemState() API.) | 
|---|
| 355 | *                                              __LIBC_FORK_ALL means not to bother checking, but | 
|---|
| 356 | *                                              just go ahead copying all the pages. | 
|---|
| 357 | */ | 
|---|
| 358 | int pfnDuplicatePages(__LIBC_FORKHANDLE *pForkHandle, void *pvStart, void *pvEnd, unsigned fFlags); | 
|---|
| 359 |  | 
|---|
| 360 | /** | 
|---|
| 361 | * Invoke a function in the child process giving it an chunk of input. | 
|---|
| 362 | * The function is invoked the next time the fork buffer is flushed, | 
|---|
| 363 | * call pfnFlush() if the return code is desired. | 
|---|
| 364 | * | 
|---|
| 365 | * @returns     0 on success. | 
|---|
| 366 | * @returns appropriate non-zero error code on failure. | 
|---|
| 367 | * @param   pForkHandle Handle of the current fork operation. | 
|---|
| 368 | * @param       pfn                     Pointer to the function to invoke in the child. | 
|---|
| 369 | *                                              The function gets the fork handle, pointer to | 
|---|
| 370 | *                                              the argument memory chunk and the size of that. | 
|---|
| 371 | *                                              The function must return 0 on success, and non-zero | 
|---|
| 372 | *                                              on failure. | 
|---|
| 373 | * @param       pvArg           Pointer to a block of memory of size cbArg containing | 
|---|
| 374 | *                                              input to be copied to the child and given to pfn upon | 
|---|
| 375 | *                                              invocation. | 
|---|
| 376 | */ | 
|---|
| 377 | int pfnInvoke(int *(pfn)(__LIBC_FORKHANDLE *pForkHandle, void *pvArg, size_t cbArg), void *pvArg, size_t cbArg); | 
|---|
| 378 |  | 
|---|
| 379 | /** | 
|---|
| 380 | * Flush the fork() buffer. Meaning taking what ever is in the fork buffer | 
|---|
| 381 | * and let the child process it. | 
|---|
| 382 | * This might be desired to get the result of a pfnInvoke() in a near | 
|---|
| 383 | * synchornous way. | 
|---|
| 384 | * @returns     0 on success. | 
|---|
| 385 | * @returns appropriate non-zero error code on failure. | 
|---|
| 386 | * @param   pForkHandle Handle of the current fork operation. | 
|---|
| 387 | */ | 
|---|
| 388 | int pfnFlush(__LIBC_FORKHANDLE *pForkHandle); | 
|---|
| 389 |  | 
|---|
| 390 | /** | 
|---|
| 391 | * Register a fork() completion callback. | 
|---|
| 392 | * | 
|---|
| 393 | * Use this primitive to do post fork() cleanup. | 
|---|
| 394 | * The callbacks are executed first in the child, then in the parent. | 
|---|
| 395 | * | 
|---|
| 396 | * @returns     0 on success. | 
|---|
| 397 | * @returns appropriate non-zero error code on failure. (Usually ENOMEM.) | 
|---|
| 398 | * @param   pForkHandle Handle of the current fork operation. | 
|---|
| 399 | * @param   pfnCallback Pointer to the function to call back. | 
|---|
| 400 | *                      This will be called when fork() is about to | 
|---|
| 401 | *                      complete (the fork() result is established so to | 
|---|
| 402 | *                      speak). A zero rc argument indicates success, | 
|---|
| 403 | *                      a non zero rc argument indicates failure. | 
|---|
| 404 | * @param   pvArg       Argument to pass to pfnCallback as 3rd argument. | 
|---|
| 405 | * @param   enmContext  __LIBC_FORKCTX_CHILD, __LIBC_FORKCTX_PARENT, or | 
|---|
| 406 | *                      __LIBC_FORKCTX_BOTH. | 
|---|
| 407 | *                      (mental note: check up the naming convention for enums!) | 
|---|
| 408 | * @remark      Use with care, the memory used to remember these is taken from the | 
|---|
| 409 | *          fork buffer. | 
|---|
| 410 | */ | 
|---|
| 411 | int pfnCompletionCallback(__LIBC_FORKHANDLE *pForkHandle, | 
|---|
| 412 | void (pfnCallback)(__LIBC_FORKHANDLE *, int rc, void *pvArg), void *pvArg, | 
|---|
| 413 | __LIBC_PARENTCHILDCTX enmContext); | 
|---|
| 414 | ... | 
|---|
| 415 |  | 
|---|
| 416 |  | 
|---|
| 417 |  | 
|---|
| 418 | 4.3 The Flow Of A fork() Operation | 
|---|
| 419 | ---------------------------------- | 
|---|
| 420 |  | 
|---|
| 421 | When a process simple process foo.exe calls fork() the following events occur. | 
|---|
| 422 | (The 'p:) indicates parent process while (c) indicates child process.): | 
|---|
| 423 |  | 
|---|
| 424 | - p: fork() is called. It starts by push all registers (including fpu) | 
|---|
| 425 | onto the stack and recording the address of that stuff (in the | 
|---|
| 426 | fork handle when it's initiated). | 
|---|
| 427 | - p: fork() allocates the shared fork memory and initiatlizes it, thus | 
|---|
| 428 | creating the fork handle. | 
|---|
| 429 | - p: fork() calls helper for copying the memory allocations records | 
|---|
| 430 | to the fork buffer. (Those are FIFO and if there are too many for | 
|---|
| 431 | the fork buffer they should be completed after DosExecPgm returns.) | 
|---|
| 432 | - p: fork() walks the list of registered modules and calls the callback | 
|---|
| 433 | function asking if it's ok to fork now. | 
|---|
| 434 | Note: This will work for processes with multiple libc dlls because | 
|---|
| 435 | the list head is in the process block. | 
|---|
| 436 | - p: fork() allocates and initiates the process block of the child process | 
|---|
| 437 | entering the fork handle and linking it into the embryo list. | 
|---|
| 438 | - p: fork() takes the exec semaphore. | 
|---|
| 439 | - p: fork() spawns a child process taking the executable name from the PIB | 
|---|
| 440 | and giving it "!fork!" as argument. | 
|---|
| 441 | - c: During DosExecPgm all dll0(hi)fork's will be called, and for forkable | 
|---|
| 442 | modules __libc_ForkRegisterModule() is called and returns 1. The | 
|---|
| 443 | init code will the return successfully and not call the _DLL_InitTerm() | 
|---|
| 444 | or any other DLL init code. | 
|---|
| 445 | - c: __libc_ForkRegisterModule() will first check if the process block | 
|---|
| 446 | have been found (global LIBC pointer), and if not try locate it | 
|---|
| 447 | or allocate a new one. It will the check the fork handle. | 
|---|
| 448 | Once found it will add the module to the list of forkable modules. | 
|---|
| 449 | If the fork handle member is not NULL a fork operation is in | 
|---|
| 450 | progress and the module callback is called with a check if the | 
|---|
| 451 | module still thing fork is ok and give it a chance to do dllinit | 
|---|
| 452 | time preperations. | 
|---|
| 453 | The operations returns 1 if we're working, 0 if we're not working | 
|---|
| 454 | and -1 if we failed in some way the the dllinit should fail. | 
|---|
| 455 | - c: The first time the process block is found and it's forking | 
|---|
| 456 | a helper for processing the memory allocations in the | 
|---|
| 457 | for buffer is called. | 
|---|
| 458 | This *must* be done as early as possible!!! | 
|---|
| 459 | - p: The child have successfully initiated, DosExecPgm/DosStartSession | 
|---|
| 460 | returns NO_ERROR. | 
|---|
| 461 | - c: Child blocks while trying to get access to the fork handle (crt0). | 
|---|
| 462 | (Blocks on fork handle child event semaphore.) | 
|---|
| 463 | - p: Parent sets the operation enum member of the fork handle to signal | 
|---|
| 464 | an init operation. It resets the parent event sem member, releases | 
|---|
| 465 | the mutex, and goes to sleep for a defined max fork() timeout on the | 
|---|
| 466 | event sem. (The max could be, let's say, 30 seconds.) | 
|---|
| 467 | - c: The child get the ownership of the fork handle mutex. | 
|---|
| 468 | - c: If the process is statically linked the memory allocations should | 
|---|
| 469 | be done now (meaning do it here if not done already). | 
|---|
| 470 | - c: The child resets the child event sem, posts the parent event sem | 
|---|
| 471 | and releases the mutex. | 
|---|
| 472 | - p: fork() acquires the fork handle ownership and checks the return code | 
|---|
| 473 | from the child. | 
|---|
| 474 | - p: fork() walks the list of modules calling the callbacks with the | 
|---|
| 475 | do parent fork operation. This means the each register module will | 
|---|
| 476 | do what's necessary for replicting it self into the child process | 
|---|
| 477 | so that it will work as expected there after the fork() returns. | 
|---|
| 478 | This is done using the primitives. This is the only time the | 
|---|
| 479 | parent can use the currently defined primitives. | 
|---|
| 480 | - p: Buffer flush is preformed. This may happen multiple times, but | 
|---|
| 481 | fork() will allways finish of by doing one after all callbacks | 
|---|
| 482 | have been called. | 
|---|
| 483 | The buffer flush means passing the current fork buffer content | 
|---|
| 484 | to the child for processing. pfnFlush() does this. | 
|---|
| 485 | - p: pfnFlush() sets the fork handle operation enum to | 
|---|
| 486 | process buffer, resets the parent event sem, signals the | 
|---|
| 487 | child event semaphore and releases the mutex. | 
|---|
| 488 | - c: Child takes ownership of the fork handle and performs | 
|---|
| 489 | the actions recorded in the fork buffer. | 
|---|
| 490 | - c: The total result is put in the result member, other stuff | 
|---|
| 491 | might may be put in the buffer but that's currently not | 
|---|
| 492 | defined. The fork buffer is then transfered to the parent. | 
|---|
| 493 | - p: pfnFlush() wakes up and reclaims the fork handle ownership | 
|---|
| 494 | and returns the value of the result member. | 
|---|
| 495 | What it does if the buffer contains data is currently not | 
|---|
| 496 | defined. | 
|---|
| 497 | - p: Once all callbacks have been successfully executed, the stack is | 
|---|
| 498 | copied to the child. We copy all committed stack pages. The fork() | 
|---|
| 499 | return stack address is also passed to the child. | 
|---|
| 500 | NOTE: this step may be relocated to an earlier phase! | 
|---|
| 501 | - c: Child gets the stack and copies it in two turns, first the upper | 
|---|
| 502 | part (above current esp/fork return stack address), then it relocate | 
|---|
| 503 | it self on the stack so that it's ready for returning, before it | 
|---|
| 504 | copies the low part of the stack. | 
|---|
| 505 | (low/high here is not address value but logical stack view.) | 
|---|
| 506 | - c: Iterate the modules calling the callbacks signaling fork child | 
|---|
| 507 | operation. the callbacks then have the option to iterate the | 
|---|
| 508 | _CRT_FORK_CHILD1 vector. | 
|---|
| 509 | - c: Calls any completion callbacks registered. | 
|---|
| 510 | - c: Put's result code and passes the fork handle to parent after first | 
|---|
| 511 | freeing its mapping of the shared memory and semphores (not all sems | 
|---|
| 512 | first of course). | 
|---|
| 513 | - c: Returns from fork() restoring all registers but setting eax (return | 
|---|
| 514 | value) to zero indicating child. | 
|---|
| 515 | - p: fork() calls any completion callbacks registered. | 
|---|
| 516 | - p: fork() frees the fork handle and related resources and returns | 
|---|
| 517 | the pid of the child process. (not restoring registers) | 
|---|
| 518 |  | 
|---|
| 519 |  | 
|---|
| 520 |  | 
|---|
| 521 | 4.4 Forking LIBC | 
|---|
| 522 | ---------------- | 
|---|
| 523 |  | 
|---|
| 524 | To make LIBC forkable a bunch of things have to be fixed up in the child | 
|---|
| 525 | process. Some of these must be done in a certain order other doesn't have to. | 
|---|
| 526 | To solve this we are using set vectors (as we do for init and term, see | 
|---|
| 527 | emx/startup.h). The fork set vectors are be defined in InnoTekLIBC/fork.h | 
|---|
| 528 | and be called _CRT_FORK_PARENT1() and _CRT_FORK_CHILD1(). They take two | 
|---|
| 529 | arguments, the callback and the priority. Priority is 0 to 4G-1 where 4G-1 | 
|---|
| 530 | is the highest priority. The set vectors are started in crt0/dll0 called | 
|---|
| 531 | ___crtfork_parent1__, ___crtfork_chidl1__ | 
|---|
| 532 |  | 
|---|
| 533 | The _atfork_callback() iterates the set vectors when called for doing | 
|---|
| 534 | parent and child fork stuff. | 
|---|
| 535 |  | 
|---|
| 536 |  | 
|---|
| 537 | The _atfork_callback() have this prototype: | 
|---|
| 538 | /** | 
|---|
| 539 | * Called multiple times during fork() both in the parent and the child. | 
|---|
| 540 | * | 
|---|
| 541 | * The default LIBC implementation will: | 
|---|
| 542 | *      1) schedule the data segment for duplication. | 
|---|
| 543 | *      2) do ordered LIBC fork() stuff. | 
|---|
| 544 | *      3) do unordered LIBC fork() stuff, _CRT_FORK1 vector. | 
|---|
| 545 | * | 
|---|
| 546 | * @returns 0 on success. | 
|---|
| 547 | * @returns appropriate non-zero error code on failure. | 
|---|
| 548 | * @param   pModule         Pointer to the module record which is being | 
|---|
| 549 | *                          processed. | 
|---|
| 550 | * @param   pForkHandle     Handle of the current fork operation. | 
|---|
| 551 | * @param   enmOperation    Which callback operation this is. | 
|---|
| 552 | *                          Any value can be used, the implementation | 
|---|
| 553 | *                          of this function must just respond to the | 
|---|
| 554 | *                          one it knows and return successfully on the | 
|---|
| 555 | *                          others. | 
|---|
| 556 | *                          Operations: | 
|---|
| 557 | *                              __LIBC_FORK_CHECK_PARENT | 
|---|
| 558 | *                              __LIBC_FORK_CHECK_CHILD | 
|---|
| 559 | *                              __LIBC_FORK_FORK_PARENT | 
|---|
| 560 | *                              __LIBC_FORK_FORK_CHILD | 
|---|
| 561 | */ | 
|---|
| 562 | int _atfork_callback(__LIBC_FORKMODULE *pModule, __LIBC_FORKHANDLE *pForkHandle, | 
|---|
| 563 | enum __LIBC_CALLBACKOPERATION enmOperation); | 
|---|
| 564 |  | 
|---|
| 565 | The _CRT_FORK_*1() callbacks have this declaration: | 
|---|
| 566 | /** | 
|---|
| 567 | * Called once in the parent during fork(). | 
|---|
| 568 | * This function will use the fork primitives to move data and invoke | 
|---|
| 569 | * functions in the child process. | 
|---|
| 570 | * | 
|---|
| 571 | * @returns 0 on success. | 
|---|
| 572 | * @returns appropriate non-zero error code on failure. | 
|---|
| 573 | * @param   pModule         Pointer to the module record which is being | 
|---|
| 574 | *                          processed. | 
|---|
| 575 | * @param   pForkHandle     Handle of the current fork operation. | 
|---|
| 576 | * @param   enmOperation    Which callback operation this is. | 
|---|
| 577 | *                          !Important! Later versions of LIBC may call | 
|---|
| 578 | *                          this callback more than once. This parameter | 
|---|
| 579 | *                          will indicate what's going on. | 
|---|
| 580 | */ | 
|---|
| 581 | int _atfork_callback(__LIBC_FORKMODULE *pModule, __LIBC_FORKHANDLE *pForkHandle, | 
|---|
| 582 | enum __LIBC_CALLBACKOPERATION enmOperation); | 
|---|
| 583 |  | 
|---|
| 584 |  | 
|---|
| 585 |  | 
|---|
| 586 |  | 
|---|
| 587 | 4.4.1 Heap Memory | 
|---|
| 588 | ----------------- | 
|---|
| 589 |  | 
|---|
| 590 | The LIBC heaps will use the extended DosAllocMemEx. This means that the | 
|---|
| 591 | memory used by the heaps will be reserved at LIBC init time and duplicated | 
|---|
| 592 | as the very first thing in the LIBC _atfork_callback(). | 
|---|
| 593 |  | 
|---|
| 594 | 4.4.1.1 DosAllocMemEx | 
|---|
| 595 | --------------------- | 
|---|
| 596 |  | 
|---|
| 597 | TODO. Fixed address allocation and allocation recording. | 
|---|
| 598 |  | 
|---|
| 599 | 4.4.1.2 DosFreeMemEx | 
|---|
| 600 | -------------------- | 
|---|
| 601 |  | 
|---|
| 602 | TODO | 
|---|
| 603 |  | 
|---|
| 604 |  | 
|---|
| 605 | 4.4.2 Semaphores | 
|---|
| 606 | ---------------- | 
|---|
| 607 |  | 
|---|
| 608 | Use _CRT_FORK_*1() for each of the LIBC semaphores, or we'll make an extended | 
|---|
| 609 | API, DosCreateEventSemEx/DosCreateMutexSemEx, for semaphores as we do with | 
|---|
| 610 | memory APIs. | 
|---|
| 611 |  | 
|---|
| 612 | There is a forkable class of semaphores, we might wanna kick that out if | 
|---|
| 613 | we decide to use extended APIs (which I guess we'll do). | 
|---|
| 614 |  | 
|---|
| 615 |  | 
|---|
| 616 |  | 
|---|
| 617 | 4.4.3 Filehandles | 
|---|
| 618 | ----------------- | 
|---|
| 619 |  | 
|---|
| 620 | Two imporant things, 1) _all_ handles are inherited (even the close-on-exec ones), | 
|---|
| 621 | and 2) non-OS/2 handles must be inherited too. | 
|---|
| 622 |  | 
|---|
| 623 | For 1) we must temporarily change the inherit flag on all the close-on-exec | 
|---|
| 624 | flagged handles during the fork. | 
|---|
| 625 | The major question here is when we're gonna restore the OS/2 no inherit flag | 
|---|
| 626 | for close-on-exec handles. The simplest option option is a | 
|---|
| 627 | __LIBC_FORK_DONE_PARENT _atfork_callback() operation. The best option is to | 
|---|
| 628 | have a primitive to registering fork-done routines (which are called both | 
|---|
| 629 | on success and failure). | 
|---|
| 630 |  | 
|---|
| 631 | For 2) we need to extend the libc filehandle operation interface to include | 
|---|
| 632 | atfork operation. This will be called for every non-standard handle and it | 
|---|
| 633 | must it self use the fork primitives to cause something to happen in the | 
|---|
| 634 | child. This is not the fastest way, but it's the most flexible one and it's | 
|---|
| 635 | one which is probably will work too. | 
|---|
| 636 |  | 
|---|
| 637 |  | 
|---|
| 638 | 4.4.3.1 Sockets | 
|---|
| 639 | --------------- | 
|---|
| 640 |  | 
|---|
| 641 | Implement the atfork handle operation. Use the fork primitives to invoke | 
|---|
| 642 | a function in the child duplicating that handle. The function invoked in | 
|---|
| 643 | the child basically use adds the socket to the socket list of the new | 
|---|
| 644 | process (tcpip api for this). | 
|---|
| 645 |  | 
|---|
| 646 | Note that WS4eB tcpip level have a bug in the api adding sockets to a | 
|---|
| 647 | process. The problem is that a socket cannot be added twice, it'll cause | 
|---|
| 648 | an breakpoint instruction. | 
|---|
| 649 |  | 
|---|
| 650 |  | 
|---|
| 651 |  | 
|---|
| 652 | 4.4.4 Locale/Iconv/Stuff | 
|---|
| 653 | ------------------------ | 
|---|
| 654 |  | 
|---|
| 655 | Record and create once again in the child. Use _CRT_FORK_*1() to register | 
|---|
| 656 | callbacks. Requires a little bit of recording of create parameters in the | 
|---|
| 657 | iconv() case but that's nothing spectacular. | 
|---|
| 658 |  | 
|---|
| 659 |  | 
|---|
| 660 |  | 
|---|
| 661 |  | 
|---|
| 662 |  | 
|---|
| 663 | 8.0     Coding Conventions | 
|---|
| 664 | ---------------------- | 
|---|
| 665 |  | 
|---|
| 666 | New LIBC stuff uses these conventions: | 
|---|
| 667 | - Full usage of hungarian prefixes. Details on this is found in the | 
|---|
| 668 | old odin web pages and gradd docs (IIRC). | 
|---|
| 669 | - As much as possible shall be static. I.e. static int internalhelper(void). | 
|---|
| 670 | - Internal global function and variables shall be prefixed __libc_[a-z] or | 
|---|
| 671 | _sys_ depending on what it implements. Will not be exported by LIBCxy.DLL. | 
|---|
| 672 | - Non-standard LIBC functionality is prefixed __libc_[A-Z]. | 
|---|
| 673 | - Stuff defined in SuS is wrapped by _STD() so that we get __std_ prefix | 
|---|
| 674 | and the usual two aliases (plain and underscored). | 
|---|
| 675 | - No warnings. | 
|---|
| 676 |  | 
|---|
| 677 |  | 
|---|
| 678 | 9.0 Abbreviation and such | 
|---|
| 679 | ------------------------- | 
|---|
| 680 |  | 
|---|
| 681 | SuS | 
|---|
| 682 | The Single Unix Specification version 6 as published on www.opengroup.org. | 
|---|
| 683 |  | 
|---|
| 684 |  | 
|---|