Changeset 1279


Ignore:
Timestamp:
Feb 25, 2004, 4:26:45 AM (21 years ago)
Author:
bird
Message:

First draft not yet done, but it's beyond bedtime.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/doc/Fork.os2

    • Property cvs2svn:cvs-rev changed from 1.1 to 1.2
    r1278 r1279  
    1 Fork design.
     1$Id$
     2
     3Fork Design Draft
     4--------------------
     5
     61.0 Intro
     7----------
     8
     9blah.
     10
     11
     121.1 The SuS fork() Description
     13------------------------------
     14
     15NAME
     16
     17    fork - create a new process
     18
     19SYNOPSIS
     20
     21    #include <unistd.h>
     22
     23    pid_t fork(void);
     24
     25DESCRIPTION
     26
     27    The fork() function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as detailed below:
     28
     29        * The child process shall have a unique process ID.
     30        * The child process ID also shall not match any active process
     31                  group ID.
     32        * The child process shall have a different parent process ID,
     33                   which shall be the process ID of the calling process.
     34        * The child process shall have its own copy of the parent's file
     35                  descriptors. Each of the child's file descriptors shall refer
     36                  to the same open file description with the corresponding file
     37                  descriptor of the parent.
     38        * The child process shall have its own copy of the parent's open
     39                  directory streams. Each open directory stream in the child process
     40                  may share directory stream positioning with the corresponding
     41                  directory stream of the parent.
     42        * [XSI] The child process shall have its own copy of the parent's
     43                  message catalog descriptors.
     44        * The child process' values of tms_utime, tms_stime, tms_cutime, and
     45                  tms_cstime shall be set to 0.
     46        * The time left until an alarm clock signal shall be reset to zero,
     47                  and the alarm, if any, shall be canceled; see alarm() .
     48        * [XSI] All semadj values shall be cleared.
     49        * File locks set by the parent process shall not be inherited by
     50                  the child process.
     51        * The set of signals pending for the child process shall be
     52                  initialized to the empty set.
     53        * [XSI] Interval timers shall be reset in the child process.
     54        * [SEM] Any semaphores that are open in the parent process shall
     55                  also be open in the child process.
     56        * [ML] The child process shall not inherit any address space memory
     57                  locks established by the parent process via calls to mlockall()
     58                  or mlock().
     59        * [MF|SHM] Memory mappings created in the parent shall be retained
     60                  in the child process. MAP_PRIVATE mappings inherited from the
     61                  parent shall also be MAP_PRIVATE mappings in the child, and any
     62                  modifications to the data in these mappings made by the parent
     63                  prior to calling fork() shall be visible to the child. Any
     64                  modifications to the data in MAP_PRIVATE mappings made by the
     65                  parent after fork() returns shall be visible only to the parent.
     66                  Modifications to the data in MAP_PRIVATE mappings made by the
     67                  child shall be visible only to the child.
     68        * [PS] For the SCHED_FIFO and SCHED_RR scheduling policies, the
     69                  child process shall inherit the policy and priority settings
     70                  of the parent process during a fork() function. For other s
     71                  cheduling policies, the policy and priority settings on fork()
     72                  are implementation-defined.
     73        * [TMR] Per-process timers created by the parent shall not be
     74                  inherited by the child process.
     75        * [MSG] The child process shall have its own copy of the message
     76                  queue descriptors of the parent. Each of the message descriptors
     77                  of the child shall refer to the same open message queue
     78                  description as the corresponding message descriptor of the parent.
     79        * [AIO] No asynchronous input or asynchronous output operations
     80                  shall be inherited by the child process.
     81        * A process shall be created with a single thread. If a
     82                  multi-threaded process calls fork(), the new process shall contain
     83                  a replica of the calling thread and its entire address space,
     84                  possibly including the states of mutexes and other resources.
     85                  Consequently, to avoid errors, the child process may only execute
     86                  async-signal-safe operations until such time as one of the exec
     87                  functions is called. [THR]  Fork handlers may be established by
     88                  means of the pthread_atfork() function in order to maintain
     89                  application invariants across fork() calls.
     90
     91                  When the application calls fork() from a signal handler and any of
     92                  the fork handlers registered by pthread_atfork() calls a function
     93                  that is not asynch-signal-safe, the behavior is undefined.
     94        * [TRC TRI] If the Trace option and the Trace Inherit option are
     95                  both supported:
     96                  If the calling process was being traced in a trace stream that
     97                  had its inheritance policy set to POSIX_TRACE_INHERITED, the
     98                  child process shall be traced into that trace stream, and the
     99                  child process shall inherit the parent's mapping of trace event
     100                  names to trace event type identifiers. If the trace stream in
     101                  which the calling process was being traced had its inheritance
     102                  policy set to POSIX_TRACE_CLOSE_FOR_CHILD, the child process
     103                  shall not be traced into that trace stream. The inheritance
     104                  policy is set by a call to the posix_trace_attr_setinherited()
     105                  function.
     106        * [TRC] If the Trace option is supported, but the Trace Inherit
     107                  option is not supported:
     108          The child process shall not be traced into any of the trace
     109                  streams of its parent process.
     110        * [TRC] If the Trace option is supported, the child process of
     111                  a trace controller process shall not control the trace streams
     112                  controlled by its parent process.
     113        * [CPT] The initial value of the CPU-time clock of the child
     114                  process shall be set to zero.
     115        * [TCT] The initial value of the CPU-time clock of the single
     116                  thread of the child process shall be set to zero.
     117
     118    All other process characteristics defined by IEEE Std 1003.1-2001 shall
     119        be the same in the parent and child processes. The inheritance of
     120        process characteristics not defined by IEEE Std 1003.1-2001 is
     121        unspecified by IEEE Std 1003.1-2001.
     122
     123    After fork(), both the parent and the child processes shall be capable
     124        of executing independently before either one terminates.
     125
     126RETURN VALUE
     127
     128    Upon successful completion, fork() shall return 0 to the child process
     129        and shall return the process ID of the child process to the parent
     130        process. Both processes shall continue to execute from the fork()
     131        function. Otherwise, -1 shall be returned to the parent process, no
     132        child process shall be created, and errno shall be set to indicate
     133        the error.
     134
     135ERRORS
     136
     137    The fork() function shall fail if:
     138
     139    [EAGAIN]
     140        The system lacked the necessary resources to create another
     141                process, or the system-imposed limit on the total number of
     142                processes under execution system-wide or by a single user
     143                {CHILD_MAX} would be exceeded.
     144
     145    The fork() function may fail if:
     146
     147    [ENOMEM]
     148        Insufficient storage space is available.
     149
     150
     151                                                                                                       
     152                                                                                                       
     1532.0 Requirements and Assumptions Of The Implementation
     154------------------------------------------------------
     155
     156The Innotek LIBC fork() implementation will require the following features
     157in LIBC to work:
     158        1. A shared process management internal to LIBC for communication to the
     159           child that a fork() is in progress.
     160        2. A very generalized and varied set of fork helper functions to archive
     161           maximum flexibility of the implementation.
     162        3. Extended versions of some memory related OS/2 APIs must be implemented.
     163       
     164The implemenetation will further make the following assumption about the
     165operation of OS/2:
     166        1. DosExecPgm will not return till all DLLs are initated successfully.
     167       
     168
     169       
     1703.0     The Shared Process Management
     171---------------------------------
     172
     173The fork() implementation requires a method of telling the child process
     174that it's being forked and must take a very different startup route. For
     175some other LIBC apis there is need for parent -> child and child -> parent
     176information exchange. More specifically, the inheritance of sockets,
     177signals, the different scheduler actions of a posix_spawn[p]() call, and
     178possibly some process group stuff related to posix_spawn too if we get it
     179figured out eventually. All this was parent -> child during spawn/fork. A
     180need exist also for child -> parent notification and possibly exchange for
     181process termination. It might be necessary to reimplement the different
     182wait apis and implement SIGCHLD, it's likely that those tasks will make
     183such demands.
     184
     185The choice is now whether or not to make this shared process management
     186specific to each LIBC version or try to make it survive normal LIBC updates.
     187Making is specific have advantages in code size and memory footprint (no
     188reserved field), however it have certain disadvantages when LIBC is updated.
     189The other option is to use a named shared memory object, defining the
     190content with reserved space for later extensions so several versions of
     191LIBC with more or less features implemented can co use the memory space.
     192
     193The latter option is prefered since it allows more applications to
     194interoperate, it causes less shared memory waste, the shared memory
     195can be located in high memory and it would be possible to fork
     196processes using multiple versions of LIBC.
     197
     198The shared memory must be named \SHAREMEM\INNOTEKLIBC.V01, the version
     199number being the one of the shared memory layout and contents, it will
     200only be increased when incompatible changes are made.
     201
     202The shared memory will be protected by an standard OS/2 mutex semaphore.
     203It will not use any fast R3 semaphore since the the usage frequency is low
     204and the result of a messup may be disastrous. Care must be take for
     205avoiding creation races and owner died scenarios.
     206
     207The memory will have a fixed size, since adding segments is very hard.
     208Thus the size must be large enough to cope with a great deal of
     209processes, but bearing in mind that OS/2 normally doesn't support more
     210than a 1000 processes, with a theoritical max of some 4000 (being the
     211max thread count). A very simplistic allocation scheme will be
     212implemented. Practically speaking a fixed block size pool would do fine
     213for the process structure, while for the misc structures like socket
     214lists a linked list based heap would do fine.
     215
     216The process blocks will be rounded up to in size adding a reasonable
     217amount of space resevered for future extensions. Reserved space must be
     218all zeroed.
     219
     220The fork() specific members of the process block will be a pointer to
     221the shared memory object for the fork operation (the fork handle) and
     222list of forkable modules. The fork handle will it self contain
     223information indicating whether or not another LIBC version have already
     224started fork() handling in the child. The presense of the fork handle
     225means that the child is being forked and normal dll init and startup
     226will not be executed, but a registered callback will be called to do
     227the forking of each module. (more details in section 4.0)
     228
     229The parent will before spawn, fork and exec (essentially before DosExecPgm
     230or DosStartSession) create a process block for the child to be born and
     231link it into an embryo list in the shared memory block. The child will
     232find the process block by looking searching an embryo list using the
     233parent pid as key. All DosExecPgm and DosStartSession calls are
     234serialized within one LIBC version. (If some empty headed programmer
     235manages to link together a program which may end up using two or more
     236LIBC versions and having two or more thread doing DosExecPgm at the
     237very same time, well then he really deserves what ever trouble he gets!
     238At least don't blame me!)
     239
     240Process blocks will have to stay around after the process terminated
     241(for child -> parent term exchange), a cleanup mechanism will be invoked
     242whenever a free memory threshold is reached. All processes will register
     243exit list handlers to mark the process block as zombie (and later
     244perhaps setting error codes and notifying waiters/child-listeners).
     245
     246                                       
     247
     2484.0 The fork() Implementation
     249-----------------------------
     250
     251
     252The implementation will be based on a fork handle and a set of primitives.
     253The fork handle is a pointer to an shared memory object allocated for the
     254occation and which will be freed before fork() returns. The primitives
     255all operates on this handle and will be provided using a callback table
     256in order to fully support multiple LIBC versions.
     257
     258
     2594.1 Forkable Executable and DLLs
     260--------------------------------
     261
     262The support for fork() is an optional feature of LIBC. The default
     263executable produced with LIBC and GCC will not be forkable. The fork
     264support will be based on registration of the DLLs and EXEs in their
     265LIBC supplied startup code (crt0/dll0). A set of fork versions of these
     266modules will be made.
     267
     268The big differnece between the ordinary crt0/dll0 and the forkable
     269crt0/dll0 is a per module structure, a call to register this, and the
     270handling of the return code of that call.
     271
     272The structure will contain these fields:
     273        - chain pointer.
     274        - data segment base address.
     275        - data segment end address.
     276        - fork callback function.
     277       
     278The fork callback function is called _atfork_callback, it takes the fork
     279handle, module structure, and an operation enum as arguments. LIBC will
     280contain a default implementation of _atfork_callback() which simply
     281duplicates the data segment.
     282
     283The register call, __libc_ForkRegisterModule(), will return:
     284        - 0 if normal process startup. no forking.
     285        - 1 if fork() is in progress. The crt0/dll0 code will then
     286          not call any standard initiation code, but let the
     287          _atfork_callback() do all necessary stuff.
     288
     289
     2904.2 Fork Primitives
     291-------------------
     292
     293These primitives are provided by the fork implementation in the fork
     294handle structure. We will define a set of these primitives now, if
     295later new ones are added the users of these must check that they are
     296actually present.
     297
     298Example:
     299        rc = pForkHandle->pOps->pfnDuplicatePages(pModule->pvDataBase, pModule->pvDataEnd, __LIBC_FORK_ONLY_DIRTY);
     300        if (rc)
     301                return rc; /* failure */
     302       
     303Prototypes:
     304        /**
     305         * Duplicating a number of pages from pvStart to pvEnd.
     306         * @returns     0 on success.
     307         * @returns appropriate non-zero error code on failure.
     308         * @param   pForkHandle Handle of the current fork operation.
     309         * @param       pvStart         Pointer to start of the pages. Rounded down.
     310         * @param       pvEnd           Pointer to end of the pages. Rounded up.
     311         * @param       fFlags          __LIBC_FORK_ONLY_DIRTY means checking whether the
     312         *                                              pages are actually dirty before bothering touching
     313         *                                              and copying them. (Using the partically broken
     314         *                                              DosQueryMemState() API.)
     315         *                                              __LIBC_FORK_ALL means not to bother checking, but
     316         *                                              just go ahead copying all the pages.
     317         */
     318        int pfnDuplicatePages(__LIBC_FORKHANDLE *pForkHandle, void *pvStart, void *pvEnd, unsigned fFlags);
     319       
     320        /**
     321         * Invoke a function in the child process giving it an chunk of input.
     322         * The function is invoked the next time the fork buffer is flushed,
     323         * call pfnFlush() if the return code is desired.
     324         *
     325         * @returns     0 on success.
     326         * @returns appropriate non-zero error code on failure.
     327         * @param   pForkHandle Handle of the current fork operation.
     328         * @param       pfn                     Pointer to the function to invoke in the child.
     329         *                                              The function gets the fork handle, pointer to
     330         *                                              the argument memory chunk and the size of that.
     331         *                                              The function must return 0 on success, and non-zero
     332         *                                              on failure.
     333         * @param       pvArg           Pointer to a block of memory of size cbArg containing
     334         *                                              input to be copied to the child and given to pfn upon
     335         *                                              invocation.
     336         */
     337        int pfnInvoke(int *(pfn)(__LIBC_FORKHANDLE *pForkHandle, void *pvArg, size_t cbArg), void *pvArg, size_t cbArg);
     338       
     339        /**
     340         * Flush the fork() buffer. Meaning taking what ever is in the fork buffer
     341         * and let the child process it.
     342         * This might be desired to get the result of a pfnInvoke() in a near
     343         * synchornous way.
     344         * @returns     0 on success.
     345         * @returns appropriate non-zero error code on failure.
     346         * @param   pForkHandle Handle of the current fork operation.
     347         */
     348        int pfnFlush(__LIBC_FORKHANDLE *pForkHandle);
     349        ...
Note: See TracChangeset for help on using the changeset viewer.