Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datetime NoneType after calling Py_Finalize and Py_Initialize #71587

Open
DennyWeinberg mannequin opened this issue Jun 27, 2016 · 26 comments · May be fixed by #101783
Open

Datetime NoneType after calling Py_Finalize and Py_Initialize #71587

DennyWeinberg mannequin opened this issue Jun 27, 2016 · 26 comments · May be fixed by #101783
Assignees
Labels
3.12 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@DennyWeinberg
Copy link
Mannequin

DennyWeinberg mannequin commented Jun 27, 2016

BPO 27400
Nosy @ncoghlan, @abalkin, @tiran, @Fak3, @JimJJewett, @zooba, @MojoVampire, @ndjensen, @yan12125, @ammaraskar, @cschramm, @pganssle
Files
  • issue27400.patch
  • 27400.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/abalkin'
    closed_at = None
    created_at = <Date 2016-06-27.13:49:18.523>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = 'Datetime NoneType after calling Py_Finalize and Py_Initialize'
    updated_at = <Date 2018-07-05.17:14:18.266>
    user = 'https://bugs.python.org/DennyWeinberg'

    bugs.python.org fields:

    activity = <Date 2018-07-05.17:14:18.266>
    actor = 'pablogsal'
    assignee = 'belopolsky'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2016-06-27.13:49:18.523>
    creator = 'Denny Weinberg'
    dependencies = []
    files = ['44678', '46478']
    hgrepos = []
    issue_num = 27400
    keywords = ['patch']
    message_count = 15.0
    messages = ['269379', '269381', '269751', '269914', '269916', '276598', '276604', '276622', '276626', '276630', '276632', '277184', '285560', '286618', '286636']
    nosy_count = 15.0
    nosy_names = ['ncoghlan', 'belopolsky', 'christian.heimes', 'palm.kevin', 'Roman.Evstifeev', 'Jim.Jewett', 'steve.dower', 'josh.r', 'ndjensen', 'yan12125', 'Denny Weinberg', 'ammar2', 'cschramm', 'p-ganssle', 'FrankBlabu']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue27400'
    versions = ['Python 3.6', 'Python 3.7']

    Linked PRs

    @DennyWeinberg
    Copy link
    Mannequin Author

    DennyWeinberg mannequin commented Jun 27, 2016

    After calling Py_Finalize and Py_Initialize I get the message "attribute of type 'NoneType' is not callable" on the datetime.strptime method.

    Example:

    from datetime import datetime
    s = '20160505 160000'
    refdatim = datetime.strptime(s, '%Y%m%d %H%M%S')

    The first call works fine but it crashes after the re initialization.

    Workaround:

    from datetime import datetime
    s = '20160505 160000'
    try:
        refdatim = datetime.strptime(s, '%Y%m%d %H%M%S')
    except TypeError:
        import time
        refdatim = datetime.fromtimestamp(time.mktime(time.strptime(s, '%Y%m%d %H%M%S')))

    Related Issue: bpo-17408 ("second python execution fails when embedding")

    @DennyWeinberg DennyWeinberg mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jun 27, 2016
    @DennyWeinberg
    Copy link
    Mannequin Author

    DennyWeinberg mannequin commented Jun 27, 2016

    Just to be clear:

    The error happens after these steps:

    1. Call strptime
    2. Call cpython function "Py_Finalize" and "Py_Initialize"
    3. Call strptime again

    Now we get the error "attribute of type 'NoneType' is not callable"

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jul 3, 2016

    Thanks for the report Denny. Looking at https://hg.python.org/cpython/file/30099abdb3a4/Modules/datetimemodule.c#l3929, there's a problematic caching of the "_strptime" module that is almost certainly the cause of the problem - it will attempt to call _strptime._strptime from the already finalized interpreter rather than the new one.

    It should be possible to adjust that logic to permit a check for _strptime._strptime being set to None, and reimporting _strptime in that case.

    @ammaraskar
    Copy link
    Member

    Is there any particular reason that datetime.strptime caches the imported module like that?

    From a quick search, these two other examples don't bother with any caching:

    PyObject *strptime_module = PyImport_ImportModuleNoBlock("_strptime");

    io = PyImport_ImportModuleNoBlock("io");

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jul 7, 2016

    Aye, skipping the caching entirely would be an even simpler solution - the only thing it is saving in the typical case is a dictionary lookup in the modules cache.

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Sep 15, 2016

    Nick: Looks like it's quite a bit more work than just a dict lookup. That PyImport_ImportModuleNoBlock call (which seems odd; the implementation of NoBlock is just to wrap the blocking function; guess we don't allow non-blocking imports anymore and this is just to avoid changing all the names elsewhere?) involves a *lot* more work than just a dict lookup (it devolves to a PyImport_Import call https://hg.python.org/cpython/file/3.5/Python/import.c#l1743 , which basically does everything involved in the import process aside from actually reading/parsing the file unconditionally, because of how weird __import__ overrides can be, I guess).

    While it's not a perfect comparison, compare:

    >>> import _strptime  # It's now cached
    
    # Cache globals dict for fair comparison without globals() call overhead
    >>> g = globals()     
    
    # Reimport (this might be *more* expensive at C layer, see notes below)
    >>> %timeit -r5 import _strptime
    1000000 loops, best of 5: 351 ns per loop
    
    # Dict lookup (should be at least a bit cheaper at C layer if done equivalently, using GetAttrId to avoid temporary str)
    >>> %timeit -r5 g['_strptime']
    10000000 loops, best of 5: 33.1 ns per loop
    
    # Cached reference (should be *much* cheaper at C layer)
    >>> %timeit -r5 _strptime
    100000000 loops, best of 5: 19.1 ns per loop

    Note: I'm a little unclear on whether a Python function implemented in C has its own globals, or whether it's simulated as part of the C module initialization); if it lacks globals, then the work done for PyImport_Import looks like it roughly doubles (it has to do all sorts of work to simulate globals and the like), so that 351 ns per re-import might actually be costlier in C.

    Either way, it's a >10x increase in cost to reimport compared to a dict lookup, and ~18x speedup over using a cached reference (and like I said, I think the real cost of the cheaper options would be much less in C, so the multiplier is higher). Admittedly, in tests, empty string calls to _strptime._strptime take around 7.4 microseconds (with realistic calls taking 8.5-13.5 microseconds), so caching is saving maybe a third of a microsecond overhead, maybe 2.5%-4.5% of the work involved in the strptime call.

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Sep 15, 2016

    Hmm... On checking down some of the code paths and realizing there were some issues in 3.5 (redundant code, and what looked like two memory leaks), I checked tip (to avoid opening bugs on stale code), and discovered that bpo-22557 rewrote the import code, reducing the cost of top level reimport by ~60%, so my microbenchmarks (run on Python 3.5.0) are already out of date for 3.6's faster re-import. Even so, caching wasn't a wholly unreasonable optimization before now, and undoing it now still has a cost, if a smaller one.

    @abalkin
    Copy link
    Member

    abalkin commented Sep 15, 2016

    I am not sure this is possible to fix without refactoring the datetime module according to PEP-3121. See bpo-15390.

    @tiran
    Copy link
    Member

    tiran commented Sep 15, 2016

    PEP-3121 is a big change. Can we use PyModuleDef->m_clear() for a clever hack?

    @abalkin
    Copy link
    Member

    abalkin commented Sep 15, 2016

    Yes, I think something like the attached patch may do the trick.

    @abalkin abalkin added the 3.7 (EOL) end of life label Sep 15, 2016
    @abalkin abalkin self-assigned this Sep 15, 2016
    @tiran
    Copy link
    Member

    tiran commented Sep 15, 2016

    Wouldn't it clear strptime_module when a subinterpreter shuts down, too? It's not a big deal because it can't cause a crash.

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Sep 21, 2016

    Having to (re-)fill the cache once per interpreter seems like a reasonable price to pay.

    Why is 3.5 not included? Did this not cause problems before the import change, or is it just that this bug is small enough that maybe it isn't worth backporting?

    @DennyWeinberg
    Copy link
    Mannequin Author

    DennyWeinberg mannequin commented Jan 16, 2017

    Any news here? 3.6.0 is also affected by this bug.

    @cschramm
    Copy link
    Mannequin

    cschramm mannequin commented Feb 1, 2017

    This issue should have a much higher priority as it basically breaks Python embedding unless the user either does not re-initialize the interpreter or avoid the use of _datetime.strptime.

    We're currently testing with a patch based on Christian and Alexander's idea but using m_free as using m_clear and Py_CLEAR neither makes sense to me nor did it work in conjunction with Py_Finalize when testing it. A version matching the current tip is attached. We did not run into any issues so far (and the only thing I can think of is clearing the static variable too often and thus causing some extra imports).

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Feb 1, 2017

    I've added Steve Dower to the nosy list, as he's done some work recently on making the Windows builds more embedding-friendly, and I believe at least some of that time may have been funded work.

    Unfortunately, we don't currently have anyone I'm aware of that's being specifically paid to improve or maintain CPython's embedding support in general, so getting attention for these kinds of interpreter reinitialization bugs can be a bit hit-or-miss.

    @MarkusEh
    Copy link

    Hi,

    This issue results in bugs in KODI, see xbmc/xbmc#17311 .
    Please advice:

    • Should we ask KODI developers to implement workarounds, as we cannot expect this bug to be fixed in the near future?
    • Or is it worth while to wait for a bug fix in Python?

    garymm added a commit to garymm/plex-for-kodi that referenced this issue Sep 26, 2022
    garymm added a commit to garymm/plex-for-kodi that referenced this issue Sep 26, 2022
    ruuk pushed a commit to plexinc/plex-for-kodi that referenced this issue Dec 3, 2022
    @brettcannon
    Copy link
    Member

    FYI if anyone wants to fix this, you need to make the following line not static:

    static PyObject *module = NULL;

    Now performance might not suck in this instance if you just change that line, but who knows without testing (last time this was looked at was the 3.5 days and things have obviously changed since then). Otherwise datetime should get updated for multi-phase init or do something such that caching _strptime happens as part of calling PyInit__datetime() and not datetime_strptime().

    @abalkin
    Copy link
    Member

    abalkin commented Feb 10, 2023

    @brettcannon - there are two patches attached to this issue from 6 years ago. One is mine and another by Christopher Schramm (@cschramm). Let me convert Christopher's patch to a PR so that we can test and review it.

    abalkin pushed a commit to abalkin/cpython-1 that referenced this issue Feb 10, 2023
    @bedevere-bot bedevere-bot linked a pull request Feb 10, 2023 that will close this issue
    @abalkin abalkin linked a pull request Feb 10, 2023 that will close this issue
    @abalkin abalkin added extension-modules C modules in the Modules dir 3.12 bugs and security fixes and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.7 (EOL) end of life labels Feb 10, 2023
    abalkin added a commit to abalkin/cpython-1 that referenced this issue Feb 11, 2023
    @abalkin
    Copy link
    Member

    abalkin commented Feb 11, 2023

    It looks like embedding the datetime module is completely broken:

    % ./Programs/_testembed test_repeated_init_exec 'print("import datetime");import datetime'
    --- Loop #1 ---
    import datetime
    --- Loop #2 ---
    import datetime
    _testembed(90330,0x2093e0100) malloc: *** error for object 0x100db3a50: pointer being freed was not allocated
    _testembed(90330,0x2093e0100) malloc: *** set a breakpoint in malloc_error_break to debug

    Here is the backtrace:

    (lldb) 
    frame #16: 0x000000010098365c _datetime.cpython-312d-darwin.so`_datetime_exec(module=0x000000010296ef90) at _datetimemodule.c:6728:5
       6725	
       6726	    /* timedelta values */
       6727	    PyObject *d = PyDateTime_DeltaType.tp_dict;
    -> 6728	    DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
       6729	    DATETIME_ADD_MACRO(d, "min", new_delta(-MAX_DELTA_DAYS, 0, 0, 0));
       6730	    DATETIME_ADD_MACRO(d, "max",
       6731	                       new_delta(MAX_DELTA_DAYS, 24*3600-1, 1000000-1, 0));
    (lldb) bt
    * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xdddddddedea5690d)
        frame #0: 0x00000001001790c8 _testembed`_PyObject_DebugDumpAddress(p=0x0000000100c78b30) at obmalloc.c:2150:13
        frame #1: 0x0000000100176314 _testembed`_PyMem_DebugCheckAddress(func="_PyMem_DebugRawFree", api='r', p=0x0000000100c78b30) at obmalloc.c:2068:9
        frame #2: 0x0000000100175080 _testembed`_PyMem_DebugRawFree(ctx=0x00000001005fc400, p=0x0000000100c78b30) at obmalloc.c:1901:5
        frame #3: 0x00000001001756c4 _testembed`PyMem_RawFree(ptr=0x0000000100c78b30) at obmalloc.c:574:5
        frame #4: 0x000000010017489c _testembed`_PyObject_Free(ctx=0x0000000000000000, p=0x0000000100c78b30) at obmalloc.c:1615:9
        frame #5: 0x00000001001750c4 _testembed`_PyMem_DebugRawFree(ctx=0x00000001005fc460, p=0x0000000100c78b40) at obmalloc.c:1905:5
        frame #6: 0x00000001001751b8 _testembed`_PyMem_DebugFree(ctx=0x00000001005fc460, ptr=0x0000000100c78b40) at obmalloc.c:2038:5
        frame #7: 0x0000000100175bec _testembed`PyObject_Free(ptr=0x0000000100c78b40) at obmalloc.c:706:5
        frame #8: 0x0000000100192dac _testembed`object_dealloc(self=0x0000000100c78b40) at typeobject.c:5007:5
        frame #9: 0x0000000100173568 _testembed`_Py_Dealloc(op=0x0000000100c78b40) at object.c:2425:5
        frame #10: 0x000000010014bd50 _testembed`Py_DECREF(filename="./Include/object.h", lineno=654, op=0x0000000100c78b40) at object.h:549:9
        frame #11: 0x000000010014fee0 _testembed`Py_XDECREF(op=0x0000000100c78b40) at object.h:654:9
        frame #12: 0x000000010014ce88 _testembed`insertdict(mp=0x0000000100c6a690, key=0x00000001029bc860, hash=2255823617163242250, value=0x00000001029a8e00) at dictobject.c:1297:5
        frame #13: 0x000000010014c6ec _testembed`_PyDict_SetItem_Take2(mp=0x0000000100c6a690, key=0x00000001029bc860, value=0x00000001029a8e00) at dictobject.c:1841:12
        frame #14: 0x000000010014bcc0 _testembed`PyDict_SetItem(op=0x0000000100c6a690, key=0x00000001029bc860, value=0x00000001029a8e00) at dictobject.c:1859:12
        frame #15: 0x0000000100152e7c _testembed`PyDict_SetItemString(v=0x0000000100c6a690, key="resolution", item=0x00000001029a8e00) at dictobject.c:3857:11
      * frame #16: 0x000000010098365c _datetime.cpython-312d-darwin.so`_datetime_exec(module=0x000000010296ef90) at _datetimemodule.c:6728:5
    ...
    

    @erlend-aasland
    Copy link
    Contributor

    erlend-aasland commented Feb 15, 2023

    It looks like embedding the datetime module is completely broken:

    Yes, it is. The underlying C extension module implements multi-phase initialisation, but uses static global data (static types, etc.). There is no guard for repeated initialisation. In order to fix this, we must move all global data into module state (including porting static types to heap types). For now, a quick-fix could be to simply add an initialisation guard. Proof-of-concept:

    diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c
    index eda8c5610b..f2e549f525 100644
    --- a/Modules/_datetimemodule.c
    +++ b/Modules/_datetimemodule.c
    @@ -6681,9 +6681,15 @@ datetime_destructor(PyObject *op)
         PyMem_Free(ptr);
     }
     
    +static int initialized;
    +
     static int
     _datetime_exec(PyObject *module)
     {
    +    if (initialized) {
    +        return 0;
    +    }
    +
         // `&...` is not a constant expression according to a strict reading
         // of C standards. Fill tp_base at run-time rather than statically.
         // See https://bugs.python.org/issue40777
    @@ -6860,6 +6866,7 @@ _datetime_exec(PyObject *module)
         if (us_per_hour == NULL || us_per_day == NULL || us_per_week == NULL) {
             return -1;
         }
    +    initialized = 1;
         return 0;
     }
     

    cc. @kumaraditya303

    @erlend-aasland
    Copy link
    Contributor

    erlend-aasland commented Feb 25, 2023

    @kumaraditya303: I have a proof-of-concept branch for datetime ready at https://github.com/erlend-aasland/cpython/tree/isolate-datetime/poc. There's still some work to do with it, but it is an ok starting point for now. The diff is rather large, so we should consider breaking it up in multiple PRs.

    Issues:

    • datetime.datetime and datetime.time use different object layouts depending on if they are timezone aware or not1
    • theres still some optimisations to be done2
    • ./python.exe -m test test_datetime works fine, but ./python.exe -m test -R : test_dateime does not; might be a problem with the test code

    FWIW, ./Programs/_testembed test_repeated_init_exec 'print("import datetime");import datetime' works fine.

    Footnotes

    1. it's a memory optimisation; for now, I've just disabled that optimisation, I just use the "large" structs all the time

    2. store state in object structs? use defining class / clinic and _PyType_GetModuleState iso. the mro walk. Come to think of it, porting _datetime to Argument Clinic could be a nice prerequisite for the isolation.

    @zooba
    Copy link
    Member

    zooba commented Feb 27, 2023

    • ./python.exe -m test test_datetime works fine, but ./python.exe -m test -R : test_dateime does not; might be a problem with the test code

    You mean there are refleaks? Or the test doesn't like running multiple times? (I assume dateime is a typo in the message, not in your command) Either of those would be real issues that I'd expect with this kind of change.

    A quick look at your changes, they seem like what I'd expect. I'm not sure that breaking them up would help anything - apart from the failures, it's pretty mechanical.

    @erlend-aasland
    Copy link
    Contributor

    Yeah, that's a typo. Multiple test runs result in test failures, not refleaks. Sorry for not specifying that.

    @erlend-aasland
    Copy link
    Contributor

    As noted in the footnotes of my previous post, I think there is a benefit from converting datetime to Argument Clinic before doing this change.

    @kumaraditya303
    Copy link
    Contributor

    As noted in the footnotes of my previous post, I think there is a benefit from converting datetime to Argument Clinic before doing this change.

    SGTM

    @kumaraditya303
    Copy link
    Contributor

    I have a proof-of-concept branch for datetime ready at

    Okay but before proceeding to that let's finish #101948 first.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-bug An unexpected behavior, bug, or error
    Projects
    Status: Todo
    Development

    Successfully merging a pull request may close this issue.

    10 participants