Let’s say you have a complex class with a number of attributes. The class is used in a few different ways, so sometimes the attributes are available, but sometimes they haven’t been initialized yet. Because of global knowledge about how the class is used, we know which paths are certain to have the attributes, and which might not have them.
[UPDATE: I’ve changed my mind: Late initialization, reconsidered]
(If you are interested, the real code I’m thinking about is from coverage.py, but this post has toy examples for clarity.)
Before static type checking, I’d initialize these attributes to None. In the certain-to-exist code paths, I’d just use the attributes. In the uncertain code paths, I’d check if an attribute was None before using it:
# Original untyped code.
class Complicated:
def __init__(self):
self.other = None
def make_other(self):
self.other = OtherThing()
def certain_path(self):
self.other.do_something()
def uncertain_path(self):
if self.other is not None:
self.other.do_something()
How should I add type annotations to a situation like this? The most obvious approach is to declare the attribute as Optional. But that means adding asserts to the certain paths. Without them, the type checker will warn us that the attribute might be None. Type checkers don’t have the global understanding that makes us certain about them being available on those paths. Now we need extra code for both certain and uncertain paths: asserts for one and run-time checks for the other:
# Simple Optional typing.
class Complicated:
def __init__(self):
self.other: Optional[OtherThing] = None
def make_other(self):
self.other = OtherThing()
def certain_path(self):
assert self.other is not None
self.other.do_something()
def uncertain_path(self):
if self.other is not None:
self.other.do_something()
This is a pain if there are many certain paths, or many of these attributes to deal with. It just adds clutter.
A second option is to have the attribute exist or not exist rather than be None or not None. We can type these ghostly attributes as definitely not None, but then we have to check if it exists in the uncertain paths:
# Ghost: attribute exists or doesn't exist.
class Complicated:
def __init__(self):
# declared but not defined:
self.other: OtherThing
def make_other(self):
self.other = OtherThing()
def certain_path(self):
self.other.do_something()
def uncertain_path(self):
if hasattr(self, "other"):
self.other.do_something()
This is strange: you don’t often see a class that doesn’t know in its own code whether attributes exist or not. This is how I first adjusted the coverage.py code with type annotations: six attributes declared but not defined. But it didn’t sit right with me, so I kept experimenting.
A third option is to use two attributes for the same value: one is typed Optional and one is not. This lets us avoid asserts on the certain paths, but is really weird and confusing:
# Two attributes for the same value.
class Complicated:
def __init__(self):
self.other: OtherThing
self.other_maybe: Optional[OtherThing] = None
def make_other(self):
self.other = self.other_maybe = OtherThing()
def certain_path(self):
self.other.do_something()
def uncertain_path(self):
if self.other_maybe is not None:
self.other_maybe.do_something()
But if we’re going to use two attributes in the place of one, why not make it the value and a boolean?
# Value and boolean.
class Complicated:
def __init__(self):
self.other: OtherThing
self.other_exists: bool = False
def make_other(self):
self.other = OtherThing()
self.other_exists = True
def certain_path(self):
self.other.do_something()
def uncertain_path(self):
if self.other_exists:
self.other.do_something()
This is about the same as “exists or doesn’t exist’, but with a second nearly-useless attribute, so what’s the point?
Another option: the attribute always exists, and is never None, but is sometimes a placebo implementation that does nothing for those times when we don’t want it:
# Placebo
class OtherPlacebo(OtherThing):
def do_something(self):
pass
class Complicated:
def __init__(self):
self.other: OtherThing = OtherPlacebo()
def make_other(self):
self.other = OtherThing()
def certain_path(self):
self.other.do_something()
def uncertain_path(self):
self.other.do_something()
A philosophical quandary about placebos: should they implement all the base class methods, or only those that we know will be invoked in the uncertain code paths? Type checkers are fine with either, and run-time is of course fine with only the subset.
In the end, I liked the placebo strategy best: it removes the need for any checking or asserts. I implemented the placebos as bare-bones with only the needed methods. It can make the logic a bit harder to understand at a glance, but I think I mostly don’t need to know whether it’s a placebo or not in any given spot. Maybe six months from now I’ll be confused by the switcheroos happening, but it looks good right now.
Comments
It seems to me that it would be simplest to just use
Optional
and check foris not None
even if it is slightly tedious.Adding placebo objects seems to overcomplicate things in order to evade typing as well as being nonstandard.
If there’s a lot of these checks, you could add a
maybe_do_with_other(self, action)
method that does the checking in a single place, although I’m not sure off the top of my head how that would interact with typing.Another thought that comes to mind is that using a placebo like this is similarish to the strategy pattern, so maybe you could pass in
other
as a strategy instance up front rather than having amake_other
method? But this also seem overly complex just to avoid someNone
checks.Are six lines of
assert ... is not None
at the top of various functions really so bad?The Placebo solution is totally bespoke and doesn’t even work as advertised – you had to create
CoverageData._real
. This feels like adding magic and cognitive overhead for marginal aesthetic reasons.I’m not a fan of the placebo pattern—it neutralizes the benefits of strict optional type checking, which throws the baby out with the bathwater.
In the “Simple optional typing” approach, every time you access the attribute, the type checker forces you to consider whether you should treat the path as certain versus uncertain. This is something you need to be thinking about as a programmer because the type checker can’t do it for you. Once the code is written, you can also see at the callsite which mode you are in and, if you get an
AssertionError
you know there is a bug in your program.One possibility would be wrap the optional testing in a helper method, forcing you to declare at the callsite whether you are on the certain or uncertain path.
However, this can be unwieldy if there are a lot of possible methods on
other
that you want to call and don’t want to wrap them all. Also, you may have calls todo_other_something()
that don’t do something, which can make it more difficult to reason about the code.If you don’t need to support versions before 3.8, you can combine
overload
andLiteral
fromtyping
with assignment expressions to make what I think is a very nice pattern. The callsites are pretty clean and you never have ado_something()
call that doesn’t do something.Oops, I’m missing the
self
arguments on theget_other()
definitions above and there is a trailing...
, but you get the idea.Or maybe dispense with all the cleverness and just have a generic
unwrap()
method for the certain paths.I hear what you all are saying, and am thinking about it. My annotation style will likely continue to evolve :)
BTW: I changed my mind: https://nedbatchelder.com/blog/202302/late_initialization_reconsidered.html
Add a comment: