Exploding comments

Wednesday 21 December 2005This is almost 19 years old. Be careful.

A debate about asserts flared up in the fall, and I only just now caught up with it. Len Holgate started it with Asserts are Evil. His complaints center mostly around how asserts are traditionally implemented (only in debug builds, and they exit the program if failed).

B (?) mostly agreed: Assertions are still evil. It’s in the comments on that thread that the defining phrase was coined: Len Holgate described asserts as “exploding comments”.

Noel Llopis came to assert’s defense with Asserting Oneself.

I’ve written about asserts before, and I’m all for them. Programmers need every tool they can get to help nail down the often slippery semantics of the complex beasts they are building. Even if they are only exploding comments, isn’t that better than just plain-old comments?

The biggest misconception about asserts (and I think the heart of Len’s complaints about them) is the confusion of the caller’s semantics with the implementation details. The main thing I learned while writing my Asserts essay was the detailed semantic of an ASSERT statement:

ASSERT(expr)

Asserts that an expression is true. The expression may or may not be evaluated.

  • If the expression is true, execution continues normally.
  • If the expression is false, what happens is undefined.

That last word is the most important: if the expression is false, what happens is undefined. Note that nothing is said about debug vs. release builds. Nothing is said about exiting the process. So when Len complains,

assertions are often hard to test for; try and write a unit test that proves that a “traditional” assert works... It’s complicated because assert ends the program...

how does he know the failed assert will end the program? Who said? He’s assuming a detail about the implementation of the assert. It sounds like he’s living with too-simple an implementation.

In the assert framework I built at work, a failed assert could do one of a number of things:

  • Log a big red error message with a stack trace,
  • Throw an exception,
  • Display a dialog box,
  • Run an external program,
  • Debug the process,
  • Abort the process,
  • Nothing (ignore the failed assert),
  • or a complex combination of the above.

The decision about what to do is mostly controlled by defaults based on product (clients display a dialog, servers throw an exception), but could be affected by registry settings, or by the user’s choice if a dialog comes up.

The anti-assert people assert (!) that unit testing is a much better way to guarantee the correctness of an implementation. I’m all for unit tests, but have found it very difficult to build them for complex systems being built on tight deadlines. But unit tests depend on asserts, so I’m not even sure it’s an either/or situation. In many unit test frameworks, it is specifically failed asserts that signal the test failure to the test runner, which continues on to the next test.

I say asserts are good. I’ve found them very useful for building self-diagnosing code that can raise the alarm when it hasn’t adapted to changing requirements fast enough.

Comments

[gravatar]
I'm with you on this, Ned. Unit tests are great, but when you have many units interacting in extremely complex ways they're nowhere near sufficient because practically all of the most interesting (by which I mean "hard to debug") problems lie in between. Also, as you point out, unit tests and asserts often work together. They're complementary, not mutually exclusive. The place where I think Len has a point is that it's wrong to rely only on asserts and that splatting an assert to the viewer of a web page isn't really very useful. In the example that triggered his article, if that same information had been splatted to a server-side log it would have been much more useful. In data storage it's common to say that if something unexpected occurs you have to stop because the alternative is to risk data corruption. It's good to stop in a way that facilitates later diagnosis without requiring that the user leave their system idle and their firewall open so a developer can poke around - many times that just ain't gonna happen - but someone viewing a web page has no use for that kind of information and it might even be a security risk to disclose it so publicly. Even with the best unit testing (and static code checking) in the world, having an assert to check for unexpected conditions and dump information before stopping the program is an essential part of making such a system maintainable in the field.
[gravatar]
Ned,

The thing that people seem to miss about my original piece on asserts was that I was complaining about the "traditional" C style assert and the fact that the standard behaviour for that assert isn't especially sensible (IMHO). I was complaining that often people don't treat assert as a first class programming construct and simply throw them in to code rather than fixing the real problems.

I'm all for people providing their own assert if that's what works for them. I'm also all for people who apply discipline to how they add asserts (or equivalent) to their code. But, personally, I don't see much of a value add over using exceptions... I expect that's because of my server software background, I can see how it's more useful to have a pop up message if you're working on a GUI app with a framework that doesn't understand that every message handler could throw an exception...
[gravatar]
Platypus: At Lotus, working on Notes and Domino, our assert was, "we crash to protect your data".

Len: I understand your complaints about the tradition C style assert. That's what I referred to as a "too-simple implementation" of asserts. But I disagree that exceptions are just as good. Even if failed asserts simply throw an exception, there's an expressive difference between "if (p == NULL) throw" and "ASSERT(p != NULL)".

The first says p can be NULL, and we need to code for it. The second says p cannot be NULL. The developer is saying two different things, and follow-on developers read two different things. The intention of the developer is just as important as the run-time execution semantics.
[gravatar]
Ned: I agree that there's little practical difference between an assert and an uncaught exception (or one caught only by a generic handler) except for the one you note. When (p==NULL) is a legitimately possible error condition it should be handled, generally with neither an assert nor an exception. Exceptions are supposed to be exceptional, not a signaling mechanism for expected conditions. On the other hand, when one asserts that (p!=NULL) it's a statement about something the code itself can control. If an assertion fails it means the programmer ended up not doing what they thought they were doing; expressing it as an assert is both more representative of intent and more concise than as an explicit check plus a throw.
[gravatar]
Ned,

I get a "here we go again" feeling about this but...

Assuming we're talking about C++ as we're comparing an assert with an exception... I consider an assert(p != null) to be indicative for a design flaw. Why isn't "p" a reference if it can never be null? I wouldn't be looking to put an exception there, I'd be looking to remove the chance that it can ever happen by passing a reference if the null pointer is not a valid option. If it IS an option then neither an assert nor an exception are appropriate... This is why I take the view that encouraging asserts encourages bad design. In the example you show the assert is a sticking plaster over a broken design. I prefer to disallow the plaster and force the design to be fixed.

The problem with a 'not too simple implementation' is that unless it's just using exceptions then you're starting to couple your code to some common assert concept and that may (or may not) become a problem. I personally prefer to stick with a mechanism that's within the language rather than building some new mechanism that my code becomes coupled to and that the developers all have to understand. Looking at code that includes "if() throw etc" is easy for all to read and obvious to all exactly what happens. If I see ASSERT(blah) I need to be familiar with your personal implementation of assert and all the baggage that comes with it... That may, or may not, be a problem. What may or may not be more expressive to you may just be less precise, or require more thinking, for me...

But I've said all this before ;)

I've no problem with people having an appropriate error handling strategy, I think it's unfortunate that many call this ASSERT given then broad nature of what assert means to different people. For novice programmers, being told to use assert without being also told that they shouldn't rely on a 'simplistic' assert, like the one that's "standard" seems often to lead to bad code...
[gravatar]
Len, getting bogged down in what the ASSERT actually does is bad coupling. When you code an ASSERT, you need to know three things: 1) the expression may or may not be evaluated, 2) if it is true, execution will continue, and 3) if it is false, you can't count on anything at all, because it is undefined what will happen. That is the contract of ASSERT, that is what ASSERT means. If you have need for more promises than that, then don't use ASSERT, find something else that better meets your needs.

ASSERTs should only be written when the coder truly believes that the expression can never be false. That's why they're OK with that "undefined" semantic in the definition of ASSERT.

One way to look at it: if you think of ASSERT as an "if", then you are using it wrong. It is a statement of fact, an assertion. ASSERT(p != NULL) means p can never be NULL.

As to making p a reference, that may be an option in some designs, or it may not. There are plenty of places where we have a pointer in hand, and want to assert that it cannot be null. And "p != NULL" is just a sample assert. There are plenty of other assert expressions which cannot be "syntaxed" away.
[gravatar]
I've never used a language with strong support for Design By Contract, but I think Bertrand Meyer had the right idea with Eiffel. Pre/Post/Invarient conditions better encapsulate what most people agree is the best case usage of asserts.
[gravatar]
Ned

Your definition of Assert is fine but it isn't how I see most asserts being used. They're often used for catching 'error' conditions, possibly because the coder in question doesnt think about situations where the expression can be false... Personally I dont find 1) particularly useful when they're used in that way.

Perhaps it's just the code that has been written by others that I've had to deal with in the past but, I find that more often than not a sprinkling of asserts means that the coder doesn't actually know what's going on... Much like overused random debug traces...

"There are plenty of other assert expressions which cannot be "syntaxed" away"

Show me a non trivial example then. Everyone uses the assert(!p) argument and then disagrees that you can't just design out of it... If you cant design it away, then it's actually likely to be an error condition that needs to be handled by first class code that can't be disolved away to nothing by a preprocessor setting. And it's fine for you to say that your non-standard assert doesn't do that but it's one more thing for a developer to be familiar with before he knows that your asserts work differently. This is even more of an issue if you're mixing in 3rd party source that has a different meaning for their version of assert...

The problem with saying that they're good for things that cant be designed away is that the design suffers, it's like lack of cost correctness, people say that it's hard because there's so much code that isn't const correct and all that leads to is more code that isn't const correct... Once you start designing your way out of the problem the problem starts to get smaller because your designs dont lead to situations where you'd need to assert, etc.

Anyway, I've been here before ;) If you get value out of it then that's good.
[gravatar]
I'm a big fan of ASSERTS. It provides a little self verifying comment (or if you prefer, exploding comment) about the expected state of the running code at that point. And unlike regular comments, they don't get out of sync with the code.

I find Len's stawman arguments against them unconvincing. Len seems to think you should never use pointers if they can't be non-null. But its a widely accepted idiom that passing objects that will have their ownership taken over by the called code, those objects should be passed by pointer. And oh yeah, its entirely possible to have null references:

void Bar(Foo& foo)
{
foo.dosomething(); // kaboom!
}
int main()
{
Foo* pFoo = NULL;
Bar(*pFoo);
}

Also the argument that some people won't know what ASSERT() means in your project, therefore we shouldn't use them is silly when we take it to its logical end: Custom functions in your project are a bad idea because people aren't familiar with the semantics of them.

I also find ASSERTs that get removed in release builds encourage much more explicit sanity checking in the code, especially in performance sensitive code. Because if we know that all the checks won't slow down the final code, then we tend to be more free with adding more checks. (At least I do anyway)
[gravatar]
Yes, we all know that you can fabricate null references. The point is that even though you probably have to convert a pointer to a reference at some point it's just ONE place that you need to test it for null, from then on you can assume that it's good and never ever need to add an assert to the code to pass that nugget of information on to those who come later. If you're writing code where I can't assume that ALL references are good ALL of the time then that's a whole different discussion...

My point is not to pass everything by reference just things that cannot be null; though I do find that if you follow the advice above then the number of places where you have pointers that could be null will decrease... If passing a null pointer to a function that takes ownership is legal then fine, if it's not legal to do so and you'd prefer not to pass by reference then by all means pass it by pointer and deal with the fact that it's an error to have null passed in. Personally I feel that you should have permanent error handling code to deal with errors. If you want to add an assertion before the permanent error handling code then that's fine but I'd question the utility of the duplication. If you want to replace the permanent error handling code with an assert that may compile away then, IMHO, that's not fine.

If you're working on a large system and it contains lots of code from lots of different sources then it's quite possible that ASSERT means different things in different parts of the system. It's not nice, it's not something to strive for, but it does happen.

Testing a debug build is not the same as testing a release build, especially if the release build has lots of code compiled out. Even more especially if the code is multi-threaded. If you think that it's sensible to remove error handling for a release build because it's never hit when you run your tests on a debug build then that's fine. Just don't ask me to carry a pager...

As I've said time and time again, if it works for you then that's good. Keep doing it. My way works for me.
[gravatar]
"If you think that it's sensible to remove error handling for a release build because it's never hit when you run your tests on a debug build then that's fine."

Right there is the disconnect, ASSERTs aren't error handling. They don't ask a question and then respond with an action. Neds point that "If the expression is false, what happens is undefined" makes it explicit, nothing is actually be "handled". ASSERT are an explicit statement of the state of the system at that moment, and if the system isn't in that state, then "Oh Shit", something we thought was impossible has happened and we shouldn't keep going.

For example:
// concatenate a array of strings into a new string
char* Concat(char** strArray, size_t arrayLen)
{
size_t newSize = 1; // one for null char
for(size_t i=0; i < arrayLen; i++) {
    newSize += strlen(strArray[i]);
}
char* pNewBuffer = new char[newSize + 1];
char* p = pNewBuffer;

for(size_t i=0; i < arrayLen; i++) {
    memcpy(p, strArray[i], strlen(strArray[i]));
    p += strlen(strArray[i]);
}
*p++ = 0; // null terminate

ASSERT(p == pNewBuffer + newSize);
return pNewBuffer;
}

This assert gives us assurance that the concatentation loop completely successful and all strings were copied correctly. An inspection of the source code should give us the same thing, except we aren't computers and most code isn't this trivial, and bugs can be hidden and subtle. But ASSERTs can't be tricked like humans, if the system isn't in the correct state, then somethings clearly gone wrong (and sometimes its the ASSERT thats wrong), regardless how correct the code looks. If it fails, then clearly we have a bug that needs to be tracked down.

Now this code can be (and should be) tested also by unit test, but unit tests are best at checking the integrity of the output, and by then the code that caused the problem may have long since executed and destroyed the evidence of its logic errors, making debugging harder.
[gravatar]
I'm glad to know in my code when an assert fails, at least it is defined that the bit of code I'm in does not continue. I guess that's the "we crash to protect your data" idea. For unit testing, asserts are great -- they mean that unit tests that merely exercise code will be implicitly testing everything that is in your assert. If garbage in means garbage out, you have to constantly test for garbage. Asserts mean that garbage in means some sort of error coming out.
[gravatar]
I don't think there's anything wrong with that assert. It's not something that should be replaced with an exception. I'd aim to replace it with a test that rendered it redundant. If you were working in a TDD style then you might find that it made it difficult to write your test but that depends on how your particular flavour of assert worked.

I find that's one of the problems of these kinds of discussions ;) I'll raise an objection to one thing and someone will counter with something completely different...

Show me some code for a function that takes ownership of a pointer that cannot be null.

Oh, and you're allocating 1 byte too many...
[gravatar]
The B(?) is for Barry... :-)

My view on assertions is that in 80/20, 90/10, pick your ratio, that they are used incorrectly. I'm all for getting as much post-mortem information when things go horribly wrong, but in C and C++ they are abused beyond the point where it is funny. Or even intelligible. The classic example shown here is ASSERT(p != NULL). In most cases I've seen it would have been better to pass the pointer in as a reference, guaranteeing non-NULL-ness, and that's just that particular example. The throw/assert thing boils down to the end user might know what to do/I don't know what to do. In most of my cases I can code without assertions or throw and allow a higher piece of code to intelligently deal with it.

I personally use assertions in test harnesses, and very, very rarely in 'production' code. YMMV as the saying goes. As always, my opinion, and others are allowed to disagree :-) I just happen to be of the opinion that in the C/C++ community there are a majority group of programmers that use assertions as a programming crutch.
[gravatar]
Ian,

Thanks for pointing out, so clearly, the misconceptions that everyone seems to have with my point of view. I'm not suggesting that everywhere you'd decide to put an assert you put nothing and just hope for the best. If you read the original postings on the subject then you'll see that I'm suggesting that sometimes you design away the problem, sometimes you ALWAYS handle the failure in a defined way and sometimes you have external unit tests that confirm that the code is operating as you expect. There isn't a single example you could give me that I'd just say "take out the assert and hope for the best".

Asserts mean that, often, if you're using the standard assert that came with your compiler and you don't know better and you think that all assert's are equal and all asserts are good then you have bugs in your release builds because your careful checks will vanish.

You dont need asserts inside the code for unit tests to work effectively.

You do raise another interesting issue though, if you're asserting on input data, which you imply from your "garbage in" comment, where do you draw the boundaries? If A calls B and B calls C and all pass a pointer that mustn't be null but only A is currently accessible for 'external' use as the published API do you assert in all 3?

If anyone's interested in all the stuff that went before Ned's posting on this then you might want to check out these:

http://www.gamesfromwithin.com/articles/0510/000101.html
http://www.lenholgate.com/archives/000500.html
http://www.lenholgate.com/archives/000525.html
http://www.lenholgate.com/archives/000527.html

and the comments.
[gravatar]
Platypus, I somehow missed your first comment, sorry...

I agree that in Chris Baus's example it would be useful to put the details of the problem into a server side log but I disagree that it necesarilly has to terminate the process - which it did. It terminates the transaction in progress, sure. It reports the error, rolls back what it can and then allows the server to continue processing other requests. We may have a failure to handle one particular type of transaction request. That may not be fatal for other types of transaction. I agree that failures like this should prevent further processing and protect data.
[gravatar]
Barry:
in C and C++ they are abused beyond the point where it is funny. Or even intelligible. The classic example shown here is ASSERT(p != NULL). In most cases I've seen it would have been better to pass the pointer in as a reference?

Are we talking about "C and C++" or just C++? No, seriously, I think it matters. A lot of the suggested alternatives having to do with exceptions or references are entirely specific to C++, which is fine, but somewhere in all this we should also consider the needs of those who work in kernels and embedded systems where C is the norm. What should assert do in those environments? How might it be different than what it might do in a user-level C++ environment, and why? Is it better to have uniform treatment, even if that's least-common-denominator, or to have different treatment for each? No offense intended to you or anyone else, but I think looking at this issue only through C++ colored glasses misses half of the picture.

Len:
I disagree that it necesarilly has to terminate the process - which it did. It terminates the transaction in progress, sure. It reports the error, rolls back what it can and then allows the server to continue processing other requests.

Good point. I guess the issue, again, is whether you want a uniform behavior. One approach would be to consider each case individually, so that this assertion failure aborts just the transaction but that assertion failure aborts the entire session. That would be two kinds of asserts, and strikes me as harder to maintain. The other approach would be to decide on a single granularity and say that all assertion failures abort at that granularity. That has the disadvantage of being overkill - literally killing too much - in some cases. I think we probably agree, though, that killing the entire process is almost never the right choice, since the same process might actually be used to handle many totally independent activities.
[gravatar]
Platypus

Personally I'm talking about C++. My view is that in C++ there are usually better alternatives to assert. In C, or in embedded situations where you cannot use exceptions then obviously my suggestions to use exceptions do not apply ;)


One approach would be to consider each case individually, so that this assertion failure aborts just the transaction but that assertion failure aborts the entire session. That would be two kinds of asserts, and strikes me as harder to maintain

The way to make it less difficult to maintain is to take the decision away from the failure point. The problem with deciding at the failure point what you're going to do (even if it's by some clever and complex, configurable, multi-'debug level' assert system) is that as soon as you want to use the code in question in more than one system you may have conflicting requirements. All you know at the failure point is that you have a failure, you don't know how that failure should be handled in all situations. In one situation the failure may be fatal to the application and you need to shut down. In another it's fatal to a single transaction. In a third you'd prefer to show the user and ask them if they'd like to report it to you and then shut down, etc. You can probably work around this with a clever assert system or you can simply throw an exception and deal with it where/if you catch it. For me, as you may have gathered, I'd prefer the simplicity and consistency of an exception over something else to learn and document and maintain. Especially when mixing code from different places; sure they migh all use their own exception types but I can deal with them all in the same way.

I wonder if the confusion over my suggestion to use exceptions comes from the fact that many people might think that I'm suggesting that these exceptions actually get caught and handled somewhere close to the failure point. They don't. I'd expect this kind of failure to get handled at a thread or process boundary or at a point where the error reporting strategy has to change (I'm thinking COM boundaries here). The failure occurs, the exception causes rollback where appropriate due to appropriate use of RAII and we can then decide if we allow the thread or process to continue doing useful work or not. To be honest in C++ I find it rare to need more than one set of exception handling per "boundary", if I find I need more then I'm suspicious of the design.

I realise someone will now come along and suggest that doing anything other than simply stopping the process when you discover such a failure is wrong and risks data loss or corruption and to those I say "show me a fully worked example that proves your point".
[gravatar]
Platypus: thanks for picking me up on that - I'm referring to C++. I should stop referring to C, as I rarely program it. I am talking about language specific constructs, and specifically C++. As Len says regarding C and embedded systems it does indeed make sense to use assertions. However the corollary is to ensure that they are used correctly, which is still my 80/20 rule (the ratio may vary on a programming group basis), but I'm still of the opinion that after seeing (only a meagre) 10 years of production C++ code, the bulk of assertions are unnecessary, and just plain wrong.

Often sitting down and thinking for 10 minutes will get you a better solution than just writing ASSERT. In a system I currently work on, I'd say that the way assertions are used is down to a very poor decision made by someone 5/6 years ago, who had never used exception handling, and didn't bother to think through the problem. Then the problem grew and escalated through staff churn, cut'n'paste, newb programs and the preprocessor obfuscating the code base. The reason was to pass errors from a decoupled back end system into Excel. But that's a whole other story (and discussion I'm sure).

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.