Skip to content

Conversation

@redsun82
Copy link
Contributor

@redsun82 redsun82 commented Nov 30, 2021

This is a first step towards having different buffer overwrite queries with different precisions. The idea is to add a reason for why an estimate was given, allowing to detect when exact range analysis kicks in, in order to split out the current OverrunWrite.ql into two separate high precision and medium precision queries.

Next steps will be to apply the same idea to the float special case (possibly applying range analysis there as well) and actually splitting the query in two.

@github-actions github-actions bot added the C++ label Nov 30, 2021
@redsun82
Copy link
Contributor Author

redsun82 commented Dec 1, 2021

Juist a few words of context: I wanted to have a kind of enum splitting the buffer size estimation results. Something like

  • best effort: normal size calculations on non-sprintf like functions, no meaningful range analysis possible on sprintf, %f capped to 8
  • extreme floats: only giving a size on sprintf without exact range analysis, taking into account the true maximum width of double
  • range analysis: only giving a size on sprintf with exact range analysis

In particular I'd like to avoid the kind of code that in the cpp/overrunning-write-with-float avoids the query overlap, as with three different queries (maybe more in the future) it'd start to be a bit ugly I think. So I want one method accepting this strategy enum and giving disjoint values (so, no value if the strategy does not apply to a specific case), so that the three corresponding queries would be written by just comparing this value.

For the moment I just want to apply this reasoning to the existing two cases, but this particular PR is not getting me there yet though.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if something isn't clear in what I wrote. This is the kind of change that would make total sense in most other languages, but in QL it will be difficult to implement.

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is great cleanup on its own, and should make it easier to add the third strategy. 👍

@redsun82 redsun82 force-pushed the redsun82/cpp-overrunning-write-precision-split branch from 64a709a to bb3ef7a Compare December 8, 2021 09:53
@redsun82
Copy link
Contributor Author

redsun82 commented Dec 8, 2021

ok, here is a further take. Putting the special float case in was proving challenging, so I'm leaving it aside for the moment. From the tests it seems like the split is kind of working ok, but I'm eager to get some early comments on the logic behind it.

@redsun82 redsun82 changed the title C++: refactor buffer overwrite queries with strategies C++: refactor buffer overwrite queries with estimate reasons Dec 9, 2021
@redsun82 redsun82 force-pushed the redsun82/cpp-overrunning-write-precision-split branch from bb3ef7a to b08263d Compare December 9, 2021 09:44
@redsun82 redsun82 added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Dec 9, 2021
@redsun82 redsun82 marked this pull request as ready for review December 9, 2021 10:30
@redsun82 redsun82 requested a review from a team as a code owner December 9, 2021 10:30
@redsun82 redsun82 added no-change-note-required This PR does not need a change note and removed documentation labels Dec 9, 2021
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach much better than the first one :)

@redsun82 redsun82 force-pushed the redsun82/cpp-overrunning-write-precision-split branch 2 times, most recently from 8ac4f1a to 3005d46 Compare December 10, 2021 10:21
@redsun82 redsun82 removed the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Dec 10, 2021
@redsun82 redsun82 force-pushed the redsun82/cpp-overrunning-write-precision-split branch from 1ad7b6d to 904bf85 Compare December 10, 2021 11:08
@redsun82 redsun82 force-pushed the redsun82/cpp-overrunning-write-precision-split branch from febca72 to 5ed7056 Compare December 13, 2021 11:28
@geoffw0
Copy link
Contributor

geoffw0 commented Dec 13, 2021

The DCA run:

  • no significant performance changes.
  • no query result changes 🎉 (I checked that cpp/overrunning-write-with-float was run, amongst others).
  • microsoft__calculator failed to build; but I would ignore this, its been flaky lately.

@MathiasVP
Copy link
Contributor

  • microsoft__calculator failed to build; but I would ignore this, its been flaky lately.

FWIW, it's failing because it's using C++/CX. It's not really flaky as much as it's using a dialect of C++ that we don't fully support.

@redsun82
Copy link
Contributor Author

The DCA run:

  • no significant performance changes.
  • no query result changes tada (I checked that cpp/overrunning-write-with-float was run, amongst others).
  • microsoft__calculator failed to build; but I would ignore this, its been flaky lately.

thanks for that @geoffw0 . So, shall we merge this?

@geoffw0
Copy link
Contributor

geoffw0 commented Dec 13, 2021

shall we merge this?

I haven't kept up with the changes since my initial round of comments (I believe all of those have been addressed apart from adding explanation around the newtype, which is optional). But this is a good sign.

@MathiasVP
Copy link
Contributor

MathiasVP commented Dec 13, 2021

thanks for that @geoffw0 . So, shall we merge this?

I've added a review comment that hasn't been fixed yet :) It's an easy one!

As far as I can tell, there's also been a non-trivial amount of changes since the last DCA: https://github.com/github/codeql/compare/5ed705670752bda84aa6d53c562a60b04807d1c5..d88e4d60b57cc1d49f8702fee4162ade0ecee795#diff-303d2f7d1a901ce74c795fa2b729115169b2cf4c21678522f7cd9be9c3c0d8f5

I hope it's okay that I took the liberty to start another DCA run. Once that finishes (and my final lonely comment has been addressed) I'll be happy to merge it. Sorry I didn't raise this doubt on Slack earlier today. I didn't check which SHA the DCA experiment had actually run on.

@redsun82
Copy link
Contributor Author

thanks for that @geoffw0 . So, shall we merge this?

I've added a review comment that hasn't been fixed yet :) It's an easy one!

Uh, I had totally missed that, fixed now 😄 I wish there was a view showing only unresolved comments... or is there?

As far as I can tell, there's also been a non-trivial amount of changes since the last DCA: https://github.com/github/codeql/compare/5ed705670752bda84aa6d53c562a60b04807d1c5..d88e4d60b57cc1d49f8702fee4162ade0ecee795#diff-303d2f7d1a901ce74c795fa2b729115169b2cf4c21678522f7cd9be9c3c0d8f5

I hope it's okay that I took the liberty to start another DCA run. Once that finishes (and my final lonely comment has been addressed) I'll be happy to merge it. Sorry I didn't raise this doubt on Slack earlier today. I didn't check which SHA the DCA experiment had actually run on.

Thanks for that!

@redsun82
Copy link
Contributor Author

It seems the DCA experiment is ok, right?

@MathiasVP
Copy link
Contributor

It seems the DCA experiment is ok, right?

Indeed, all looks good :)

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MathiasVP MathiasVP merged commit 6fda5e8 into main Dec 14, 2021
@MathiasVP MathiasVP deleted the redsun82/cpp-overrunning-write-precision-split branch December 14, 2021 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants