Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Add implicit destructors for named variables to the IR #15506

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

rdmarsh2
Copy link
Contributor

@rdmarsh2 rdmarsh2 commented Feb 1, 2024

No description provided.

@github-actions github-actions bot added the C++ label Feb 1, 2024
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. So far I've only looked at some of the .expected files. I think I'll delay looking at too many of the .qll changes until the conflicts with main has been resolved.

I think some of the consistency issues I've noted here would also be revealed by running the cpp/ql/test/library-tests/ir/ tests

Comment on lines +5756 to +5813
#-----| Goto -> Block 1
#-----| Goto -> Block 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong. It looks like the call to delete and the destructor call is both specified as the successor here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - there's an underlying extractor issue here that's causing issues with delete whenever the object being deleted has a destructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Do you mind creating an internal issue for this? And is this something we can work around in IR generation for now?

Comment on lines +5786 to +5843
#-----| Goto -> Block 1
#-----| Goto -> Block 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (but for delete[])

Comment on lines +7192 to +7265
# 1244| r1244_2(glval<String>) = VariableAddress[c] :
# 1244| r1244_3(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_4(void) = Call[~String] : func:r1244_3, this:r1244_2
# 1244| mu1244_5(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_6(void) = ^IndirectReadSideEffect[-1] : &:r1244_2, ~m?
# 1244| mu1244_7(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_2
# 1244| r1244_8(glval<String>) = VariableAddress[b] :
# 1244| r1244_9(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_10(void) = Call[~String] : func:r1244_9, this:r1244_8
# 1244| mu1244_11(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_12(void) = ^IndirectReadSideEffect[-1] : &:r1244_8, ~m?
# 1244| mu1244_13(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_8
# 1244| r1244_14(glval<String>) = VariableAddress[a] :
# 1244| r1244_15(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_16(void) = Call[~String] : func:r1244_15, this:r1244_14
# 1244| mu1244_17(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_18(void) = ^IndirectReadSideEffect[-1] : &:r1244_14, ~m?
# 1244| mu1244_19(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. a, b and c are all static locals. So their destructors should not be called at the end of the scope. Ideally, we should have some kind of new IRFunction that calls these destructors at the end of main, but let's leave that aside for now.

As a start, I think we should simply not emit these destructors for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. I missed that those were static when I was looking over the results.

Comment on lines +11498 to +11555
#-----| Goto -> Block 1
#-----| Goto -> Block 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another case with strange control-flow. Not sure what's going on here.

Comment on lines +5759 to +5827
# 1017| Block 1
# 1017| r1017_3(glval<unknown>) = FunctionAddress[~String] :
# 1017| v1017_4(void) = Call[~String] : func:r1017_3
# 1017| mu1017_5(unknown) = ^CallSideEffect : ~m?
# 1017| v1017_6(void) = ^IndirectReadSideEffect[-1] : &:r1017_2, ~m?
# 1017| mu1017_7(String) = ^IndirectMayWriteSideEffect[-1] : &:r1017_2

# 1020| Block 1
# 1020| r1020_1(glval<unknown>) = FunctionAddress[~PolymorphicBase] :
# 1020| v1020_2(void) = Call[~PolymorphicBase] : func:r1020_1
# 1020| mu1020_3(unknown) = ^CallSideEffect : ~m?
# 1020| v1020_4(void) = ^IndirectReadSideEffect[-1] : &:r1020_7, ~m?
# 1020| mu1020_5(PolymorphicBase) = ^IndirectMayWriteSideEffect[-1] : &:r1020_7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks wrong: We have two Block 1s 🤔

Comment on lines +11501 to +11576
# 2059| Block 1
# 2059| r2059_4(glval<unknown>) = FunctionAddress[~Base2] :
# 2059| v2059_5(void) = Call[~Base2] : func:r2059_4
# 2059| mu2059_6(unknown) = ^CallSideEffect : ~m?
# 2059| v2059_7(void) = ^IndirectReadSideEffect[-1] : &:r2059_3, ~m?
# 2059| mu2059_8(Base2) = ^IndirectMayWriteSideEffect[-1] : &:r2059_3

# 2062| Block 1
# 2062| r2062_1(glval<unknown>) = FunctionAddress[~Base2] :
# 2062| v2062_2(void) = Call[~Base2] : func:r2062_1
# 2062| mu2062_3(unknown) = ^CallSideEffect : ~m?
# 2062| v2062_4(void) = ^IndirectReadSideEffect[-1] : &:r2062_8, ~m?
# 2062| mu2062_5(Base2) = ^IndirectMayWriteSideEffect[-1] : &:r2062_8

# 2065| Block 1
# 2065| r2065_1(glval<unknown>) = FunctionAddress[~Derived2] :
# 2065| v2065_2(void) = Call[~Derived2] : func:r2065_1
# 2065| mu2065_3(unknown) = ^CallSideEffect : ~m?
# 2065| v2065_4(void) = ^IndirectReadSideEffect[-1] : &:r2065_8, ~m?
# 2065| mu2065_5(Derived2) = ^IndirectMayWriteSideEffect[-1] : &:r2065_8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks strange.

@rdmarsh2 rdmarsh2 force-pushed the rdmarsh2/cpp/ir-synthetic-destructors branch from a341acf to 2d010f6 Compare February 2, 2024 17:40
result = TranslatedVariableInitialization.super.getChildInternal(id)
}

final override Instruction getChildSuccessorInternal(TranslatedElement elem, EdgeKind kind) {

Check warning

Code scanning / CodeQL

Redundant override Warning

Redundant override of
this predicate
.
result = TranslatedVariableInitialization.super.getChildInternal(id)
}

final override Instruction getChildSuccessorInternal(TranslatedElement elem, EdgeKind kind) {

Check warning

Code scanning / CodeQL

Redundant override Warning

Redundant override of
this predicate
.
@@ -11,6 +11,7 @@
private import TranslatedCondition
private import TranslatedElement
private import TranslatedExpr
private import TranslatedCall

Check warning

Code scanning / CodeQL

Redundant import Warning

Redundant import, the module is already imported inside
TranslatedExpr
.
@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Feb 2, 2024

Force-pushed to fix a merge conflict with #15461 and remove some commits from #15318

Comment on lines +12247 to +12261
# 2170| mu2170_8(Bool) = ^IndirectMayWriteSideEffect[-1] : &:r2170_1

# 2170| (no string representation)
# 2170| CopyValue: (condition decl)
# 2170| ConditionalBranch: (condition decl)
#-----| False -> Block 3
#-----| True -> Block 2

# 2170| Block 1
# 2170| r2170_9(glval<Bool>) = VariableAddress[B] :
# 2170| r2170_10(glval<unknown>) = FunctionAddress[operator bool] :
# 2170| r2170_11(bool) = Call[operator bool] : func:r2170_10, this:r2170_9
# 2170| mu2170_12(unknown) = ^CallSideEffect : ~m?
# 2170| v2170_13(void) = ^IndirectReadSideEffect[-1] : &:r2170_9, ~m?
# 2170| mu2170_14(Bool) = ^IndirectMayWriteSideEffect[-1] : &:r2170_9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Do you know what's going on here? It seems like:

  1. Block 0 ends without any outgoing edges
  2. There's a block named (no string representation) that looks super strange.
  3. Block 1 also has no outgoing edges

Is this related to this PR? Or to variable declarations inside if expressions?

Copy link
Contributor Author

@rdmarsh2 rdmarsh2 Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's related to variable declarations inside if expressions, not to destructors specifically. It looks like it only fails if there's a constructor call - see the test at 1780 in ir.cpp

@@ -11647,6 +11844,464 @@ ir.cpp:
# 2109| v2109_11(void) = AliasedUse : ~m?
# 2109| v2109_12(void) = ExitFunction :

# 2115| void TryCatchDestructors(bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great test! It looks like we're currently only emitting destructor calls on the non-exceptional blocks. This makes a lot of sense since it's would be a fair bit of work to ensure that we generate a destructor call on the exceptional path so I think simply not emitting them is a good choice.

Am I correct in this observation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the extractor doesn't provide any destructor calls on the throw or catch blocks. I'm not sure if that's because the frontend doesn't provide that info or just because the extractor doesn't dump them.


class InstructionTag extends TInstructionTag {
final string toString() { result = "Tag" }
final string toString() { result = getInstructionTagId(this) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! One of the first things I do when I debug IR construction is also to replace "Tag" with this exact line. So I'm very happy with this change 👍

@@ -87,7 +93,8 @@ abstract class TranslatedCall extends TranslatedExpr {
)
}

override Instruction getInstructionSuccessor(InstructionTag tag, EdgeKind kind) {
override Instruction getInstructionSuccessorInternal(InstructionTag tag, EdgeKind kind) {
kind instanceof GotoEdge and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line needed given that we're restricting the EdgeKind on the line below?

@@ -152,7 +156,13 @@ class TranslatedStaticLocalVariableDeclarationEntry extends TranslatedDeclaratio
kind instanceof GotoEdge
}

final override Instruction getInstructionSuccessor(InstructionTag tag, EdgeKind kind) {
final override Instruction getLastInstructionInternal() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this predicate can return multiple results: Should it be called getALastInstructionInternal? I think we should also add QLDoc to both this predicate (in TranslatedElement.qll) and getLastChild. Specifically, I'd like to know what "last" actually means. Is it "the instruction/child to be executed by this TranslatedElement"? Or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's "an instruction/child for which the successor instruction is outside this TranslatedElement". I'll add a comment.

abstract Instruction getInstructionSuccessor(InstructionTag tag, EdgeKind kind);
abstract Instruction getInstructionSuccessorInternal(InstructionTag tag, EdgeKind kind);

Instruction getInstructionSuccessor(InstructionTag tag, EdgeKind kind) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this one be final? Also: QLDoc, please 😂

abstract Instruction getChildSuccessor(TranslatedElement child, EdgeKind kind);
Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) { none() }

Instruction getChildSuccessor(TranslatedElement child, EdgeKind kind) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be final?

@@ -75,6 +76,28 @@ abstract class TranslatedExpr extends TranslatedElement {
expr.isGLValueCategory()
}

abstract TranslatedElement getChildInternal(int id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QLDoc, please 🙏

Comment on lines 96 to 98
result = max(int childId | exists(this.getChildInternal(childId))) + 1
or
not exists(this.getChildInternal(_)) and result = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this predicate only have a result if it doesn't handle destructor calls explicitly itself?

)
}

final override predicate hasImplicitDestructorCalls() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nitpick: Should this be called hasAnImplicitDestructorCall? Predicate names are very rarely pluralized

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants