Skip to content

Conversation

@ihsinme
Copy link
Contributor

@ihsinme ihsinme commented Jun 15, 2021

good day.
in this request, I am considering finding a error use of the switch block.
I have identified 4 main areas:
1.when the value range of the condition is less than the selection items.
2. when the selection items are larger than the range of values.
3.when there are labels in the body with names similar to default or case
4.when there is code before the first case element.

I would like to draw your attention to the predicate isRealRange unfortunately the restriction on int did not allow using bitShiftRight and make it more elegant.

regarding fixes in real software, I am currently awaiting a response from some projects and continue to look for a solution in others.
htacg/tidy-html5#953
Hamlib/Hamlib#723

@ihsinme ihsinme requested a review from a team as a code owner June 15, 2021 13:45
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for another contribution, @ihsinme!

from SwitchStmt sw, string msg
where
isRealRange(sw.getExpr()) and
isRealRange(sw.getExpr().getAChild*()) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need isRealRange(sw.getExpr()) to check that we have a meaningful range for the expression that's being branched on. But what why do we need to need isRealRange(sw.getExpr().getAChild*()) as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to exclude the situation when a significant range is formed from operations with elements with the maximum range. for example (a + b) >> c in the presence of a significant range only for c we can get a fairly beautiful overall range, but this will not be what we are looking for.

predicate isNotAllSelected(SwitchStmt swtmp) {
not swtmp.getExpr().isConstant() and
exists(int i |
i != 0 and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exclude 0 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is also excluded in isConditionBig predicate through significant range. this is done in order to exclude quite frequent stylistic situations before processing, when the developer processes 0, and in the switch block either excludes the selection, or truncates the values of the conditions. here is an example from trimming selection.
aappleby/smhasher#91

Comment on lines +91 to +102
(
lb.getName().charAt(0) = "d" or
lb.getName().charAt(0) = "c"
) and
(
lb.getName().charAt(1) = "e" or
lb.getName().charAt(1) = "a"
) and
(
lb.getName().charAt(2) = "f" or
lb.getName().charAt(2) = "s"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will match labels with the names "def", "des", "daf", "das", "cef", "ces", "caf", "cas". Is that the intended effect?

Would it be better to flag all LabelStmt inside switch statements that have no GotoStmt jumping to them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will match labels with the names "def", "des", "daf", "das", "cef", "ces", "caf", "cas". Is that the intended effect?

I was hoping to find all the labels that start with these characters.

Would it be better to flag all LabelStmt inside switch statements that have no GotoStmt jumping to them?

your offer is much better from a security point of view. it will also cover situations when you forgot to write case. I'll look at it live and change the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good day.

my analysis shows that unused labels are the result of constructs using complex transitions such as

https://lgtm.com/projects/g/gperftools/gperftools/snapshot/3a9817e8738c18d5ef40f444ec79c18cacc365b8/files/src/tests/stacktrace_unittest.cc?sort=name&dir=ASC&mode=heatmap#xa2c70e59f3daf555:1

or a macro that generates various labels

https://lgtm.com/projects/g/fluent/fluent-bit/snapshot/9c2dbdae7d7779fa364e2298c101e4510498e18d/files/lib/onigmo/regexec.c?sort=name&dir=ASC&mode=heatmap#xc7d8130fce95515:1

or the result of #ifdef constraints

https://lgtm.com/projects/g/SoftEtherVPN/SoftEtherVPN/snapshot/39fd835476c762d03fba693e9ee7f7bb3a275787/files/src/Cedar/Client.c?sort=name&dir=ASC&mode=heatmap#x24cfa9cdeb8f67e:1

unfortunately in all these situations, the developer assumes the presence of unmanaged labels.

I want to find a label similar to default for example default dafalt defalt defaul and so on.

which is possible and is used in the transition operator (or maybe not),
but it creates the possibility that one of the developers might mistake this label as part of a switch construct.
therefore, such a label is bad, at least from a stylistic point of view.

if you do not like the current code, I suggest changing it to detect specific erroneous labels,
with similar ones on default it is not difficult to do this.
with labels similar to case, I'll think about it, because I wanted to take into account, among other things, caseTARARATARARA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my analysis shows that unused labels are the result of constructs using complex transitions such as

https://lgtm.com/projects/g/gperftools/gperftools/snapshot/3a9817e8738c18d5ef40f444ec79c18cacc365b8/files/src/tests/stacktrace_unittest.cc?sort=name&dir=ASC&mode=heatmap#xa2c70e59f3daf555:1

To filter out such labels you could require that there doesn't exist a LabelLiteral whose getLabel gives you back the label.


I admit those labels are all quite... intricate... I don't particularly mind this solution. Have you found any bugs caused by a misspelled default:?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the initial stage of the formation of the request, I found two ambiguous situations of using labels. but this did not look like an example of an error.
I keep looking.

@ihsinme
Copy link
Contributor Author

ihsinme commented Jun 17, 2021

Thanks a lot for your corrections and comments.

I want to separately apologize for the comments that you have already made to me in other PRs.(

@ihsinme
Copy link
Contributor Author

ihsinme commented Jul 1, 2021

good afternoon @MathiasVP.

until I found detection when using labels(

but I improved the query a bit.

and found several situations of incorrect operation of the ranges.

  1. division with remainder. in this example, the code assumes that the range is mod [1..3] . although in my opinion it cannot be equal to 3
    https://lgtm.com/projects/g/openwall/john/snapshot/949b67f7e6a308171d47e65845194af9ad4fbb9f/files/src/base64_convert.c?sort=name&dir=ASC&mode=heatmap#x764b03453831028c:1
  2. there is also some ambiguity with the handling of enum values. for example, the format range is calculated correctly, but the enum range is not.
    https://lgtm.com/projects/g/wxWidgets/wxWidgets/snapshot/5e210e4143333d512f079961c22b432b04850fc1/files/src/common/imagpcx.cpp?sort=name&dir=ASC&mode=heatmap#x9405549a45703a78:1

it would be great if you could help me understand the cause of these errors.

Thank you.

@MathiasVP
Copy link
Contributor

MathiasVP commented Jul 16, 2021

  1. division with remainder. in this example, the code assumes that the range is mod [1..3] . although in my opinion it cannot be equal to 3
    https://lgtm.com/projects/g/openwall/john/snapshot/949b67f7e6a308171d47e65845194af9ad4fbb9f/files/src/base64_convert.c?sort=name&dir=ASC&mode=heatmap#x764b03453831028c:1

This was a bug in our range analysis. I've fixed it in #6315 and verified that this PR removes the alert. Thanks for catching it!

there is also some ambiguity with the handling of enum values. for example, the format range is calculated correctly, but the enum range is not.
https://lgtm.com/projects/g/wxWidgets/wxWidgets/snapshot/5e210e4143333d512f079961c22b432b04850fc1/files/src/common/imagpcx.cpp?sort=name&dir=ASC&mode=heatmap#x9405549a45703a78:1

I haven't quite figured out yet why this alert is raised. I'll take a look at it on Monday.

…edSwitch.ql

Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
@MathiasVP
Copy link
Contributor

2. there is also some ambiguity with the handling of enum values. for example, the format range is calculated correctly, but the enum range is not.
https://lgtm.com/projects/g/wxWidgets/wxWidgets/snapshot/5e210e4143333d512f079961c22b432b04850fc1/files/src/common/imagpcx.cpp?sort=name&dir=ASC&mode=heatmap#x9405549a45703a78:1

The problem here is your isConditionBig predicate:

predicate isConditionBig(SwitchStmt swtmp) {
  // ...
  not exists(int iu, int il |
    // ...
    upperBound(swtmp.getExpr()) = iu and
    // ...
    lowerBound(swtmp.getExpr()) = il
  )
}

You're saying that the predicate holds whenever there doesn't exist integers iu and il such that something holds. The problem is that, when we cannot produce an upper bound (or lower bound) for the switch expression, this predicate will instantly be satisfied. For instance, if the expression being switched on is an enum expression (since we don't compute ranges for such expressions).

Instead, you should state something like:

  1. There exists an upper bound iu such that this upper bound is greater than any possible SwitchCase expression in the SwitchStmt, or
  2. There exists a lower bound il such that this lower bound is smaller than any possible SwitchCase expression in the SwitchStmt.

Does that make sense?

@ihsinme
Copy link
Contributor Author

ihsinme commented Jul 20, 2021

You're saying that the predicate holds whenever there doesn't exist integers iu and il such that something holds. The problem is that, when we cannot produce an upper bound (or lower bound) for the switch expression, this predicate will instantly be satisfied. For instance, if the expression being switched on is an enum expression (since we don't compute ranges for such expressions).

Good day.
I'm sorry to be late with the reply.

I very much missed that the enum throws an error. this increases the potential for false detection.

I am still thinking about a fix:

  1. suggested by you.
  2. use the isRealRange predicate to compute getASwitchCase(). GetExpr() and getASwitchCase().GetEndExpr()

@ihsinme
Copy link
Contributor Author

ihsinme commented Jul 20, 2021

could you take a look at

https://lgtm.com/projects/g/ArtifexSoftware/ghostpdl/snapshot/1c5db5089f9dba983785331cfbc12362f136c15f/files/base/gdevm64.c?sort=name&dir=ASC&mode=heatmap#x7d02ad802747467:1

the range of w1 is defined as [-2.3].
is this error included in those that you are fixing?

@ihsinme
Copy link
Contributor Author

ihsinme commented Aug 1, 2021

Good afternoon @MathiasVP.

I fixed the predicate isConditionBig and noticed an interesting FP.

if there is a loop before the case, then it is executed albeit with some restrictions. (for example, there is no initialization)

https://godbolt.org/z/zPcTo84bW

@MathiasVP
Copy link
Contributor

MathiasVP commented Aug 2, 2021

I fixed the predicate isConditionBig and noticed an interesting FP.

if there is a loop before the case, then it is executed albeit with some restrictions. (for example, there is no initialization)

Thanks for continuing with this PR! I don't see any changes to the isConditionBig compared to last time I reviewed it, though? I.e., this comment still applies.

What you're observing is actually more general than something that happens with loops. See for instance this example: https://ideone.com/S5x6pn.

Very rarely are there good reasons for code like this, though.

@MathiasVP
Copy link
Contributor

MathiasVP commented Aug 2, 2021

could you take a look at

https://lgtm.com/projects/g/ArtifexSoftware/ghostpdl/snapshot/1c5db5089f9dba983785331cfbc12362f136c15f/files/base/gdevm64.c?sort=name&dir=ASC&mode=heatmap#x7d02ad802747467:1

the range of w1 is defined as [-2.3].
is this error included in those that you are fixing?

This is working as intended (unfortunately). It's due to the use of widening which we use in the range analysis library to ensure that the analysis terminates in a reasonable amount of time. You can add more values to the predicates here and here to see if that improves your analysis. I don't think we can merge changes to those predicates, however. They are very important for performance on some databases.

@ihsinme
Copy link
Contributor Author

ihsinme commented Aug 3, 2021

I fixed the predicate isConditionBig and noticed an interesting FP.
if there is a loop before the case, then it is executed albeit with some restrictions. (for example, there is no initialization)

Thanks for continuing with this PR! I don't see any changes to the isConditionBig compared to last time I reviewed it, though? I.e., this comment still applies.

What you're observing is actually more general than something that happens with loops. See for instance this example: https://ideone.com/S5x6pn.

Very rarely are there good reasons for code like this, though.

I may be wrong, but the predicate changed work. https://lgtm.com/query/7750306597002152773/

the example you gave (if before) is the example of dead code I'm looking for. if there was a loop, it would be used by the compiler as live code.

@MathiasVP
Copy link
Contributor

I may be wrong, but the predicate changed work. https://lgtm.com/query/7750306597002152773/

the example you gave (if before) is the example of dead code I'm looking for. if there was a loop, it would be used by the compiler as live code.

Ah, I see what you mean. The code you added in the where clause ensures that lowerBound and upperBound exists, and so the problem of isConditionBig being satisfied when no lowerBound or upperBound exist is no longer an issue. I guess that also fixes the problem.

@MathiasVP
Copy link
Contributor

I'm happy with this PR now @ihsinme. Do you want to make any more changes before I merge it?

@ihsinme
Copy link
Contributor Author

ihsinme commented Aug 3, 2021

I do not plan to make changes yet

@MathiasVP MathiasVP merged commit 43044cd into github:main Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants