Skip to content

Ruby: Add rb/tainted-format-string query#8272

Merged
hmac merged 10 commits intogithub:mainfrom
hmac:hmac/tainted-format-string
Mar 21, 2022
Merged

Ruby: Add rb/tainted-format-string query#8272
hmac merged 10 commits intogithub:mainfrom
hmac:hmac/tainted-format-string

Conversation

@hmac
Copy link
Contributor

@hmac hmac commented Feb 28, 2022

This shares as much as possible with the JS version.

@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

2 similar comments
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@hmac hmac force-pushed the hmac/tainted-format-string branch from 0aabff3 to 8f11802 Compare March 2, 2022 04:31
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@hmac hmac force-pushed the hmac/tainted-format-string branch from 8f11802 to aa5cd34 Compare March 2, 2022 04:45
@github-actions github-actions bot added the Java label Mar 2, 2022
@hmac hmac force-pushed the hmac/tainted-format-string branch from aa5cd34 to aae109e Compare March 2, 2022 04:46
@hmac hmac removed the Java label Mar 2, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@hmac hmac force-pushed the hmac/tainted-format-string branch from aae109e to d34838b Compare March 8, 2022 00:25
@github-actions github-actions bot added the JS label Mar 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@hmac hmac force-pushed the hmac/tainted-format-string branch from d34838b to 492c75a Compare March 9, 2022 04:04
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf throw an exception that there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, a method such as Kernel.puts should be used, which does not apply string formatting to its arguments.

References

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf will throw an exception as there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, string interpolation should be used exclusively:

class UsersController < ActionController::Base
  def index
    puts "Unauthorised access attempt by #{params[:user]}: #{request.ip}"
  end
end

References

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf will throw an exception as there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, string interpolation should be used exclusively:

class UsersController < ActionController::Base
  def index
    puts "Unauthorised access attempt by #{params[:user]}: #{request.ip}"
  end
end

References

@hmac hmac marked this pull request as ready for review March 9, 2022 20:09
@hmac hmac requested review from a team as code owners March 9, 2022 20:09
erik-krogh
erik-krogh previously approved these changes Mar 9, 2022
Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS 👍

Copy link
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only minor comments.

DataFlow::Node getFormatString() { result = this.getArgument(0) }

/**
* Gets then `n`th formatted argument of this call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably either change this to be 0-indexed rather than 1-indexed, or note in the QLDoc that it's 1-indexed.

* A call to `IO#printf`.
*/
class IOPrintfCall extends PrintfStyleCall {
IOPrintfCall() { this = API::getTopLevelMember("IO").getInstance().getAMethodCall("printf") }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IOPrintfCall() { this = API::getTopLevelMember("IO").getInstance().getAMethodCall("printf") }
IOPrintfCall() { this.getReceiver() instanceof IOInstance and this.getMethodName() = "printf" }

with an import of codeql.ruby.frameworks.Files::IO.

This will catch cases like:

    file = File.open "foo.txt", "a"
    file.printf(params[:format], arg) # BAD

where the IO instance is created without new - this case is already caught by the UnknownMethodCall catch-all as it stands, but that might not be the case if/when UnknownMethodCall is removed.

// Kernel#printf supports two signatures:
// printf(io, string, ...)
// printf(string, ...)
override DataFlow::Node getFormatString() { result = this.getArgument([0, 1]) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands we can end up with FPs of the form:

    printf("hello %s %s", params[:format], "arg2")

where params[:format] is currently interpreted as a format string due to appearing as arg1, but is only a format argument.

I think we could avoid this by checking the type of arg0, e.g. as a rough heuristic:

Suggested change
override DataFlow::Node getFormatString() { result = this.getArgument([0, 1]) }
override DataFlow::Node getFormatString() {
if this.getArgument(0).asExpr() instanceof StringLiteralCfgNode
then result = this.getArgument(0)
else result = this.getArgument([0, 1])
}

In this way, we avoid FPs where arg0 is trivially a format string, but we err on the side of over-reporting when we're unsure of the type of arg0.

Copy link
Contributor Author

@hmac hmac Mar 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've made this change, though instead of instanceof StringLiteralCfgNode I've used .getConstantValue().isString(_), to take advantage of local flow, e.g. in

fmt = "hello %s %s"
printf(fmt, params[:format], "arg2")

This relies on IO objects not having a string constant value, but I think that's a safe bet.

hmac added 10 commits March 21, 2022 12:51
It will now identify cases like this:

    file = File.open "foo.txt", "a"
    file.printf(params[:format], arg)
Kernel#printf supports two call signatures:

    printf(String, *args)
    printf(IO, String, *args)

We want to identify the String argument, which is the format string.
Previously we would return the 0th and 1st arguments, which gives some
FPs when the 1st arg is not a format string.

We now try to rule out the trivial case by checking if arg 0 has a
string value, and then assuming it is the format string. Otherwise we
fall back to returning both arguments.

This still has some false positive potential, but less than previously.
@hmac hmac force-pushed the hmac/tainted-format-string branch from 9f4e6b7 to 5a6da82 Compare March 20, 2022 23:52
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-134/TaintedFormatString.qhelp

Use of externally-controlled format string

Methods like Kernel.printf accept a format string that is used to format the remaining arguments by providing inline format specifiers. If the format string contains unsanitized input from an untrusted source, then that string may contain unexpected format specifiers that cause garbled output or throw an exception.

Recommendation

Either sanitize the input before including it in the format string, or use a %s specifier in the format string, and pass the untrusted data as corresponding argument.

Example

The following program snippet logs information about an unauthorized access attempt. The log message includes the user name, and the user's IP address is passed as an additional argument to Kernel.printf to be appended to the message:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by #{params[:user]}: %s", request.ip)
  end
end

However, if a malicious user provides a format specified such as %s as their user name, Kernel.printf will throw an exception as there are too few arguments to satisfy the format. This can result in denial of service or leaking of internal information to the attacker via a stack trace.

Instead, the user name should be included using the %s specifier:

class UsersController < ActionController::Base
  def index
    printf("Unauthorised access attempt by %s: %s", params[:user], request.ip)
  end
end

Alternatively, string interpolation should be used exclusively:

class UsersController < ActionController::Base
  def index
    puts "Unauthorised access attempt by #{params[:user]}: #{request.ip}"
  end
end

References

@hmac hmac requested a review from alexrford March 21, 2022 00:28
Copy link
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*/

import javascript
import semmle.javascript.security.dataflow.DOM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this import semmle.javascript.security.dataflow.DOM is redundant.
But that's fine.
(ql/redundant-import anyone?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A query for redundant imports would be great! Even better would be a compiler warning delivered via LSP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it belongs as a compiler warning.
In the above case both of the imports eventually import each other, so deleting any one of them will solve it.

And do you want a compiler warning on both for that? I think that feels wrong, and so does the arbitrary choice of warning on the second.
And sometimes (redundant) imports and their order can be important (I'm not sure why, but I think the CPP team can tell stories).

In QL-for-QL: arbitary choices and lingering warnings are fine, so I think it should be a query.
(But I want to figure out how we'll get support for parameterized modules before I start more QL-for-QL query writing).

@hmac hmac merged commit 6c18e1d into github:main Mar 21, 2022
@hmac hmac deleted the hmac/tainted-format-string branch March 21, 2022 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants