Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: Include ASP.NET assemblies in the standalone extraction. #13876

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Aug 3, 2023

In this PR we improve the dependency resolution for standalone extraction.
During a standalone extraction everything in the source directly is just crammed into a single compilation in the standalone extractor. If the project is an ASP.NET project, the ASP.NET assemblies needs to be included as well (this would happen during a traced build), but this doesn't happen automatically as the ASP.NET runtime is not a part of the source directory.

The following heuristic is used to detect whether we need to include the ASP.NET runtime assemblies in the compilation:
If any file in the source dir contains either of the following

  • <Project Sdk="Microsoft.NET.Sdk.Web">
  • <FrameworkReference Include="Microsoft.AspNetCore.App" />

Besides the dependency resolution improvement, this PR also contains two bugfixes:

  1. We can't use the ASP.NET runtime a .NET runtime (this is not a replacement for the .NET runtime). It needs to be used in conjunction with the .NET runtime.
  2. Whitespaces in regular expressions needs to be generalised to properly detect includes (for nuget package dependency resolution).

@github-actions github-actions bot added the C# label Aug 3, 2023
Comment on lines 345 to 352
foreach (var valueMatch in regex.EnumerateMatches(line))
{
// We can't get the group from the ValueMatch, so doing it manually:
if (GetGroup(line, valueMatch, groupPrefix) == value)
{
return true;
}
}

Check notice

Code scanning / CodeQL

Missed opportunity to use Where Note

This foreach loop
implicitly filters its target sequence
- consider filtering the sequence explicitly using '.Where(...)'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually an error with the quality query.

  1. In this case the type of regex.EnumerateMatches(line) is not an IEnumerable which means that we can't use the Linq extension methods.
  2. The statement in the loop uses early return.

Comment on lines +381 to +384
catch (Exception ex)
{
progressMonitor.FailedToReadFile(file, ex);
}

Check notice

Code scanning / CodeQL

Generic catch clause Note

Generic catch clause.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many different possible exceptions and we want to just catch all in case the file read fails.

@michaelnebel
Copy link
Contributor Author

DCA looks good (there are some unrelated OOM errors).

@michaelnebel michaelnebel added the no-change-note-required This PR does not need a change note label Aug 7, 2023
@michaelnebel michaelnebel marked this pull request as ready for review August 7, 2023 06:01
@michaelnebel michaelnebel requested a review from a team as a code owner August 7, 2023 06:01
Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments for consideration.

[GeneratedRegex("<FrameworkReference\\s+Include=\"(.*?)\".*/?>", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline)]
private static partial Regex FrameworkReference();

[GeneratedRegex("<Project\\s+Sdk=\"(.*?)\".*/?>", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this regex needs some slight tweaking. There are some optional XML attributes that could come between the opening tag and Sdk=. For example the below is also valid:

<Project ToolsVersion="15.0" Sdk="Microsoft.NET.Sdk.Web">

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a rare case, but it's possible to define the SDK in at least one other way. We should probably cover the below case too:

<Import Project="Sdk.props" Sdk="Microsoft.NET.Sdk.Web" />

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I will change the regexp back to what it approximately except that we want to match the whitespace using \s. I at least found one case where this was needed for the matching to be succesfull.


[GeneratedRegex("<PackageReference .*Include=\"(.*?)\".*/>", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline)]
[GeneratedRegex("<PackageReference\\s+Include=\"(.*?)\".*/?>", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change might break cases when the package version is specified before the Include=.

Comment on lines +377 to +381
foreach (var file in AllFiles)
{
try
{
using var sr = new StreamReader(file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to improve the overall performance slightly. We're already reading all the files in the repo to get package references. We could combine all the regex lookups into one pass, and then we wouldn't need to read the files multiple times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a good idea, but it will require some extra effort - I will look into it :-)

@@ -29,6 +29,9 @@ internal sealed partial class BuildAnalysis : IDisposable
private readonly Options options;
private readonly DirectoryInfo sourceDir;
private readonly DotNet dotnet;
private readonly Lazy<IEnumerable<string>> allFiles;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the Lazy<>?

Copy link
Contributor

@tamasvajk tamasvajk Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if neither UseNuGet and ScanNetFrameworkDlls options are set, then we don't need the file list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct.
In any case - I will re-write the code - then this might go away.

@michaelnebel
Copy link
Contributor Author

Added some comments for consideration.

Thank you! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C# no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants