Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc(proto): ensure all strings are well-formed #15909

Merged
merged 1 commit into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
misc(proto): ensure all strings are well-formed utf
  • Loading branch information
connorjclark committed Apr 2, 2024
commit 36f00a8ac377efd71689bde2d169ddacb3786dc6
32 changes: 24 additions & 8 deletions core/lib/proto-preprocessor.js
Original file line number Diff line number Diff line change
Expand Up @@ -83,28 +83,44 @@
}

/**
* Remove any found empty strings, as they are dropped after round-tripping anyway
* Execute `cb(obj, key)` on every object property where obj[key] is a string, recursively.
* @param {any} obj
* @param {(obj: Record<string, string>, key: string) => void} cb
*/
function removeStrings(obj) {
function iterateStrings(obj, cb) {
if (obj && typeof obj === 'object' && !Array.isArray(obj)) {
Object.keys(obj).forEach(key => {
if (typeof obj[key] === 'string' && obj[key] === '') {
delete obj[key];
} else if (typeof obj[key] === 'object' || Array.isArray(obj[key])) {
removeStrings(obj[key]);
if (typeof obj[key] === 'string') {
cb(obj, key);
} else {
iterateStrings(obj[key], cb);
}
});
} else if (Array.isArray(obj)) {
obj.forEach(item => {
if (typeof item === 'object' || Array.isArray(item)) {
removeStrings(item);
iterateStrings(item, cb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this allows arrays with non-well-formed strings in them to pass through unchanged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah... same is true for empty strings sticking around. maybe it's fine ... 🪵 🤛

}
});
}
}

removeStrings(reportJson);
iterateStrings(reportJson, (obj, key) => {
Copy link
Member

@paulirish paulirish Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest of patch lgtm, but maybe extract this cb into a fn with a name? removeEmptyAndDropLoneSurrogates ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the comments inside sufficient and the code readable? detaching this anonymous fn from the utility function is more jumping around and doesn't help readabilty imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well i suppose this entire file is also in the vein of "read it if you want to know what it does" so.. whatevs

const value = obj[key];

// Remove empty strings, as they are dropped after round-tripping anyway.
if (value === '') {
delete obj[key];
return;
}

// Sanitize lone surrogates.
// @ts-expect-error node 20
if (String.prototype.isWellFormed && !value.isWellFormed()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we throw an error if String.prototype.isWellFormed isn't available? Seems like we would want to fail loudly in that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely do not want to throw an exception here, that would take down PSI for a very minor reason. Anyhow, WRS ships a Chrome that supports this and won't ever regress on that front.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, no error sounds good

// @ts-expect-error node 20
obj[key] = value.toWellFormed();
}

Check warning on line 122 in core/lib/proto-preprocessor.js

View check run for this annotation

Codecov / codecov/patch

core/lib/proto-preprocessor.js#L120-L122

Added lines #L120 - L122 were not covered by tests
});

return reportJson;
}
Expand Down
22 changes: 22 additions & 0 deletions core/test/lib/proto-preprocessor-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,28 @@ Object {

expect(output).toMatchObject(expectation);
});

it('sanitizes lone surrogates', () => {
// Don't care about Node 18 here. We just need this to work in Chrome, and it does.
if (!String.prototype.toWellFormed) {
return;
}

const input = {
'audits': {
'critical-request-chains': {
'details': {
'chains': {
'1': 'hello \uD83E',
},
},
},
},
};
const output = processForProto(input);

expect(output.audits['critical-request-chains'].details.chains[1]).toEqual('hello �');
});
});

describeIfProtoExists('round trip JSON comparison subsets', () => {
Expand Down
Loading