Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lone surrogate character breaks PSI proto conversion #15908

Closed
2 tasks done
dimpalambient opened this issue Apr 2, 2024 · 4 comments · Fixed by #15909
Closed
2 tasks done

Lone surrogate character breaks PSI proto conversion #15908

dimpalambient opened this issue Apr 2, 2024 · 4 comments · Fixed by #15909
Assignees
Labels
bug P1.5 PSI/LR PageSpeed Insights and Lightrider

Comments

@dimpalambient
Copy link

FAQ

URL

https://tangrammontessori.site/

What happened?

Getting error while checking Page Speed.

What did you expect?

A Report

What have you tried?

https://pagespeed.web.dev/analysis/http-tangrammontessori-site/kvkrh86eqs?form_factor=desktop

How were you running Lighthouse?

PageSpeed Insights

Lighthouse Version

11.50.0

Chrome Version

No response

Node Version

No response

OS

No response

Relevant log output

Oops! Something went wrong.
generic::internal: Error unmarshalling JSON into proto: {"lighthouseVersion":"11.5.0","requestedUrl":"https://tangrammontessori.site/","mainDocumentUrl":"https://tangrammontessori.site/","finalDisplayedUrl":"https://tangrammontessori.site/","finalUrl":"https://tangrammontessori.site/","fetchTime":"2024-04-02T13:49:16.763Z","gatherMode":"navigation","runWarnings":[],"userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/122.0.6261.94 Safari/537.36","environment":{"networkUserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36","hostUserAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/122.0.6261.94 Safari/537.36","benchmarkIndex":359,"credits":{"axe-core":"4.8.1"}},"audits":{"is-on-https":{"id":"is-on-https","title":"Uses HTTPS","description":"All sites should be protected with HTTPS, even ones that don't handle sensitive data. This includes avoiding [mixed content](https://developers.google.com/web/fundamentals/security/prevent-mixed-content/what-is-mixed-content), where some resources are loaded over HTTP despite the initial request being served over HTTPS. HTTPS prevents intruders from tampering with or passively listening in on the communications between your app and your users, and is a prerequisite for HTTP/2 and many new web platform APIs. [Learn more about HTTPS](https://developer.chrome.com/docs/lighthouse/pwa/is-on-https/)."
@adamraine
Copy link
Member

I can reproduce this. For debugging, this is the report JSON that fails to render in PSI
https://googlechrome.github.io/lighthouse/viewer/?gist=119fd68aab627c9481910fed9f48bf24

@adamraine adamraine added P1.5 PSI/LR PageSpeed Insights and Lightrider and removed needs-priority labels Apr 2, 2024
@adamraine
Copy link
Member

Locally I get this error when trying to convert the JSON into proto:

UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83e' in position 76: surrogates not allowed

@adamraine
Copy link
Member

This is an interesting situation because this string happens cuts off in the middle of a surrogate pair:

Explorez le monde fascinant des dinosaures avec notre sélection exclusive ! \ud83e...

If the string was cut off on character later this problem wouldn't happen. Nevertheless, PSI should not break in this type of situation, we should just find a way to handle this.

@adamraine adamraine changed the title generic::internal: Error unmarshalling JSON into proto: "lighthouseVersion":"11.5.0" Lone surrogate character breaks PSI proto conversion Apr 2, 2024
@adamraine adamraine assigned connorjclark and unassigned paulirish Apr 2, 2024
@connorjclark
Copy link
Collaborator

connorjclark commented Apr 2, 2024

This is not from our truncation. We use an ellipse for truncation, not .... The actual HTML element has an invalid utf-16 string, which we aren't handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug P1.5 PSI/LR PageSpeed Insights and Lightrider
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants