Sites affected by the new date-time format change
Categories
(Core :: Internationalization, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox108 | --- | unaffected |
firefox109 | + | fixed |
firefox110 | + | fixed |
People
(Reporter: ksenia, Assigned: jfkthame)
References
(Regression, )
Details
(Keywords: regression)
Attachments
(3 files)
We've received a report in https://github.com/webcompat/web-bugs/issues/114746 where store detail pages are broken on homedepot.com.
To reproduce, visit https://www.homedepot.com/l/Hampton/VA/Hampton/23666/4612 in Nightly 110.0a1 (2022-12-15) and observe the page.
The site is using the hour:min:sec part from the result of :
const e = (new Date).toLocaleString('en-US', {
timeZone: 'America/Anchorage'
});
and tries to extract only hour from this string with the following regex: match(/(\d+):(\d+):(\d+) (\w)/);
and expecting a match, but since bug1792775 this expression returns null
. I've attached a test case.
Reporter | ||
Comment 1•1 year ago
|
||
I'm not sure whether this is expected behavior since the page/testcase also breaks in Chrome Canary, so just filing this for visibility.
André, wonder if you could take a look at this, please?
Comment 2•1 year ago
|
||
Set release status flags based on info from the regressing bug 1792775
Comment 3•1 year ago
|
||
This is a bug in their code. I18n results are best-effort and may change over time as i18n data improves. This is such case, and all web browsers will migrate to ICU 72 and won't work with such code.
Comment 4•1 year ago
|
||
We were aware that such bugs might arise, see bug 1792775, comment #16. That's why we timed this change, so it happens at the same time as in Chromium. Can we first try to notify the website owner, so they update their code to handle the new date-time format for en-US
? If that fails we should consider adding an intervention.
Reporter | ||
Comment 5•1 year ago
•
|
||
(In reply to André Bargull [:anba] from comment #4)
We were aware that such bugs might arise, see bug 1792775, comment #16. That's why we timed this change, so it happens at the same time as in Chromium. Can we first try to notify the website owner, so they update their code to handle the new date-time format for
en-US
? If that fails we should consider adding an intervention.
Thanks, I'll try to contact them then and file a bug for an intervention.
There is another report about group pages on meetup.com being broken as well in https://github.com/webcompat/web-bugs/issues/115508.
They use toLocaleTimeString
and expecting third element to be present in an array after splitting a string with a space, but it's undefined (since the change in spaces):
const t = (new Date).toLocaleTimeString('en-us', {
timeZone: "US/Mountain",
timeZoneName: 'short',
hour12: !0
}).split(' ') [2]
Comment 6•1 year ago
|
||
Can this bug be closed now it had been resolved?
Reporter | ||
Comment 7•1 year ago
•
|
||
From https://bugs.chromium.org/p/chromium/issues/detail?id=1392814#c26:
Update here is that homedepot is aware of the issue and will work on a fix. But if we discover more broken sites, I suggest we consider rolling back the ICU update.
Looks like homedepot was contacted by the Chrome team and is working on a fix. So there is a remaining issue with meetup.com (so far), I'll reach out to them.
Wonder if we should have a plan in case Chrome rolls back the ICU update?
Reporter | ||
Comment 8•1 year ago
|
||
(In reply to hgreen.9 from comment #6)
Can this bug be closed now it had been resolved?
We should keep this bug open for now to track affected sites
Updated•1 year ago
|
Updated•1 year ago
|
Comment 9•1 year ago
|
||
Try push of the ICU72 update reverted from Beta:
https://treeherder.mozilla.org/jobs?repo=try&revision=1dcb26f3e14a58ccbdb2ca535121fb9ad7768b73
André, can you please sanity check the commit above just to make sure everything looks as expected? It did backout cleanly, modulo the CLOBBER bump which I did manually.
Comment 10•1 year ago
|
||
Ugh, as I feared, there do appear to be other bugs which landed after the ICU72 update which also depended on it. We're going to need to backout some other things if we're to go this route.
Comment 11•1 year ago
|
||
Alright, this one is looking better:
https://treeherder.mozilla.org/jobs?repo=try&revision=5e209935659b73df8b7303a846daba04cd42ca6a
It maintains the tzdata 2022g update that landed after the ICU72 update landed. It also includes a revert of bug 1790163 and bug 1473911 as they also depended on the newer ICU version. With that, CI is all green from what I can see. I'd still love a sanity check of the resulting patches, André.
Comment 12•1 year ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #11)
I'd still love a sanity check of the resulting patches, André.
The updated patch looks good to me! I've spot checked some files and validated the ICU data file has the tzdata 2022g update correctly applied.
Comment 13•1 year ago
|
||
The bug is marked as tracked for firefox109 (beta) and tracked for firefox110 (nightly). However, the bug still isn't assigned.
:Amir, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Comment 14•1 year ago
|
||
After syncing with the other WebCompat folks, I unfortunately have to request to back out the ICU update for now. :(
While I understand that some changes in i18n APIs are expected, the current situation has us surprised about real-world web breakage, and we're currently not able to mitigate or investigate this in any meaningful way.
The update is riding Firefox 109, which would hit Release on January 17th. As noted in bug 1792775, this update is also shipping in Chrome, but they have slotted it for Chrome 110, which would be released on February 7th. So if we keep this update in, we'd be breaking the web three weeks before Chrome does. This already isn't a great spot to be in, since we can't rely on Web Developers fixing their sites because they're broken everywhere. Reading this comment on the homedepot crbug, and noting that Chrome also has issues on meetup.com, I wouldn't be surprised if Chrome decides to backout as well.
The second factor is timing: between now and January 17th, there isn't much time left, especially considering that a lot of people will be on vacation. We got a bit surprised by the fallout here, and we did not have time to properly investigate this situation, or to reach out to sites and/or find a browser-side mitigation for compat issues. And now it's a bit too late for that, especially since I'd assume that our outreach attempts are severely hindered by the fact that most webdevs probably are on vacation, too.
So a backout, for now, seems like the safest solution here.
After that, we should collaborate with Google and Unicode folks to see if we can find a web-compatible solution. If, for some reason, there is no way to keep ICU compatible, then we can at least spend some time on investigating the potential fallout, reach out to sites, and prepare WebCompat Interventions as a mitigation for big websites that don't get a fix out in time. In the best case, if Chrome sticks with their release plan, we don't even have to do any additional work, and just wait for webdevs to notice their sites being broken in Chrome.
Assignee | ||
Comment 15•1 year ago
|
||
As far as I'm aware, the only web-visible change in ICU72 that looked high-risk was the use of narrow no-break spaces as field separators in place of regular ASCII spaces, which confuses poorly-implemented code that wants to parse the result.
So an alternative mitigation for the ICU72-related webcompat breakage would be to post-process the date/time strings we get from ICU to replace U+202F separators with regular U+0020 spaces. That would allow us to keep the ICU72 update (including the IDN bugs that have landed on top, for example) and just patch up the data that's confusing websites.
@anba, do you know of any other changes ICU72 made that might be similarly risky, or are these spaces the only such issue?
Comment 16•1 year ago
•
|
||
bugherder uplift |
The ICU72 update has been backed out from Beta for 109.0b5 along with two other bugs which depended on it.
https://hg.mozilla.org/releases/mozilla-beta/rev/94c0c172ef0b
https://hg.mozilla.org/releases/mozilla-beta/rev/7deaef0e20e3
Updated•1 year ago
|
Assignee | ||
Comment 17•1 year ago
|
||
FWIW, with this patch:
diff --git a/js/src/builtin/intl/DateTimeFormat.cpp b/js/src/builtin/intl/DateTimeFormat.cpp
--- a/js/src/builtin/intl/DateTimeFormat.cpp
+++ b/js/src/builtin/intl/DateTimeFormat.cpp
@@ -1128,6 +1128,14 @@ static bool intl_FormatDateTime(JSContex
return false;
}
+ // Replace any occurrences of U+202F in buffer with U+0020, to mitigate
+ // webcompat problems triggered by ICU 72.
+ for (auto& c : mozilla::Span(buffer.data(), buffer.length())) {
+ if (c == 0x202F) {
+ c = 0x0020;
+ }
+ }
+
JSString* str = buffer.toString(cx);
if (!str) {
return false;
to the JS engine, the testcase attached here no longer throws an error (the original page on homedepot.com doesn't allow me access, maybe geo-blocked?), and the meetup page no longer goes blank after loading.
But I really have no idea whether this is an appropriate place to apply such a hack. Andre would be a better person to say....
Comment 18•1 year ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #15)
@anba, do you know of any other changes ICU72 made that might be similarly risky, or are these spaces the only such issue?
I don't think any other change in ICU 72 is risky. For example we didn't (yet) see any reports about the other space change, where date range formatting is now using U+2009 instead of U+0020.
Comment 19•1 year ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #17)
But I really have no idea whether this is an appropriate place to apply such a hack. Andre would be a better person to say....
Yes, either directly before passing the formatted JSString
to the user or earlier in mozilla::intl::DateTimeFormat
, if we also need/want to cover other mozilla::intl::DateTimeFormat
callers. If we want to restrict this fix-up phase to the Intl.DateTimeFormat
JavaScript object, we should also modify the formatted string for Intl.DateTimeFormat.prototype.formatToParts
(implemented in the intl_FormatToPartsDateTime
C++ function) to ensure consistent results w.r.t. Intl.DateTimeFormat.prototype.format
.
Assignee | ||
Comment 20•1 year ago
|
||
(In reply to André Bargull [:anba] from comment #18)
For example we didn't (yet) see any reports about the other space change, where date range formatting is now using U+2009 instead of U+0020.
Hmm; I do wonder if we should patch that similarly, if we're going to apply a hack to minimize compat risk here. Although it hasn't showed up yet, it's not too hard to imagine that a site might try formatting a date range and then use a (fragile) regex to take it apart. If we're temporarily reverting the change to one of these kinds of space separators, we should probably do both.
(In reply to André Bargull [:anba] from comment #19)
(In reply to Jonathan Kew [:jfkthame] from comment #17)
But I really have no idea whether this is an appropriate place to apply such a hack. Andre would be a better person to say....
Yes, either directly before passing the formatted
JSString
to the user or earlier inmozilla::intl::DateTimeFormat
, if we also need/want to cover othermozilla::intl::DateTimeFormat
callers.
Right, I wondered whether to do this in mozilla::intl::DateTimeFormat
.
My thinking was that the problem with the U+202F spaces only affects JS code that gets a formatted date/time and tries to interpret it in some way (e.g. with a simple regex) that assumed "normal" space. A DateTimeFormat caller that simply wants to render the resulting string won't care what kind of spaces are used, they'll just be passed to the rendering code as-is.
But I'd be fine with doing it at the lower level of mozilla::intl
, too. Maybe that would be the safer option, in case some other caller does care.
I've just pushed a try run with a patch to mozilla::intl::DateTimeFormat
to "fix" both U+202F and U+2009, for consideration. This will presumably result in some test failures, where tests were updated in bug 1792775 to expect the new types of spaces, so if we proceed with this then corresponding adjustments will be needed there.
Comment 21•1 year ago
|
||
Patching mozilla::intl::DateTimeFormat
seems like a simpler solution to me compared to doing it in SpiderMonkey, as long as we document it. My only small concern would be other locales were already using the characters we are swapping out, which I guess you could examine the old CLDR data.
Assignee | ||
Comment 22•1 year ago
|
||
OK, I propose that we go ahead and apply a patch to mozilla::intl::DateTimeFormat
and mozilla::intl::DateIntervalFormat
to restore "normal" spaces in Nightly, so we can see how that goes for the next couple of weeks.
Assignee | ||
Comment 23•1 year ago
|
||
The data for a bunch of locales was updated in ICU 72 to use U+202F and U+2009 in places where previously it had regular Space characters.
Unfortunately, this breaks some sites that attempt to parse the formatted output using naive regular expressions (or similar)
that just expect space, rather than "any whitespace", and fail to match against the new formatted output.
To mitigate this, until more browsers update to the newer ICU/CLDR data and pressure builds on sites to fix such fragile scripts,
we can post-process the formatted output from ICU to replace these "special" spaces with standard ASCII space characters.
This workaround is designed to be easily disabled at build time by just changing the DATE_TIME_FORMAT_REPLACE_SPECIAL_SPACES #define,
when we're ready to try re-enabling the updated formats.
Updated•1 year ago
|
Assignee | ||
Comment 24•1 year ago
|
||
We'll need to revert this when we're ready to disable the workaround.
Depends on D165408
Comment 25•1 year ago
|
||
you can add foxnews.com to the list affected in latest Nightly and also the latest Chrome. Edge is not affected it seems.
Images on Foxnews will fail to load when site does its own page refresh. A manual refresh of the main page restores the images, but fail again on the next site refresh.
Updated•1 year ago
|
Assignee | ||
Comment 26•1 year ago
|
||
(In reply to Jim Jeffery not reading bug-mail 1/2/11 from comment #25)
you can add foxnews.com to the list affected in latest Nightly and also the latest Chrome. Edge is not affected it seems.
Images on Foxnews will fail to load when site does its own page refresh. A manual refresh of the main page restores the images, but fail again on the next site refresh.
Interestingly, I wasn't able to reproduce a problem on foxnews.com. (Maybe they do IP geolocation and serve different content?) Anyhow, I have just triggered landing for the workaround patch here. So once that's in Nightly, if you could confirm it whether fixes the problem, that'd be good to know.
Comment 27•1 year ago
|
||
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/40e2c54d5618 Replace Narrow No-Break Space (U+202F) and Thin Space (U+2009) in DateTimeFormat/DateTimeIntervalFormat output with regular Space to mitigate breakage on fragile websites. r=anba https://hg.mozilla.org/integration/autoland/rev/4b27c1d6a589 Update tests to expect normal spaces rather than U+202F/2009 in formatted date/time strings. r=gregtatum
Comment 28•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/40e2c54d5618
https://hg.mozilla.org/mozilla-central/rev/4b27c1d6a589
Comment 29•1 year ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #26)
(In reply to Jim Jeffery not reading bug-mail 1/2/11 from comment #25)
you can add foxnews.com to the list affected in latest Nightly and also the latest Chrome. Edge is not affected it seems.
Images on Foxnews will fail to load when site does its own page refresh. A manual refresh of the main page restores the images, but fail again on the next site refresh.Interestingly, I wasn't able to reproduce a problem on foxnews.com. (Maybe they do IP geolocation and serve different content?) Anyhow, I have just triggered landing for the workaround patch here. So once that's in Nightly, if you could confirm it whether fixes the problem, that'd be good to know.
Well it seems likely its something else causing the page to not load the thumbnails. So the fix you landed did not affect the issue, nor did
we think that it would. Just a hope that it was related. Biggest problem is I can't find a way to repo the problem reliably.
Thanks for your work... will keep digging, hoping it goes away.
Updated•1 year ago
|
Description
•