Closed Bug 1421938 Opened 7 years ago Closed 5 years ago

Quotes marks in <q> tag are not localized when using lang attribute

Categories

(Core :: CSS Parsing and Computation, defect, P3)

57 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: github, Assigned: jfkthame)

References

Details

(Keywords: dev-doc-complete)

Attachments

(6 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36

Steps to reproduce:

<html lang="fr">
  <body>
    <q>This is a quote</q>
  </body>
<html>


Actual results:

“This is a quote”


Expected results:

«This is a quote»
This works in Chrome. See https://codepen.io/bbuhler/pen/YERvaJ
Component: Untriaged → Layout: Text
Product: Firefox → Core
This seems more like an issue with the default HTML stylesheet, rather than a core Text bug. AFAICS, we don't have any lang-dependent rules that would configure the quote marks appropriately for different languages.

I guess adding something like

  *:lang(fr) {
    quotes: "«" "»";
  }
  *:lang(ru) {
    quotes: "«" "»";
  }
  *:lang(de) {
    quotes: "„" "“";
  }

to html.css should make things better. But obviously we'd need a much more comprehensive set of rules.... I guess the <delimiters> from CLDR would be the canonical source of such data.
Status: UNCONFIRMED → NEW
Component: Layout: Text → CSS Parsing and Computation
Ever confirmed: true
Priority: -- → P3
In principle, I think something like this (derived from CLDR data, release 32.0.1) would provide the desired behavior. I'm a bit concerned about the possible performance impact of adding so many rules to the HTML stylesheet, though.
It feels very expensive to add such large number of attribute-based rules to UA sheet...

We should probably consider using mapped attributes for this, I guess.
Changing the lang attribute already requires a subtree restyle, so we could just avoid doing invalidation for these selectors entirely, if that's your concern.
Do selector matching with attribute selector should be considered expensive enough that we should avoid leaving them alone (i.e. without tag, etc.) especially in UA sheets which is loaded for every page.

We need to map lang attribute for other properties anyway, so it should be easy to just reuse that, I suppose.
(In reply to Xidorn Quan [:xidorn] UTC+10 (PTO Jan 19 ~ 29) from comment #6)
> Do selector matching with attribute selector should be considered expensive
> enough that we should avoid leaving them alone (i.e. without tag, etc.)
> especially in UA sheets which is loaded for every page.
> 
> We need to map lang attribute for other properties anyway, so it should be
> easy to just reuse that, I suppose.

Yeah, that's fair, I think we could reuse that...
I think this is a duplicate of the WONTFIX bug 16206.  And I really don't think we should do this unless somebody is writing a *spec* for the entirety of the values and multiple browsers are agreeing to implement that spec, and there's substantial review agreeing that the spec is correct.

That said, I think the <q> element is a misfeature and the XHTML2 <quote> design (where the author provided the quotation marks) was superior.

(In reply to David Baron :dbaron: 🏴󠁵󠁳󠁣󠁡󠁿 ⌚UTC-7 from comment #9)

I think this is a duplicate of the WONTFIX bug 16206. And I really don't
think we should do this unless somebody is writing a spec for the entirety
of the values and multiple browsers are agreeing to implement that spec, and
there's substantial review agreeing that the spec is correct.

Even if we think the way <q> works isn't a great design, given that both webkit and blink already implement localized open/close quotation marks, depending on the lang attribute (e.g. see example in bug 1560311 comment 3, which displays with various localized marks in both Safari and Chrome), I think we should do the same for better interoperability.

Regarding a spec, there's a clear source of data available in CLDR; we should simply use that.

One question regarding mapping the lang attribute: AFAICT, this would mean that an element with lang=xx gets a new initial value for quotes, overriding any inherited value for the property. So a document like

<div style="quotes: '<' '>';">
    <div lang="ja">Japanese <q>quotes</q></div>
</div>

would render Japanese quote marks. Is that correct? Is it desirable, or should the explicit quotes property on the parent <div> inherit "through" the presence of the lang attribute and continue to apply to the <q> within the lang=ja content?

Or is there a way we can map the lang attribute to a quotes value only if it has not been explicitly set by an ancestor?

Hmm, looking at https://html.spec.whatwg.org/multipage/rendering.html#quotes I see that it lists rules such as

:root:lang(ja), :not(:lang(ja)) > :lang(ja) { quotes: '\300c' '\300d' '\300e' '\300f' } /* 「 」 『 』 */

which implies that the example in comment 11 should indeed render Japanese "bracket" quotes around the <q> element.

There are still some open questions regarding how <q> should deal with language (use quote marks based on the lang of the <q> element vs its parent?); see https://github.com/whatwg/html/issues/3636. However, regardless of how that gets resolved, I think we should go ahead with implementing lang-based quote marks based on CLDR data, so that at least straightforward cases behave as expected, and more consistently with other browsers.

To try and move this forward, I've created a set of WPT reftests based on the i18n group's "language responsiveness" tests at https://w3c.github.io/i18n-tests/results/the-q-element.html; currently we fail a bunch of these (we only pass those where the language-specific quotes happen to match our default quote marks). I also have a patch series that implements CLDR-based language-specific quote marks, which makes these tests pass.

Jonathan, if my review comment in https://phabricator.services.mozilla.com/D36429 makes sense to you, I'd be happy to write a more formal proposal to the CSS working group, and I think the patch would become much simpler.

We can actually implement the blink behavior pretty straight-forwardly without waiting to discuss what the name of the new value would be (and you can still use it via quotes: initial from an author's perspective). We'd just need to replace the initial quotes computed value for a default-constructed ArcSlice, and make sure that serializes to the empty string. That should be web compatible (since it's what WebKit and Blink do), and forwards-compatible with whatever resolution the CSS WG comes up with.

Also, that means that the right way to implement the quotes reset that I discussed with Jen Simmons and Fantasai last CSSWG F2F would just be [lang] { quotes: initial } (or [lang] { quotes: auto / whatever } if we expose that magic initial value to content).

Flags: needinfo?(jfkthame)

Yes, this sounds like a good way forward. If you can put a proposal to the WG, that'd be great - thanks.

Flags: needinfo?(jfkthame)

The issue was resolved to add auto.

OK, sounds good. I took a stab at revising the patch here to add auto, and it seems to behave as expected; most likely could be a lot more idiomatic, especially on the rust/style-system side of things, as I really don't know what I'm doing there, so feedback welcome!

Also added a few tests for nested quotes etc.

Attachment #9074981 - Attachment description: Bug 1421938 - Reset the initial value of the CSS 'quotes' property by mapping the HTML 'lang' attribute when present. r=emilio → Bug 1421938 - Add an 'auto' value for the CSS 'quotes' property, and make it use language-dependent quote marks. r=emilio
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1d38eb5eff6e
Add a set of WPT reftests (based on the manual i18n-wg testcases) for localized quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/b8ebd4d241e0
Make the nsAtomCString constructor accept a const nsAtom pointer. r=emilio
https://hg.mozilla.org/integration/autoland/rev/f150702af11f
Implement a mozilla::intl::QuotesForLang utility to return localized quotation marks for a given locale, based on CLDR data. r=emilio
https://hg.mozilla.org/integration/autoland/rev/89a0866d1aa0
Add an 'auto' value for the CSS 'quotes' property, and make it use language-dependent quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/3958f2af0e34
Add WPT reftests for nested quotes and 'auto' behavior. r=emilio
Flags: needinfo?(svoisen) → needinfo?(jfkthame)
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/388742e894ae
Backed out 5 changesets for causing a bustage in /builds/worker/workspace/build/src/intl/locale/cldr-quotes.inc:21:448 CLOSED TREE

Re-landing with fix for the warnings-as-errors bustage.

Separately, :gandalf wondered on irc whether we could use ICU to retrieve the localized quotation marks, rather than the static CLDR-derived data here. That may be a good option (simpler maintenance), though might be a bit more expensive. I'll look into it and potentially file a followup if it seems worthwhile.

Flags: needinfo?(jfkthame)
Blocks: 1564255
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/33e38a62b400
Add a set of WPT reftests (based on the manual i18n-wg testcases) for localized quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/adb2e2714c14
Make the nsAtomCString constructor accept a const nsAtom pointer. r=emilio
https://hg.mozilla.org/integration/autoland/rev/f27980997dc5
Implement a mozilla::intl::QuotesForLang utility to return localized quotation marks for a given locale, based on CLDR data. r=emilio
https://hg.mozilla.org/integration/autoland/rev/11a8f9bc0418
Add an 'auto' value for the CSS 'quotes' property, and make it use language-dependent quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/4e25a6db1f5b
Add WPT reftests for nested quotes and 'auto' behavior. r=emilio
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8142cf458076
Backed out 5 changesets for bustages in /builds/worker/workspace/build/src/layout/base/nsQuoteList.cpp CLOSED TREE
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0d506ceb820f
Add a set of WPT reftests (based on the manual i18n-wg testcases) for localized quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/44d239e5d236
Make the nsAtomCString constructor accept a const nsAtom pointer. r=emilio
https://hg.mozilla.org/integration/autoland/rev/66e4974b9a8c
Implement a mozilla::intl::QuotesForLang utility to return localized quotation marks for a given locale, based on CLDR data. r=emilio
https://hg.mozilla.org/integration/autoland/rev/b7e1b4dd94c6
Add an 'auto' value for the CSS 'quotes' property, and make it use language-dependent quote marks. r=emilio
https://hg.mozilla.org/integration/autoland/rev/4e0eb2a76b0f
Add WPT reftests for nested quotes and 'auto' behavior. r=emilio
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/17697 for changes under testing/web-platform/tests
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Assignee: nobody → jfkthame
Flags: needinfo?(jfkthame)

I've had a go at documenting this; see https://github.com/mdn/sprints/issues/2101#issuecomment-530782716 for the full details.

I am a bit unsure as to whether I got this right, as the results seem strange; a review would be much appreciated. Thanks!

Flags: needinfo?(jfkthame)

Looks reasonable to me; what do you feel is strange? (Just the fact that the other browsers have auto behavior yet don't recognize the value? That's known, and I expect in due course they'll add it.)

Basically, the auto value is a recent addition to the spec, and I don't think any other browsers have actually updated to support it yet, but they "magically" behave that way by default anyway. See https://github.com/w3c/csswg-drafts/issues/4074 for background.

Flags: needinfo?(jfkthame)

(In reply to Jonathan Kew (:jfkthame) from comment #35)

Looks reasonable to me; what do you feel is strange? (Just the fact that the other browsers have auto behavior yet don't recognize the value? That's known, and I expect in due course they'll add it.)

Basically, the auto value is a recent addition to the spec, and I don't think any other browsers have actually updated to support it yet, but they "magically" behave that way by default anyway. See https://github.com/w3c/csswg-drafts/issues/4074 for background.

Yeah, basically it just felt a bit weird to me, as I've never explicitly thought about this kind of situation before, but I guess it's not the first time its happened. I'm glad I got it right ;-)

Thanks for the review and explanation - this is really helpful.

QA Whiteboard: [qa-70b-p2]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: