diff --git a/meetings/task-force/#103-2020-09-28.md b/meetings/task-force/#103-2020-09-28.md new file mode 100644 index 0000000000..f0bbddf8ed --- /dev/null +++ b/meetings/task-force/#103-2020-09-28.md @@ -0,0 +1,310 @@ +## Executive summary ([Original Doc](https://docs.google.com/document/d/1lAyBZR2VQR8ILqvcg5Gad_wf7QWUbsoJ13wGZFSmtbE/edit#)) + +*Participants:* +- DAF: David Filip +- ECH: Elango Cheran +- MIH: Mihai Nita +- RCA: Romulo Cintra +- STA: Stanislaw Malolepszy +- EAO: Eemeli Aro +- ZIB: Zbigniew Braniecki +- LHS: Luke Swartz +- NIC: Nicolas Bouvrette + +The main subject of discussion is internal selectors and message-level selectors. + +Top level only vs internal selectors cannot be decided on technical merits only. We need to go back to the axes (design principles) discussion. We need to see what our values are, in terms of stakeholders and use cases. + +If we go with message-level selection, we want to have a migration path from internal selectors (ex: ICU MessageFormat) to the message-level selection. + +Also, some of the concerns raised recently about message-level selectors would be addressed by allowing references from one message to another. + +Action items include all of the following: + +Collecting stakeholders +- Listing all categories of stakeholders +- Inviting more representatives from stakeholder categories +- Goal is collect information that will help us decide on priorities among the categories of stakeholders + +Collecting use cases +- [Issue #119](https://github.com/unicode-org/message-format-wg/issues/119) +- Including corner cases for current approaches (ICU MessageFormat, Fluent, etc.) +- Examples that seem practical IRL but potentially unwieldy, from all perspectives +- Collect in GH issue + +Describe / depict the scenario of a UI (of a CAT tool) for professional translators in dealing with internal selectors (is this feasible or more difficult when compared with only full message selectors). + + +> Approval Stamps for Executive Summary + +*DAF,ECH,STA,NIC* + +## Minutes + +This meeting is a continuation of the last task force ([minutes from the 1st task force meeting](https://docs.google.com/document/d/1-6t6Yl5RHZI9QZwBDrFrl1fqSKSA4IMs1ef60IxD3lU/edit#heading=h.tulel52cgapk)). In [issue #103](https://github.com/unicode-org/message-format-wg/issues/103). + +STA: EAO has use cases of internal selectors, we can start from there. + +MIH: ZIB raises concerns that I don't totally agree with. I think we are past technical arguments or pros and cons, and we have fundamental philosophical differences in how we evaluate the pros and cons. By arguing about our positions, we've ended up at a point where we need to compromise somehow. On how to compromise, we need some clear guiding principles. I think the first 4 bullets (of doc _____) still apply. We have to be opinionated and be willing to fix what we learned that was wrong. + +LHS: What is the point of having a standard in the first place? Why should we not each of us just create something and keep it within our respective organizations? For example, we've created a format internally for better supporting more necessary aspects of localization for our localization tools ecosystem. + +EAO: So where are we supposed to do that transformation? + +MIH: Yes, that is the core argument. My preference among the 4 options [in this issue comment](https://github.com/unicode-org/message-format-wg/issues/103#issuecomment-699432663) is option 2, and the other argument being discussed is option 3. + +RCA: What I meant to start off the reason is for EAO to explain his concerns. We want to hear more about EAO and ZIB have to say. + +STA: Can I try to summarize? Some of us prefer the message-level selectors with the internal selectors are "exploding" (Cartesian product of combinatorial options) to full message patterns with top-level selectors. + +And others would prefer this conversion to happen in the round-trip. + +EAO: Not quite. The conversion from internal selectors to top-level selectors is an easy operation, and can happen when we need top-level only selectors. Allowing for internal-only selectors doesn't impose an appreciable cost when we require the top-level selectors. + +If we make explicit what the operation is for converting from internal selectors to top-level selectors, then the reverse-conversion will be clear, even though we don't need to specify how it should be done in the specification. + +Concerns are: ability to round trip the messages, and potentially size for a large volume of messages. + +LHS: Is it possible to do both: 2 different versions of the standard (internal selectors, message/top-level selectors)...and have conversion part of the standard? + +MIH: yes but implementation is harder, things are messier...once we allow internal selectors, we sneak in through this door recursivity + +LHS: When you say recursivity, do you mean nesting? + +MIH: Yes. But I also mean what happens when you have message references. + +DAF: Isn’t recursivity an issue also in case of top level selectors? + +EAO: I think what we're talking about is nesting, not recursivity. Recursivity needs to be addressed if we allow references between messages, ex: A includes B, B includes C, ... + +LHS: Can you explain why nesting would be worse for internal selectors than full-message selectors? + +MIH: + +``` +Foo {count, plural, =1 {...} other {...}} bar. +xyz {hostGender, select, female {...} male {...} other {...}} vwz. + +Foo {count, plural, =1 {xyz {hostGender, select, female {...} male {...} other {...}} vwz} other {xyz {hostGender, select, female {...} male {...} other {...}} vwz}} bar. + +[=1 female] {.....} +[=1 male] {.....} +[=1 other] {.....} +[other female] {.....} +[other male] {.....} +[other other] {.....} +``` + +RCA: But the 2nd option is possible, right? + +MIH: Yes, but once it's in the standard, then we're stuck with it. + +EAO: It's not all that difficult, having worked with that for a while. + +MIH: But it's not adopted by others, especially in the localization world. This is not very well supported. + +LHS: Some of us at Google are sensitive to the concerns of the localization industry due to the internal localization we do. I don't think you've explained why it’s messier with internal selectors than external/full-message selectors. + +EAO: How simple would the algorithm need to be that converts from internal to top-level selectors so that the localization industry would be able to support messages with internal selectors? + +MIH: The algorithm is not that complicated. What becomes messier is the data structures - basically, we make it more complicated. We then need to ensure that we are dealing with the normalized form before working on it. + +EAO: Or we add a flag that indicates whether we are using internal selectors. + +ZIB: I have 3 thoughts on what I just heard. 1) __________. I am not sympathetic to the reason that "localization tools haven't used it so far" as a reason for not supporting it. Saying that CAT tools are not powerful enough to handle it, so that they never will be, limits what we are willing to try. + +LHS: I'm sure you're as frustrated with the current ICU MessageFormat as the rest of us are. At Google, we have a bespoke localization toolchain, we've spent resources in the tooling and l10n infrastructure. Even for us, with all that we do, it's still very difficult to support these features (in existing MessageFormat). + +MIH: That's not really the argument. + +RCA: Point of order, let's stick to the original topic. + +MIH: No, that's not what I was saying. We've been using ICU MF for 15 years. The main argument is that the places where it is not adopted / banned, it was banned for the reason "There are internal selectors -- don't do this". So to argue for putting that back in is ignoring those lessons, and we know what works and what doesn't. + +ZIB: Can I respond to that, because I think it links to what LHS is saying. Rust allows “unsafe” option, we should have the default Lint option be external message selectors, but still provide the other option. The experience of l10n teams is a good justification for defaults, and this limitation as a default, but saying “let’s not allow it ever by anyone” seems cocky. MIH please distinguish complicated & complex. + +MIH: if it has a real use case, a benefit, other than “flexible for something that we might or might not need”, languages themselves are unlikely to change in 15 years + +ZIB: Agree that we need reasons to add complexity + +MIH: let’s not say “let’s make it as flexible as possible” because then we’ll end up with a Turing-complete language + +STA: We agreed that these approaches are equivalent, so it’s just a question of convenience, convenience for developers vs. translators. + + +STA: I think we agreed previously that these difference approaches are equivalent, so things are inherently complex, but the tooling can be there. + +I’d like to take a step back in heated conversations and think about our goals and principles. I want to go back to what LHS wsa saying about the l10n industry. With current MF, we're so fragmented. So I think we shouldn't try to add too many features. But maybe we're ready to challenge previous assumptions. + +It's another axis to think about. Unification or innovation. + +DAF: First off, I do support message only selectors.. I tried to wear the hat of internal selectors in case of MIH’s recursivity argument. I think both approaches support recursivity. In full message selectors, the recursivity is only allowed upwards and no hidden complexity is allowed. But inside selectors allow for infinitesimal hidden complexity via recursivity. I am for saying that the standard canonical solution is message level selectors only, and having internal selector capable roundtrips is okay as far as we want to go there. +We finished our goals and non goals, but we haven't finished our design principles. This discussion is one of the axes discussions, continuing the design principles development. We agreed that interoperability with L10n is one of our goals and I don’t see internal selectors contributing to that. + +ECH: When it comes to discussion of complexity & simplicity, there’s a talk I go back to, [“Simple Made Easy”](https://www.infoq.com/presentations/Simple-Made-Easy/), talks about how this relates to programming. The talk does a better job than I could to explain this. Simplicity & complexity are objective terms, but difficult/easy are subjective terms. Complicated sounds a little ambiguous because it's not clear whether it's meant in the objective meaning of complex or the relative meaning of difficult. There’s inherent complexity (in the problem we’re trying to solve) and incidental complexity (that we’re adding)...I go to this perspective repeatedly and it always proves to help me make successful decisions every time. + +MIH: I want to go back to me being emotional, and I am not getting emotional about making my point, but I'm getting emotional about repeating my points, and we keep repeating the arguments like in a flame war without discussing the underlying principles and finding a way to compromise, in order to break this impasse. + +EAO: A lot of my exp with MF1 has in fact been at a different scale of Google’s size. Small teams, in-house work. Most of the stuff we do needs to have support for Finnish and English. The other end is: no need for external toolchain, no need for LSP companies, just working with developers who know what they’re doing. + +EAO: The other point is that I think there is a compromise solution here to satisfy both our desires. That is, require top-level selectors, but require messages to build themselves out of parts of other messages. That would give us the benefits of message level selectors without having them. Would that work, MIH? + +MIH: Yes, with caveats. + +EAO: That way, we're not specifying the format explicitly, and we let the data model for a message or a bundle of messages, and to not allow recursivity. + +MIH: Yes. Although, there was a request in the GH issue discussions for allowing recursivity, so we should revisit that. Back to the English / Finnish examples, is there value in gathering the stakeholders together and seeing how we can serve them all? + +ZIB: I'm happy to _________. Last month, we agreed that those 2 approaches are equivalent, but then EAO pointed out that it's not true, for example version control roundtrip. If we allow nested selectors (?), then we can support recursivity in either approach, but one approach doesn't allow round trip and the other doesn't. + +STA: But if we say that we only require top level selectors, then round trip works. + +ZIB: ________, and from the top level, you cannot get back to what was provided. + +STA: One solution is to not allow the transfer between the 2. + +ZIB: But then the data model doesn't allow for the storage of one approach. + +MIH: + +**You can’t roundtrip this (with info from data model only):** +``` +You {count, plural, =1 {deleted # file from} other {deleted # files from}} income. +``` + +DAF: You can do the conversion if you have knowledge of the expansion mechanism that had been used. Without that, it's impossible just algorithmically. + +EAO: And that gets tricky when you deal with transformations in the other form (in the message level form if originally supplying internal selector form). Which is why I say that we allow it but don't try to specify it. + +DAF : Last meeting we were tending to a consensus that all public exchanges should happen with full message selectors and internal selectors would be allowed in private roundtrips. + +MIH : Ok we only can roundtrip this if you have internal information of data model + +ZIB: I want to point out that I agree that the question MIH posed is the right question, but I want to counter the point that they are completely equivalent. + + +DAF: The only argument for internal selectors that makes business sense to me is that internal is cheaper over the wire.. + +EAO: And it's more future-proof, if you're asserting that tools up until now haven't used it, it doesn't mean that tools won't use it. + +STA: Back to axis (unification vs innovation). Why do you think we want to have this standard out? What chances do you see for MF 3.0? Will that [have to] happen, or is MF 2.0 enough. + +EAO: If we do it right, we can maintain backwards compatibility and never have to break. + +STA: We have to look at the issue and implications. What do we do with a message with 7 different selectors. There are 3 different options: 1) favor compatibility and accept that we won’t be able to express messages with many selectors in a terse manner; 2) favor innovation and challenge the existing toolchains, 3) perhaps add it in MF3.0? And considering the complexity involved, I'm in favor of adding less features and keeping better backwards compatibility. + +RCA: Can we take a look at 2 options, and vote. So, message level selectors only, or allows both message level and internal selectors. We're going back and forth now. + +MIH: It's not so easy. + +DAF: I agree that it's not so easy. If we allow message level selectors only, that’s the only simple option. But if we allow internal selectors, we get combinatorial possibilities. Which is canonical, how do u get from one to another? If you allow both, we’d need to define these equivalences, u cannot simply allow two options and let people use both, that’s not how u standardize interoperability.. + +EAO: We're also not just dealing with this question in a vacuum. Like I said, I'm perfectly okay with supporting message level selectors on the condition that we can allow for message bundles (groupings of messages) to be able to be passed. + +MIH: I can clarify my position. If you don’t mean “Bundle”, I’m fine with references. I want to allow for references that are loaded from an arbitrary place, not necessarily from the same “Bundle”. I don’t want to care if it’s the same file or not. + +EAO: In the data model it does matter because ____ + +EAO: It’s not only about loading references to messages, but also scope and arguments. + + +ZIB: I think we agree on the round trip, but differences in _____ + +Is the goal to allow a standard that allows a particular localization to be built on top? Or are we trying to create a data model + + +STA: I have thoughts. + +ECH: I want to pick up on what EAO was saying about the idea of Bundles and references if we can pass a bundle of messages. +Maybe the caveat that MIH was bringing up is relevant, but one of the systems we have at Google, supports localization details that are not representable in MF, and it also creates bundles of messages. It was necessary. So maybe this is the right solution. + +MIH: I agree about references, but I don’t think we need to support hierarchy and scope here. An example I can give in Android is a library that has a picker, and it needs to have localization and if you use my library you inherit my strings from the library. They come from a library, they’re not part of your app strings. + +ECH: So, you want to have references, but you don’t need them in the same bundle, you just need to be able to refer to one another. + +NIC: Re-using string is normally considered a bad practice. In many languages the context will change the translation of a string (e.g. “yes/no” in Vietnam). From what I understand from ZIB's bundles, the main use case is context, and that's a much bigger problem to solve if we tackle this part of this group. A lot of UI also required images for example to provide the best translation. + +ZIB: I want to point out that EAO and MIH are using "bundle" to refer to as a single packages of resources/files grouped together, and Fluent means it to refer to context that gets evaluated/interpolated at runtime. + +DAF: If I understand correctly, EAO would be happy with top-level selectors only if we allow messages, variables, or scope passed through. As a part of l10n interchange format standards, I am potentially worried that this could violate the boundaries of text units. + +MIH: There are use cases for that such as alt text in HTML.. + +DAF: I see, subflows.. I am fine with that, as the subflows mechanism is well established in L10n exchange.. + +STA: I would like to caution about message references. I still don't know what to think about them myself. They do cause problems for tooling, from experience. And they do cause problems with runtime resolution if you don't have that reference ready, yet. So there are reasons to be cautious, and it's good to not conflate these 2 discussions. + +Back to what ZIB was saying earlier, whether we want to have a data model that can express different forms of translation to support different types of l10n systems. I think that is based on the idea that it is necessary to support the internal selectors. But that's not necessarily true. It can be useful for developers, but there are cases where they are an abomination and cause problems. So I would be okay just not allowing them at all. + +MIH: That is a problem. Languages in English might be more conducive to expression in internal selectors, but when you get to Slavic languages, then you are forced to expand it or find workarounds. + +ZIB: I know you're aware of the problem of the explosion of permutations (expansion of combinations, which is a large number). Do you not think that is a problem? + +STA: I think there is a tradeoff that we're not really naming, which is, if we can say that there are a few messages that we won't be able to support, but if we do, then we can enable a very wide adoption, and become a good standard, and the large benefit that it provides would, in my mind, outweigh the costs of not supporting a few types of messages. + +ZIB: Can we collect all the use cases that can only be represented with internal selectors? Give ourselves 2 weeks to do that, and see what we get. + +EAO: I think it would be hard to get current users of current MessageFormat ["1.0"] without a safe and trusted way to convert their messages and be able to recover their messages in the old format. And that is why, if we only use message level selectors, we have to use message references, etc. + +MIH: I think it is possible to do. + +RCA: How can we collect use cases? Create a new issue? + +MIH: Also propose collecting stakeholders (“beneficiaries of this standard”), so when we do our compromises we know who is affected + +RCA: When the group was formed, I wanted translation companies to be part of us, but wasn’t able to bring enough people from that side to our meeting/work group. Should include L10n industry *and* simple developers that want to localize their own stuff. + +MIH: we keep saying “translators” but there are a number of types of translators (from open source contributors to paid L10n vendors) + +EAO: Like bilingual developers! + +ZIB: I really like where it’s going, the balance we’re going to be striking is between STA / MIH, we want to take those outliers that we believe would be unfeasible to solve with top-level selectors, see if they’re really infeasible, and independently of the answer, we may decide that this it out of scope. I like this way of thinking, since there are multiple ways to decide “out of scope” but the ultimate one is what STA said, we don’t want 12K messages, “you need a new L10n system”, my concern is that the problem is common enough that we may not want to make it out of scope. I hope that this example collection exercise will help us decide. + +ECH: Can handle both the pre-filling and non-prefilling case by having CAT tools that allow the typing 1 string in a translation target textbox, and everywhere that the source is the same, copy into the target, and then edit based on that. So there’s tooling that could help—potentially on the L10n side. + +MIH: You’d have a fuzzy match so you’d know to edit it. + +ECH: Or take a starter string and edit based on that starter string, which would reduce the translation effort. + +EAO: if the source were using internal selectors, the initial duplication might be easier for the translation tool + +MIH: I think it would be messier + +EAO: if you know that something is the same, it’s easier to match it + +MIH: but in some languages that’s not the case, some things that are the same in English wouldn’t be in Romanian + +RCA: most of this will happen offline? + +ZIB: proposed exercise for MIH: complexity of data model interferes with what CAT tools can do, I know you have examples in your mind. It would be useful for us if you could compile a list what innovations you’d need to see in a CAT tool to be comfortable with doing nested selectors from a CAT Tool perspective, what would need to happen for you to say “that’s fine”...then we could send it to people who work on CAT Tools, and see if it’s possible to solve this...or it isn’t possible + +MIH: I can try, I have an idea of how that can be used...I think it’s possible, I’m not sure how useful. I’m fine to take this “homework”. + +RCA: maybe close this part & do rest offline? For those preparing slides for the Unicode Conference, do you want time to discuss? + +RCA: We have a few minutes left, so maybe we can continue the discussion offline. + +EAO: Back to our main discussion, is there anyone else besides STA who has a problem with message references? + +MIH: I do. I know you can do bad L10n with them, but I think that the benefits outnumber the drawbacks. And I think it can be done with message level selectors. I don’t think it would be a blocker. + +STA: EAO, is it an invitation for me to prepare a short presentation on my concerns? + +EAO: I’m curious what other people's opinions might be there. + +STA: I see benefits of them, I also see benefits of not having them. I can write a short doc on it and put it on github. + + +ZIB: I would also like to say that we have an increasing number of requests for dynamic elements, where the best thing to pass is a declaration to an argument, and this is causing people to write dirty hacks in JS. It's another angle to think from. Ex: you decide only at runtime which name from a set of five names you use for a message. + +STA: I think this is similar to how Siri works. I think that is something interesting. + +MIH: Basically, you get all the same problems that you get for placeholders. Ex: the problem + +ZIB: But the extra problem is that declaration is synchronous, but the resolution of the declaration is asynchronous. That results in _______. That creates churn and is a paper cut. + +EAO: This conversation is presupposing that we have message references. + +MIH: I will file an issue, if we don't, to discuss message references. Also, I created and applied the "requirements" tags to our issues. Fix as needed. + +RCA: Thanks, this type of repo maintenance is necessary, so thanks for that. Let's define a little bit more the organizing work to be done by the chair group in the chair group meeting next week, and create the project planning board. \ No newline at end of file diff --git a/meetings/task-force/#103-2020-10-26.md b/meetings/task-force/#103-2020-10-26.md new file mode 100644 index 0000000000..3fb932ae4e --- /dev/null +++ b/meetings/task-force/#103-2020-10-26.md @@ -0,0 +1,318 @@ +## Executive summary ([Original Doc](https://docs.google.com/document/d/1QvzmpbVsPfW0MFajXGIqPhXxp54oGV6xaVShOjaV-qk/edit#)) + +*Participants:* +- DAF: David Filip +- ECH: Elango Cheran +- MIH: Mihai Nita +- RCA: Romulo Cintra +- STA: Stanislaw Malolepszy +- EAO: Eemeli Aro +- ZIB: Zbigniew Braniecki +- LHS: Luke Swartz +- NIC: Nicolas Bouvrette +- GRH: George Rhoten +- CLS: Colin Sprague + +Consensus 1: Include message references in the data model. + +Discussion: The implementers would find a way to include references anyways, but including it in the data model (standard) can make it subject to best practices. It’s still possible for users to do “the wrong thing” (ex: concatenation of strings/messages), but then you would find it more difficult to achieve. + +One of the drawbacks of message references is that referenced messages effectively have a public API (names of parameters, variables, variants, etc.) which must be consistent across all callsites. This leads us to consensus 2. + +Consensus 2: Allow parameters passed with message references to the message being referenced and validate it. + +Discussion: The variables/fields passed should not be completely untyped and unchecked. We want a validation mechanism that can allow providing early error feedback to the translators & developers. We need to decide on when the validation can & should happen, including the meaning of “build time” and “run time” in regards to validation. + +> Approval Stamps for Executive Summary + +*ECH,MIH,RCA,NIC,DAF,STA,EAO,CLS* + +## Minutes + +LHS: Summary of last meeting (2020-10-19 monthly meeting). Continuation of previous discussions, but closer to a compromise. One side would prefer to keep all selectors external to the message. The other side would prefer to allow selectors inside / internal to the message. Argument for external selectors is simplicity of data model, and compatibility with existing l10n tooling. Argument for internal selectors is that it allows for more flexibility in the future. It makes a more compact notation (for cases where compactness matters). And it makes it easier for translators who are programmers. It allows for the possibility of lossless round tripping. EAO was suggesting a compromise that we either allow message references or internal selectors, and it seems like MIH and others were OK with this…? + +MIH: Message references are about using a message id that points to an external selector, but not including the message itself. So it’s “message-by-reference”, not “message-by-value”. + +ZIB: For message references, there would need to be tooling made to support it, but it can be done. But one advantage is that they can be used dynamically. That is a powerful model that we don’t support in Fluent but would like to explore support of. If they solve what EAO is looking for, then that is great. + +LHS: One thing to point out is that MIH said last time that allowing message references means that users can concatenate messages, but we recognized that no matter what we do, it’s always possible for users to “do the wrong thing”. + +ZIB: And we can use linters to help users detect potential problems. + +MIH: And if we don’t allow programmers to do this, then it will be “behind our backs”, but if we let people do references in the standard, then we can catch it, advise, etc. + +RCA: It’s nice to have guidelines to help the user, but it’s a cool-to-have or nice-to-have feature, but it’s good to + +EAO: I’m not sure that linters are outside the scope [of the WG]. It is something that could be included in the spec. + +RCA: I think we can specify best practices, but it is good to not interfere in the + +EAO: I suspect that if we delve deeper into this topic, that it would be more complicated than we realize. + +DAF: It’s true that we don’t have specific guidelines on the external user deliverables, but we do have a specific goal on interoperability with XLIFF, and that might surface + +STA: I want to acknowledge the shortcomings from Fluent regarding message references. The first drawback is that message references means that you reference messages. People tend to abuse it. They put nouns into message reference and then use it as a reference instead of just using the noun directly in the pattern. The second problem is that there is a loss of context for the translator. The third problem is that a single message is no longer independent. That means that you need a CAT tool to find out what these other messages are / where to find them, and then you need to construct a graph of message dependencies. + +DAF: STA, you’re right, but MIH is also right -- if you don’t gie this option, then they will do this anyways, so this gives the user some guidance on what to do. XLIFF can represent this with subflows. You’re right there might be subflows of logic, and there can be business logic around that. And this is a continuation of the discussion on the thread about “localization units”. + +EAO: If our current discussion is about where to put the message references in the data model, then later discussions can determine how to layer in how to use it, but we haven’t finished the first discussion. + +LHS: One thing that I wanted to makes sure is that companies/groups that want to use internal selectors, that there is a mechanism to convert to external selectors losslessly, and that companies/groups that want to keep things simpler (e.g. due to STA’s concerns) could just not allow references at the company/group level (even though it’s allowed in the data model/standard) + +ZIB: What I was talking about last time was exploration about extensions, using metadata, that gives us the ability to turn on features using flags and maintain that information. + +EAO: It’s nice to have consensus on this topic of , which I am seeing here. + +RCA: Can we ask for consensus, here? + +STA: I have my doubts about them, and I’m less enthusiastic about them, but it’s really easy to work around them, and if you do, it’s even worse. It’s not perfect, but all the alternatives are worse. + +MIH: I agree with all the concerns that STA has, too. + +STA: I was enthusiastic about the features that message references enable, as GRH showed with Siri, so there is a lot synergy there, and I’m not going to block it. + +RCA: Does this decision about message references imply anything about internal selectors? + +MIH: They’re different. + +LHS: But can we talk about not having internal selectors if we use message references? + +MIH: ZIB, is it okay to share info from our 1-on-1 meeting earlier today? We are cautiously optimistic that there is a way to work with this without needing to use internal selectors. + +ZIB: I will write up a summary of our meeting and post it as a GH issue. + +EAO: The (external) message that is being referenced by another message should be able to have access to some variable representing the context of the message that uses the reference. + +MIH: Is this example correct? “I am visiting {city_name}” and the translator wants to know, when translating the message that {city_name} points to, that it is a dative or locative case. + +STA: One thing about genders and casing was that developers define the messages with cases, etc., and once they do, they’ve created a sort of public API for the messages, whether they realize it or not. + +GRH: Are you talking about the nomenclature for the grammatical keys, etc. + +STA: For example, someone might call it genitive, but in Polish, I may not call it genitive. + +GRH: Linguists come in and define it, but over time we’ve improved after we realize that we want to standardize the names of terms. + +MIH: A proposal: we don’t put it in the standard, but instead we create a registry like what Unicode does for locale identifiers (BCP 47) that uses IANA for a registry. With a clear expectation that it is subject to change. + +STA: That is something that we were thinking about, too. Maybe on a per-application basis. + +NIC: We had discussions on Github about special file formats for references. Is that necessary for references? + +MIH: I think that it can be designed to be file-format agnostic. + +EAO: My feeling is that we should be able to provide _a_ file format that can support all of the features that we want to support, but not declaring it to be _the_ file format. + +LHS: Like a reference implementation? + +EAO: Yes. I think YAML is the only one that does ______. But I don’t think we should talk about file formats right now. + +DAF: Going back to the topic of registry. We have quite a lot of things already that would benefit from a registry. Maybe a repo for all of them? General linguistics define many of the grammatical concepts that map to variants of messages. But we also have to think about where the repo goes, who maintains it, etc.? Would Unicode be maintaining it? For XLIFF, we have a place where we allow people to register their own custom values so that they don’t just go off and do things opaquely. + +MIH: XLIFF 2 has a model that defines standard ways to add extensions +http://docs.oasis-open.org/xliff/xliff-core/v2.1/os/xliff-core-v2.1-os.html#extensions + +DAF: also it is a good idea to reserve some authority for ourselves and make clear where people can register their own sets of values and for what under their own authority.. + +EAO: This overlaps with the discussion of references (?) and we need a registry for that. + +DAF: We are talking about one registry, but maybe we need more than one. For example, a registry of variants per language is a different kind of information than an inter-language registry of general linguistic categories Unicode CLDR acts as registry for several external specs such as BCP 47 extensions U and T.. We also have to start thinking about where to place the registry technically and politically. + +EAO: What we do not know yet is whether these registries will have 100 entries or 1000 entries, and these sorts of matters will shape the discussions of who owns that registry and how does it operate. + +RCA: Before going too far into topics of registries, etc., should we first resolve the discussion of message references, and get a consensus. + +STA: Anyone -- is there any sustained opposition to MessageFormat 2.0 having references, with details TBD? + +RCA: We seem to have consensus. + +DAF: We should bring this up in the monthly meeting. Only there can we make decisions. The consensus here is still helpful for the taskforce. + +ZIB: I would like to see the [Apache voting system](https://www.apache.org/foundation/voting.html) to make it quick and clear, from -1.0 to 1.0. It helps see the temperature on the discussion and not a difficult binary system. + +RCA: We should still have the official vote in our monthly meeting, based on our rules. + +DAF: But maybe we can quickly vote + + +Apache style Voting on including message references in data model: + +``` +EAO: +1 +ZBI: +1 +CLS: +0.9 +NIC: +0.9 +LHS: +0.5 +DAF: +0.9 +RCA: +1 +MIH: +0.9 +ECH: +0.9 +STA: +0.5 +``` + +ZIB: Thanks, this was much more useful to me than just silence. + +DAF: Zoom does have a polling mechanism built in that I use in other groups that I chair, and it has worked well. + +RCA: Should we count? + +ZIB: It looks good, everything is fairly strongly positive with a low standard deviation. + +RCA: We should have someone bring this to the plenary meeting to describe the discussion and the consensus and temperature reading mechanism using Apache voting. Can someone do that? + +MIH: I can. + +EAO: I think there is a possibility to go a step further to have these message references allow the passing of context from the message with a reference to the message being referred to. + +RCA: I see this as an extension of the message reference itself. + +MIH: It kind of is. + +EAO: I would like us to consider the proposal and get consensus for the ability to pass context in the message reference. If we do, then we can call for consensus on not allowing internal selectors. + +MIH: We could take a look at the metadata that we tie to the references, but we should not connect it to the topic of internal selectors. + +RCA: Can we have an example that illustrates this topic? + +MIH: a need for metadata for message references +Example: `You visited {$company} headquarters` +It is useful to allow a translator to add some extra info to the ref, for example the fact that it should use a locative grammatical case. + + +EAO: STA, can you formulate a statement that you would be willing to support? + +STA: That might be too much to do right now. + +ZIB: MIH and I would like to ask you all for a couple of days so that we can share our discussed proposal. + +STA: I think I would like to make my support of message references more qualified by saying that I would prefer them to be strongly typed. + +MIH: What does that mean? + +STA: In the Fluent example below, `$case`, nominative, genitive are the de facto public API of brand-name. I’d like message reference to protect (on build time) against mismatches between definitions (brand-name) and callsites (about). + +``` +brand-name = { $case -> + [nominative] Brand + [genitive] Brand’s +} + +about = About {brand-name($case: “genitive”)} +``` + +EAO: What you’re saying is that you want a build-time error if the name `$case` is not an option, then it throws an error. + +MIH: Isn’t this something that can be handled for the registry. + +EAO: That _a_ possible implementation, but what I am looking for is a consensus for the idea of a strongly typed reference. + +MIH: Okay, I wouldn’t call that “strongly typed”, but it should be defined in the registry. + +ZIB: Another way of thinking about it is having some form of meta information that defines which selectors/keys and/or values provided to them that are passed to the message that is being referenced. But we still allow companies/groups to define their own selector types. + +DAF: We need 3 levels: things in the standard, things in the registry, and things in the control of the code owners. That’s basically defining extension points that say what we additionally accept and where. + +We need to define extension points. + +MIH: Yes, XLIFF 1.2 defines so many extension points that tool makers extend it every which way to the point where they are incompatible. + +DAF: we need to be clear that the private extensions must not compete for functionality with the standard or the registry when adding their own private values. + +CLS: Another simple example: + +``` +color-ball = "You picked the {$color} ball." +color-toy = "You picked the {$color} toy." +``` + +Spanish: + +``` +red = {type: Color::Adjective, masculine-singular: rojo, feminine-singular: roja, ... } +color-ball = "Escojiste la pelota {$color : feminine-singular}." +color-toy = "Escojiste el juguete {$color : masculine-singular}." +``` + +MIH: maybe + +``` +color-toy = "Escojiste el juguete {$color, { grammatical-gender: masculine, grammatical-plural: singular} }." +``` + +`$` in front of color means reference (Fluent style), but it is not a proposed syntax, just to exemplify. + +STA: Message references create an implicit API, and what I’m trying to solve is preventing someone from breaking this API. + + +RCA: Should we bring this to the plenary and discuss there? + +EAO: Let’s do the Apache voting on this topic. + +Voting on allowing meta-information passed with message references to the message being referenced and having build time validation (see discussion above): + +``` +EAO: +1 +ZBI: +0.8 +CLS: +0.8 (as long as it has standardized category names) +NIC: +0.9 (as long as its generic and extensible) +LHS: +0.5 +DAF: +0.9 (as long as the validation is integral and enforced) +RCA: +0.9 +MIH: +0.9 +ECH: +0.9 +STA: +0.9 (with validation) +``` + +EAO: Also, the type of variable being passed (via a message reference to the referenced message) could be numeric. + +LHS: What’s the strawman argument against this? + +EAO: Maybe this means that the messages are authored more verbosely. + +MIH: Or that we pass this information freeform. + +EAO: And we support the inherent complexity of the problem via build time checking. (Might just add verbosity...actually will reduce complexity.) + +STA: One thing that is not a con, but a cost, is that we are instilling English vocabulary on grammatical terms. + +MIH: Our standard can specify what fields should be included in the meta information passed with message references, but it doesn’t need to specify what happens if the user does not specify all required fields. + +DAF: we can force compliance with standard and registry and put a SHOULD for private values + +ZIB: Is there a general consensus that the tools give the users an option on whether to fail or proceed when validation fails? + +STA: This is represented in one of the design principle Github issues, and there’s a spectrum of positions on the topic. + +ECH: I think that could be an implementation detail, but I agree that it’s nice to have, and implementers giving users that option between warning and failing would be nice. + +NIC: How will linguists handle variables that are managed in registries, especially for big datasets? Are we expecting them to have access to variable values during translations and if so, I presume we would expect TMSes to implement this new standard for this to work? + +MIH: Yes… The metadata stays the same no matter how big the dataset is. Things like grammatical case / gender / number don’t depend on how many items we apply them to. But + +STA: If we make this lenient and the parameters don’t match at runtime, then it would be useful for the parent message to know that the parameters don’t match. + +MIH: An example I had was `hello {$username}`, and in Polish the case of the `$username` changes to vocative, but what happens when there is no vocative form of the name available? Can the translator have a default option that doesn’t use the placeholder altogether? + +DAF: I agree, we need an exception payload + +EAO: Do we need the referenced messages to return not only the value (string, array, etc) but also the case/variant that was chosen, so that the referencing message has a chance to react? + +STA: There needs to be 2-way communication, I agree EAO. + +EAO: Given the mini-consensus that the task force has come to on these topics, I would be okay in not allowing internal selectors in MF 2.0. + +RCA: We can bring these issues to the next plenary meeting. I would like to ask for another volunteer to bring this issue about characterizing metainformation passed with a message reference. + +MIH: Let’s summarize our discussion. + +LHS: Also, ZIB and MIH had a discussion that they want to share the notes for. + +ZIB: Yes, I have the notes written up. + +MIH: So we’ll move fast, but let’s summarize what we have right here. + +RCA: MIH, do you want to have another task force meeting on this? + +MIH: For message references + metadata, we can take it to the plenary meeting without further discussion. Not yet for internal selectors. + +RCA : We should define the terminology of build time & runtime, here is the place [#126](https://github.com/unicode-org/message-format-wg/issues/126) \ No newline at end of file