-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File Format #47
Comments
I think there are 4 related concepts:
When you say file format agnostic, do you mean (3) or (4)? |
I am talking about the format of the file where you store the key/values (strings) which may or may not contain MF (or other) syntax - if I understand correctly this would be (4)? But I'm starting to realize there might be a part missing in the conversation here (related to this thread as well). From my perspective, there are two key concepts part of the localization chain - everything else is more on the implementation side:
In my view of the simplest solution, you can use any file format to store your strings and use the syntax or not, as you see fit. The syntax is simple and can be translated seamlessly using modern continuous localization platforms, without the need of import/export scripts. This is the part I think I will try to clarify because I'm not sure how everyone is doing localization today, but to me, having conversion scripts, or build scripts could be avoided by keeping everything simple enough - if this is even possible with the extended scope. All I can say is that this is already possible today :-) |
From all I see in the discussions of various features we seem to want eat the cake, and have it too:
I like both of these bullets. |
As far as I see ... the mixing of requirements/features and the binomial between form and function must be kept whatever type or format of file we end up. I see a strict dependency on a well defined Data Model that will have the same complexity than the number of features we put on the bag. I see the focus in tackling this part of the definition very important and consequently will drive the format or formats. |
Fair enough - the more I learn about languages the more I realize how much variety there is (funny example here: https://en.wikipedia.org/wiki/Boustrophedon) But ultimately we are trying to build a useful solution. If the file type is not widely supported, or if the format is too complex, this could reduce the usefulness of the solution because of adoption barriers. I think a good exercise would be to stack rank all the linguistic features we would need with clear use case (real life), this way we will be well aware of why we would do certain trade-offs. |
As an observation, based on the likely outcome of #103 to support top-level selectors with multiple input variables, this is going to create a desire for a human-accessible file format that can support lists of strings as item keys. As far as I know, the only widely-used format that does allow for that is YAML. Reformatting some of the examples given there by @echeran, @mihnita and myself, here's how they could be expressed (using slightly variable selector function specifications): plain-message: Do we allow multiple multi-select messages to nest inside one another?
profile-likes:
select: [ PLURAL(friendsNum), PLURAL(countriesNum), GENDER(user) ]
cases:
[ one, one, masculine ]: ${friendsNum} friend from ${countriesNum} country liked his profile.
[ one, one, feminine ]: ${friendsNum} friend from ${countriesNum} country liked her profile.
[ one, one, other ]: ${friendsNum} friend from ${countriesNum} country liked their profile.
[ one, other, masculine ]: ${friendsNum} friend from ${countriesNum} countries liked his profile.
[ one, other, feminine ]: ${friendsNum} friend from ${countriesNum} countries liked her profile.
[ one, other, other ]: ${friendsNum} friend from ${countriesNum} countries liked their profile.
[ other, one, masculine ]: ${friendsNum} friends from ${countriesNum} country liked his profile.
[ other, one, feminine ]: ${friendsNum} friends from ${countriesNum} country liked her profile.
[ other, one, other ]: ${friendsNum} friends from ${countriesNum} country liked their profile.
[ other, other, masculine ]: ${friendsNum} friends from ${countriesNum} countries liked his profile.
[ other, other, feminine ]: ${friendsNum} friends from ${countriesNum} countries liked her profile.
[ other, other, other ]: ${friendsNum} friends from ${countriesNum} countries liked their profile.
deleted-files:
select: [ file_count:plural, dir_count:plural ]
cases:
[ =1, =1]: You deleted one file in one folder!
[ =1, other]: You deleted one file in ${dir_count} folders!
[other, =1]: You deleted ${file_count} files in one folder!
[other, other]: You deleted ${file_count} files in ${dir_count} folders!
listed-items:
select: count
cases:
one: Listing one item
other: Listing ${count} items My suspicion is that going with any other choice than YAML would require us to spend more time defining and building tooling for selector syntax, only to arrive at some custom solution that's going to look really similar. One limitation that YAML does impose is that plain unquoted scalars can't start with the As a further observation, using an externally defined format like YAML can also be thought of as an argument for not allowing in-message selectors, given that we then don't need to define a syntax for them. |
@eemeli I really dislike the syntax where one has to list all permutations of argument variations manually, it's so easy to miss some this way. |
Since we target a data model, not a file format, this is probably out of scope. And if we need to design something at this level, I would rather go with a syntax, not a file format. One of the main benefits of the "not-a-file-format" approach is that you can store the strings in whatever format is native for the tech stack used. One might say: who cares, a file format is just a file format, you can mix and match. But not so. Then you add the burden to either migrate all strings to the new file format (and update the code that loads the strings), or live with a mixture (most strings stay in strings.xml, but the plural, gender and inflection strings should be in strings.mf2) This is why I don't even care about the file format issue. |
I am closing this issue because we have resolved that resource (file) formats are out of scope. The ABNF includes some features (related to whitespace handling) to help implementations of various resource formats, but we're otherwise agnostic. If you think this should be re-opened, please consider opening new issues with specific requests/requirements against the syntax or specification indicating what in-scope features are needed (e.g. to support various resource syntaxes). |
Is your feature request related to a problem? Please describe.
To store strings, we need to support at least one file format. Message Format currently is agnostic of file formats which means that some topics like context (see issues 39 and 40 should either be considered upfront or as a separate topic.
Describe the solution you'd like
I would keep the new standard file format agnostic (but would be curious what others think).
Describe why your solution should shape the standard
By not requiring a new file format I see the following benefits:
Additional context or examples
Most TMSes already support +20 file formats, and while I agree that Fluent's current file format seems like the most appropriate for localization, wide adoption can take time and other formats are already widely used. There are already a lot of formats that can support context and that are well supported across languages. The most popular formats I have seen:
The text was updated successfully, but these errors were encountered: