-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Icu 63.1 min static data #666
Conversation
… number of locals and a few datasets such as locale units and currency
nice pull request number! 👿 |
a0eb02a525822ea5ab409e04cf280bacccbb5d4d \ | ||
data | ||
|
||
mason_extract_zip -d ${MASON_NAME}/source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prior to the SLUG_SUFFIX
this function would have zapped the cache for the previous download.
fi | ||
} | ||
|
||
function trim_data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is what "patches" the ICU data files so that we only build some of the datasets and only for some of the locales. i could have used a patch file but ICU docs say to copy files and modify those so I regex'd my way there instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
./configure ${MASON_HOST_ARG} --prefix=${MASON_PREFIX} \ | ||
$(icu_debug_configure) \ | ||
$(cross_build_configure) \ | ||
--with-data-packaging=static \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the flag that makes it so that the data is included in the static library itself as opposed to a separate file
|
||
MASON_NAME=icu | ||
MASON_VERSION=63.1-min-static-data | ||
MASON_LIB_FILE=lib/libicuuc.a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@springmeyer can one put more than one library on this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to check that the install worked. It only supports a single file and can be anything that exists after the build completes. So, the way you have it here is 👍. No need (or ability) to point to multiple files and not pointing to the other libraries is okay - they will still get installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes looks good, but I'll defer the approval to @springmeyer.
|
||
# Using uint_least16_t instead of char16_t because Android Clang doesn't recognize char16_t | ||
# I'm being shady and telling users of the library to use char16_t, so there's an implicit raw cast | ||
ICU_CORE_CPP_FLAGS="-DU_CHARSET_IS_UTF8=1 -DU_CHAR_TYPE=uint_least16_t" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinkreiser this U_CHAR_TYPE
customization (and the below -DUCHAR_TYPE=char16_t
in the mason_cflags
) was done by @ChrisLoer for MBGL and may not benefit your usecase, so I would consider removing it unless you specific think it is helpful for your usecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, mbgl-core is no longer using mason for ICU, so totally fine for the mason build to stop using the U_CHAR_TYPE
customization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, we looked at vendoring as well (which i think we want to do long term) but decided against biting the bullet of figuring out how to vendor the data into the lib (so we can ship without having to bundle the data separately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinkreiser this is fantastic work. Awesome job getting the data library re-packing working. However I'm requesting changes here since I think the MASON_SLUG_SUFFIX
should:
- Land in a different PR if we want to support this
- For this PR I think it would be simpler to just download the
http://download.icu-project.org/files/icu4c/63.1/icu4c-63_1-data.zip
manually with curl (rather than usemason_download
and running into the cache issue)
So, how about that? You revert the MASON_SLUG_SUFFIX
work and just download the data.zip
directly? That would avoid us needing to worry about potential breakages with other packages and would isolate this particular nuance (of needing to download a second package) to just the ICU package.
@springmeyer your comments should now be addressed. can i get another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I think this is ready for ./mason trigger
which will start giving you feedback about which platforms are working and which are not.
This is a build of ICU with
-Os
and with the data packaging set to put the data in a static lib. This also removes a significant amount of the data. It narrows down the datasets to just:locales
,unit
,currency
and a couple basic things inmisc
. Additionaly for those datasets the only locales supported are:With all of these changes the total data size of the data library is down from 26MB to 3.1MB. Which is quite convenient for cross platform libraries that have to ship the same code to a variety of platforms.
There was a slight issue with the standard download/unpacking workflow. ICU does not ship the "code" used to build the data in the same targz as the code. Because of this the package must download the code and the data-code and splice them together (to get the static library with data built-in). What I've done is updated the downloading function with the ability to add a suffix to the file you downloaded. This allows the unpacking functions to then use the same suffix and thefore avoids destroying the cache of the other files that were downloaded for the package in question. Because of this change I am wondering if I should update the mason version from
0.18.0
to0.19.0
or if even just new packages in mason also obviate the version update? @springmeyer or maybe @brunoabinader can you advise?