-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wikiteam3 v4 release #176
wikiteam3 v4 release #176
Conversation
strip and unquote url Convert domain to IDNA Remove port `:80` and `:443` if `http://` and `https://` respectively remove last slash if exists truncate to last slash remove "/any.php" suffix (`r"(/[^/]+\.php)"`) remove ~ tilde sulgify the url path if `ascii_slugify` is True replace port(`:`) with underscore(`_`) lower case
pre-bump 4.0.0 WIP: deprecate launcher WIP: refactor uploader WIP: rename function names from camelCase to snake_case WIP: default args WIP: feat: `--noverify-image-size` WIP: enable `--xmlrevisions --curonly` support TODO: feat: save `Special:Log` wiki page break change: image: change the magic NULL string from `False` to `null` chore: remove deprecated alias for xml.etree.ElementTree TODO: feat: cli: add an option to disable random UserAgent TODO: feat: Incremental image dump powered by archive.org (experimental) WIP: feat: add option to dump recent images only (with a custom cut-off date)
Reasons: * if the limit is float and between 0 to 1, the MW backend will force-int it to 0 * To loop over all the revisions, we need to retrieve at least 2 at a time * If the historical revisions cannot be retrieved even with limit=2, it should fail at this time so that the user can try to use --curonly .
<element name="parentid" type="positiveInteger" minOccurs="0" maxOccurs="1"/>
The last titlelist may not be yielded
deprecate: `.desc` xmlfile support feat: cli: `--disable-image-verify`
…ed in the given time interval close: mediawiki-client-tools#151
pre-bump: version 4.0.0 feat: cli: add `--user-agent` to custom UA
feat: `--ia-wbm-booster` feat: write .mark file after dumped refactor: SessionMonkeyPatch feat: image: dynamic load config.delay
feat: cli: `--upload`: Upload wikidump to Internet Archive after successfully dumped
bump 4.0.14
(new MediaWiki versions)
Point of Conflict. |
Several things:
|
This is the sort of thing where I will defer to everyone else. In general I would say that anything breaking backwards compatibility should be dependent on implementation of build-versioning that would allow users to reliably target older versions. Right now this repository does not even have version-tagged GitHub builds, let alone versioned PyPI builds, so at this point breaking backwards compatibility is a no-go. As for drastically refactoring the code… that's fine, and probably for the better, as long as there is build-versioning in place to protect existing users. |
Regarding format compatibility: I think introducing a new default format is fine as long as the existing upstream format continues to be supported alongside it for a substantial "bridge" period, with (a) an ability to convert existing dumps to the new format, and (b) there are strong "deprecation" nudges encouraging users to migrate. Refactoring (and abstracting much of the backend) could of course facilitate this, hence why I'm supportive of refactoring more generally. Basically what I'm saying is that introducing a new data format should be dependent on first establishing a stable public API for the backend, which currently does not exist. |
This is too complex for my level of comprehension - except that this PR for wikiteam3 is not in that repository. |
Here's my thought… @yzqzss why don't you open a new Pull Request from an earlier commit on this branch? (I think you have to create a branch from that commit in order to do so.) This would be much easier to approach if it wasn't a gigantic total total rewrite all at once, and breaking it into chunks this way would help. If you're not interested in doing thing, to be fair, we could try and do so ourselves, but you're more familiar with your own code than we are. |
Fixes
feats
--disable-image-verify
(Some images get dropped from the dump #170)--user-agent
to custom UA--ia-wbm-booster
Incremental image dump powered by web.archive.org (experimental)refactor
countless
...
drop legacy code
Special:Export
Breaking changes
False
tonull
.desc
xmlfile supportDrop launcher
Shifts compression responsibilities from the launcher to the uploader.
dependencies
refactor uploader
--parallel
to disable it)https://pypi.org/project/wikiteam3/