All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning as of 2.0.0.
### Added
- Add support for
--debug
flag to output debug logs
- Insert as few rsync URLs as possible in DB when a book selection is made (#220)
- Replace usage of os.path and path.py with pathlib.Path (#195)
- Simplify the logger named (used
gutenberg2zim
instead ofgutenberg2zim.constants
) (#206)
Publisher
ZIM metadata can now be customized at CLI (#210)
Publisher
ZIM metadata default value is changed toopenZIM
intead ofKiwix
(#210)
- Do not fail if temporary directory already exists (#207)
- Typo in
Scraper
ZIM metadata (#212) - Adapt to hatchling v1.19.0 which mandates packages setting (#211)
- Fixed regression with broken filters on on multiple-languages ZIM (#175)
- Fixed
Name
metadata that was incorrectly including period (#177) - Fixed
Language
metadata (and filename) for multilang ZIMs (#174) - Using zimscraperlib 2.1.0
- Using localized Title and Description metadata (#148)
- Fixed regression with epub files stored as
application/zip
(#181) - Adopt Python bootstrap conventions, especially migration to hatch instead of setuptools and Github CI Workflows adaptations (#190)
- Removed inline Javascript in HTML files (#145)
- Support single quotes in author names (#162)
- Migrated to another Gutenberg server (#187)
- Removed useless file languages_06_2018 (#180)
- Removed Datatables JS code from repository, fetch online now (#116)
- Dropped Python 2 support (#191)
- Porgress report using
--stats-filename
- Updated dependencies, including zimscraperlib (2.0)
- Now creating no-namespace ZIM with Illustration
- Fixed/reduced sqlite timeouts
- Better handling of rsync'd list of URLs
- RDF files are not extracted to disk anymore (faster on selections)
- Remove all Urls from DB before processing rsync'd ones
- Fixed --concurrency short flag (now
-c
) - Docker image now uses python3.11
- DB don't use a separate Format table anymore
- Dependency to zimwriterfs binary.
-r
/--rdf-folder
flag: rdf not extracted to disk anymore--export
: HTML files not written to disk first anymore--dev
: idem- Binaries from docker images: jpegoptim, pngquant, gifsicle, zip, curl, p7zip
- Added portuguese translation
- Changed mirror used as aleph doesn't contain all files anymore
- Changed links to accomodate zimwriterfs 2.1.0-2 (#144)
- Using --zstd option with zimwriterfs
- removed duplicate dependencies
- Added tag _category:gutenberg which was missing
- docker-only release with updated zimwriterfs (2.1.0-1)
- simplified home page results on smaller screen sizes
- added bookshelves mode option
- added title search option (doesn't scale!)
- fixed setup_urls on macos
- add s3 based optimization cache
- do not allow checking other URLs if downloaded from s3
- remove book from DB if not downloaded in any format
- better handling of downloaded/optimized files
- More informative and better displayed logs
- switched to python3.6+
- docker image now based on bionic
- docker image only writes onto /output
- using zimwriterfs 1.3.10-4
- using zimscraperlib
- fixed some articles missing titles
- zimwriterfs error on all-langs now stops whole process
- safer extraction of rdf-files.tar
- Fixed broken setup.py (moved LICENSE file)
- Added changelog
- fixed running on macOS
- fixed running with PY3
- defaulting to PY3 on Docker
- Added ability to set an output folder for --one-language-one-zim
- removed unused -m parameter
- Added --tags
- Added --scraper
- Harmonized ZIM name and filename with other projects
- Removed format list in filename, title and description if all formats selected
- Changed dockerfile to use source instead of pypi
- Fixed --one-language-one-zim not completing
- Fixed IntegrityError on thread colision
- fixed python3 compatibility
- cleaned-up code (tab/space mix)
- initial version