You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the current implementation of the WAT extractor, the WARC-Filename in tht WAT warcinfo record corresponds to the filename of the original (W)ARC record.
According to the WARC ISO standard, it should be the WAT filename itself.
In the current implementation of the WAT extractor, the WARC-Filename in tht WAT warcinfo record corresponds to the filename of the original (W)ARC record.
According to the WARC ISO standard, it should be the WAT filename itself.
Current:
WARC/1.0
WARC-Type: warcinfo
WARC-Date: 2015-02-18T10:24:54Z
WARC-Filename: BnF-6224-50-20150218094547-00001-ciblee_2015_menelas2.bnf.fr.warc.gz
WARC-Record-ID: urn:uuid:97a37ea9-1af4-4c47-8ae0-5515428347aa
Content-Type: application/warc-fields
Content-Length: 73
Target:
WARC/1.0
WARC-Type: warcinfo
WARC-Date: 2015-02-18T10:24:54Z
WARC-Filename: BnF-6224-50-20150218094547-00001-ciblee_2015_menelas2.bnf.fr.warc.wat.gz
WARC-Record-ID: urn:uuid:97a37ea9-1af4-4c47-8ae0-5515428347aa
Content-Type: application/warc-fields
Content-Length: 73
Implementation:
java extractor.jar -wat fichierA.warc.gz --> will go to standard output
WARC-Filename:
fichierA.warc.gz => fichierA.warc.wat.gz
fichierA.arc.gz => fichierA.arc.wat.gz
fichierA.warc => fichierA.warc.wat
fichierA.arc => fichierA.arc.wat
java extractor.jar -wat fichierA.warc.gz fichierB.wat.warc.gz --> will go to file fichierB output
WARC-Filename: fichierB.wat.warc.gz
The text was updated successfully, but these errors were encountered: