Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread empty files #22

Closed
brry opened this issue Jul 20, 2020 · 3 comments
Closed

fread empty files #22

brry opened this issue Jul 20, 2020 · 3 comments
Labels

Comments

@brry
Copy link
Owner

brry commented Jul 20, 2020

readDWD has the argument fread to read datasets through data.table::fread, which is significantly faster than base unzip+read.table.
Early on (2017), the default was fread=NA (i.e. conditional on availability of data.table).
Some users sent emails about errors on Windows OS so I changed the default to FALSE (272b947).
For speedup, it would be nice to have the default NA again.

This will be set experimentally to see if new complaints arise.

brry added a commit that referenced this issue Jul 20, 2020
@brry
Copy link
Owner Author

brry commented Jul 20, 2020

Here's the error structure reported by at least 2 people independently:

Error in data.table::fread(paste("unzip -p", f, fp), na.strings = na9(), : File is empty: [Rtempfile]
In addition: Warning messages:
1: running command 'C:\Windows\system32\cmd.exe /c (unzip -p [path].zip [pfile].txt) > [Rtempfile] had status 1
2: In shell(paste("(", input, ") > ", tt, sep = "")) : '(unzip -p [path].zip [pfile].txt) > [Rtempfile]' execution failed with error code 1

[pfile] being the produkt file, e.g. produkt_klima_tag_19370101_19860630_00001.txt
for [path] DWDdata/daily_kl_historical_tageswerte_KL_00001_19370101_19860630_hist.zip
or produkt_klima_tag_20160225_20170827_03987.txt for DWDdata/daily_kl_recent_tageswerte_KL_03987_akt.zip

@brry brry added the question label Jul 20, 2020
@brry
Copy link
Owner Author

brry commented Jul 21, 2020

For 10 minute data, there is 64-bit formatted columns needing the package bit64 if fread=TRUE/NA.
See 5a3f38e

@brry
Copy link
Owner Author

brry commented Aug 24, 2020

A new error message was reported in #23, but I think it's the same problem.

'unzip' is not recognized as an internal or external command, operable program or batch file.
'(unzip -p [path].zip [pfile].txt) > [Rtempfile]' execution failed with error code 1
File '[Rtempfile]' has size 0. Returning a NULL data.frame. File contains no rows: [path].zip

It seems the system command unzip is not available by default on Windows (source).
On my computer, it is found in Rtools:
Sys.which("unzip") gives me c:\\Rtools\\bin\\unzip.exe

brry added a commit that referenced this issue Aug 24, 2020
@brry brry closed this as completed Aug 25, 2020
brry added a commit that referenced this issue Sep 21, 2020
…w section on homepage. Related again to issue #22.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant