Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandoc error with pdf when .Rmd is on network file share (NFS) #701

Open
GeorgeTomlinson opened this issue May 19, 2016 · 25 comments
Open

Comments

@GeorgeTomlinson
Copy link

I am on OSX El Capitain and have a network file share mounted using sshfs.

When I run markdown::render() (or use the menu item knit->PDF in Studio) on a .Rmd file on my local drive it works fine. When the .Rmd is located on the file share, it will generate a PDF, but it terminates with the message below. This in itself is usually not a problem but there are two side effects:

(1) The PDF does not open in the internal viewer in RStudio. I have to open the PDF directly with another application.

(2) When I use the "keep_tex: true" option in the preamble, I don't get a .tex file

One possible clue: path names to files and directories are not case-sensitive on my local drive (OSX) but are case-sensitive on the file share drive.

Error message:

pandoc: InterimReportMay18.pdf: hClose: invalid argument (Bad file descriptor)
Error: pandoc document conversion failed with error 1
Execution halted

@bweigel
Copy link

bweigel commented Jul 8, 2016

I have a similar problem, however my mountpoint is cifs (running a linux server and client):

/usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html --variable 'theme:bootstrap' --include-in-header /tmp/RtmpqFVGzL/rmarkdown-str12bf7b3a9f6c.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --no-highlight --variable highlightjs=/home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/highlight
abrts with

pandoc: test.html: hClose: hardware fault (Input/output error)
Error: pandoc document conversion failed with error 1
Execution halted

However, I get it to run from the terminal, when I remove some of the optional parameters:

/usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html

No error, resulting html looks ok.
Would be neat, if anyone had an idea how to fix this.

@stroobandt
Copy link

stroobandt commented Aug 2, 2016

I ran into the same error using pandoc without RStudio on a remote Samba/CIFS drive.
.pdf: hClose: does not exist (Host is down)
Generating the PDF locally works.
The error reminds me of a clock skew error.
Using touch * did not resolve the problem though.

@cderv
Copy link
Collaborator

cderv commented Aug 2, 2016

By the way, this issue was also reported to pandoc at jgm/pandoc#1326. Do not seem to be a pandoc issue for them.

I have the same error with html conversion from markdown on a network file share.

@stroobandt
Copy link

Yup, I am convinced this is not an RStudio problem. It only got reported first by people using RStudio. We need more testing with producing pandoc output on remote drives.
Here, it seems to happen only when a large PDF output needs to be generated. Most of the time it does not work, but sometimes it does for the same file. Hence my intuition that timing might be involved.

@kevinushey
Copy link
Contributor

kevinushey commented Aug 2, 2016

One way that rmarkdown could potentially handle this would be to have pandoc generate the resulting document in the same directory as the input document, and then copy that document back to the desired location. (We might already do that in some cases?)

That may work more reliably than having pandoc to attempt to write files to remote drives.

@stroobandt
Copy link

In my case, this would not be a solution; both the input and output file are in the same remote folder.
Better were for pandoc to generate the output in some fail safe local temporary directory (e.g. /tmp) and then copy the output file to the final remote destination directory on the network drive.

@kevinushey
Copy link
Contributor

kevinushey commented Aug 2, 2016

Maybe an R_PANDOC_OUTPUT_DIR environment variable to specify where pandoc should attempt to generate its outputs, and then we could copy those rendered outputs to the desired final destination?

@jjallaire
Copy link
Member

I realize it seems like this is an expedient solution to the problem at
hand, but anything to do with output directories in rmarkdown gets dicey
pretty quickly. That is because we've already got output_dir and
intermediates_dir arguments (I'd suggest looking at those to see if they
solve the problem) and any new options/features around directories have to
deal with the intersection of states created by those options.

On Tue, Aug 2, 2016 at 4:39 PM, Kevin Ushey [email protected]
wrote:

Maybe an R_PANDOC_OUTPUT_DIR to specify where pandoc should attempt to
generate its outputs, and then we could copy those rendered outputs to the
desired final destination?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#701 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGXxxa9LIO_xaA6vcx0nxfMwvh1fnALks5qb6sagaJpZM4Iicaj
.

@cderv
Copy link
Collaborator

cderv commented Aug 3, 2016

Following your suggestion, I used intermediates_dir and output_dir in order to solve the issue with standalone html document from Rmd. (I notice that if I do no want a standalone html, there is no error)


When I run rmarkdown::render with default intermediates_dir and output_dir

#> output file: Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash 
--output Test_file.html --smart --email-obfuscation none --self-contained --standalone 
--section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 
--variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b55553a92.html --mathjax 
--variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango 
--variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

#> pandoc: Test_file.html: hClose: does not exist (Host is down)
#> Erreur : pandoc document conversion failed with error 1

Test_file.html is created in working directory and Test_file_files directory too. However, pandoc does not seem to find the file and to convert it to a self_contained html.


I first set up intermediates_dir to a local directory but I leave output_dir as default. That is to say the network drive where my Rmd file is located.
Every intermediates files (md and png images) are placed in local drive but html output file and his associated folder (before conversion to standalone html file) are located in network drive.
Their is still an error with pandoc conversion as the files folder of the html is not found.

Example for Test_file.Rmd and intermediates_dir = "~/TempRMD"

#> output file: /home/ruser01/TempRMD/Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash 
--output Test_file.html --smart --email-obfuscation none 
--self-contained --standalone --section-divs --table-of-contents 
--toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 
--variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 
--template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b7118b9a.html 
--mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' 
--highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

#> pandoc: Could not fetch Test_file_files/figure-html/Fig_1-1.png
#> Test_file_files/figure-html/Fig_1-1.png: openBinaryFile: does not exist (No such file or directory)
#> Erreur : pandoc document conversion failed with error 67

The error is now different. Test_file.html and Test_file_files directory are in my working directory (default output_dir) but setting intermediates_dir seems to make pandoc search elsewhere.


Example for Test_file.Rmd and intermediates_dir = "~/TempRMD" and output_dir = "~/TempRMD/doc

#> output file: /home/ruser01/TempRMD/Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output /home/ruser01/TempRMD/doc/Test_file.html --smart 
--email-obfuscation none --self-contained --standalone --section-divs 
--table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 
--variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b1896b739.html --mathjax 
--variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

Output created: /home/ruser01/TempRMD/doc/Test_file.html

It works ! However everything is in my local drive and I have to move the output html file to my working directory located in network drive. Not perfect, but better than the annoying error.


Theses tests make me think that their is someting odd with file paths with rmarkdown render and pandoc because I think that example 2 with just intermediates_dir arg should have worked. However, do not know if it is a rmarkdown::render issue or a pandoc issue.
I may try some pandoc command line to check that.

I did not test with pdf output. If someone willing to check if intermediates_dir and output_dir works too, could be helpful.

@stroobandt
Copy link

stroobandt commented Aug 3, 2016

I am not (yet) an RStudio user.
I only commented this bug because I wanted to make clear that this is a more general pandoc issue.
So, I am not going to let me in about local RStudio specifics.

I resolved my issue by letting pandoc generate its PDF output in /tmp, before moving it back to the remote folder which contains the input file and the makefile.

More details on this solution can be found over at the pertaining pandoc issue page.

@stroobandt
Copy link

jgm/pandoc#1326

@raubreywhite
Copy link

raubreywhite commented Oct 27, 2016

This is my solution for Ubuntu (windows will require different system calls(


RmdToDOCX <- function (inFile = "", outFile = "", tocDepth = 2, copyFrom = NULL) 
{
  if (!is.null(copyFrom)) {
    if (!stringr::str_detect(inFile, paste0("^", copyFrom, 
                                            "/"))) {
      stop(paste0("inFile does not start with ", copyFrom, 
                  "/ and you are using copyFrom=", copyFrom))
    }
    file.copy(inFile, gsub(paste0("^", copyFrom, "/"), "", 
                           inFile), overwrite = TRUE)
    inFile <- gsub(paste0("^", copyFrom, "/"), "", inFile)
  }
  try({
    outDir <- tempdir()
    originalOutFile <- outFile
    #if (RAWmisc::PandocInstalled()) {
      outFile <- unlist(stringr::str_split(outFile, "/"))
      if (length(outFile) == 1) {
        #outDir <- getwd()
      }
      else {
        #outDir <- file.path(getwd(), outFile[-length(outFile)])
        outFile <- outFile[length(outFile)]
      }
      css <- system.file("extdata", "custom.css", package = "RAWmisc")
      rmarkdown::render(input = inFile, output_file = outFile, 
                        output_dir = outDir, output_format = rmarkdown::word_document(toc = TRUE, 
                                                                                      toc_depth = tocDepth))
    #}
    #else {
    #}
    cmd <- paste0("rm -f ",file.path(getwd(),originalOutFile))
    system(cmd)
    print(cmd)
    cmd <- paste0("cp -f ",file.path(outDir,outFile)," ",file.path(getwd(),originalOutFile))
    system(cmd)
    print(cmd)
  }, TRUE)
  if (!is.null(copyFrom)) {
    file.remove(inFile)
  }
}


RmdToDOCX(
  inFile = "RunWP2.Rmd",outFile = paste0("reports_formatted/WP2_",format(Sys.time(), "%Y_%m_%d"),".docx"))

@harrismcgehee
Copy link
Contributor

@kevinushey Is there any chance you all are still looking at this? Would there be a way to use a temp directory / temp file and then copy to destination?

I believe this issue or similar also affects Notebook files on CIFS drives. They show output in the editor, but not in the Viewer and an error message displays at the top: "Error creating notebook: pandoc document conversion failed with error 1"

@yihui yihui added this to the v1.8 milestone Oct 17, 2017
@yihui yihui modified the milestones: v1.8, v1.9 Nov 15, 2017
@mwip
Copy link

mwip commented Jan 21, 2018

I happened to encounter the same problem. I use Linux Mint (18.3) on both my Laptop as well as my Desktop. I store my .Rmd on a NAS which is mounted via CIFS on the Desktop and synchronized on the Laptop (some Cloud Service).
When knitting the .Rmd on the Laptop (i.e. on its hard drive) no problems occur.
However, knitting on the Desktop (via. CIFS) it seems that the size of the resulting PDF has an influence on the pandoc conversion success. Strangely, when I add a few images to my Beamer presentation, the file will not compile anymore. The error is:

pandoc: 01_courseintro.pdf: hPutBuf: invalid argument (Bad file descriptor)
Error: pandoc document conversion failed with error 1
Execution halted

Yet, when I randomly comment the ![](somepic.png) lines, the compilation will suceed again.
Furthermore, a test on the local hard drive of my Desktop worked fine as well...

I am looking forward to seeing this one fixed in v 1.9. Thanks in advanced. Let me know if you need any more info.

UPDATE:

devtools::install_github("rstudio/rmarkdown") from #590 fixed it for me so far.

@fvanrenterghem
Copy link

With the latest rmarkdown, I'm still experiencing this issue on Windows 10 with the Rmd on a network drive. Saving it locally and knitting works.

@bac3917
Copy link

bac3917 commented Apr 10, 2019

I'm struggling with this issue now, using Windows 10, and the following sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.1 rsconnect_0.8.13 htmltools_0.3.6 tools_3.5.1 yaml_2.2.0 Rcpp_1.0.1
[7] rmarkdown_1.12 knitr_1.22 xfun_0.6 digest_0.6.18 evaluate_0.13

@adam-sampson
Copy link

It looks like Pandoc actually found a fix for this issue using \\?\UNC\. But this comment jgm/pandoc#5127 (comment) shows that this user had to use forward slashes / instead of backslashes \ in the folder path. By default, the rmarkdown package seems to convert all folder paths to the R friendly \ before passing them to Pandoc.

@adam-sampson
Copy link

I've been trying to make a fix but am a bit stuck.

If the RMD is very basic and doesn't have any figures or special options then changing

  • line 61 of pandoc.R (the pandoc_convert function) from args <- c(input) to args <- c(normalizePath(input))
  • line 71 from args <- c(args, "--output", output) to args <- c(args, "--output", normalizePath(output))
    fixes the NFS issue.

However, this creates several more issues. Figures no longer can be found. Additionally, this hard-coded change causes the intermediates_dir option (render("testRMDknit.RMD",intermediates_dir = "C:\\temp")) to fail. So it is clear that I need to change the path in other locations than I'm doing it. I'm not really familiar enough with this package. I've been trying to figure out where in the render() function these changes would need to be made.

@adam-sampson
Copy link

Trying to change this in the render() function, but it's complicated. I can get the function to run until I build it in the package. There is a lot going on here with these directories.

@rogerjbos
Copy link

Does knitting files on a network still work if using an older version of Pandoc? Is downgrading pandoc the (short-term) answer?

@JDOsborne1
Copy link

struggling with a similar problem, dirty fix seemed to be ensuring the input rmd is all in lowercase.

@robsonyeg
Copy link

For my case, the installation is on Windows 10 and my home folder is on file server with UNC path. The default installation will create R library on file server, home folder, which caused the issue. My workaround is to

  1. add R_LIBS_USER in system environment variable and set it to local disk e.g. C:\R-library
  2. change default working directory to local disk e.g. C:\tmp
    After that, the error is gone when preview R-markdown

@GabriellaS-K
Copy link

This is also a problem for me (on a Mac) and I'm supposed to be sharing work in an online drive, so copying it to my desktop each time to knit to word is not practical...Has anyone found workarounds? What if it is done on GIT?

@cderv
Copy link
Collaborator

cderv commented Jan 20, 2021

This is also a problem for me (on a Mac)

This is interesting that you have that on Mac too!
Can you share the pandoc command line that should be shown when rendering the doc ?

@GabriellaS-K
Copy link

This is also a problem for me (on a Mac)

This is interesting that you have that on Mac too!
Can you share the pandoc command line that should be shown when rendering the doc ?

Sorry, I hope I've posted in the right place...I'm not sure I have! I am trying to knit my bookdown to word, and it fails each time.

The error message:

pandoc: template.docx: openBinaryFile: does not exist (No such file or directory)
Error: pandoc document conversion failed with error 1

R version:

$platform
[1] "x86_64-apple-darwin17.0"

$arch
[1] "x86_64"

$os
[1] "darwin17.0"

$system
[1] "x86_64, darwin17.0"

$status
[1] ""

$major
[1] "4"

$minor
[1] "0.3"

$year
[1] "2020"

$month
[1] "10"

$day
[1] "10"

$`svn rev`
[1] "79318"

$language
[1] "R"

$version.string
[1] "R version 4.0.3 (2020-10-10)"

$nickname
[1] "Bunny-Wunnies Freak Out"

Mac is OS Catalina, 10.15.7

Thank you for your patience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests