Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support svg images when going from markdown to docx #4058

Closed
agusmba opened this issue Nov 11, 2017 · 27 comments
Closed

Support svg images when going from markdown to docx #4058

agusmba opened this issue Nov 11, 2017 · 27 comments

Comments

@agusmba
Copy link
Contributor

agusmba commented Nov 11, 2017

In a similar way to #1793 for pdf generation, it would be nice to automatically convert svg images to png (or any word supported format) when going from markdown to docx.

@iandol
Copy link
Contributor

iandol commented Apr 18, 2018

@agusmba
Copy link
Contributor Author

agusmba commented Apr 19, 2018

Thanks for the pointer!

It took them a while for a 1999 standard, and it seems it's only for whose with active subscriptions to 365

Anyway, better late than never. Nice!

@mb21
Copy link
Collaborator

mb21 commented Sep 26, 2018

Posting the relevant part of the attached docx here for reference. Apparently, the docx contains both the svg and a fallback png.

    <w:drawing>
      <wp:inline distT="0" distB="0" distL="0" distR="0" wp14:anchorId="589F1FF2" wp14:editId="553CEC71">
        <wp:extent cx="5943600" cy="3608070"/>
        <wp:effectExtent l="0" t="0" r="0" b="0"/>
        <wp:docPr id="1" name="Graphic 1"/>
        <wp:cNvGraphicFramePr>
          <a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1"/>
        </wp:cNvGraphicFramePr>
        <a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
          <a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
            <pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
              <pic:nvPicPr>
                <pic:cNvPr id="1" name="cars.svg"/>
                <pic:cNvPicPr/>
              </pic:nvPicPr>
              <pic:blipFill>
                <a:blip r:embed="rId8"> <!-- pretty sure this is the file Id of the PNG -->
                  <a:extLst>
                    <a:ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
                      <a14:useLocalDpi xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main" val="0"/>
                    </a:ext>
                    <a:ext uri="{96DAC541-7B7A-43D3-8B79-37D633B846F1}">  <!-- this is a constant, identifying the svg extension -->
                      <asvg:svgBlip xmlns:asvg="http://schemas.microsoft.com/office/drawing/2016/SVG/main"
                        r:embed="rId9"/>  <!-- pretty sure this is the file Id of the SVG -->
                    </a:ext>
                  </a:extLst>
                </a:blip>
                <a:stretch>
                  <a:fillRect/>
                </a:stretch>
              </pic:blipFill>
              <pic:spPr>
                <a:xfrm>
                  <a:off x="0" y="0"/>
                  <a:ext cx="5943600" cy="3608070"/>
                </a:xfrm>
                <a:prstGeom prst="rect">
                  <a:avLst/>
                </a:prstGeom>
              </pic:spPr>
            </pic:pic>
          </a:graphicData>
        </a:graphic>
      </wp:inline>
    </w:drawing>

Word_Markdown_with_SVG.docx

@mb21
Copy link
Collaborator

mb21 commented Sep 26, 2018

Since we need to generate a fallback-png for this anyway, this seems as good a time as any, to factor out the svg-to-png rendering features that was added for LaTeX/PDF output, and generalize it to docx and epub output (#2766).

btw. see #2211 for some code that may be usable.

@mb21
Copy link
Collaborator

mb21 commented Sep 27, 2018

@jgm What do you think of factoring out the rsvg-convert-related code from PDF.hs? I'm not sure where it should go: in Text.Pandoc.Shared (where PDF.hs lives) or Text.Pandoc.Writers.Shared (where EPUB.hs etc. are). Or even Text.Pandoc.ImageSize which already has morphed into a kind of general-purpose image module, or a new module?

@agusmba
Copy link
Contributor Author

agusmba commented Mar 8, 2019

The nice thing is that even older word (2013) can open documents with SVGs (I guess they fall back to showing the png)

@jgm
Copy link
Owner

jgm commented Mar 8, 2019 via email

@mb21
Copy link
Collaborator

mb21 commented Mar 8, 2019

I don't remember, it's been a while ;-) But yes, maybe LaTeX and Word understand about the same subset of image formats, then we could just move the whole convertImage function...

@ociule
Copy link

ociule commented Jul 5, 2019

As a heavy user of the docx output format, I'm really interested in this issue. It would allow outputting better quality documents.

@leonidlezner
Copy link

The issue is still there: When convertig Markdown to PDF the SVGs are embedded, in DOCX files the images are not present, only placeholders (Broken image...). Is there any way to deal with Word documents?

@jgm
Copy link
Owner

jgm commented Jan 20, 2020

It would be good to do something here. Note that convertImage from the PDF module currently has type

convertImage :: WriterOptions -> FilePath -> FilePath
             -> IO (Either Text FilePath)

It creates a file. I think we should create a new unexported module Text.Pandoc.Image with

convertImage :: WriterOptions
   -> MimeType   -- ^ Input mime type
   -> ByteString  -- ^ Input image as bytestring
   -> MimeType  -- ^ Desired output mime type
   -> IO (Either Text ByteString)

This would be more suitable for use in Word, since we don't have a tmp dir. We'd need a bit of code around this in Text.Pandoc.PDF, but it would be simple.

Note: rsvg-convert can be used as a pipe.

@jgm
Copy link
Owner

jgm commented Jan 20, 2020

Or maybe Text.Pandoc.ImageSize should be folded into the new Text.Pandoc.Image (which would then need to be exported). ImageSize contains an ImageType type, which could be used instead of MimeType in the new convertImage.

@jgm jgm added this to the next release milestone Jan 20, 2020
@jgm jgm removed this from the next release milestone Feb 13, 2020
@jgm
Copy link
Owner

jgm commented Feb 13, 2020

I added an unexported module Text.Pandoc.Image with svgToPng.

@jgm jgm modified the milestone: next release Feb 13, 2020
@gijswijs
Copy link

I'm confused. is this a milestone for the next release, or is it not?

Either way, I'm really happy to see that you folks are working on this.

@mb21
Copy link
Collaborator

mb21 commented Apr 26, 2020

hm... I was taking a stab at this... and svgToPng in Text.Pandoc.Image is great... but we cannot run it in the Pandoc Monad.. :S

for the record, this is how far I got:

diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs
index 2caba59cc..d5403e65b 100644
--- a/src/Text/Pandoc/Writers/Docx.hs
+++ b/src/Text/Pandoc/Writers/Docx.hs
@@ -44,6 +44,7 @@ import Text.Pandoc.Definition
 import Text.Pandoc.Generic
 import Text.Pandoc.Highlighting (highlight)
 import Text.Pandoc.Error
+import Text.Pandoc.Image (svgToPng)
 import Text.Pandoc.ImageSize
 import Text.Pandoc.Logging
 import Text.Pandoc.MIME (MimeType, extensionFromMimeType, getMimeType,
@@ -1328,7 +1329,12 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
   imgs <- gets stImages
   let
     stImage = M.lookup (T.unpack src) imgs
-    generateImgElt (ident, _, _, img) =
+    svgBlip ident = mknode "a:extLst" [] $
+      mknode "a:ext" [("uri", "{96DAC541-7B7A-43D3-8B79-37D633B846F1}")] $
+        mknode "asvg:svgBlip" [
+          ("xmlns:asvg", "http://schemas.microsoft.com/office/drawing/2016/SVG/main")
+        , ("r:embed", ident) ] ()
+    generateImgElt (ident, _, mbMimeType, img) =
       let
         (xpt,ypt) = desiredSizeInPoints opts attr
                (either (const def) id (imageSize opts img))
@@ -1343,7 +1349,10 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
                             [("descr",T.unpack src),("id","0"),("name","Picture")] ()
                         , cNvPicPr ]
         blipFill = mknode "pic:blipFill" []
-          [ mknode "a:blip" [("r:embed",ident)] ()
+          [ mknode "a:blip" [("r:embed",ident)] $
+              case mbMimeType of
+                Just "image/svg+xml" -> [svgBlip ident]
+                _ -> []
           , mknode "a:stretch" [] $
               mknode "a:fillRect" [] ()
           ]
@@ -1414,6 +1423,8 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
          else do
            -- insert mime type to use in constructing [Content_Types].xml
            modify $ \st -> st { stImages = M.insert (T.unpack src) imgData $ stImages st }
+
+           svgToPng opts $ toLazy img
            return [generateImgElt imgData]
       )
       `catchError` ( \e -> do

@jgm
Copy link
Owner

jgm commented Apr 26, 2020

but we cannot run it in the Pandoc Monad.

svgToPng may be too specialized to make a method of PandocMonad.
One possibility would be to add a method to PandocMonad class

ioWithFallback :: PandocMonad m => a -> IO a -> m a

This would be implemented in PandocIO by simply running the IO action.
In PandocPure it would simply return the fallback.
With this we could easily integrate svgToPng.
Thoughts? @tarleb @jkr

@tarleb
Copy link
Collaborator

tarleb commented Apr 26, 2020

I'm not sure. Seems like a reasonably clean and pragmatic solution, but feels a bit weird, too.

Questions that came to mind, in no particular order:

  • Would an IOException in the IO action trigger the fallback, or would it bubble up in the form of a PandocError?
  • Could a Text.Pandoc.App.Transform be a viable option to perform the conversion?
  • A separate class MonadFallbackIO or the like could allow for finer-graned effects handling; but might be too complicated.

@jgm
Copy link
Owner

jgm commented Apr 26, 2020

Would an IOException in the IO action trigger the fallback, or would it bubble up in the form of a PandocError?

I suppose we'd want to trap exceptions in the IO action and raise a PandocError. The fallback would just be for cases that can't perform IO. If you wanted to return the fallback if there were IO exceptions, you could just handle the exception yourself.

Could a Text.Pandoc.App.Transform be a viable option to perform the conversion?

Ah, I see; you mean do a pass through the AST first, converting SVGs to PNGs, before rendering the AST? This might be a bit less performant, but we could do it without changes to the Class API.

@frabera
Copy link

frabera commented Jun 1, 2021

Is there any update for this?

@acircleda
Copy link

I am also wondering if there is a fix. Right now, I can produce graphics in SVG format using ggsave and then drag them in manually to Word and they work well. Including the svg automatically using dev="svg" or ![](image.svg) produces a broken image placeholder.

@jgm
Copy link
Owner

jgm commented Aug 27, 2021

Could a Text.Pandoc.App.Transform be a viable option to perform the conversion?

It can't be a Transform, those are pure.

We could add an ad hoc step to the pipeline in T.P.App (before extractMedia) but, presumably, after filters?
This would be triggered only if docx is the output format. It would convert any svgs in the mediabag to pngs, and change the image links accordingly.

@jgm jgm closed this as completed in 51caa8b Aug 28, 2021
@jgm
Copy link
Owner

jgm commented Aug 28, 2021

Support for SVGs has been added!
Note: this includes PNG fallbacks, but I can't really test to see whether these work on my recent version of Word. If anyone can test on an old version, it would be good to know if it works there too.

@agusmba
Copy link
Contributor Author

agusmba commented Aug 31, 2021

Looking great! 👏

I did a quick test on my laptop, and even though I got a warning because I don't have rsvg-convert, I still got a proper docx (without the fallback png I guess):

image

I'm using Word 365 (v2102).

If I open it with Word Online, it does as expected (W.Online doesn't support embedded SVG and shows the fallback png which is missing since I didn't have rsvg):

image

I'll upload my test files if you want to generate a proper docx (with rsvg).
pandoc-issue-4058.zip

Thanks @jgm!

@jgm
Copy link
Owner

jgm commented Aug 31, 2021

Good to know it's falling back to the PNG (or attempting to) on the online version. That's the piece I wasn't able to test.

@agusmba
Copy link
Contributor Author

agusmba commented Sep 1, 2021

Good to know it's falling back to the PNG (or attempting to) on the online version. That's the piece I wasn't able to test.

After obtaining a rsvg-convert build for windows, I could test it again, and it seems to work perfectly.

The online view (browser) uses the PNG fallback (see the raster in the red curve when zoomed in 400x):
image

which does not happen in the desktop version (real vector graphic included):
image

At 100% it is hard to spot the differences. 👍

@adamryczkowski
Copy link

What exactly do you do, to get the svg into the Word document? I've tried many things on my Ubuntu 22.04 (Pandoc 3.1.4) and all I get is either a placeholder without the rsvg-convert, or a PNG file with it, using the attached pandoc-issue-4058.zip (I skipped the --reference-doc)

@jgm
Copy link
Owner

jgm commented Jun 26, 2023

I'm seeing the svg in there (it's compressed as svgz).

Screen Shot 2023-06-26 at 9 22 23 AM

It's also linked in document.xml. Everything looks right?

PS. The -i flag doesn't do what you think it does. See the manual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests