Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LaTeX Reader and multirow, part 2 #6603

Closed
kysko opened this issue Aug 9, 2020 · 4 comments
Closed

LaTeX Reader and multirow, part 2 #6603

kysko opened this issue Aug 9, 2020 · 4 comments

Comments

@kysko
Copy link

kysko commented Aug 9, 2020

Firstly, thanks to LRDC for his latex reader, and fixing a little problem recently.

But there's a few more issues with it that escaped me last time, multirow being particularly problematic. I prefer to combine them here rather as separate issues, even if it make a long text.

(Below, I was using pandoc-nightly-windows-2020-08-08 which includes LRDC's latest fix for issue #6596.)

multirow's second argument

From the documentation, multirow's second curly-braced argument is width, the width in which to set the text.
However it seems the latex reader interprets the second argument as an alignment.

\begin{tabular}{r}
\multirow{2}{c}{hello}
\end{tabular}

will give the (simplified!) native output:

Table
 AlignRight,ColWidthDefault
 TableBody
  Row
   Cell AlignCenter (RowSpan 1) (ColSpan 1) [Plain [Str "hello"]]

Notice how the Cell is centered, although I demanded a right align.
The proper notation

\begin{tabular}{r}
\multirow{2}{5em}{hello}
\end{tabular}

will give a Para in the native output, not a Table.

Maybe the confusion comes from the fact that in \multicolumn{2}{r}, the second argument is an alignment, and perhaps one thought multirow behaved the same way.

Note that trying to compile with pdflatex, the first example will give out an error ("missing number"), while the second won't (although here also there is something to say below).

multirow and proper cell inputs in rows

The documentation says:

You should leave the other rows empty at this column, otherwise the stuff created by \multirow will over-write it.

While not explicit, one can verify that litteraly empty cells must be written, as in the following:

\begin{tabular}{|r|r|}
\multirow{2}{5em}{hello} & A \\
& B \\
\end{tabular}

Notice how the & is present to separate from a preceding empty cell.
However, in the tests for the LaTex reader, one can see the following format is used:

\begin{tabular}{|r|r|}
\multirow{2}{c}{hello} & A \\
B \\
\end{tabular}

that is, by omitting the &.

The Native output is almost the same for both inputs, except that the "B" above is missing when written in the correct format.
As for compilation with pdflatex (once {c} is replaced by {5em} for example), both compile, but it is evident that the "B" is overwritten in the later format.
(/edited part reverted back, the above stands)

(
/Another edit!: on that note, the last example from the previous issue should then be written

\begin{tabular}{r}
\multirow{2}{5em}{hello} \\
\\
\end{tabular}

indicating an empty row, which upon compilation with pdflatex gives a better output (visible when you put vertical lines).
)

Mixing multirow and multicolumn

The documentation says:

If you want to use both \multirow and \multicolumn on the same entry, you must put the \multirow inside the \multicolumn. The other way around will not work.

However, the nested \multirow and \multicolumn in the LaTeX test has the second within the first:

\begin{tabular}{ccc}
\multirow{2}{c}{\multicolumn{2}{c}{One}} & Two \\
Three\\
Four & Five & Six\\
\end{tabular}

The proper notation should be something like (accounting for the other points above):

\begin{tabular}{ccc}
\multicolumn{2}{c}{\multirow{2}{5em}{One}} & Two \\
&&Three\\
Four & Five & Six \\
\end{tabular}

With the LaTeX reader, both Native outputs are the same (replacing the "5em" by "c" in the 2nd one).
However, pdflatex only compiles the second one (even after replacing the "c" by "5em" in the first's multirow).

(/Edited: well, this third one might not be a real problem, since the reader reads both formats, and the latex tables that compile will work fine. But still worth noticing.)

@kysko kysko changed the title LaTeX Reader and multirow LaTeX Reader and multirow, part 2 Aug 9, 2020
@jgm
Copy link
Owner

jgm commented Aug 9, 2020

@LaurentRDC

@LaurentRDC
Copy link
Contributor

Damn, my initial patch (#6470) really missed the mark.

Let me summarize the problems:

  1. \multirow is not like \multicolumn; the second argument in \multirow is the text width;
  2. Empty cells must be parsed, separated by the usual &;
  3. Technically, the nesting of \multirow and \multicolumn is not interchangeable. \multicolumn must be at the top level.

@kysko you refer to documentation. Can you specify if that is documentation that is provided with pdflatex? Once I can refer to the documentation, I'll fix those problems right away.

I'm sorry for the problems that the initial patch created! I'll make sure to add tests to explore various scenarios beyond small simple examples.

@kysko
Copy link
Author

kysko commented Aug 9, 2020

Sorry for not providing a TLDR !

And it is the multirow documentation.

Another detail about the second issue:

the native output of

\begin{tabular}{|r|r|}
A & \multirow{2}{c}{hello} \\
B & \\
\end{tabular}

seems ok through your reader, the "B" is not lost.

I was still able to process a biggish complex table with your reader (after a few adjustments, so your work has a huge benefit!
(Notice in that SE example the empty cells are on the right, so didn't have the problem of the second issue, because of the detail above I just mentioned)

@LaurentRDC
Copy link
Contributor

Ok thanks @kysko, I'll put together a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants