TransWikia.com

Word breaking from any point of the word for whole document or specifically for tables

TeX - LaTeX Asked by Mertcan Seğmen on September 6, 2020

I have a bunch of markdown tables such as this one below, and they’re being converted into PDF using pandoc and a LaTeX PDF template.

| Column1                                                                                                                           | Column2        | Column3 | Column4 | Column5             | Column6                                                                                                     | Column7          | Column8                                                                                | Column9                                         | Column10                                                                                                                |
|-----------------------------------------------------------------------------------------------------------------------------------|----------------|---------|---------|---------------------|-------------------------------------------------------------------------------------------------------------|------------------|----------------------------------------------------------------------------------------|-------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
| Lorem Ipsum verylongwordwithnospacehere simply dummy text of the printing and typesetting indust                                  | Lor            | Lor     | L       | Lor                 | Lorem Ipsum is simply dumm                                                                                  | Lorem Ipsum i    | Lorem Ipsum is simply 9834JKEMKWJ4334DWEE44 the printing and typesetting industry. Lo  | Lorem Ipsum is simply dummy text of the printin | Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard |
| Lorem Ipsum is simply dummy text of the printing anotherverylongwordwithoutspace                                                  | Lor            | Lor     | L       | Lor                 | Lorem Ipsum is simply dummy                                                                                 | Lorem Ipsum i    | Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsu  | Lorem Ipsum is simply dummy text of the printin | Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard |
| Lorem Ipsum is simply dummy Q034DJSKJ32492139DK                                                                                   | Lor            | Lor     | L       | Lor                 | Lorem Ipsum is simply dummy t                                                                               | Lorem Ipsum i    | Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsu  | Lorem Ipsum is simply dummy text of the printin | Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard |

So when there are long words or some sort of long codes in table cells, the output I’m getting is something like the pictures below. They’re either being cut out or overflowing to the next column.

img1

enter image description here

What I need is, a way to allow words to line-break from any letter. There should be no hyphenation as well, so I’m using usepackage[none]{hyphenat} for that.
So at the end, what I want is something like this:

enter image description here

As I said the markdown content is being converted into latex code automatically, so I don’t think I can use something like seqsplit{longword}. I’m not quite sure if it’s possible but I need something that will enable word-breaking for the whole document or target only the tables…

One Answer

Probably not a final answer, at this stage, but too long for a comment. I recall, and have, a file allhyph.tex with hyphenation patterns for hyphenation points after all 256 characters in the fonts for TeX of the day. I can't find it on CTAN or by web search, so I may have even written it. (The opposite zerohyph.tex should be loaded as language "nohyphenation".)

But there is another trick I found that uses ordinary (default) english hyphenation rules. The patterns always allow hyphenation after the letter l (ell). So, at the expense of never being able to use lowercase or MakeLowercase, set the lower-case-code for every character to the code for l (108). Following is an example meant for the T1 font encoding. Dealing with big font encodings would need a longer list of character code points.

The next ingredient you need is to set the hyphen character for the font (for all fonts) to be a small or zero width blank character. That is the textcompoundwordmark.

Two more things are you have to tell LaTeX to hyphenate words even at the ends; and you need to allow hyphenation in the first word of a paragraph (usually prevented).

documentclass{article}
usepackage[T1]{fontenc} % require textcompwordmark
usepackage[english]{babel}

makeatletter
newcountlccodepoint
defsetAllBreak{lccodepoint=33 @whilenum{lccodepoint<256}do
       {lccodelccodepoint=`ladvancelccodepoint@ne}%
    lefthyphenmin@ne righthyphenmin@ne
    hyphencharfont=csnamef@encodingstringtextcompwordmarkendcsname
}
g@addto@macroselectfont{setAllBreak}
AtBeginDocument{setAllBreak}

% That finishes the setup, except for everypar below.

setlengthtextwidth{2pt}% ultra-narrow for testing
setlengthparskip{8pt}

begin{document}

% This allows hyphenation of the first word in the paragraph
% but can't be in preamble
everypar{nolinebreakhspace{0pt}}

abracadabra

noindent abracadabra emph{wowzers}

end{document}

This will not introduce line breaks where none are allowed of course! Think of mbox{ }. More important for the question, most column types in tabular are like mbox and prevent all line breaks. I suggest switching the tabular environments to tabularx and using all X column types, or types derived from that (as for centering) like

newcolumntype{C}{>{centeringarraybackslash}X}

To make some columns proportionately narrower or wider than other X columns, you can see Centering in tabularx columns

Answered by Donald Arseneau on September 6, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP