TransWikia.com

write the content of an hbox in an auxilary file

TeX - LaTeX Asked by Maïeul on December 31, 2020

I know that TeX can’t write the content of hbox in an auxiliary file (Re-parse the content of a box register).

That mean that

newwritefoo
immediateopenoutfoo=jobname.txt
setbox0=hbox{bar}
immediatewritefoo{box0}

can’t write
However, can LuaTeX do it? I have found

directlua{
 n = tex.getbox(0)
}

But I don’t understand what n is representing and if I could use it to write the box content in a file.

One Answer

Edit: here is a new code that works for ligatures:

documentclass{article}
usepackage{fontspec}
begin{document}
setbox0=hbox{Příliš žluťoučký textit{kůň} úpěl hbox{ďábelské} ódy, diffierence, difference}
directlua{
    % local fontstyles = require "l4fontstyles"
  local char = unicode.utf8.char
  local glyph_id = node.id("glyph")
  local glue_id  = node.id("glue")
  local hlist_id = node.id("hlist")
  local vlist_id = node.id("vlist")
  local disc_id  = node.id("disc")
  local minglue  = tex.sp("0.2em")
  local usedcharacters = {}
  local identifiers = fonts.hashes.identifiers
  local function get_unicode(xchar,font_id)
    local current = {}
    local uchar = identifiers[font_id].characters[xchar].tounicode
    for i= 1, string.len(uchar), 4 do
      local cchar = string.sub(uchar, i, i + 3)
      print(xchar,uchar,cchar, font_id, i)
      table.insert(current,char(tonumber(cchar,16)))
    end
    return current
  end
  local function nodeText(n)
    local t =  {}
    for x in node.traverse(n) do
      % glyph node
      if x.id == glyph_id then
        % local currentchar = fonts.hashes.identifiers[x.font].characters[x.char].tounicode
        local chars = get_unicode(x.char,x.font)
        for _, current_char in ipairs(chars) do
          table.insert(t,current_char)
        end
      % glue node
      elseif x.id == glue_id and  node.getglue(x) > minglue then
        table.insert(t," ")
      % discretionaries
      elseif x.id == disc_id then
        table.insert(t, nodeText(x.replace))
      % recursivelly process hlist and vlist nodes
      elseif x.id == hlist_id or x.id == vlist_id then
        table.insert(t,nodeText(x.head))
      end
    end
    return table.concat(t)
  end
  local n = tex.getbox(0)
  print(nodeText(n.head))
  local f = io.open("hello.txt","w")
  f:write(nodeText(n.head))
  f:close()
}

box0
end{document}

Result in hello.txt:

Příliš žluťoučký kůň úpěl ďábelské ódy, diffierence, difference

Original answer:

Variablen in your example is a node list. Various types of nodes exists, such as glyphs for characters, glue for spacing, or hlist which is the type you get for your hbox. hlist contains child nodes, which are accessible in n.head attribute. You can then loop this child list for glyphs and glues.

Each node type is distinguishable by value of n.id attribute. Particular node types and possible attributes are described in chapter "8 Nodes". In this particular example, we need to process just glyph and glue nodes, but you should keep in mind that node lists are recursive and various nodes can contain child lists, like hlist, vlist, etc. You can support them with recursive call of nodeText on current node head attribute.

Regarding glyph nodes, char attribute contains unicode value only in the case if you use opentype or truetype fonts, if you use old 8-bit fonts, it contains just 8-bit value which actual encoding depends on used font encoding and it isn't easy to convert it to unicode.

documentclass{article}
usepackage{fontspec}
begin{document}
setbox0=hbox{Příliš žluťoučký textit{kůň} úpěl hbox{ďábelské} ódy}
directlua{
    local fontstyles = require "l4fontstyles"
  local char = unicode.utf8.char
  local glyph_id = node.id("glyph")
  local glue_id  = node.id("glue")
  local hlist_id = node.id("hlist")
  local vlist_id = node.id("vlist")
  local minglue = tex.sp("0.2em")
  local usedcharacters = {}
  local identifiers = fonts.hashes.identifiers
  local function get_unicode(xchar,font_id)
     return char(tonumber(identifiers[font_id].characters[xchar].tounicode,16))
  end
  local function nodeText(n)
    local t =  {}
    for x in node.traverse(n) do
      % glyph node
      if x.id == glyph_id then
        % local currentchar = fonts.hashes.identifiers[x.font].characters[x.char].tounicode
        table.insert(t,get_unicode(x.char,x.font))
                local y = fontstyles.get_fontinfo(x.font)
                print(x.char,y.name,y.weight,y.style) 
      % glue node
      elseif x.id == glue_id and  node.getglue(x) > minglue then

        table.insert(t," ")
            elseif x.id == hlist_id or x.id == vlist_id then
                table.insert(t,nodeText(x.head))
      end
    end
    return table.concat(t)
  end
  local n = tex.getbox(0)
  print(nodeText(n.head))
  local f = io.open("hello.txt","w")
  f:write(nodeText(n.head))
  f:close()
}

box0
end{document}

nodeText function returns text contained in the node list. It is used to print hbox contents to the terminal and to write to file hello.txt in this example.

For basic info about font style, you can try to use l4fontstyles module, like this:

local fontstyles = require "l4fontstyles"
...
if x.id == glyph_id then                                                        
        table.insert(t,char(x.char))
        local y = fontstyles.get_fontinfo(x.font)
        print(y.name,y.weight,y.style)

Correct answer by michal.h21 on December 31, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP