Is there a way to control the font with --convert-to txt to docx

I’m running LO 6 on Manjaro Linux.

I want to programmatically create .docx files from .txt. This works fine already except that I need a mono-spaced font so that the content will be formatted correctly. LO seems to want to use Liberation Mono but it does not exist in some contexts (i.e. Android, Windows) and a proportional font gets substituted. I’d like to be able to specify that something like Courier New be used. Is this possible?

Thanks,

~ray

Please don’t post as wiki. I never saw it coming out useful.

sry… first time… there was no explanation for the checkbox that I could see but I had seen this referred to as a wiki, so…

FWIW, I came up with my own solution for this (at least until a better one is found). I wrote a script that unzips the docx file and uses xmlstarlet to change the fonts in styles.xml (then re-zips). This is working fine even if not the “cleanest” solution. Here’s what it looks like (in case it’s helpful to anyone):

#!/bin/bash

FN="myfile.docx"
TMP="xx"

function fail() { echo "error: $*"; exit 1; }

[ -e $FN ] || fail "$FN not found"

WD=$PWD
function cleanup()
{
    cd "$WD"
    [ -e $TMP ] && rm -rf $TMP >/dev/null
}
trap cleanup EXIT
set -e

mkdir $TMP
cd $TMP
unzip -q ../$FN
cd word
xmlstarlet ed -u '//@w:ascii|//@w:hAnsi' -v 'Courier New' styles.xml > newstyles.xml
mv newstyles.xml styles.xml
cd ..
rm ../$FN
zip -rq ../$FN *

Yes, to manipulate the file directly is an appropriate way if you know the needed details I mostly try to avoid it (except for repairs), and I never did it with docx.
In the specific case it’s about a font name exclusively. If many style attributes or complete styles should be changed/updated, the usage of a LibO file as a platform may be preferrable.

It is very easy to define font name when converting from plain text files.

LibreOffice help includes an example for conversion from plain text file:

--infilter="Text (encoded):UTF8,LF,,"

The help didn’t specify what were those missing parameters after LF (now I added that, and it will go to the next help version), but here they are:

  1. UTF8 is encoding used to decode the file.
  2. LF is line ending format (CR and CRLF are the other allowed options; if missing, CRLF is used on Windows, and LF on all other platforms).
  3. Font name.
  4. BCP 47 Language tag.

So, the command line could be like this:

soffice --infilter="Text (encoded):UTF8,,Courier New,en-US" --convert-to docx path/to/file.txt

to convert a UTF8-encoded plain-text file with default line endings, using Courier New font, and English (USA) language for the imported text.

A side note

LibreOffice is quite smart when it comes to default fonts it uses, which are possibly not available on other systems. For instance, for Liberation Mono, it defines a substitute font in the generated docx (see word/fontTable.xml), which is Courier New; as well as the font properties (fixed-pitch “modern” font), which allows to find proper substitutions on any system.

@mikekaganski: Thanks. Concerning the sole definition of a font to use, I was very discontent with my suggestion (which was sketched for also loading a page style originally).
However the poor documentation of filter parameters and of .uno-commands remains an annoying issue.

Thanks! This is just what I was hoping for. Unfortunately, my results are mixed and ultimately still better with xmlstarlet.

The difference seems to be that --infilter only changes the PreformattedText style; it leaves Default set to Liberation. This seems to be interpreted differently in different contexts.

For example, Word on Win10 comes up in “reader mode” by default and it displays the --infilter version in a proportional font (the xmlstarlet version is displayed in mono). Both …

… versions do display correctly (ie in mono) when you switch to “print mode”.

On Android, I’ve tested with both WPS Office and Polaris Office. Polaris behaves like Word/Win10 (proportional with --infilter, mono with xmlstarlet). WPS, unfortunately, appears to do the wrong thing always. :frowning:

Adding to the …

… confusion in my testing, I discovered that LO by default on my laptop running Arch uses Courier New but my desktop running Manjaro uses Liberation (both are fully updated). I’m not sure why; haven’t found any config or package difference so far.

Anyway, I guess I’ll be sticking with xmlstarlet for now, but thanks anyway for this information!

(Don’t think it’s OS dependent - except for the syntax of the template’s pathname, of course.)
You need to load the character style you want from a template during the process.
See the recent related thread Opening text docs in OpenOffice (View topic) • Apache OpenOffice Community Forum in that other very valuable forum. My answer Opening text docs in OpenOffice (View topic) • Apache OpenOffice Community Forum there may help you.

===Edit 2018-11-.14 22:20 regarding the comments===
I made a demo to better explain what I meant. It is attached here as an archive (.ods is a fake. Remove it.) containing a template, two plain-text files as examples, and the spreadsheet file where a parameter range is in B2:C4, and some scripts are in the module scripts of the Standard library.
Extract the files to a common empty folder, and open the spreadsheet file. Adapt the pathnames to the actual situation and save the ods anew. A commandline for Win is prepared now in cell A20. Adapt the elements to the needs of a Mac.
If you now run the command from a terminal you will need to act on a prompt. This is unavoidable for security reasons if code from a file shall run. An alternative is, to move the scripts module to the local Standard library and to change location=docukent to location=application in the query part of the command URI.

thanks, but I don’t see how to do this programmatically (I’m doing it from a Makefile)?

I did try to make an .ott and load it with -n but didn’t work – -n causes a writer window to open and doesn’t seem to effect --convert-to . Adding --headless (which, I know, shouldn’t do anything) causes it to hang (need to ^C it).

I know I can load the new .docx into LO and edit the font manually (or with a macro) but my question is whether it can be done programmatically.

What I suggested didn’t require to load a .docx and to change there anything manually. It requires to run some code based on the LibO API and/or .uno: commands (these again executed with the help of an API service). If it was my job I would probably use a spreadsheet as a kind of batch. Any single plain-text file would be opened, reformatted by loading styles from a template, and then stored to the target format. That’s a conversion, isn’t it? I dont think it’s feasible by a CL option.

I guess I didn’t read it carefully enough. I’ll take another look. In the meantime though I came up with another way to fix the problem; I’ll add a new answer.