How to convert CSV from command line and ignore first lines?

Hello,

I’m trying to convert XLXS to CSV with the following command :

libreoffice --headless --convert-to 'csv:Text - txt - csv (StarCalc):59,34,12,10' --outdir output xlsx

I’m using LibreOffice 7.2.6.2 20(Build:2) and according to the following documentation CSV Filter parameters, i can ignore first lines with the token “Number of First Row”.

In the previous command this token is equal to 10 and i expected the first line of the generated CSV to be the 10th line of the XLXS.
Unfortunately this token seems to be ignored and the first line of the CSV is the same than in the XLXS.

Note that is not work either with ODS or even CSV as input file.

I confirmed that in spite of what it looks like the docs are saying, the 10 has no effect using the following:

Dim Propval(1) as New com.sun.star.beans.PropertyValue
Propval(0).Name = "FilterName"
Propval(0).Value = "Text - txt - csv (StarCalc)"
Propval(1).Name = "FilterOptions"
Propval(1).Value ="44,34,0,10,1"   'ASCII  59 = ;  34 = "  44 = ,

This is not trivial. If working, it would allow a person to have header info for a calculated sheet…all manner of details in some “chrome” at the top…that would export as the CSV of a derived data table lower in the sheet. Of course, a person could use BASIC and do a file output from a specific range of cells, composing the result cell-by-cell.

Trying to export an arbitrary sheet to csv using your code, I get error

BASIC runtime error.
An exception occurred 
Type: com.sun.star.io.IOException
Message: SfxBaseModel::impl_store <file:///tmp/export.csv> failed: 0x81a(Error Area:Io Class:Parameter Code:26) /home/buildslave/source/libo-core/sfx2/source/doc/sfxbasemodel.cxx:3202 /home/buildslave/source/libo-core/sfx2/source/doc/sfxbasemodel.cxx:1775.
ThisComponent.storeToURL("file:///tmp/export.csv", Propval())

Here’s the code as it sits in production:

 Sub ExportToCsv
	Dim Document as Object
	Dim FileDirectory as String
	Dim Sheets as Object
	Dim NumSheets as Integer
	Dim index as Integer
	Dim Filename as String
	Dim FileURL as String
	
    Document = ThisComponent
	
    Rem  Use the global string tools library to generate a path to save each CSV
    GlobalScope.BasicLibraries.loadLibrary("Tools")
    FileDirectory = Tools.Strings.DirectoryNameoutofPath(document.getURL(), "/")

    Rem  Work out number of sheets for looping over them later.
    Sheets = document.Sheets
    NumSheets = Sheets.Count - 1

    Rem  Set up a propval object to store the filter properties
    Dim Propval(1) as New com.sun.star.beans.PropertyValue
    Propval(0).Name = "FilterName"
    Propval(0).Value = "Text - txt - csv (StarCalc)"
    Propval(1).Name = "FilterOptions"
    Propval(1).Value ="44,34,0,1,1"   Rem ASCII  59 = ;  34 = "  44 = ,

    For index = 0 to NumSheets
        Rem  For each sheet, assemble a filename and save using the filter
        If Instr(1,document.Sheets(index).getName,"_")<>0 Or document.Sheets(index).getName="Daily" Then
	        Document.getCurrentController.setActiveSheet(Sheets(index))
	        Filename = FileDirectory + "/" + Sheets(index).Name + ".csv"
	        FileURL = convertToURL(Filename)
	        Document.StoreToURL(FileURL, Propval())
        End If
    Next index

End Sub

It’s composed of work by others…mostly https://www.briankoponen.com/libreoffice-export-sheets-csv/.

As the help text says “Row number to start reading” this is for importing CSV (line number instead of row maybe would clarify, and marking it explicitly for Import). There is no parameter to export a spreadsheet to CSV starting from a given row number.

2 Likes

Indeed it says so; however, I still filed tdf#148943, because at all other positions where that matters, there’s an explicit uniform distinctions “CSV Import”/“CSV Export”, and also it’s possible to read “Row number to start reading” as “… to start reading from spreadsheet on export”.

1 Like

Thanks for your answers and for the issue opened in the bug tracker.
I ended up with a tail command to ignore the first lines:

libreoffice --headless --convert-to 'csv:Text - txt - csv (StarCalc):59,34,12' --outdir output test.xlsx
#Ignore the first three lines
 tail -n "3" output/test.csv > final.csv

Note that while it may be enough in a specific use case, it is not strictly correct, since CSV technically may contain newlines as part of the data. (A note mainly for completeness, and to warn users reading this to solve their similar needs.)

Exact, this is just a workaround for my specific use case

While that is certainly the easiest solution, tail -n 3 outputs the last three lines. To ignore the first two lines (and thus start at line 3) use

tail -n +3