Ask Your Question
0

Converting DBF to CSV is changing the output format of date columns

asked 2019-11-14 21:33:56 +0100

protista gravatar image

updated 2019-11-14 23:03:30 +0100

I have been digging around on here to develop a way to convert a DBF file to a CSV file and am fairly close to the desired output but am having some strange output mismatches once the conversion is completed. In particular with date columns. We currently have a windows utility "WhiteTown wizards dbf2csv" we are moving away from and are hoping to get this accomplished on a new Linux host with Libre Office.

There is a header row that we do not need to keep in the output of the csv from the dbf file and there are dates in the columns that are coming out wrong: 14,24,28,30,31,33,36,42,43,44,45,80,129

The native format in the dbf file for today's date would look like 20191114 (yyyymmdd) however, after converting my file with the following command the date in the csv would look like ,11/14/19, (mm/dd/yy). We need to maintain the yyyymmdd format. I have tried to adjust for not converting the header and targeting one date column to start with by running a command that looks like this: libreoffice6.2 --headless --convert-to csv:"Text - txt - csv (StarCalc)":"44,34,76,2,14/2" file.dbf

I believe the arguments I am passing are as follows based on this document link:https://wiki.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options

  1. Field Seperator : 44 | ,
  2. Text Delimiter: 34 | "
  3. Character Set: 76 | Unicode (UTF-8) (have tried 0: system and 12: ISO-8859-1 (Western))
  4. Number of First Line: 2
  5. Cell Format: 14/2 | Column 14 / Format Text

Is there something I am doing wrong that seems to be preventing my from getting the desired output format (yyyymmdd)? Any advice would be appreciated, please let me know if I can provide any additional details as well.

edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
1

answered 2019-11-15 00:07:19 +0100

erAck gravatar image

The "native" yyyymmdd format is just how it's stored in the DBF file. Once read into Calc it is a date serial number formatted in your locale's date representation. If you want that to be in another format you need to apply a number format in Calc, i.e. YYYYMMDD, which then is exported to CSV as well (with the 9th token defaulting to true).

The Number of First Line and the Column / Cell Format sequence are just for importing CSV files.

edit flag offensive delete link more

Comments

you need to apply a number format in Calc

... which implies, that you cannot use simple --convert-to CLI to perform that; you either need to use GUI, or create some programmatic way (macro/PyUNO/...) to do that.

Mike Kaganski gravatar imageMike Kaganski ( 2019-11-15 06:26:30 +0100 )edit

Right, or convert in a locale that at least uses ISO 8601 date representation, which still does not give yyyymmdd but unambiguous yyyy-mm-dd dates, for example en-CA English Canada

LC_CTYPE=en_CA.UTF-8 libreoffice6.2 --headless --convert-to csv:...

Yes, using LC_CTYPE is odd here, but that's how things work.

erAck gravatar imageerAck ( 2019-11-15 16:06:06 +0100 )edit

I was afraid that would be the case in regards to field 4 and 5, I had seen a handful of other posts here regarding the different conversions that recommended using them which had my hopeful it would work.
I am curious why there does not appear to be any newer documentation regarding this produced by Libre Office rather than the OpenOffice document that I have seen linked about 10 different times.

This is being run on a server with no X11 with intended purpose to have this conversion run out of a script with no human interaction. It appears some additional work or a different route to getting it down will be required. I appreciate the input!

protista gravatar imageprotista ( 2019-11-15 16:12:11 +0100 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-11-14 21:33:56 +0100

Seen: 35 times

Last updated: Nov 15