Set default date format to keep MM/DD/YYYY instead of MM/DD/YY? (.csv file consideration)

gtMan · May 12, 2023, 11:24pm

Is there any way to change the default date format to keep the full year (MM/DD/YYYY) instead of it auto shortening to (MM/DD/YY)?

I know you can select the cells and change the date format, but this doesn’t save with .csv file types. It’s become a hassle to change the cell options each time I edit the many csv files I maintain.

I used to get around this by inserting a ’ before the date, but this doesnt work when pasting in multiple lines.

Really hoping I am just missing a setting somewhere, but have search google without much luck.

Wanderer · May 13, 2023, 9:14am

You can’t save additional info with .csv It is a very basic system to exchange data, but the user is “used” to store or recognize the formatting of the data.

You may change the type of the column during import. If you don’t need to calculate with the date you could change the column from Standard to Text, showing the usual ’

Try to convert the csv-data, either putting them im a database, or in other files.

Lupp · May 13, 2023, 9:13pm

To avoid 2-digit-year dates is always a good idea. Even much better is it to use the ISO 8601 format YYYY-MM-DD in the sheets, and all the more so when exporting to csv.The main advantages are:

unambiguous globally
telling not only the date value, but also “I am a date”
sortable also if in the textual representation

After all we also tell the millions before the thousands: Most Significant First.

Fortunately there now is a locale coming with this date format set as default.
It’s English - Canada.

gtMan · May 13, 2023, 9:32pm

Unfortunately I am working with a legacy application that only works with csv file types, and not actual databases. Its look like the only option I have is to uncheck the “Evaluate formulas” checkbox when multi line import paste settings, which at least seems to persist while the file is open.

Villeroy · May 14, 2023, 11:16am

Let’s try a quick and nice solution:

Create a spreadsheet template for database reports or use one of mine: SheetReport.ots (19.3 KB)
Create a document from the template where you create cell ranges with two pre-formatted rows. Format one header row for the column labels and a second data row below the header row with numbers, currencies, date, text. Cell styles help a lot. You can put them on one sheet (separated by blank columns and blank rows) or different sheets, just as you like.
Select such a pre-formatted table, call menu:Data>Define… give a name to the table and check two extra options: “Insert and remove cells” and “Keep formatting”. You may even add formulas next to the data row, for instance =C2*TaxRate or something.
Store the equally structured csv file(s) in question in a dedicated directory (same delimiters and encoding, not necessarily the same column types in same order).
menu:File>New>Database… Connect to existing database of type: Text
Specify the dedicated csv directory and delimiters etc. Yes, register the database.
Store the database document and close it for now.

That’s it. Nothing has been converted or copied. All data are in the csv files. The registered database document just presents these files as tables of a database.

Back to your spreadsheet document, call menu:View>Data Sources [Ctrl+Shift+F4] and drag table icons from the left pane of the data source window to the top-left corner of a prepared database range. Confirm the warning message (we don’t overwrite anything so far).
The csv data imported into the pre-formatted database ranges are linked.

After you replaced an old version of a csv file with a new version of the same csv file, click menu:Data>Refresh while the cell cursor is in the right database range. You get the new data set instantly without any dialog into the pre-formatted database range.
Formattings and adjacent formulas expand automatically to the new dimensions of the current data set. If the old csv had a column header and 100 data rows in A1:F101 and the new file has only 50 data rows, the pre-formatted database range shrinks down to A1:F51 and any adjacent formulas in subsequent columns will be reduced to rows 2 to 51.

A macro to refresh all ranges:

Sub refreshDatabaseRanges()
for each dbr in ThisComponent.DatabaseRanges
    dbr.refresh
next
End Sub

erAck · May 13, 2023, 9:36am

Apply the ISO 8601 date format YYYY-MM-DD once before saving to CSV.
Otherwise when reading CSV Calc will apply the current locale’s date format, which for en-US happens to be MM/DD/YY.

Yes it does. Tick the Edit filter settings box in the SaveAs dialog and in the following Export Text File dialog tick Save cell content as shown, and take a look at the saved file. But as said, when reading the file the current locale’s date format is applied. The ISO 8601 date format is persistent.

Stewart · May 14, 2023, 7:45am

On a related note, is it possible to specify that any field that is deemed by CALC to be a date is shown in a specific format (e.g. DD/MM/YYYY)?

The fact that it defaults to a 2 digit year has been a minor irritation since I started using LibreOffice.

Villeroy · May 14, 2023, 7:53am

Localized 2-digit years is a matter of fact since 30 years at least (Excel 4 of 1992?). Changing the default to 4 digit years would break many millions of spreadsheets where the cell widths would not fit anymore, showing a garden fence ##### until you widen the column, given that the column is not protected.
Any modern spreadsheet that does not need to be compatible with the outside world would use ISO dates as default date format.

Stewart · May 14, 2023, 8:32am

Thank you for the response.
I was not trying to suggest this as a global world wide change of behaviour. I was thinking more of a setting that could be set for an individual installation or file whereby when CALC identifies a field as being a date it defaults to showing it to whatever format is specified. A bit like Format Cell, but at a higher level.

Villeroy · May 14, 2023, 8:35am

Too complicated. If a file format supports number formats, the numbers are formatted accordingly. If not, some default format takes place.

Stewart · May 14, 2023, 8:43am

My question was “is it possible to”, not can the system be changed. Being too difficult is not really a relevant answer.

mikekaganski · May 14, 2023, 8:51am

IMO, this is exactly what the CSV import filter could possibly improve. Indeed, this could be impossible, given that CSV import is used in many places… Anyhow, my feeling is that this specific filter could try to do more with applying cell formatting according to the detected formats - numeric, date/time, whatever. Because this is the detail which makes CSVs “un-roundtrippable”.

The idea is that the “detected format” in the CSV is the same kind of information, as the cell format explicitly specified in ODS. The file format has no other way to express that, but when it was possible to detect - with the help of the user who specified locale, or column properties, or “detect special numbers” in the import - it should be treated the same way.

Villeroy · May 14, 2023, 9:08am

It might be possible to add new, convoluted setup options to the office suite, yes. It is also possible to load csv into pre-formatted spreadsheets which involves a database. CSV is a database exchange format but spreadsheet users hate to set up databases.
https://forum.openoffice.org/en/forum/viewtopic.php?t=23727

Villeroy · May 14, 2023, 9:58am

Well, IMHO spreadsheet formatting does not matter as much as correct values. In the rare cases where formatting really matters, I just apply some.

Setting up a registred text database and dragging a table/query icon into a preformatted sheet is very easy. Takes no more than 3 minutes, to be done only once for a distinct flavour of csv.

However, it may be possible that LibreOffice’s csv driver delivers wrong text values because it adheres to the global locale. In this case you need either a true SQL database doing the conversion or a refresh listener listening on the import range and converting the imported values.

mikekaganski · May 14, 2023, 10:11am

Consider the use of CSV.

For any file format, the most fundamental use is: open it (allow user to see the content in a form that is comprehendible / adequate); allow to edit it (if applicable); and allow to save it.

It is a normal practice, that change of data must be an explicit action (compare to the “cell formatting never changes cell data” principle).

Now look what a CSV user has.

One gets a CSV. They don’t have a slightest idea what “CSV” is (they can even have such a file with a wrong extension like XLS). They google how to handle it; they learn that LibreOffice can; they have it, or gladly install it.
They try to open the file in LibreOffice. LibreOffice shows an import dialog. This may scare users already, but this step is justified, because the specific format really misses many aspects that would be required to convert it to a spreadsheet. So users are asked for these aspects.
They somehow manage to provide the wanted details.
Now they do the edits they need, and save.

Any sensible user would expect, that at this point, after pressing Save button, they have a CSV having the same structure as the original, with the edits made. Why? Because they had provided everything that the program needed to make that happen. Did they? Yes they did, at step 3. And if the program can’t handle that information properly, this is the program’s fault, not the user’s.

There may be technical challenges here (actually, only coming from other uses of the same code, which may complicate things internally), but all the functionality is there. What prevents us from formatting all the cells with detected number format? Do you see how e.g. percent, currency, ISO dates get the format applied correctly? The filter can do more, there is nothing preventing it technically. Unless this breaks the important use. So the whole question is: why don’t we do that. The detection happens, but we simply discard that information - is that justified?

Compare to opening an ODS from e.g. me. I can use Russian-formatted dates. You would get the spreadsheet with Russian-formatted dates, because the file has that. But they would be the correct dates. And all the math with them will be OK.

Now you get a CSV. It has that 05/14/2023 in it. You told the import filter something, and the filter could figure that that string means 2023-05-14 date (serial date 45060). In the process, it indeed knew that the first was 2-digit month, then was a slash, then a 2-digit day, then a slash, then a 4-digit year. It used that knowledge. What prevented the filter, besides writing the proper serial date to the cell, also to apply the detected MM/DD/YYYY format? What would break for the user who opened it this way?

And without all that, when we save back, the user gets 05/14/23 in the file. Remember the “change of data must be an explicit action” principle? The change of data happened here silently. And it could be prevented, if the program was smarter. It didn’t require the user to know some unexpected things that required the user to do more to keep their data, beyond the things they already cared to do at step 3.

Villeroy · May 14, 2023, 10:35am

I would prefer a vanilla csv, expecting that any csv processor on the planet can deal with it.
If it is a date, store as ISO date (which LO preserves anyway).
If it is a time, store as ISO time.
If it is a boolean, store as English TRUE or FALSE.
If it is a decimal, store as English point decimal with no thousands separator, let alone currency symbol or percent.
Wrap strings in double-quotes.
Escape double-quotes in strings with double double-quotes.
Use comma as column separator.

mikekaganski · May 14, 2023, 10:37am

Sorry, do you say that “you would prefer, that after opening a given CSV, and saving it back, you get a CSV of a different structure”? Based on a obviously false claim that “any csv processor on the planet can deal with it”?

A CSV is not an abstract thing. It is always part of a specific workflow. And anyone’s ideas how that should had been implemented is never important: what is important is that the file on entry is the same structure as on exit, unless the user made explicit actions to change that.

Again - the same as with ODS. If you wrote that you preferred my explicit formats to be discarded by default, it would be nonsense. The program should allow you to re-format. But not do it by default.

Villeroy · May 14, 2023, 10:40am

Yes, in order to preserve data consistency (if any consistency was existent when opening the file). Csv does not store any formatting attributes. Feel free to produce well formed csv.

mikekaganski · May 14, 2023, 10:41am

Yes it does. Simply because a string 05/14/2023 is different from 05/14/23, and it is detectable. At the point the program converted the string into the serial date, it had the knowledge about the format. And if that required user intervention, it is even more important - to keep this user knowledge, and not require them to do the double work.

Try to read the CSV RFC. And think about the fact, that it explicitly stated that it tries to summarize some of the established pre-existing practices.

Villeroy · May 14, 2023, 10:49am

It is detectable if the user sets the correct locale, otherwise “15. Mai 23” could not be detected in non-German environments. In some cases, the date is detectable if the user declares the column explicitly (eg. “20230514” as YMD) AND if the user checked “special numbers” anyway, an option which causes the vast majority of csv related problems.