csv is always text, if the data is still text after the import, you have imported with the wrong options!
Importing your data properly is by far easier than writing a repair macro. You could also use your Python skills to preprocess the text file and convert the strings to ISO strings. But even then, there is a most simple rule: ALWAYS CHECK OPTION "DETECT SPECIAL NUMBERS" (this option should not even exist, let alone turned off by default)
Actually this option (ALWAYS CHECK OPTION “DETECT SPECIAL NUMBERS”) was disabled here for me and it was generating the error
And what would be the best approach for the locale? Set it static like this?
locale = uno.createUnoStruct('com.sun.star.lang.Locale')
locale.Language = 'pt'
locale.Country = 'BR'
locale.Variant = ''
An unsolicited remark:
Dates should always be strings (text) that comply with ISO 8601 using the separator "-"
between YYYY, MM and DD to make the date human readable, and to distinguish it from ordinary numerical text.
This is also the only way to get dates reliably and globally transferable using csv formats.
Any numeric representation formatted as date
depends on a NullDate
, and reading a date string from a csv (or alike) using the option “Detect special numbers” will apply the NullDate
currently set for the importing application. A later transfer as a number to a sheet using a different NullDate
will fail without notice. Cells of Writer tables having set a number format produce an additional mess.
(I just tried it again using a LibO V 7.5
sheet document where I had chosen the 1904-01-02 NullDate to make sure that my statement is correct.)
Thus:
- Import dates as strings.
- If they are complying with the mentioned ISO based format, you can even use them directly in calculations (automatic conversion).
Unfortunately this automatic conversion is currently (V 24.8.0) broken for pre-Gregorian dates."1582-10-15" - 1582-10-04" = 11
. Should be 1, imo, as youi get it using the DATEVALUE() function. - In different cases make sure that you know the used format, and convert it to ISO by formulas or user code.
- No program can decide reliably if 2/1/23 is meant to mean YYYY-01-02 or YYYY-02-01 or what century should be assumed.
The only possible NullDate
settings which might be globally introduced are the first day for which the Gregorian calender was defined (1582-10-15) or the day before (1582-10-04). I havent an Excel at hand, but I was told that it still can’t handle dates with a negative representation as numbers correctly. Please tell me if this has changed.
The difference between pre-Gregorian
and proleptic Gregorian
also seems to be not quite clear. (I’m using "pre-Gregorian"
as "Julian"
. However, Calc is treating 1200 correctly as a leap year the Julian way.
See also tdf#144699 created by @mikekaganski.
(Sorry for the long text. The situation was made complicated already in old times.)
… or 11, as suggested by tdf#144699
Anyway, evaluating the mentioned difference the automatic conversion and the explicit conversion using the DATEVALUE()
function should yield the same result.
In addition my opinion: Proleptic Gregorian
in the sense of =TEXT(DATEVALUE("1582-10-15") - 1; "YYYY-MM-DD")
returning 1582-10-14
is a very bad idea even if there should be a “standard” to this effect. As we know that’s an “undate” while “1200-02-29” is a real date in history. As we know standards also can be insufficiently considered (as 1899-12-30=0 was, e.g.).
Really? And 1582-10-15 is a date? In which specific country?
No, that’s the only reasonable idea out there. Pretending that 1582-10-15 is some special date is telling that a specific event in a specific country is more important than any other country, while using a specific calendar irrespective of any historic event (just because of its objective qualities) is a neutral technology.
Factually the Gregorian calendar was accepted as a “global” calendar by next to all countries and cultures. Even all the “orthodox” christian societies which still calculate their religious holidays the Julian way, accepted it. So did “anti-papal” societies at different times, so did Japan and China, and so did Saudi Arabia in 2016. As far as I can see, also countries which didn’t introduce it explicitly use it factually.
Of course. And that is why it is reasonable to be used in technology like LibreOffice, as the primary calendar.
But the date of 1582-10-15
in this calendar is just a historic artifact, not part of the technology. And treating it specially (a) creates technical problems; and (b) introduces a historical (in this case, only meaningful for a single place in the world) artifact into the technology. E.g., I could possibly argue (indeed, I won’t, but see the point) that since I use it in Russia, let me use Gregorian since the date when it was introduced here (1918-02-14), and Julian for the dates before, so that my use of dates corresponds to the dates in documents of my country?
Let’s not introduce additional and meaningless artifacts - it’s no better than the use of dates like 01/02/03 in the first place.
I can’t agree, but I also don’t want to argue any longer.
Sigh.
Remains the question of consistency (implicit conversion vs, conversion by dedicated function).
Hallo
For your FormulaArray you need to cast the date_strings into iso-formatted datestrings
strftime + strptime
@Villeroy is right, the CharLocale may be different than some number-format-locale
from datetime import datetime as dt
def test_date_format(*args):
doc = XSCRIPTCONTEXT.getDocument()
sheet = doc.getCurrentController().getActiveSheet()
source = sheet['A2:A30']
data = source.DataArray
iso_dates = [[f'{dt.strptime(entry,"%d/%m/%Y"):%Y-%m-%d}'
for entry in row ]
for row in data ]
numbers = doc.NumberFormats
locale = doc.CharLocale
number_format_key = numbers.queryKey('DD/MM/AAAA', locale, True)
target = sheet['D2:D30']
target.NumberFormat = number_format_key
target.setFormulaArray(iso_dates)
edit: 2024-08-22 changed the date-convert to f’string’-formatting
This is an approach that is also very useful when I import the contents of a csv file via python and write it to the spreadsheet.
Thanks
convert libreoffice date to python date:
I couldn’t find any BASIC solution, so here it is if someone wants it,
Sub FormatRangeAsNumber(oSheet, oLocale, ByRef oFormats, oRange, formatStr as String)
' BASIC equivalent of 'Text to Columns'
Dim formatNum as Long, i as Integer
Dim numberArray()
If oRange.getCellByPosition(0,0).getValue() = 0 Then
formatNum = oFormats.queryKey(formatStr, oLocale, False)
' add formatStr if it doesn't exist
If formatNum = -1 Then
formatNum = oFormats.addNew(formatStr, oLocale)
If formatNum = -1 Then
MsgBox "Cannot add " & formatStr & " as NumberFormat", 0, "Fatal"
Exit Sub
End If
End If
numberArray = oRange.getFormulaArray()
For i = LBound(numberArray) to UBound(numberArray)
' Convert string to actual date values
' You may have to parse and set it to numberArray(i)(0), if it's different from locale set in libreoffice
numberArray(i) = Array(CLng(CDate(numberArray(i)(0))))
Next
oRange.setDataArray(numberArray)
oRange.NumberFormat = formatNum
End If
End Sub
This function can be called as follows,
Dim oLocale as new com.sun.star.lang.Locale
oLocale .Language = 'pt'
oLocale .Country = 'BR'
FormatRangeAsNumber(oSheet, oLocale, oDoc.getNumberFormats(), someRange, "DD/MM/AAAA")
Basic is a problem, but the solution in this case. Just learn how to import csv data properly, and you never need a macro to fix wrongly imported data.
Your code won’t work in most cases of failure.
So your code relies solely on Basic’s CDate
, where you simply can’t choose anything like locale used for the conversion? Then for your case, the original setFormulaArray
approach would work better.
And generally, it would work better from the start, if you use an intermediate locale for cells for conversion, and then apply another final locale for representation.
I want to import csv properly, but I haven’t decided on a tool for preprocessing. I want this macro to run headless and regularly, so the import wizard is out of the question for now.
@mikekaganski I tried setFormulaArray
method, but it doesn’t seem to work for me. It just re-sets the same strings again. I set it after setting the NumberFomat.
Only search and replace seems to do the trick,
Dim oReplace
Dim oLocale : oLocale = new com.sun.star.lang.Locale
Dim formatStr : formatStr = "DD/MM"
Dim oFormats : oFormats = ThisComponent.getNumberFormats()
Dim formatNum as Long : formatNum = oFormats.queryKey(formatStr, _
oLocale, False)
oReplace = bookRange.createReplaceDescriptor()
If formatNum = -1 Then
formatNum = oFormats.addNew(formatStr, oLocale)
End If
' "19/08/2024" changes to "'19/08/2024" as the type is still string
bookRange.NumberFormat = formatNum
With oReplace
.searchString = ".+"
.replaceString = "&"
.SearchRegularExpression = True
End With
' After replace, the date type gets inserted
bookRange.replaceAll(oReplace)
And this is much simpler than the CDate one, thanks @Villeroy .
With Calc, you don’t need any pre-processing. It can import all kinds of reasonably valid csv.
How to import csv into Calc by macro:
- Open your specific flavor of csv manually with locale, char set and all options properly set.
- Run this code against the active csv document:
Sub showFilterOptions()
Dim args(),i%
args() = StarDesktop.CurrentComponent.getArgs()
for i = 0 to uBound(Args())
if args(i).Name = "FilterOptions" then inputbox args(i).Name,"",cStr(args(i).value)
next
End Sub
- Copy the FilterOptions into your code which may be as simple as this:
dim p(1) as new com.sun.star.beans.PropertyValue
p(0).Name = "FilterName"
p(0).Value = "Text - txt - csv (StarCalc)"
p(1).Name = "FilterOptions"
p(1).Value = "your_specific_filter_options_string"
doc = StarDesktop.loadComponentFromURL(convertToURL(sysPath), "_blank", 0, p())
- Select the column of wrongly imported date strings
- Call the number format dialog and change the number format language for the selected cells to “English (UK)” or any other non-US English.
- menu:Edit>Find&Replace…
3.1 Check “Current Selection Only”
3.2. Check “Regular Expression”
3.3. Search for:.+
3.4. Replace with:&
3.5. Button [Replace All…]
Now you have converted the strings into numbers. Apply any number format in any number format locale you want.