Macro Calc: Convert column to date - similar "Text to Columns..."

How to make a macro similar to the behavior of the command Main menu> Data> Text to Columns…

Have a lot of data that comes from a csv, some columns are dates but they are in text format. I need to convert to date format to do some sorting.

Via macro I was able to modify the column format to date, but the content is not changed by this formatting modification. Now I need a macro that does the same function as the “Text to Columns…” command, I tried using setFormulaArray to do this but the result was not as expected, some cells were in date format others in text format.

def test_date_format(*args):
    doc = XSCRIPTCONTEXT.getDocument()
    sheet = doc.getCurrentController().getActiveSheet()
    date_column = sheet['D:D']
    
    #Modify column D to date format (DD/MM/YYYY), it works OK
    #number_format_key = 36
    numbers = doc.NumberFormats
    locale = doc.CharLocale
    number_format_key = numbers.queryKey('DD/MM/AAAA', locale, True)
    date_column.NumberFormat = number_format_key

    #does not work as expected, some cells are in date format, others in text format
    content_range = sheet['D2:D30']
    content_value = content_range.DataArray
    content_range.setFormulaArray(content_value)

test_date.ods (31,2,KB)

Would anyone know how to do this, can it be vba or python macro.
Thanks

setFormulaArray:
https://forum.openoffice.org/en/forum/viewtopic.php?p=346911&sid=91487894ffe209a1c1dbd88975497cad#p346911

Create a c.s.s.util.ReplaceDescriptor and replace regex .+ with & within the text cells in question.

What makes you sure that the document’s char locale is the right one? Your macro would fail with most of my documents.

1 Like

date_conversion.ods (34.5 KB)
has some disjunct ranges with non-US English date strings (DD/MM/YYYY) but with USA number formats (style “en_US”). Some of them show a leading apostrophe in the formula bar, some don’t. This is because the text "01/02/2023 evaluates to second of January in US context whereas “13/02/2023” does not evaluate to anything in US context.

The Basic macro finds the date strings, applies cell style “en_UK” and replaces the found strings with themselves. This is like re-entering all the strings into the UK context.

csv is always text, if the data is still text after the import, you have imported with the wrong options!

1 Like

Importing your data properly is by far easier than writing a repair macro. You could also use your Python skills to preprocess the text file and convert the strings to ISO strings. But even then, there is a most simple rule: ALWAYS CHECK OPTION "DETECT SPECIAL NUMBERS" (this option should not even exist, let alone turned off by default)

2 Likes

Actually this option (ALWAYS CHECK OPTION “DETECT SPECIAL NUMBERS”) was disabled here for me and it was generating the error

And what would be the best approach for the locale? Set it static like this?

locale = uno.createUnoStruct('com.sun.star.lang.Locale')
locale.Language = 'pt'
locale.Country = 'BR'
locale.Variant = ''

Hallo

For your FormulaArray you need to cast the date_strings into iso-formatted datestrings
strftime + strptime

@Villeroy is right, the CharLocale may be different than some number-format-locale

from datetime import datetime as dt

def test_date_format(*args):
    doc = XSCRIPTCONTEXT.getDocument()
    sheet = doc.getCurrentController().getActiveSheet()
    source = sheet['A2:A30']
    data = source.DataArray
    iso_dates = [[dt.strptime(date_string,'%d/%m/%Y').strftime('%Y-%m-%d')
                 for date_string in row] for row in data]
    
    numbers = doc.NumberFormats
    locale = doc.CharLocale
    number_format_key = numbers.queryKey('DD/MM/AAAA', locale, True)

    target = sheet['D2:D30']
    target.NumberFormat = number_format_key   
    target.setFormulaArray(iso_dates)
1 Like

This is an approach that is also very useful when I import the contents of a csv file via python and write it to the spreadsheet.
Thanks

convert libreoffice date to python date: