I have a text-only document that is from a french source. I tried to import into columns but the data is only separated by spaces and will not fall into the appropriate columns, the only delimiter is the space. I have a need to convert the included dates into a usable english date format. Presently, they are formatted as DD-MMM-YY so a date looks like 20-août-02 for August 20, 2002, or as 14-mai-14. In french, the month abbreviation is not always 3 characters
How can I change the month into a numerical value so I can use it with Date(ValueYr, ValueMo, ValueDy)?
Well, I created a test file with contents:
word1 word2 word3 20-août-02 word5 14-mai-14 word7
word8 20-août-02 word10
and then replaced newlines (regex \n
) with <LF>
, after which replaced spaces with \n
, which gave me
word1
word2
word3
20-août-02
word5
14-mai-14
word7<LF>word8
20-août-02
word10
Then I saved this as a .csv, and opened in Calc. In conversion dialog, I chose language=French, and Detect special numbers
, and selected the only column’s type to date (DMY). Then I have set the required date format to the column, and saved the file. After which I only had to revert the changes (replace newlines with spaces, followed by replacing <LF>
with newlines).
But this would break any numbers there if they were. Depending on your data, more careful processing might be required.
Thanks for the reply. Still a lot of work. Looking around the net, I found several articles that gave me a clue to the method to proceed.
Source document was a multi-page PDF with a poorly formed table containing the many columns of data. Some cells were blank, some had several lines of text. Had to manually align data into columns.
Once I had the dates in a single column, I used the MATCH formula to give me a value for the french month.
MATCH(monname,{"janv";"févr";"mars";"avr";"mai";"juin";"juil";"août";"sept";"oct";"nov";"déc"},0)
I entered the following formula into F3 and copied down
=DATE(VALUE(RIGHT(E3,2)),MATCH(MID(E3,4,LEN(E3)-6),{"janv";"févr";"mars";"avr";"mai";"juin";"juil";"août";"sept";"oct";"nov";"déc"},0),VALUE(LEFT(E3,2)))
VALUE(RIGHT(E3,2)) - gives the numerical value of the year
VALUE(LEFT(E3,2)) - gives the numerical value of the day
MID(E3,4,LEN(E3)-6) - gives the middle text starting at the fourth character (3 characters on each end are not needed)
The last formula is put as ‘monname’ in the MATCH Formula.
Used ‘Paste Special as… Date and Time’ to copy entire column into G formatted as I want. Once verified, Columns E and F were deleted.
If you get the data in one column, then the proper tool is Data->Text to Columns. It allows you to convert this in one go.
Just a comment. Questions like this (and many others related) would be less frequent and less necessary if there were more emphasis in LibreOffice on using ISO standards for date formats and less emphasis on using national or local ad hoc “standards”. While users often have no control over the format of the data they must work with, it would help in the long term to reduce problems if there were more encouragement generally for businesses and developers to use ISO standard formats.
@ve3oat: This is just a wishful thinking. Doing that only makes users search for other solutions that do what they need, not what sw dictates.
You are right, Mike. The software should never dictate, at least within reason. But people should by nature want a standard format to avoid confusion, and that has to start with early education. I believe that a useful format should contain no ambiguities. As a genealogist, I struggle almost every day with other people’s day-month/month-day and 2-digit years. There is no end to it.