Formatting of external data

Hi everyone,
hope I can get some help here: I have some kind of similar problem like this guy:

I have full control about an html-table that needs to be imported (and updated) into LO Calc.
The table is very simple accounting stuff. But I just can’t make it work to detect the currency.
I can not use a helper table as table length his varying a lot.

Values are correctly formatted (German style with a comma) in the html and also in Calc. They are correctly identified as numbers (align right) but the currency symbol is missing and i am missing the second zero (14,10 € → 14,1)
I tried english formatting, and some other wierd stuff with html tags and styles too. Currency at the beginning, at the end. Maybe I am to stupid to see the obvious?

As I must do sum in the LO file, it is curcial that it is correctly shown as a € currency (red color is not important)

Version 7.3.7.2
Minimum (not) working example to be imported:

<!DOCTYPE html>

<html>
<body>
    <table style="font-family: Liberation Sans;">
        <tr>
            <th></th>
            <th><font face="Liberation Sans">Notiz/Name</font></th>
            <th><font face="Liberation Sans">Datum</font></th>
            <th><font face="Liberation Sans">Betrag</font></th>
            <th><font face="Liberation Sans">Distanz</font></th>
            <th></th>
        </tr>
        
        <tr>
            <td><a href="/someref"> </a></td>
            <td> 
                <font face="Liberation Sans">
                   Some test</font><br><font face="Liberation Sans">2nd row
                </font>
            </td>
            <td><font face="Liberation Sans">12.10.2020</font></td>
            <td><font face="Liberation Sans">14,10 €</font></td>
            <td><font face="Liberation Sans">
                47
            </font></td>
            <td><font face="Liberation Sans">km</font></td>
        </tr>
    </table>
</body>
</html>

The problem was resolved by restarting the application (libre office calc).
argh

— EDIT —
I think I found some buggy behaviour.
When opening the file in LO Calc including the referenced external data links, it displays some yellow bar, saying something like: “automatic loading has been disabled” and a button to enable it again.

If you click that button the reload will take some time but it will not parse/format the result properly.
How often you might try to reload it manually via the menu (Edit → external data → update) the formatting will not come back. Also not on fresh imports (delete old data and link, create new link).

When I reopen the file and do not click that button, instead I do the manual reloading process, all works fine.
Can someone confirm this behavior?

PPS: rowspan is also a not supported html property according my experiments.
Is there no list of supported/unspported tags?

Could you please upload your files (reduced or redacted, if necessary).
.
Years ago I developed a work-around this by always importing in a separate sheet and mirroring the data to another sheet by formula, so now I never see your problem. But my usual source for refresh is a database, not HTML.
.
If there is a different behaviour for the two ways you refresh, you should do a bug-report at bugzilla.
.
What would be your preferred beaviour: Calc to resemble style of the html-source or keeping all formatting in the sheet?

Hi Wnaderer,
thanks for your support.
I made a minimal working example:
Therefore I uploaded a minimal html to my server:
https://bbehlau.de/calc_external_data_test.html

When importing the data at first, everything works fine. When updating the source or manually reloading, everything works fine.
But when you close the file and open it up again, LO Calc will ask for allowance to update from external data.
If you do not allow it, it will not update anymore (not a bug but a feature).
If you do allow it, it will break formatting not detecting the currency anymore.

Test File:
upload2.ods (28.2 KB)


Answering your other questions:
I don’t mind if I format my html or format my ods. But I would prefer source formatting, as I can make sure that different size of data (# of columns / rows) would be formatted correctly by using the informations from html file.

On the other hand I can think of examples where the source and thereby styling is not within your hands. But independently how it is solved, one way you need to go. Both ways are fine, and you could even do both by implementing a checkbox on import (keep style informations).

Just the current state of no solution is unsatisfying. Especially if the format is getting lost everytime you reopen the file and reload data. This really breaks any kind of use case.


Using Data Import would be a great method too. Unfortunatley I could not find any tool for importing MYSQL- Data or doing mysql queries within Calc. I know it exists for M$ Excel, but that doesn’t help me here. Any suggestion?

I could even live with a macro updating the format before printing or with a shortcut. But I don’t want to over engineer solutions when there is (maybe) a simple bugfix or a way to format the source I just don’t know or cannot find the docs for.

Is there maybe way to load csv or somtihing like that from another source? Does it need to be html?

Holy crap. I did not encouter that it could be that stupid to push data into Libreo Office Calc.

So I tried today csv import via external data source too. It works in theory, but in theory only.
As I am German, the standard import method is German. So as soon as there is a period in the numbers they are detected as text. On the other hand, commas “,” are separators for csv files.
So I than ‘turned of’ manually the “separate on comma” checkbox when importing it at the first time.
But then the € sign again leads to text detection and puts an ’ (high comma/apostroph) at the beginning. Marking “import cells in high commas as text” doesn’t help either.

Puhhh I really need a break and a quick solution. My boss is really not amused me spending so much time on this before the aproaching deadline.

Is there any other option to push data into calc from a remote source (even mysql query would be fine) ?
.
PS: And I need the formatting as currency because of the trailing zeros !

Quick and dirty: As already mentioned by me (and Lupp in the other thread): Import to Sheet2/Tabelle2 for example, then use either simple =WENN(Tabelle2.A1="";"";Tabelle2.A1) to move data to the place, where it is needed. Or use more sophistcated formula with WECHSELN() for columns, where you need to remove special characters. (I assumed you use licalized german names for functions).
.

You open Base, and then say you wish to connect to an existing database of type text. Then you select the folder!! (not the file) where the csv-files are as database. The actual csv-files are the tables of the database. Save the .odb-file - it will not hold your data, as they stay in the csv-files, but the connection to the files (location), the type/charset etc. and simple queries, so you can rearrange data. Then you can simply drag a table or query from the database (use Datenquellenansicht, open with Ctrl-F4) to your sheet to access the data in Calc.
.

For the manual import stuff: “Detect special numbers” is your friend to read currencies as numbers AND you can set the locale for the import. You can even set the type for every column - just click on the column head. Very flexible and one of the reasons wich was my reason to abandon the Excel I found on my computer at work.
.

Remember: M$ does not include Access, so most people don’t have it and Exel provides replacements, but LibreOffice does have an integrated part (Base) to connect to databases, so you searcher at the wrong place. I recommend to use the german guide for Base, as it is quite up to date. RobertG is really doing a fine job there. After you connected to MySQL/MariaDB you can drag queries or Tables to Calc as mentioned above. Actually there is no difference for Calc, as Base hides usually the details of the datasource. So you can even change later.

1 Like

@Wanderer thank you very much for your reply.
I did not know about Wechseln function and I will try it. Still I am not really happy about using another table. I think that the functionality of import external data should be fixed in some proper way instead.
It would be great if you could e.g. confirm a misfunctionality for a proper bug report.
When format works on the first import, it should also work on any update when there is no change of the source (independently from closing and reopening the ods file).


But I really appreciate your suggestions as a proper bug report is one thing, but finishing the job is another thing.
I will report about testing Wechseln and the “reference to a second table” suggestion.

Maybe my statement was misleading. I did not import csv files or a number of csv files locally. I just used the “external data”-dialog with a csv file source (http://…/somefile.csv) instead of an html source.
Using local files is not an option to me. Those exports are created on the fly from the database. CSV, HTML is just some kind of protocol to push the data to Calc. I need to do the fine tuning for printing somewhere plus ods files (or exported pdfs/xlsx) are good to be send by mail.


Which might be true for manual import, but is very sadly not true for the external data dialog in my case. After entering an URL to an csv file and hitting enter the well known import dialog pops up. there you can only choose between standard, text, English and different dates for table columns. Of course I did turn on the “Detect special numbers”. Also tried different import languages while changing the source from period decimal delimiter to comma and vice versa. You can try it with those two csv files (csv cannot be uploaded here sorry):

http://bbehlau.de/import_separatorsemicolon_delimitercomma.csv
http://bbehlau.de/import_separatorcomma_delimitorperiod.csv


Which I will no try now. I will give you some feedback in short time. I hope Calc will not change formatting again when base delivers an updates.

Thanks again for your support and yeah maybe we can push this feature some where that someone can fix it?

Which database? Where? Can’t you connect directly to that database without using any exchange format?
[Whatever SQL] → [Base Config] → Calc/Writer

Even though this is not the way I wanted to do it, it seemed to be the best and unfortunately the only way to manage it.
The drop of formatting is now a known bug, listed in the bug repo of libre office.

I used the method of Villeroy to build a workaround (even though I came across another problem).

I hope that this can be resolved too. But that’s a question for another topic.
Thanks for your help!

A direct database connection is not a work-around. A text file (csv) is the work-around when a direct connection to some database is not possible.