Overall strategy for importing new data in .csv files into existing embedded LO database

Yes I do wonder why, given that I have already told Calc at the import stage that I am using D/M/Y dates in that column. You seem to have already forgotten point 2 in my post in this earlier thread where I already covered that point.

Public Request: Given the level of misunderstanding shown by you towards my posts in the last ten days I would prefer that you do not attempt to guess what I think, believe or wonder. Or at least if you make those guesses please do not share them in a public forum. The point you are making about this script is not improved by making a comment about me in passing, and especially so when your comment is inconsistent with what you have already read from me.

Likewise in Base there is an opportunity to choose date formats in Tools → Options → Language Settings → Language.

And turning to the script, it does not need anything of the complexity of an AI to load up a csv, it just needs an appropriate algorithm for the expected date formats or range of possible formats that are reasonably expected. That is no different from the View on the text table in your preferred method which likewise has to be hard coded in a similar way. That could mean that a script is hard coded for one particular kind of input, or it could be that there is some editable configuration variable in the script, or a prompt to the user.

1 Like

If you had dates in that column, the import into Base would not fail.
menu;:View>Highlight Values [Ctrl+F8] highlights numbers such as true dates in blue.
=A2-A1 returns the difference between 2 valid dates in days.
=ISNUMBER(A1) returns TRUE if the date is a date.

I want to add to what I said in public. As Mike pointed out to you, the issue of the date format is only one of many things that can make the paste into Base fail, or leave you with results you did not intend.

For example, see the points I made about not having a PK for the paste (the behaviour mentioned may be imporved in later releaeses: the info given is accurate and tested for LO 6.4)

The date format is only one of about a dozen points in my OP on that thread, but it is the only one you noticed.

Not only did you emphsis on just one point distract from the overall balance of the thread, but you even said that it was the most important one (or words to that effect).

Sorry, most of the steps are important, and getting error messages when trying to paste into a table with an existing PK that is not already in the csv file, or one where the rows in the file duplicate keys already existing can cause the upload to fail.

It is true that my workaround is more simple than it needs to be. You or I could figure out more than one way to upload data from an existing csv into an existing table. The reason I chose not to do that was that the intent was to make it as simple as possible for the relatively new user. As regards PKs I think my recipe will work for any dataset.

You and I are both incorrect (on a pedantic level) if we assert that setting Date and D/M/Y (etc) in the csv upload will always work. Again I oversimplified there. There exist date formats which would fail – the military dates you mentioned are an excellent example. Again that was a deliberate choice on my part: those formats are sufficiently rare that I chose to leave users with that data to ask a question when and if it failed.

It was to get away from the counter-productive (in my opinion) complexification of your comments in the first thread that I started the second one. Despite clearly indicating that the new thread was for an overview of strategies rather than detail, you chose to import complexity into the new thread.

Can you now understand why I was disappointed by your posts to my second question?

If you understand that, would you be willing to try to understand the skill level needed by the questionner in the case of most questions, and the skill level being targeted by a post like mine, which was to try to share an overview of methods to an unskilled audience?

I hope you would be willing to do so. If so, for my part I will offer to give help, advice, and support to do so by pm rather than on the public forums.

It is not easy to write for someone with a smaller skill set than our own: that abilty comes from my working life as a university lecturer (I am now retired). You can’t give a class of freshers (first year students) the same lecture you would give to final year students.

I am willing to coach you, behind the scenes, if you are willing to accept such coaching.

Even if you do not want such coaching behind the scenes, I still hope you will be willing to make the changes I suggest from your own efforts.

I hope this pm lands with you as it is intended: as an honest attempt to be helpful to you and to improve how helpful the forums are to newcomers.

Enjoy the rest of your day, whether you agree with me or not on any of the above :slight_smile:
River~~

This comment is an attempt to continue a misunderstanding from the same OP in the same earlier thread. There I had already said

It is inapproporate to try to continue an argument here that I walked away from on that earlier thread. But given that you chose to do that: exactly what part of that quote led you to think I did not already understand that? Your comment would have been inappropriate on the earlier thread.

Neither is it appropriate on this thread, where the title makes it clear that I am looking at overall strategies not the detail. I thought I had also made that clear in the OP here.

My observation is that you are, for the second time, ignoring an OP when responding to comments within a thread. I wonder how you think that will help the person who posted the OP? I wonder how that will help anyone arriving at this thread from a search on the internal search box or from a search engine.

My second personal request to you therefore is to try to keep all comments relevant to the title and the OP of whatever thread you are posting to. That helps to keep threads uncluttered.

This thread would have been better suited to the intended purpose had you not chosen to continue an argument that started elsewhere, and chosen to do so again after more than one person had commented about that.

In case it is not yet obvious, my hope was to create a concise summary from which people can choose which of the various strategies will best suit their particular needs; and yes indeed will best suit the particular data they have in their .csv.

There are four overall strategies for importing new data from .csv files into existing embedded HSQL:
1) via Calc: This applies to all databases with write access including embedded HSQL and FIrebird. It is doomed to fail if you are unfamiliar with the fundamental spreadsheet Basics which are the same since 4 decades.
2) via SQL text tables: Does not appy to embedded FIrebird because FB does not know any character separated text tables (only fixed lenght text tables). It requires moderate capabilities in SQL. You write a Base query mapping the right types to each other, store it as a view and every time when the csv file has changed you copy the view onto the target table.
3) via csv data source: You connect a second Base document to the csv source and copy from this Base document to the embedded database. I do not recommend this method, particularly when dates and numbers are involved. csv sources are OK for text only tasks such as mail merge.
4) programmatically by your own freely chosen means. This does not involve any office suite.

Any strategy requires that you are able to follow the rules that are imposed by the specific flavour of csv and the rules that are imposed by the database structure. A spreadsheet involves an intermediate layer. A spreadsheet is a simplified programming language with two data types text and numbers, Excel treates booleans as a third datatype, formulas return errors as an additional type.

Thank you very much for this post @Villeroy it provides me exactly the kind of feedback and suggestions I was hoping for :slight_smile:

As a result I have liked it :grinning:

That does not mean I agree totally with you: but I regard our remaining differences as honest professional differences, and totally repect the points you make including the ones on which we differ.

My main difference now is that from my perspective your version overdoes the doom and gloom stuff. I do not believe that the attempt is “doomed to fail” in the majority of cases where the user hasn’t got a clue about the character sets and so on, whereas you clearly do honestly think that.

The reason I disagree is that Calc and Base both use the same locale (located in Tools->Options->Language Settings->Languages) to decide how to interpret characters. Usually (and I agree, not always) that means that if the text fields look OK in Calc they will also do so in Base. In addition to that, if the locale in LO is set to the same as the locale on the computer that generated the csv file, then there is a sporting chance things like umlauts and cedillas will work the same in both systems.

It is not something I would bet my life on, but it is something where I feel comfortable encouraging the user to try it: if and only if it does not work then I would ask them to send in a sample of the file.

What I like most about your re-write of my OP is that it is more concise, and that even so you have given names to each of the four options. That makes discussing the options much easier than my approach of numbering them.

Thank you for this collaboration. Let’s do it again sometime…

Firebird 3.0 language reference
Page 127 - External tables
External Tables
The optional EXTERNAL [FILE] clause specifies that the table is stored outside the database in an external text file of fixed-length records. The columns of a table stored in an external file can be of any type except BLOB or ARRAY, although for most purposes, only columns of CHAR types would be useful.
All you can do with a table stored in an external file is insert new rows (INSERT) and query the data (SELECT). Updating existing data (UPDATE) and deleting rows (DELETE) are not possible.
…"

What a difference a day makes!

I was pleasantly surprised – astonished even – to see a post from @Villeroy that I honestly welcomed.

See Overall strategy for importing new data in .csv files into existing embedded LO database - #25 by Villeroy and my reply one post later. When I woke up this morning I never guessed I would be writing so warmly to him before bedtime.