Calc does not sort an imported .csv format bank statements correctly

My bank exports account data in .csv format with “Date” in the first column. The dates are in descending order, i.e. the most recent transaction is at the top and the earliest transaction at the bottom. I am trying to sort the whole sheet into date ascending order, i.e the earliest transaction at the top and the most recent transaction at the bottom. Calc will do this correctly provided there is only one transaction on each date.

Often, however, there is more than one transaction on some days, typically two or three and in that case Calc does not sort them correctly. It leaves the two or three transactions on each single day in their original order, i.e descending. The resultant table is then a mixture of ascending overall and descending within some days. This leaves very strange results in the balance column. I have previously used Excel and that sorts the whole table correctly. Somehow Excel must recognise that if it is sorting dates from descending to ascending it has to reverse the order of rows which have the same date but Calc does not recognise this.

Please upload an ODF type sample file here. At least with the columns Date "values.
(Maybe they are strings only.)
What is set as the second condition for sorting when the dates are same?

When importing csv always check “Detect special numbers”.

Or the “default” from Excel just fits YOUR needs. It may not be the best option for everybody. (Actually wrong imports by Excel were one of the reasons why I replaced it with OpenOffice years ago. Open-/LibreOffice is not perfect, but at least I’m asked and can preview the input.)
.
As @Zizi64 I’d ask for a sample csv. You may edit the columns with actual amounts of money and private data. We can then check, if Calc can sort this to fit your needs. If nothing else helps, add another column with the previous row number, then use this as second column to sort.

This can be seen as a feature, if you sort first with one, then with another column as condition for a sort. It preserves the first sort, wich was quite useful in older times, when we had only 3 possible columns to select. (No longer a problem in LibreOffice …)

I use to use a database for this and calculate the balance from auto-ID values (constant row numbers).

Thanks Villeroy. If I was trying to do my own accounting I would indeed use database software, e.g. Base or possibly a specific accounting application. All I am trying to do, however is create an annual record that is identical to the one held by the bank except that it reads forward in time from top to bottom. I guess I am quirky wanting to absorb information starting at the top of the page and proceeding to the bottom. I don’t want to start doing things like recalculating balances. I do want to do a small amount of data processing, e.g. calculating accumulative amounts for the year from the same source, without altering the original data, hence the use of a spreadsheet.

Villeroy, I see you’re still here with the same answers after 10 years! And after 10 years, this still doesn’t work! You’re simply creating confusion. There are no “special numbers” in a bank statement!

You are sure there was never a $ 123.45 or 456,78 € in a “bank statement”.?
.
If banks would actually follow some “standard” this could be solved. But they don’t…

This is a feature named “stable sorting”, which effectively means: “Keep everything as is when there is nothing to be sorted”.
Generate a column of constant row numbers and sort the list by this additional column.
Put a start value in the first cell adjacent to the actual data, then double-click the cell handle in the bottom-right corner while this cell is selected.

Bank statements contain dates. Quite often they contain figures with currency symbols. Choosing the right locale and checking the “special numbers” option solves the vast majority of csv related problems.

Sample File.ods (16.3 KB)

OK. Attached is a sample file in ODF format. I imported it to Calc from the csv file and set column 1 to date format, UK style, i.e. d/m/y,. I have deleted the contents of columns labelled “Transaction Type” which is just a 3 letter code that doesn’t relate to date in any way, “Sort Code”, “Account Number” and "Transaction Description"since I don’t want to reveal my bank details or details of the source of payments. Nothing in them relates to date information. The deposits, withdrawals and balance are real amounts. If you start at the bottom and read up you can see that the balance makes perfect sense. There was 500 in the account before the first transaction so a deposit of 437.77 increased the balance to 937.77. A withdrawal of 437.77 then reduced the balance back to 500. A deposit of 50 took it up to 550, a deposit of 100 took it up to 650, a deposit of 900.12 took it up to 1,550.12 and so on. All perfectly logical.

Sample File Sorted.ods (16.5 KB)

The second file is after sorting in ascending order using the first column. Notice that there is no second column that I can use to differentiate transactions on the same date. If you now start at the top and read down the first transaction is still correct but the next three, on the same day, make no sense at all. A deposit of 100 reduces a balance of 934.77 to 650, a further deposit of 50 reduces the balance to 550 and then a withdrawal of 434.77 reduces the balance to 500. This sort of garbage is repeated in various places on the sheet. It is clear that for multiple rows where the dates are the same it has kept the rows in their original order.

As far as I know it is common for banks to export account transaction data in csv format. I cannot imagine what anybody would do with such data other than import it into a spreadsheet. I cannot be unique in wanting to display data in ascending date order rather than descending? This must be a common problem?

I would challenge the idea that “it just isn’t doing what I want it to do”. If the sorting column contains dates then it is a series of events in time. Having multiple events on the same day cannot be that unusual? If the column is clearly in descending order, which it is, then surely it must be interpreted that the order of the events is from bottom to top. To change that to descending order, i.s. reverse the column, it is necessary to reverse the order of the events on the same day. Excel seems to be doing what is logical.

I am not interested in engaging in software wars. I have no affiliation to any particular software. The first spreadsheet I used was VisiCalc in the early 80s and I’ve used many variants since. My favourite was Lotus 123. I have been using MS Office for many years but swapped to Libre Office this year when I built a new PC and couldn’t afford to buy the latest version. I am happy with Libre Office and I am not criticising Calc, just trying to resolve an issue.

1 Like

I think the only solution is to add a column to the right of the balance with the numbers 1, 2, …, n, then sort the whole data range according to this column.

Thanks Steph1. I came to the same conclusion myself shortly after making my previous post. The nature of the responses so far suggested that there isn’t a solution within the software so a bodge solution in the data seems to be the only answer. I will probably insert it to the left rather than the right but that doesn’t really affect anything, I would just find it more convenient. I am actually pre-processing the data before transferring it to a bigger spreadsheet so I could then delete the column after sorting and before transferring.

I realise that the Calc software could be sorting any type of data so there is a limit to what decisions it can make itself. I could imagine a table constructed with a date column where the data is initially in a completely random order still with repeat dates. For example if somebody is just collecting data from a number of sources and simply enters them into the table in the order they obtain the information. In that case Calc could have no idea what to do about repeat dates other than group them together. It should be possible, however, for Calc to recognise when it is dealing with dates and those dates are already either in ascending or descending order. If the command is to reverse the order then I would argue the only logical thing to do is to reverse the order of all entries. Alternatively it could ask an ancillary question what you want to do with repeat entries. Perhaps it is something the designers could think about for a future improvement?

I agree with you that a “reverse” option could be interesting. The issue here is that the rows cannot be considered independently: the balance is the result of calculations between rows.

Therefore either you dynamically recalculate the balance (after sorting the rows) or you keep the rows in the initial order or reverse order. The former option is anyway the only possible solution in case of merging different groups.

The last option is to ask your bank to list the transactions in the ascending order!

I think the transactions are in time order. Ask the bank for the time field too

Many thanks for these two replies, Steph1 and EarnestAI. They kept me chortling for a whole day. I am guessing you don’t live in the UK? The idea that you can communicate with a UK bank is very funny. The idea that a UK bank would respond to a communication if you did manage to make one is hilarious. The idea that a UK bank would change its system in response to a client request it so far out there that I doubt if Mulder and Scully could find it.

I am sure you are right, EarnestAI that the transactions are also in time order and some banks do indeed provide a time column in their exported data so, for them, introducing a second sort column would indeed solve the problem. Unfortunately my bank doesn’t and, since I no longer live in the UK I cannot transfer banks.

I tried adding an index column to the left of the data then sorting on that column and I can confirm that it works as expected. Interestingly I generated the index column using an equation, e.g A2 = A1+1 etc and sorting completely screwed up the index column with lots of #Ref errors. It doesn’t matter since I don’t need the index column after the sorting is complete but it is perhaps better to create the index column in a dummy sheet and then just copy and paste the values.

I’ll sign off now. Thanks for all the suggestions everybody.

no need for Formula!
Start with some Number and pull down! incrementing Numbers by 1 is a builtin-feature!

Thanks for that Karolus. Over forty years using a variety of spreadsheet applications and I have never used that feature before. Very useful.

Well, I suppose spreadsheets have improved a bit since VisiCalc and/or Lotus 123.

1 Like

Start with some Number and pull down! incrementing Numbers by 1 is a builtin-feature!

Even faster: put 1 in the top cell, then double click on the little blue square at the bottom right of the cell (once the mouse cursor changes to a cross). I discovered this feature very recently by “mistake” and it took me time to understand how was I able to automatically fill the column!