Slow spreadsheet with use of INDEX and MATCH

LibreOffice version: 6.4.7.2 (latest in Ubuntu 20.04)

I have a single sheet spreadsheet which logs dividends received from investments, with some calculations to show the change from the previous year. These calculations are done using INDEX and MATCH, and whilst they work I get the following performance problems:

  • Opening the spreadsheet file takes over 60 seconds.
  • Inserting a new row into the sheet takes over 30 seconds.
  • Adding data to any of the columns covered by the INDEX or MATCH (or columns which rely on them) takes over 30 seconds.

The number of rows involved is less than 100, and only grows at a rate of 12-15 per year, so this isn’t a ‘big data’ issue. Other spreadsheets don’t have these performance issues, and I’m using a fairly powerful machine (8 cores, 32GB RAM, SSD drive).

The relevant columns are as follows (where {row} is replaced by the row number and any column without a formula is a manually-entered string or number):

A: Stock market ticker (short string in all caps)
B: Year
C: Quarter
H: Original dividend
K: Previous Year: B{row}-1
L: Previous original dividend: =IFERROR(INDEX(H:H,MATCH(1,(A{row}=A:A)*(K{row}=B:B)*(C{row}=C:C),0)), ""). Effectively this says ‘get the value from column H from the row where the Ticker (A) is the same as this row (A), the Year (B) is the same as the Previous Year (K) and the Quarter (C) is the same as this Quarter (C)’. In other words, the previous year’s same quarter dividend for the same ticker.
M: Original dividend change: =IF(L{row} <> "", (H{row}-L{row}) / L{row}, "")

Is there a way to address this performance issue, perhaps by rewriting the INDEX and MATCH formula? There will only ever be one match for the previous year, same quarter, same ticker dividend if that makes a difference.

…just guessing: replace all full column references like A:A with something like A$1:A$1000 and check, if the performance issues remain
Good luck,
ms777

Can you attach a sample file, I can’t see the issue with about 1000 rows.
In theory, MATCH shortcut the search at the last/row column with data.

The problem here may be that something like A{row}=A:A creates an interim array of an entire column (with TRUE and FALSE values of the comparison of all cells) because arguments to MATCH() are forced into array mode. So limiting that to the actual data range may indeed help.

I’ve tried doing that on a couple of cells, the problem is that LibreOffice doesn’t move the numbers along when I copy and paste or use Fill Down. For example, if I use A2:A11 (because the current row is row 12) when I try and pull that into the same column on the next row it ends up as A3:A12 instead of A2:A12.

… which means that you need to use absolute addressing in this case, instead of relative.

1 Like

so replace A2 with A$2 or $A$2 and read the Doc already givven by mikekaganski

Using $A$2 or A$2 breaks the MATCH and the cell containing the formula ends up empty.

Never seen before a case where the result of the formula was changed by changing from relative to absolute…

You do something wrong, but it is impossible to say what, because you never mentioned the exact formula that breaks. Did you e.g. forget to also use absolute addresses at the B and C, which were mentioned in the question, and so had a multiplication of different-size ranges?

I’ve rewritten all the formulas now and that seems to have done the trick, thanks (I had never come across absolute references before in 20+ years of using spreadsheets…)

That’s odd, because absolute vs relative references is a basic feature of all spreadsheet applications when it comes to copy-pasting or pulling formula cells.

Please check: Ten concepts that every Calc user should know:
https://forum.openoffice.org/en/forum/viewtopic.php?t=39529

If there is a reason you want to stay in (the old) Ubuntu 20.04, in order to encounter fewer bugs you should at least upgrade your LibreOffice to the latest version. If you want it to be stable I suggest 7.3.6 (and further upgrade to 7.3.7 two weeks later when it comes avalible). There are plenty of bug fixes and performance improvements between 6.4.7 and 7.3.6.

You can uninstall the LibreOffice provided by Ububtu by
sudo apt-get remove libreoffice*

Then download the latest libreoffice deb package provided by TDF from:

You may also want to download the language packs.

After download, unzip the files, open a terninal, then:
cd <path/to/your/unzipped/files/>
sudo apt-get install ./DEBS/*.deb

If the above apt-get install does not work, you can do the following instead:
sudo dpkg -i ./DEBS/*.deb

The above command will install all the .deb files you have downloaded. You can do the same for the lang packs, or you can move the lang oack deb files to the same folder as the main debs and run the above apt-get install command at once.

I had tested also with 6.4.7 and no delay on calculation.

Well, @pwaring1 since your data is about stock performance I assume it is public data, thus it would be better if you attach an example file here so others can take a look to identify the root cause of the problem.

I have updated LibreOffice as part of 22.04, I’m now using 7.3.6.2 with no improvement to performance.

Stock data is public, however which stocks I have is not public (and indeed is a security question sometimes asked if I call my broker), so I will have to anonymise the data first.

OK, but without some testing documents I can not help here. Maybe you can find-and-replace your stock ticker symbols with some random characters (i.e. prepare a public version).

Preprocess your date by creating some kind of index by appending colums like
=A2&"-"&B2&"-"

If you put this index (hidden) in sthe first column you could simply use VLOOKUP. As Calc needs to combine all collumns for each row to filter your cell there are a lot of cells to handle
.
Modified answer: put your data in a database. They are optmized for this…