Deleting all values in a column is slow.

asked 2018-09-11 03:40:36 +0200

hvankampen gravatar image

updated 2018-10-09 14:52:52 +0200

I have a Calc spreadsheet (LibreOffice version, Windows 7) with a single worksheet. The worksheet comprises 8 columns and 2925 rows. Each column is a named range.

I am trying to clear all the values in a single column ("Duplicates").

Attached are 2 files:

"Delete Cell Contents.ods" contains the data I am having trouble with. I have stripped out all extraneous macro code and a dialog.

Column H of "Delete Cell Contents.ods" contains the data of column A without the duplicate values.

"Delete Test1.ods" is a file using dummy data but using the same macro code as "Delete Cell Contents.ods".

I have created a toolbar ("Clear Cell Content") in each file to access the macros.

Running any of the macros in "Delete Test1.ods" results in almost instantaneous deletion of data in column H, while doing the same in "Delete Cell Contents.ods" takes about 4 seconds to delete the data in column H.

Undoing the delete in "Delete Test1.ods" takes about 8 seconds, while in "Delete Test1.ods" the undo is instantaneous.

The same delay is evident when selecting and deleting the data in column H manually.

"Delete Cell Contents.ods" has some direct cell formatting: vertical alignment is "top", horizontal alignment is "left" and "wrap text" is enabled. Removing this formatting speeds things up slightly.

C:\fakepath\Delete Cell Contents.ods(/upfiles/15368529059303209.ods)(/upfiles/1536852887630988.ods)

C:\fakepath\Delete Cell Contents.ods (/upfiles/15390892797953911.ods)

Any help in speeding up this delete process would be appreciated.

edit retag flag offensive close merge delete


Do you have formulas in adjacent columns that reference deleted data? Perhaps using the dispatch command like as .executeDispatch(doc, ".uno:Delete", "", 0, arguments("Flags","SVDFN")) will work faster?

JohnSUN gravatar imageJohnSUN ( 2018-09-11 11:23:33 +0200 )edit

Thanks for the tip. Your method seems to be marginally faster, about 3.5 seconds.

There are no formulae involved.



hvankampen gravatar imagehvankampen ( 2018-09-12 00:57:38 +0200 )edit

3.5 seconds instead of 4? Very good result! :-) It seems to me that without analyzing your file (data and macro code), we can not understand the reasons for this slow work. Can you show your spreadsheet?

JohnSUN gravatar imageJohnSUN ( 2018-09-12 07:13:26 +0200 )edit

Please see the revised original question.

Thanks in advance.


hvankampen gravatar imagehvankampen ( 2018-09-13 17:36:55 +0200 )edit

I'm sorry to upset you on the day of a professional holiday, but with a file of this size, Calc will not be able to run faster. I'm not talking about the dimensions of the table, but about the amount of data in cells. A very small gain in speed was achieved after removing the formatting from all cells of the table - but it is really quite small. :-(

JohnSUN gravatar imageJohnSUN ( 2018-09-13 19:10:55 +0200 )edit

Hmmm.... So, is there any specific reason that deleting the content in all cells at once (<ctrl+a>) is almost instantaneous? If I select columns A-F the delete takes approx. 1.5 seconds; if I select columns A-C, it takes approx. 3.5 seconds. It seems that the less data I select, the longer it takes. Is this not odd?


hvankampen gravatar imagehvankampen ( 2018-09-13 21:57:22 +0200 )edit

Yes, the removal of everything - instantly, but the Undo after that - the same 8-10-15 seconds. IMHO(!) there is a simple explanation for this. Calc sees the document as a tree structure. Deletion of all data is how to cut down the whole tree under the root. Removing cells in columns is creeping from branch to branch and cutting out individual fragments. If this analogy is correct, then the delays in processing the data are also understandable, are not they?

JohnSUN gravatar imageJohnSUN ( 2018-09-14 09:20:10 +0200 )edit

OK... following your tree analogy, I would picture a Calc file thus:

1: The Spreadsheet is the trunk. 2: Each Sheet is a major branch off the Spreadsheet. 3: Each Column is a minor branch off a Sheet, crosslinked with Rows (or the other way around). 4: Each Cell is a twig off a Column or Row.

I still find it odd that it takes less time to cut off 6 "minor branches" than it does 1. Calc doesn't seem to process them in order (from columns A to H), so that doesn't seem to be the issue. (cont.)

hvankampen gravatar imagehvankampen ( 2018-09-14 17:32:06 +0200 )edit

I have a similar delay issue with sorting on columns A and B.

Oh well, if that's the way it is, so be it.

Thanks for your input.


hvankampen gravatar imagehvankampen ( 2018-09-14 17:34:09 +0200 )edit

FWIW, I don't experience any delay with the sample file, deletion with or without using the macro is instantaneous.

erAck gravatar imageerAck ( 2018-10-09 11:21:19 +0200 )edit