Allow regexes and wildcards, but also pure text

I’m importing CSV data into a sheet, and some of the fields contain “confusing” text, in casu "-*" and "V*", that can be interpreted as wldcards or REGEX, and the same column also contains rows containing just “-” and “V”.

The problem is that I’m trying to sum each type independently with a

=COUNTIF(N_Type,"="&$P12)

where P12 contains the text value to be matched, and if I enable (Tools, Options, Calc, Calculate) “Enable regular expressions in formulas”, a text value of “-*” in P12 also matches the “-” text in column “N_Type”, which is NOT, NOT, NOT what I want.

Yet, for other formulae I DO need regex’es, so how the flipping 'ell do I solve these two obviously very much clashing requirements???

Have you considered substituting for a benign character on both sides of your COUNTIF() then entering that COUNTIF() as a CSE array formula? Like

=COUNTIF(SUBSTITUTE(N_Type,"*","&"),SUBSTITUTE("="&$P12,"*","&"))

You have to have the CSE array entry so that SUBSTITUTE acts across each cell in N_Type, even though you are entering the formula into just a single cell. I’m thinking of N_Type here as a named range.

Thanks!

I’m just a “simple” z/OS guy, things like array formulae are way over my head, I’m probably one of the many people for whom 10% of Calc provides 90% of what’s needed.

What you suggest works like a treat, even for the unsimplified formula, which was =COUNTIFS(A_trip,"="&Q$10,N_Type,"="&$P12) - on z/OS we always tend to simplify things to the essential.

However, your solution throws up another question, why does changing the contents in cell Q10 to an asterisk not lead to the first part of the countifs() operator to be true when (Tools, Options, Calc, Calculate) “Enable regular expressions in formulas” is selected? Or in other words, why for the first compare it’s treated as a literal, and for the second as a regex(/wildcard?)?

I’m curious…but not certain what you are saying. I think you are saying that now you are using something like

=COUNTIFS(SUBSTITUTE(A_Type,"*","&"),SUBSTITUTE("="&$Q10,"*","&"),SUBSTITUTE(N_Type,"*","&"),SUBSTITUTE("="&$P12,"*","&"))

but you could get by with something like just

=COUNTIFS(A_trip,"="&Q$10,SUBSTITUTE(N_Type,"*","&"),SUBSTITUTE("="&$P12,"*","&"))

as if the first parameter of COUNTIFS acted differently from the second parameter. Intuition upfront says that there is a difference somewhere else…not sure how. If you have a moment to upload a simplified example please do.

I’d suggest to try reading relevant help. Citing from there:

If you want to search for a text that is also a regular expression, you must either precede every character with a “\” character, or enclose the text into \Q…\E.

Which converts the formula in your question into

=COUNTIF(N_Type,"=\Q"&$P12&"\E")
2 Likes

Yes, it is where you tell me it is, but if someone, as here: me, is looking for regular expressions, wouldn’t it be also useful to mention those characters on the file:///C:/Program%20Files/LibreOffice/help/en-US/text/shared/01/02100001.html regular expressions help page?

Citing from that very help page:

For a full list of supported metacharacters and syntax, see ICU Regular Expressions documentation

1 Like

Going off to some potentially blocked site on the internet? You’d be amazed how many organisations block site likes github.io!

Well, that’s their choice. It’s just unreasonable to try to copy and keep up-to-date every relevant documentation from the authors/maintainers of respective technology.

Is this “=” needed in the second argument? I tested with a few samples without problem. Thanks.

Not to sound patronizing, because I’m no regexpert, but regex is a dark art. I wouldn’t worry too much about knowing more than you need for a certain task. I suppose use sites like https://regexr.com/ that let you build and test. I use paid software for testing, myself.

I’m more interested in seeing if there is any irregularity in COUNTIFS().

And you want to compete with that other office product? Would M$ ever tell you to go to the internet to get the answer you require?

How did we go from having a working answer to people getting upset here? FOSS has its goods and bads, as does BIG-BUCKS. I once submitted a LO bug with the HYPERLINK() function and the fix was in the next release within 48 hours. Lots to be said for MS, though…Office really is (usually) easier to use. Different strokes for different folks. Besides, you can get the full support deal, etc., from the fine folks at Collabora who do a lot to contribute back to the project. [I’m not an insider, and am not trying to start any flames about TDF/Collabora, okay.]

@LeroyG Likely no; use of criterion slightly changes the processing, so in some cases it may be important, but for most cases, it’s the same. I just didn’t want to complicate the matter, and followed the original formula structure.

1 Like

And you want to compete with that other office product?

Lol. Who told you that? Do you believe that the only reason something exists is to compete with another thing?

You don’t want to compete with that other office product?

Then tell me why the changelogs are filled to the brim with " Improvements of DOCX/XLSX/PPTX import/export filter" notes?

Or, from the latest release notes, " Document Conversion"

soffice --convert-to csv:"Text - txt - csv (StarCalc)":44,34,UTF8,1,,0,false,true,false,false,false,-1 sample.ods

Where the flipping 'ell am I supposed to find the meaning of those 12 parameters? Not in the Help! How hard would it be to make this interactive, asking, given the true/false parameters, for Y/N replies at the command prompt? Or add option-specific help, like “–help convert-to”?

You’re working for Collabora, which should "In that way, you can get long-term Service Level Agreements (SLA), personalised assistance, technical support, and custom new features. "

I read somewhere that there were worries that businesses were reverting to the “free” version, why am I not surprised?

Then tell me why the changelogs are filled to the brim with " Improvements of DOCX/XLSX/PPTX import/export filter" notes?

:smiley: Because that’s what users need. The office suite is to enable users making their job done. Opening files is important for users, and is unrelated to “competing” :wink:

1 Like

Why not? We are adding it there - it’s in 7.3 help. You may see that it was me who filed tdf#140779.

I also participated in tdf#98153 discussion. You might see that I agree with it being needed.

But honestly, I do not understand your point. I tried to help you; I suggested something that could solve your case, and pointed to the source that may help you in the future (the under-rated help system). You had been shown that the help also points to the complete syntax, which is defined by the library we use. You ask, and we answer, and you only seem to try to prove that everything is bad. Do you say that I should not answer your questions?

1 Like

Definitely!

2 Likes

LibreOffice refers to the documentation of the technology it uses, i.e. the ICU regular expressions engine. (Btw, MS-Office doesn’t know regular expressions at all, though ICU fwiw is also used by Microsoft, but for other projects, see https://icu.unicode.org/ ).

If you prefer to not visit any github pages you can read the outdated documentation following this old icu-project.org link that forwards to an interim google.com page.