An interesting question. For clarification, are your “duplicate paragraphs” also contiguous (i.e., next to each other)? or separated in the file? I’m assuming the former (e.g., in a sorted file, and looking for duplicates together).
In fact, I can’t seem to get your “working” expression to find anything in the AltSearch extension (assuming that is what you’re using?).
The help file for AltSearch suggests there might be issues in using back-references (and more widely on that page).
Some older OpenOffice forum threads suggest that searching across paragraph boundaries is impossible (also, an older one), but I’m not exactly sure if that’s the issue here.
In much experimenting both with regular expressions in the “normal” CTRL-H
dialog, and with the AltSearch extension, I couldn’t manage to find duplicate paragraphs. I would be fascinated to see a solution to this one!
Update: On a different machine now, and the expression eaglgenes101 provided does “work” - it finds two consecutive paragraphs. The explanation for why it is finding any two contiguous paragraphs, and not two successive identical paragraphs, is that ([:print:]+)\p([:print:]+)
does not provide a “back reference”.
In other words, it finds one set of printable characters ([:print:]
), followed by end-of-paragraph, followed by another set of printable characters, but there’s nothing in the expression to make those two sets of [:print:]
sequences to be the same. That’s the job of “grouping and back-references”, and you would normally use \1
to refer back to the first grouped sequence (and \2
, if there is a second grouped sequence, and so on). The expression ought, then, to look something like ([:print:]+)\p(\1)
… but that doesn’t work in AltSearch.
So there’s a bit of a “Catch-22” here. AltSearch can find matches across paragraph boundaries, but it seems that its back-references are broken (well, “limited”) in searches in some situations, including this scenario. On the other hand, back references work fine in LibO’s CTRL-H
+ regex searching, but in this case the limitation is that you can’t (apparently) search across paragraph boundaries.
It looks to me that this problem has been registered in the bug tracker at fdo#58744 (to which I’ve added a comment and a link to this thread). It would be VERY good to have this “fixed”, enhancement, developed, whatever. Maybe in 2014? (Updating on New Year’s Day!)