Libreoffice Python https problem

kiloran · July 18, 2017, 9:16am

I’m having problems web scraping https sites using LibreOffice python.
I have Libreoffice 5.3.4.2 on Windows 7, and can demonstrate the problem with this simple script:

try:
    import urllib.request
    myUrl = 'https://ask.libreoffice.org/c/english/5'
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = urllib.request.Request(url=myUrl, headers=hdr)
    response = urllib.request.urlopen(req)
except Exception as e:
    print(e)

This fails immediately with “urlopen error unknown url type: https”.
It works fine with an http url, but fails with any https url.

I tried the above in a LibreOffice Calc document with this embedded script and it failed. It also failed when I tried running it in a terminal window from C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.0\bin\python.exe

The script works fine with my standalone Python 3.3.2 running from a terminal window.

I’ve also tried various LibreOffice Portable installations I have:

4.0.2.2: Works OK
5.3.1.2: Fails
5.3.2.2: Fails

I’ve tried uninstalling and reinstalling 5.3.4.2 more times than I can count and cannot get it to work. Yet installing it on Windows 10 on the same PC using a VM machine, it works fine.

Any idea what is going on?

=================================

Further news 19Jul17:
I tried the Safe Mode in LibreOffice 5 and the script works fine. Went back to normal mode and it failed again. Uninstalled LibreOffice 5.4.3.2 and then deleted everything I could find relating to LibreOffice. Reinstalled 5.4.3.2 x86 and the behaviour is unchanged… works OK in Safe Mode and fails in normal mode

kiloran · July 20, 2017, 6:29pm

Solved! Or at least I’ve found a work-around.
This pointed me in the right direction: https://www.reddit.com/r/learnpython/comments/3w7p31/python_macro_on_libreoffice_without_ssl/

I renamed _ssl.pyd in C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.0\lib\ to _ssl.pyd(old).

I then copied _ssl.pyd from my standalone Python installation at C:\Program Files (x86)\Python\DLLs\ and pasted it into the above folder.

Note that my Python and LibreOffice are both 32-bit.

And that’s it. It works fine. I tried the same technique on one of my Python Portable installations which failed (4.3.4.1) and this also solved the problem.

I noticed that the original _ssl.pyd was 48kB and the replacement was 1162kB and this did concern me, but it really does work, though I’ve no idea why.

I think I’ll report it to LibreOffice as a potential bug, even though it works OK in Windows 10

kiloran · July 20, 2017, 6:41pm