Ask Your Question
0

Libreoffice Python https problem [closed]

asked 2017-07-18 11:16:22 +0200

kiloran gravatar image

updated 2017-07-20 20:42:03 +0200

I'm having problems web scraping https sites using LibreOffice python. I have Libreoffice 5.3.4.2 on Windows 7, and can demonstrate the problem with this simple script:

try:
    import urllib.request
    myUrl = 'https://ask.libreoffice.org/en/questions/'
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = urllib.request.Request(url=myUrl, headers=hdr)
    response = urllib.request.urlopen(req)
except Exception as e:
    print(e)

This fails immediately with "urlopen error unknown url type: https". It works fine with an http url, but fails with any https url.

I tried the above in a LibreOffice Calc document with this embedded script and it failed. It also failed when I tried running it in a terminal window from C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.0\bin\python.exe

The script works fine with my standalone Python 3.3.2 running from a terminal window.

I've also tried various LibreOffice Portable installations I have:

4.0.2.2: Works OK
5.3.1.2: Fails
5.3.2.2: Fails

I've tried uninstalling and reinstalling 5.3.4.2 more times than I can count and cannot get it to work. Yet installing it on Windows 10 on the same PC using a VM machine, it works fine.

Any idea what is going on?

=================================

Further news 19Jul17: I tried the Safe Mode in LibreOffice 5 and the script works fine. Went back to normal mode and it failed again. Uninstalled LibreOffice 5.4.3.2 and then deleted everything I could find relating to LibreOffice. Reinstalled 5.4.3.2 x86 and the behaviour is unchanged... works OK in Safe Mode and fails in normal mode

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by kiloran
close date 2017-07-20 20:41:37.138237

1 Answer

Sort by » oldest newest most voted
1

answered 2017-07-20 20:29:16 +0200

kiloran gravatar image

Solved! Or at least I've found a work-around. This pointed me in the right direction: https://www.reddit.com/r/learnpython/...

I renamed _ssl.pyd in C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.0\lib\ to _ssl.pyd(old).

I then copied _ssl.pyd from my standalone Python installation at C:\Program Files (x86)\Python\DLLs\ and pasted it into the above folder.

Note that my Python and LibreOffice are both 32-bit.

And that's it. It works fine. I tried the same technique on one of my Python Portable installations which failed (4.3.4.1) and this also solved the problem.

I noticed that the original _ssl.pyd was 48kB and the replacement was 1162kB and this did concern me, but it really does work, though I've no idea why.

I think I'll report it to LibreOffice as a potential bug, even though it works OK in Windows 10

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2017-07-18 11:16:22 +0200

Seen: 203 times

Last updated: Jul 20 '17