Automatically generated document is highlighted in black

Hi,

I’m using a transcription service (Happy Scribe) that generates a docx document with the transcripted text. This service has an online editor that allows me to highlight text. These highlights then are exported within the generated docx.

The problem I have is that when I open the document with Libre Office, the NON-highlighted text appears highlighted in black in the document. The same document opened in MS-Word appears correctly.

Another strange thing is that when pressing the space bar to preview the file (I’m a mac user), the document appears correctly, i.e. the black highlights are not visible.

Is there anything that can be don so that Libre Office can open the files correctly? I know that I can search and replace the highlighted text… but I would prefer to be able to skip this if possilbe.

I attach here one of the documents I’ve had problems with.

english_accent.wav.docx

Thank you very much!

The “Happy Scribe” service generates documents with text fill explicitly set to black (in Word, you may see that under Home ribbon->Font dialog->Text Effects button->Text Fill). That is odd, and I’d ask the service why on Earth might they want to do that at all.

However, there’s a problem in LibreOffice in handling that setting coming from DOCX. For another unclear reason, Word decides to treat the black background as either white (?) or absent (?). And LibreOffice does not follow that when importing. Which would be nice to report as a interoperability bug.

Thank you very much for your feedback!

I will get in contact with Happy Scribe and tell them about it. Hopefully it will not so difficult to correct.

tdf#132092

Just to update. Thanks to your feedback I got in touch with Happy Scribe and very quickly they corrected the issue.

From analysis of your sample file, it appears that the whole text has been “decorated” by the transcription service.

This is not a problem per se but for the fact that the .docx format has a much poorer notion of styles. M$ Word knows only of paragraph styles. Highlighting, which would preferentially be done with character styles in LO Writer, is done with direct formatting (= manual formatting) in Writer.

The properties are such:

  • font color: gray8
  • (text) background color: either black or yellow

Since everything is manual, I can’t see any automated procedure to correct the issue.

You might however not highlight parts of text in the transcription service editor. Get the raw transcribed text and import it into LO Writer. Create our highlight character style and use it for your highlighting. Save the file as .odt, otherwise the character style will be converted to direct formatting and future editing will also be manual.

Export to .docx only for external co-workers and only if they can’t read .odt (Word claims it can! but won’t correctly export to character styles because it has no notion of it). Any conversion to/from .docx will create editing problems.

To show the community your question has been answered, click the ✓ next to the correct answer, and “upvote” by clicking on the ^ arrow of any helpful answers. These are the mechanisms for communicating the quality of the Q&A on this site. Thanks!

Thanks for your help!

The thing is that is very useful to highlight and edit the transcript with their web-tool, and export at the end of that process.

I will report this to Happy Scribe and they can hopefully correct it.