Create new custom document format

Editing my original question to add some more clarity

I wanted to build a custom document format where I could store along with plain text other data items such as audio clips, images and svg sketches etc. I could go ahead and render all of these in a HTML page and display them to the user but I was wondering if I could leverage the existing XML document libraries.

Instead of using HTML I wanted to use XML since it seems like there many people before me have already built lots of libraries to use XML as a document format and then I learnt that Open office format is basically nothing but XML format.

Coming to my question, suppose my XML document looked like:

<customDocumentTag1>

343442223322

</customDocumentTag1>

<annotation1>

Some Annotation Data

</annotation1>
`

I might add some new tags like <audioClip>, <customData> or something else to my format later on. I would want to leverage Libreoffice SDK to build viewers for this document. I’m thinking the viewers would make it look like you’re opening document natively. I was wondering if using XML document vs. HTML document might buy me some more type safety, existing libraries to build native viewers and export them easily to HTML to view on browsers as well?

Is it possible to do this with Libreoffice SDK or am I not thinking correctly? Thanks for any help!

Edit your question to make it less cryptic.

What do you want to do? Create an HTML5-like document for use in some web site (where the CSS sheet will format your custom element)? Organise some XML document of yours?

Writer format, called ODF, is a specific application of XML with the standard/DTD describing the syntax and the semantics. Writer then displays the document according to the semantic rules.

I doubt you can add new XML-elements and have them correctly interpreted by an unmodified Writer.

If you think of your custom elements as paragraph styles, then use the standard tools in Writer. Of course, encoding is not as simple as in your example.

Hi ajlittoz, thanks for your reply.

I wanted to build a note taking format which can be viewed on different devices easily. I could’ve just gone with an HTML document with CSS styling but I wanted the note taking format to be natively viewable on PC(Mac/Linux/Windows) and was hoping that if I went with Libreoffice SDK I might not have to re-invent the XML document format with styling.

I would want to add my own custom attributes in this new document format that I’m imagining. For example, I might want to add a custom tag that anchors an audio clip to a section of text and this tag will store that audio clip and anchoring location both.

I’m not sure if all of my requirements make sense, I was just doing some exploratory work. Let me know if what I’m saying makes any sense.

A debugged (or at least thought-upon) specification always makes sense to address a need.

Why don’t you start with a collection of paragraph styles? Your notes will “typed” by the style. You can then export the document as “flat ODT” (.fodt extension) which is an uncompressed XML file. Of course, tagging is not as simple as your XML example, but it can be read (though perhaps not interpreted) on any device.

I succeeded in inserted an audio file with Insert>Media>{Audio or Video` but found no way to play it.

Perhaps, you could also try Impress. Similarly, it can save its presentation in “flat ODP” (extension .odp).

Since this is a hobby project, I wanted to build a new XML document format from the ground-up from scratch to understand how to build a custom XML document format. (BTW I just tried a hello world document in writer and saved it as .fodt and it looks pretty cool.)

Can you build a custom XML document and use Open Office XML existing software to display it? If so, are there any existing resources tutorials etc. that I can look at?

I was wondering that Libreoffice SDK will probably show how to programmatically build a XML document but didn’t know where to look.

LibreOffice is not about creation of file formats. It uses its own document model (that follows ODF); and any file format filter (import or export) do the job of mapping the model into the document format’s primitives.

If you need to learn the file formats, you may look at DLP (Document Liberation Project), which has infrastructure for different import/export filters used in LibreOffice (and other projects). Or you may look at XSLT filters, and create your own filter mapping any XML-based format you want into ODF. Here is a small sample of such a filter (more are in LibreOffice’s installation).