For the clarity of others, this discussion relates to the threads here and here on this forum. One other thing I would like to point out is that hexadecimal rather than decimal values should be used when quoting file contents as this is the standard and makes it much easier for others to interpret what is being indicated. It is also not possible to determine anything from content displayed erroneously in Writer (e.g., “##á#”) as this is translated.
Unfortunately there is no easy answer to the question asked here, as ClarisWorks, which later became AppleWorks[1], underwent several changes of version (in unrelated series numbering) across the MacOS and later Windows platforms. During these changes the components offered expanded from word processing, spreadsheet, and database, to later include page layout, graphics/drawing/painting, and equation editing, all of which used the CWK file extension. A CWK file therefore can originate from different platforms, versions, and components. Generally speaking, most CWK files will be from the later v5.x or v6.x series of the product for MacOS.
The Java source used by Terrence Curran indicates this byte format for the header in CWK files:
-
01-04: Version e.g., I have seen
05 02 7d 00
, 05 02 91 00
, 05 02 99 00
= ClarisWorks v5.x; 06 07 d0 00
, 06 07 e1 00
= AppleWorks v6.x.[2]
-
05-08: File Creator ID e.g.,
42 4f 42 4f
= “BOBO” the reason for which is explained in the Notes at the foot of the page here by one of the creators of ClarisWorks.
-
09-12: Previous Version e.g., I have seen
04 07 97 00
, 04 07 9e 00
= ClarisWorks v4.x; 05 02 91 00
, 05 02 99 00
, 05 07 ad 00
= ClarisWorks v5.x; 06 07 d0 00
= AppleWorks v6.x.
-
13-20: unknown e.g., appears to always be
00 00 00 00 00 00 00 00
.
-
21-22: unknown e.g., appears to always be
00 01
.
-
23-24: unknown, possible marker e.g., I have seen
00 0f
, 00 ba
, 00 bb
, 00 b7
, 00 c0
, 01 d0
.
-
25-26: unknown e.g., I have seen
0b f0
, 10 8c
, 31 6c
, 42 c4
, 80 ac
, 86 a6
, 88 32
, 8f 04
, 8f 12
, dd 06
.[3]
-
27-30: unknown e.g., appears to always be
00 00 00 00
.
-
31-32: Page Height[4] e.g.,
02 53
= 595pt; 02 64
= 612pt; 03 18
= 792pt; 03 4a
= 842pt.
-
33-34: Page Width e.g., as above for Page Height.
-
35-46: Page Margins e.g., six two-byte values such as
00 12
= 18pt; 00 3a
= 58pt; 00 48
= 72pt.
- 47-??: unknown
[1] Wikipedia.
[2] The “##á#” value quoted (decimal 35,35,225,35) is a translated value, as previously indicated. There are several example CWK files here. The 1998 Roster may be a spreadsheet form as these four bytes display in LO Writer v4.1.3.2 as ##�# (decimal 35,35,65533,35) but these four bytes in the file are 05 02 99 00
(i.e., v5.x). The reason the thrid byte changes is because the value is higher than ASCII and so is possibly interpreted differently in Windows and MacOS.
[3] Seems highly variable.
[4] A4 = 595x842pt; US Letter = 612x792pt.