I am typing in Punjabi with Libre Office Writer 4.0.In Punjabi there are lots of words with 3-4 characters.Writer takes a Punjabi vowel sign for a character in Unicode.For example, ‘ਸਿੱਖੀ’ is ‘Sikhi’ in English and ‘ਹਰਮਨ’ is ‘Harman’ in English.Word completion works for ‘ਸਿੱਖੀ’ and not for ‘ਹਰਮਨ’ as ‘ਸਿੱਖੀ’ being composed of 5 characters qualifies and ‘ਹਰਮਨ’ being a 4 character word does not.Is there any way to change this limit to 4.Can I write a plugin for this?
@gagmani - I don’t know more about Punjabi, then that it is the most spoken language in Pakistan.
Looking at how the Punjabi words in your question appear on my screen I assume that Punjabi is in computer terms a double byte language like, Japanese, Chinese.
Here is the screenshot of what I see.
To me it currently appears that your question has to do with the conversion of keyboard entries to double byte words.
Should you cannot get an answer to your question, I need to know how you write Punjap and how it is supported by SW. Especially I need to understand if your word “character” refers to a single byte character or a double byte.
I have also copied the gagmani’s text to Writer and I can perfectly see the whole text. I think in your case is a problem of font. In my Writer non-English characters appear as “Mangal” font. I don’t remember of installing this font manually (I have probably no reason to do that - I don’t speak Punjap), so most probably installed some software… Using LibreOffice v4.0.1.2 on Windows XP.
@froz and @gagmani - Could you possible post a screenshot of how the characters should look like and possibly add information on Character Encoding in your browser settings?
BTW, I am running Firefox 19.0.2
As ROSt52 pointed out it looks to me Punjabi is using 2-bytes per one character. If this is correct then change settings Tools | AutoCorrect Options | Word Completion tab | settings Min. word length. to 8.
It may also be dependent of Writer language settings. Maybe you are using wrong language settings… From Tools | Language Settings | Languages | Locale settings. I think you should set it to Punjabi. For current document select all of the text with CTRL+A and then select from menu Tools | Language | For all Text | Punjabi.
@froz - Please let me answer to your comment here. I might need more characters than available in a note.
From the way Japanese (double byte) characters and words are handled in a SW, the auto-completion of words has no influence. The conversion from latin characters to a hiragana (a basic Japanese character set) is fixed because it there is a 1:1 relation existing. From hiragana to the complex kanji characters (they look very similar to Chinese characters) the conversion must be triggered by the writer with the correct selection.
If I understand how the conversion from latin characters to Punjabi characters are done I MIGHT be able to give @gagmani a hint. I also need to get correction of my assumption that the keyboard @gagmani is using has latin characters to enter information leading to Punjabi characters.
@froz - thanks for joining this discussion. It could be a very tricky one. Let’s see!
The original question was about the Word Completion feature of LO Writer and more specifically the treatment of letters and vowel signs (in Punjabi/Gurmukhi) by this feature. I have no idea why the answers so far have spiralled off into a discussion about double-byte encoding. @ROSt53 you almost certainly do not have a supporting Punjabi/Gurmukhi font installed. Even my phone shows the characters fine. Here are the characters for ‘Sikhi’ and ‘Harman’ respectively:
$ unicode ਸਿੱਖੀ
U+0A38 GURMUKHI LETTER SA
UTF-8: e0 a8 b8 UTF-16BE: 0a38 Decimal: ਸ
ਸ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
U+0A3F GURMUKHI VOWEL SIGN I
UTF-8: e0 a8 bf UTF-16BE: 0a3f Decimal: ਿ
ਿ
Category: Mc (Mark, Spacing Combining)
Bidi: L (Left-to-Right)
U+0A71 GURMUKHI ADDAK
UTF-8: e0 a9 b1 UTF-16BE: 0a71 Decimal: ੱ
Category: Mn (Mark, Non-Spacing)
Bidi: NSM (Non-Spacing Mark)
U+0A16 GURMUKHI LETTER KHA
UTF-8: e0 a8 96 UTF-16BE: 0a16 Decimal: ਖ
ਖ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
U+0A40 GURMUKHI VOWEL SIGN II
UTF-8: e0 a9 80 UTF-16BE: 0a40 Decimal: ੀ
ੀ
Category: Mc (Mark, Spacing Combining)
Bidi: L (Left-to-Right)
$ unicode ਹਰਮਨ
U+0A39 GURMUKHI LETTER HA
UTF-8: e0 a8 b9 UTF-16BE: 0a39 Decimal: ਹ
ਹ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
U+0A30 GURMUKHI LETTER RA
UTF-8: e0 a8 b0 UTF-16BE: 0a30 Decimal: ਰ
ਰ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
U+0A2E GURMUKHI LETTER MA
UTF-8: e0 a8 ae UTF-16BE: 0a2e Decimal: ਮ
ਮ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
U+0A28 GURMUKHI LETTER NA
UTF-8: e0 a8 a8 UTF-16BE: 0a28 Decimal: ਨ
ਨ
Category: Lo (Letter, Other)
Bidi: L (Left-to-Right)
As you can see ‘Sikhi’ has two letters and two vowel signs (and does not auto-complete) while ‘Harman’ has four letters (and does auto-complete). The auto-completion of words can be adjusted under Tools > AutoCorrect Options… > Word Completion but I imagine the original poster knows this. The problem is more likely to be whether a combining vowel sign is treated as a ‘letter’ by the LO Writer feature. I suspect not, but we need an expert to provide greater information.
Thanks to @froz for the screenshot with the encoding settings. As @owen pointed out it seems that I am missing the font to display Punjab. Thus I am out of the discussion.