Bidi

1. left to right and right to left, perhaps for Arabic or Hebrew?
2. Understanding the bid algorithm
3. Writing mode and Unicode bidi

1.

left to right and right to left, perhaps for Arabic or Hebrew?

J.Pietschmann



> Now that I am turning my attention to our language requirements (Hebrew & 
> Arabic), I would like this character to "flip" and point the opposite 
> direction. There is no corresponding left-pointing-arrow in the Dingbat 
> Font.

> Is there any fo:magic I could use to achieve this? 

XSLFO supports Unicode BIDI. Check out the BIDI properties of the arrow, if it's mirrored, it will flip automatically in a rl writing mode context. I suspect it isn't mirrored though. The properties can be found here

2.

Understanding the bid algorithm

I'm generating index entries for Arabic documents (using XSL Formatter 2.3) and I'm running into a problem with the how the bi-di algorithm is being applied. I can solve the problem by adding a bidi-override element to the output FO but that solution is not ideal because it would mean adding markup to the input XML document, which I would like to avoid.

Here is a typical example:

<fo:block-container writing-mode="rl-tb">
<fo:block >Arabic English1 English2<fo:leader 
leader-length="1em">123</fo:block>
</fo:block-container>

The desired presentation result is:

   123  English1 English2 cibarA

But what I get from XSL Formatter is:

   English1 English2  123 cibarA

That is, the "<fo:leader>123" is taken as part of the left-to-right sequence, which I think is correct application of the Unicode bi-di algorithm.

I'm wondering if there is a Unicode control character of some sort that I can interject into the output to end the recognition of the left-to-right text (as though a normal Arabic character occured after the English text)? I looked through the Unicode database as presented by Unipad but I didn't see anything that looked like what I want.

I guess another question would be does FO define any markup that I could wrap around the entire index entry text (but not including the page number citation) that would also have the defined semantic of limiting the scope of the bidi recognition? I have tried putting a bidi-override around the leader (between the index entry text and the page number citation to no effect) but it does not produce the desired effect.

I am continuing to work with Antenna House support on this issue but I was hoping there might be an easy solution that wouldn't require more work on anyone's part.

Answer:

> 2. It matters to AH whether or not the page number is literal text in 
> the FO instance or a page-number-citation.

What is the difference, and why?

Eliot answers: If the numbers are literal, they are included in the left-to-right sequence. If generated by page-number-citation, then they are not. I would expect the same behavior in both cases.

 > Here is an instance that produces the correct result using XSL Formatter:
 > 
 > <fo:block>
 >    <fo:bidi-override unicode-bidi="embed" direction="rtl">
 >    >ABC
 >    <fo:leader leader-length="1em" leader-pattern="space"
 >    /><fo:page-number-citation ref-id="bcdfr234566"/>
 > </fo:block>
 > 
 > I'm still interested to know if there's a control character that changes 
 > the writing direction.

&#x200E;, LEFT-TO-RIGHT MARK
&#x200F;, RIGHT-TO-LEFT MARK
&#x202A;, LEFT-TO-RIGHT EMBEDDING
&#x202B;, RIGHT-TO-LEFT EMBEDDING
&#x202C;, POP DIRECTIONAL FORMATTING
&#x202D;, LEFT-TO-RIGHT OVERRIDE
&#x202E;, RIGHT-TO-LEFT OVERRIDE

See UAX #9, The Bidirectional Algorithm, or Section 3.12, Bidirectional Behavior, of the Unicode Standard, Version 3.0. They're also covered in Chapter 6 of "Unicode: A Primer".

Section 5.8, Unicode BIDI Processing, of the XSL 1.0 Recommendation describes the behavior of the "direction" and "unicode-bidi" properties in terms of RLO/LRO and RLE/LRE.

Mixing markup and the formatting control characters is asking for trouble. LRM and RLM do what you want -- i.e., they force the directionality of neighbouring characters, such as numbers, that have only "weak" inherent directionality. However, here's what the HTML 4.0 Recommendation says about mixing markup and formatting control characters:

If both methods are used, great care should be exercised to insure proper nesting of markup and directional embedding or override, otherwise, rendering results are undefined.

3.

Writing mode and Unicode bidi

Tokushige Kobayashi



> <fo:block>&#x625;&#x639;&#x62f;&#x627;&#x62f;</fo:block>

> and

> <fo:block-container writing-mode="tb-rl">
>    <fo:block>&#x625;&#x639;&#x62f;&#x627;&#x62f;</fo:block>
> </fo:block-container>

> do very different things, namely switching 
> the start and end coordinates.
> This *has* to be done with writing-mode, correct? 
> If this is so, then one
> would indeed need to use the writing-mode when 
> switching from left-to-right
> to right-to-left, correct?

Yes

Unicode Bidi only determines the directionality of characters. By default, Arabic Characters are written from right to left, Latin characters are written from left to right.

When Arabic characters are positioned in a block which is in a lr-tb reference area, start of block is still left. An Arabic word is positioned at left in a line.

Please refer to XSL-FO Spec. 5.3 Computing the Values of Corresponding Properties.

For example, If the "writing-mode" specifies an inline-progression-direction of "left-to-right": "left" maps to "start".

As a result, you must wrap Arabic paragraphs in fo:block-container which has a property of writing-mode="rl-tb". (not tb-rl)

In my understanding, XSL Formatter does not change the behavior for writing-mode.