Saturday, August 25, 2007

Uniscribe and Delphi!

Microsoft has an extremely powerful API called Uniscribe that allows applications to do typography of scripts.Things that are mostly in interest of those who wants to make their programs multilanguage.

ut, one big problem with Uniscribe is the lack of samples and documentations over internet.

Problem :
I have a routine in my En2Fa library that gives me a widestring.If this WideString is displayed in right to left TLabel or TEditBox(i.e., in Delphi a component with BIDI set to right-to-left),you can see that it is displayed correctly.(You wont see the difference until you have both English and Persian-in my case-or both RTL and LTR texts in your string).
But if you display it in a window that doesn't have RTL attribute, some swapping of words will be noticed.
I wanted to use Uniscribe to fix this problem.

But why?

Well in my Y!En2FaMsngr program, I have to output this text into Yahoo Messenger editbox, and reading it correctly is not also an issue, but sending it correctly is more important.
So i have to reorder the text that it both displays ok and also sends to other party in correct way.
Making Yahoo Messenger edit box right to left is not a problem. You can use this code to make edit boxes right to left.

lAlgn := GetWindowLong(hYahooEditHnd, GWL_EXSTYLE) ;

// To toggle alignment and reading order
lAlgn := lAlgn or WS_EX_RIGHT or WS_EX_RTLREADING;

SetWindowLong(hYahooEditHnd, GWL_EXSTYLE, lAlgn );
InvalidateRect(hYahooEditHnd, nil, false);

Searching over internet, and also looking at Uniscribe documentation in your SDK you will notice there are several functions that will do the job of displaying texts, but what I need is a logical form of reordered string. So I can use normal WM_SETTEXT to set text of yahoo EditBox.

Logical order is an order that you typed your strings. But visual order is how text should be displayed.
Look here how logical and visual orders are different.
So, I created ApplyRightToLeft function that tries to reorder logical string so it can be displayed ok in left to right EditBox.

To start you need Uniscribe Delphi unit files. You can download it from here.
Two important functions I used are ScriptItemize and ScriptLayout.

Look at the code below.

function ApplyRightToLeft(inText:WideString) : WideString;
generated_items,max_items : integer;
text_len,i : integer;
bidiLevel:array of byte;
LogicalToVisual ,VisualToLogical :array of integer;
widths:array of integer;
Result := '';
text_len := Length(inText);
if text_len = 0 then
// Most applications won’t need to set any control flags.
ZeroMemory(@control, sizeof(SCRIPT_CONTROL));

// Initial state, you will probably want to keep this updated as you process
// runs in order so that you can always give it the correct direction of the
// surrounding text.
ZeroMemory(@state, sizeof(SCRIPT_STATE));
state.uBidiLevel := 0; // 0 means that the surrounding text is left-to-right.

max_items := 16;
while (true) do
// Make enough room for the output.

// We subtract one from max_items to work around a buffer overflow on some
// older versions of Windows.
generated_items := 0;
hr := ScriptItemize(PWideChar(inText), text_len, max_items - 1, @control,
@state, @items[0], @generated_items);
if (SUCCEEDED(hr)) then
// It generated some items, so resize the array. Note that we add
// one to account for the magic last item.
SetLength(items,generated_items + 1);
if (hr <> E_OUTOFMEMORY) then
// Some kind of error.

// The input array isn't big enough, double and loop again.
max_items := max_items*2;

SetLength(VisualToLogical ,generated_items);
SetLength(LogicalToVisual ,generated_items);

// Manually extract bidi-embedding-levels ready for ScriptLayout
for i := 0 to generated_items-1 do
bidiLevel[i] := items[i].a.s.uBidiLevel;

// Build a visual-to-logical mapping order
ScriptLayout(generated_items, @bidiLevel[0], @VisualToLogical [0],@LogicalToVisual [0]);

for I := 0 to high(items) do
items[i].iCharPos := items[i].iCharPos + 1;

for I := 0 to high(items) - 1 do
widths[i]:=items[i+1].iCharPos - items[i].iCharPos;
widths[high(items) ]:=1;

for I := generated_items - 1 downto 0 do

// free the temporary BYTE[] buffer
bidiLevel := nil;
VisualToLogical := nil;
LogicalToVisual := nil;
widths := nil;
items := nil;

The job of ScriptItemize is to break a Unicode string into individually shapeable items and giving you SCRIPT_ITEM structures representing the items that have been processed.
Look at the result of ScriptItemize with following text.

And here is how it is formed in logical order.

So in my function generated_items will be 3.
After that things are easy, you have to calculate width of each run, and using ScriptLayout that gives you Logical to Visual table-which gives you an array showing which runs should be displayed first-.And showing it in reverse order is all you have to do.

Here are links that can make life a little bit easier when using Uniscribe.
Unicode Script Processor for Complex Scripts
Supporting Multilanguage Text Layout and Complex Scripts with Windows NT 5.0
Design and Implementation of a Win32 Text Editor
Displaying Text with Uniscribe
Globalization Step-by-Step
Limitations of Uniscribe
Supporting multilanguage text layout and complex scripts with Windows 2000
Yes I know, I hate VB but...just for those interested
Tutorial Using Unicode in Visual Basic 6.0


maya said...

can you guide for 'ScriptBreak',please?
I want to do ScriptItemize,after that,I want to do softbreak.
Thanks for your helps

Arnaud said...

I used the whole Uniscribe API to render Arabic in our open source PDF engine.
Your article helped me starting!
Thanks for the reference!

Millan said...

Great about this topic. its very interesting about Delphi. thanks for sharing .