Discussion:
Mixing ANSI and Unicode windows and WM_CHAR issues with IME - BugTest.zip (0/1)
(too old to reply)
Pix
2006-02-06 12:37:10 UTC
Permalink
I have written a small application to reproduce some issues I've been
having lately. It consists of a main application (ANSI) and a Unicode
DLL. The main app is an app-wizard generated dialog based application
that creates a child Unicode dialog using the Unicode DLL. Both app
and the DLL use shared MFC.

When app starts, two edit controls are displayed. One is ANSI hosted
on the main app dialog and the other is Unicode hosted in the child
Unicode dialog.

In order for PreTranslateMessage mechanism to work in the Unicode DLL,
the main app calls DLL's TranslateAccelerator function which calls
AfxPreTranslateMessage in the context of the Unicode DLL.

Now, activate IME and enter some, say, Japanese characters in the ANSI
edit box. If running Japanese regional settings, you will get what you
typed properly. If you do the same for the other (Unicode) edit box,
you will get garbage.

If you just comment out the TranslateAccelerator, even the Unicode
edit box will do just fine.

But I really need that TranslateAccelerator because I need the
PreTranslateMessage mechanism in both the app and the DLL.

I have narrowed the problem down to the WM_CHAR message which can
contain either a MBCS or a Unicode character. If TranslateAccelerator
is commented out, PreTranslateMessage does not call IsDialogMessageW
and the message is dispatched by DispatchMessageA of the main app's
message loop which recognizes that the target window is Unicode and
does a proper conversion.

On the other hand, if TranslateAccelerator is in place,
IsDialogMessageW is called at one point. It assumes that WM_CHAR is
already Unicode and does no translation. Therefore, the Unicode edit
box gets a MBCS WM_CHAR, treats it as Unicode character and inserts
garbage.

How does one proceed with mixed ANSI/Unicode windows in a single
application and still have the PreTranslateMessage mechanism in place
working normally?

Any help or suggestions is highly appreciated!

Pix
Michael (michka) Kaplan [MS]
2006-02-07 03:07:58 UTC
Permalink
For IMEs, always use WM_IME_CHAR, not WM_CHAR....
--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.
Post by Pix
I have written a small application to reproduce some issues I've been
having lately. It consists of a main application (ANSI) and a Unicode
DLL. The main app is an app-wizard generated dialog based application
that creates a child Unicode dialog using the Unicode DLL. Both app
and the DLL use shared MFC.
When app starts, two edit controls are displayed. One is ANSI hosted
on the main app dialog and the other is Unicode hosted in the child
Unicode dialog.
In order for PreTranslateMessage mechanism to work in the Unicode DLL,
the main app calls DLL's TranslateAccelerator function which calls
AfxPreTranslateMessage in the context of the Unicode DLL.
Now, activate IME and enter some, say, Japanese characters in the ANSI
edit box. If running Japanese regional settings, you will get what you
typed properly. If you do the same for the other (Unicode) edit box,
you will get garbage.
If you just comment out the TranslateAccelerator, even the Unicode
edit box will do just fine.
But I really need that TranslateAccelerator because I need the
PreTranslateMessage mechanism in both the app and the DLL.
I have narrowed the problem down to the WM_CHAR message which can
contain either a MBCS or a Unicode character. If TranslateAccelerator
is commented out, PreTranslateMessage does not call IsDialogMessageW
and the message is dispatched by DispatchMessageA of the main app's
message loop which recognizes that the target window is Unicode and
does a proper conversion.
On the other hand, if TranslateAccelerator is in place,
IsDialogMessageW is called at one point. It assumes that WM_CHAR is
already Unicode and does no translation. Therefore, the Unicode edit
box gets a MBCS WM_CHAR, treats it as Unicode character and inserts
garbage.
How does one proceed with mixed ANSI/Unicode windows in a single
application and still have the PreTranslateMessage mechanism in place
working normally?
Any help or suggestions is highly appreciated!
Pix
Pix
2006-02-07 08:56:00 UTC
Permalink
Post by Michael (michka) Kaplan [MS]
For IMEs, always use WM_IME_CHAR, not WM_CHAR....
Thanks! It is just that I want to have transparent support for Asian
languages without having to do anything special. This doesn't seem
possible if ANSI and Unicode windows are mixed.

Now forget about IME... Use the test app and do an ALT+number to
generate a non-ANSI character. If you Spy++ on both edit controls you
will realize that they both get IDENTICAL WM_CHAR message. One would
expect the ANSI window to get the MBCS WM_CHAR and an Unicode window
to get Unicode WM_CHAR, but this does not happen. Unicode window also
gets a MBCS WM_CHAR.

Pix
Norman Diamond
2006-02-08 00:24:25 UTC
Permalink
Now forget about IME... Use the test app and do an ALT+number to generate
a non-ANSI character. If you Spy++ on both edit controls you will realize
that they both get IDENTICAL WM_CHAR message. One would expect the ANSI
window to get the MBCS WM_CHAR and an Unicode window to get Unicode
WM_CHAR, but this does not happen. Unicode window also gets a MBCS
WM_CHAR.
Sorry I can only express an opinion about "one would expect" instead of what
happens, since I've never used ALT+number to generate anything.

If you use ALT+number to generate an ANSI character, i.e. a character that
exists in the current ANSI code page, then I would expect it to be handled
as your input requested it. But you aren't talking about such a case.

If you use ALT+number to generate a non-ANSI character, i.e. a codepoint
that doesn't represent a character in the current ANSI code page, then I
would expect garbage. If garbage includes having the MBCS value pass
unchanged instead of being converted to some Unicode character (even if a
question mark or an underscore or a square box is some common choice to use
as a substitute for bad input), I would not be surprised if that happens.

Of course the usual way to input a character that isn't directly on the
keyboard is to use the IME, and for some punctuation marks that are hard to
find using the IME, copying and pasting from the character code chart is
needed occasionally. I don't think there are many people who either
memorize or look up thousands of numeric codes to use with an ALT key.
Pix
2006-02-08 08:23:07 UTC
Permalink
"Norman Diamond" <***@community.nospam> wrote:

[snip]
Post by Norman Diamond
Of course the usual way to input a character that isn't directly on the
keyboard is to use the IME, and for some punctuation marks that are hard to
find using the IME, copying and pasting from the character code chart is
needed occasionally. I don't think there are many people who either
memorize or look up thousands of numeric codes to use with an ALT key.
Thanks Norman!

I didn't mean to say that using ALT+number is a preferred way of
entering characters. I was merely using it as a different example (not
IME related) to reproduce the same problem.

Pix
Pix
2006-02-08 08:23:55 UTC
Permalink
Post by Michael (michka) Kaplan [MS]
For IMEs, always use WM_IME_CHAR, not WM_CHAR....
Using WM_IME_CHAR directly doesn't seem like a solution because, as
with WM_CHAR, there is no way to determine if a particular message
contains MBCS or Unicode characters in order to perform the necessary
conversion.

Pix
Michael (michka) Kaplan [MS]
2006-02-08 15:54:06 UTC
Permalink
If it is a Unicode window, it should be UTF-16; is mpt, it shoul be MBCS.
And there are no numpad-type bugs to worry about. Easy! :-)
--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.
Post by Pix
Post by Michael (michka) Kaplan [MS]
For IMEs, always use WM_IME_CHAR, not WM_CHAR....
Using WM_IME_CHAR directly doesn't seem like a solution because, as
with WM_CHAR, there is no way to determine if a particular message
contains MBCS or Unicode characters in order to perform the necessary
conversion.
Pix
Ben Bryant
2006-02-08 19:29:08 UTC
Permalink
Did Michael even read this thread? You stated up front that that is what you
expected.

I think the reason you are getting the ANSI message in your Unicode window
might have to do with the MFC framework you are using (your function names
gave away that you are using MFC). If keyboard messages are all getting
routed through the framework, then may be they are going to conform to the
fact that your frame window is ANSI. Either that, or the keyboard system
does not discriminate between windows within an application. The whole
keyboard system seems to have issues apart from the normal ANSI/UNICODE
dichotomy of Windows messaging. Then again, may be you just didn't set up
something quite right with your extension DLL.

"Michael (michka) Kaplan [MS]" <***@microsoft.online.com> wrote in
message news:***@TK2MSFTNGP14.phx.gbl...
If it is a Unicode window, it should be UTF-16; is mpt, it shoul be MBCS.
And there are no numpad-type bugs to worry about. Easy! :-)
Pix
2006-02-09 09:32:18 UTC
Permalink
Post by Ben Bryant
I think the reason you are getting the ANSI message in your Unicode window
might have to do with the MFC framework you are using (your function names
gave away that you are using MFC). If keyboard messages are all getting
routed through the framework, then may be they are going to conform to the
fact that your frame window is ANSI. Either that, or the keyboard system
does not discriminate between windows within an application. The whole
keyboard system seems to have issues apart from the normal ANSI/UNICODE
dichotomy of Windows messaging. Then again, may be you just didn't set up
something quite right with your extension DLL.
No, I don't think it has anything to do with any particular framework.
It is true I used MFC but I think it is irrelevant. You could write
code to reproduce the problem with just any framework or plain API
code.

The point is in having a) ANSI main app, b) Unicode plug-in DLL and c)
well-defined interface for a) to access b) including a method (that I
called TranslateAccelerator) to allow component b) to have a look at
the messages from the main message loop before they are dispatched.

This is actually very common. Take a look at
IShellView::TranslateAccelerator. It is the same thing. A shell view
is a window hosted in an app exposing a method to allow it to capture
some messages early. So, this is not my idea.

Basically, after all this time trying to debug this problem I believe
it would have been much cleaner if WM_CHAR message had an additional
flag to distinguish whether the character it carried is MBCS or
Unicode.

Currently I see two solutions: 1) handle WM_IME_CHAR and produce
WM_CHARs by *SENDING* them or 2) implement both
TranslateAcceleratorA() and TranslateAcceleratorW(). By using the
latter the calling app identifies itself as either ANSI or Unicode and
the plug-in can do the translation if necessary.

I have gone down the route 1). Hopefully this has no new side-effects.
I am assuming the DefView (default shell view) is doing the same.

Thanks to all that helped!

Pix
Ben Bryant
2006-02-09 16:33:40 UTC
Permalink
Pix, I was not suggesting MFC or you were doing anything wrong or unusual.
Here is a question: What makes you think the DLL has anything to do with it?
Would this WM_CHAR mixup not occur anytime you have a UNICODE window in an
ANSI app and do message translating/intercepting in the main message pump?
Note that once you get the message in ANSI you will have lost any Unicode
text that does not fit in your ANSI code page, so you wouldn't want your
plugin or Unicode window to do any translating from ANSI to Unicode.
Pix
2006-02-09 19:14:54 UTC
Permalink
Post by Ben Bryant
Pix, I was not suggesting MFC or you were doing anything wrong or unusual.
Ok, it's a misunderstanding.
Post by Ben Bryant
Here is a question: What makes you think the DLL has anything to do with it?
Did I explicitly say it is a DLL issue? I apologize if it sounded like
that. I only introduced DLL to the whole story because I don't think
there is (an easy) way to have both ANSI and Unicode MFC windows in a
single app and I didn't want to go along the route of writing an
API-only app.
Post by Ben Bryant
Would this WM_CHAR mixup not occur anytime you have a UNICODE window in an
ANSI app and do message translating/intercepting in the main message pump?
Note that once you get the message in ANSI you will have lost any Unicode
text that does not fit in your ANSI code page, so you wouldn't want your
plugin or Unicode window to do any translating from ANSI to Unicode.
You are absolutely right but having an ANSI app and a Unicode DLL is
not something I can change right now. I just needed to find a way for
the two to coexist well and it seems I have found a satisfactory
solution.

Thanks for your help!

Pix
Ben Bryant
2006-02-09 22:24:07 UTC
Permalink
Pix, thanks for the response, I am curious to understand your conclusion. So
you have gone with WM_IME_CHAR in your Unicode window and a SendMessage to
pass the Unicode value to WM_CHAR on the same window. Is this correct? And I
take it that solves the IME case, but does it solve the ALT+number issue you
mentioned or was that not important?
Pix
2006-02-10 11:31:22 UTC
Permalink
Post by Ben Bryant
Pix, thanks for the response, I am curious to understand your conclusion. So
you have gone with WM_IME_CHAR in your Unicode window and a SendMessage to
pass the Unicode value to WM_CHAR on the same window. Is this correct? And I
take it that solves the IME case, but does it solve the ALT+number issue you
mentioned or was that not important?
Honestly, I have not tried this. As somebody said ALT+number is not
something everybody uses everyday. Most users do not even know
anything about it so whether it works or not is not really my biggest
concern. I would expect it not to work unless WM_IME_CHAR is sent and
this doesn't seem to be likely.

Pix
Mihai N.
2006-02-10 07:46:42 UTC
Permalink
Post by Pix
This is actually very common. Take a look at
IShellView::TranslateAccelerator. It is the same thing. A shell view
is a window hosted in an app exposing a method to allow it to capture
some messages early. So, this is not my idea.
Yes, but is not the same thing. IShellView is Unicode (like all COM
interfaces), and the hosting application (Explorer) is also Unicode.
It is not the mixture you are trying.
--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Pix
2006-02-10 11:40:28 UTC
Permalink
Post by Mihai N.
Yes, but is not the same thing. IShellView is Unicode (like all COM
interfaces), and the hosting application (Explorer) is also Unicode.
It is not the mixture you are trying.
I think you prove nothing. Consider a MBCS application hosting a shell
view on e.g. Windows XP. This is another example of a MBCS app and a
Unicode "plug-in" window. On the other hand, TranslateAccelerator has
just a single argument of type MSG * and as far as I know there is no
MSGA and MSGW, just MSG. So basically, DefView (being Unicode) can
just as well receive MBCS WM_CHARs as my example did and effectively
get in trouble just as I did.

I have previously done tests on DefView and it seems that in place
renaming of items (using IME and Asian chars) does not suffer from the
problem I had. The only thing I can think of is because DefView
employs the same or very similar technique of handling WM_IME_CHAR and
posting itself WM_CHARs.

Pix
Pix
2006-02-09 09:14:22 UTC
Permalink
Post by Michael (michka) Kaplan [MS]
If it is a Unicode window, it should be UTF-16; is mpt, it shoul be MBCS.
And there are no numpad-type bugs to worry about. Easy! :-)
You kind of gave a hint there. It seems that WM_IME_CHAR is *sent* to
the window while WM_CHAR messages generated by the default handler of
WM_IME_CHAR are actually *posted*. Therefore, WM_CHAR messages are
seen by the main app's message queue while WM_IME_CHAR go directly to
the destination window.

The simple solution is to duplicate the work of the default window
procedure with respect to WM_IME_CHAR by *sending* WM_CHAR messages
instead of posting them. This way they never get to the app's message
queue but to the target window directly.

This seems like a nice fix for something that looks like a *bug*. Why
does the default window procedure post WM_CHAR messages?

Pix
Michael (michka) Kaplan [MS]
2006-02-19 20:18:06 UTC
Permalink
WM_CHAR is an infomative message, nothing more. It can be ignored, and the
results of it do not change anything....

The difference is not a bug, it is simply the way things are....
--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.
Post by Pix
Post by Michael (michka) Kaplan [MS]
If it is a Unicode window, it should be UTF-16; is mpt, it shoul be MBCS.
And there are no numpad-type bugs to worry about. Easy! :-)
You kind of gave a hint there. It seems that WM_IME_CHAR is *sent* to
the window while WM_CHAR messages generated by the default handler of
WM_IME_CHAR are actually *posted*. Therefore, WM_CHAR messages are
seen by the main app's message queue while WM_IME_CHAR go directly to
the destination window.
The simple solution is to duplicate the work of the default window
procedure with respect to WM_IME_CHAR by *sending* WM_CHAR messages
instead of posting them. This way they never get to the app's message
queue but to the target window directly.
This seems like a nice fix for something that looks like a *bug*. Why
does the default window procedure post WM_CHAR messages?
Pix
Loading...