VBA Unicode done right, redux

Some time ago I wrote a short article on forcing VBA to use some form of unicode to allow simple insertion of non-latin1 text into VBA modules. It sort of worked, for the most part, kind of. Well, it didn't. The problem is down to the VBA editor being locked to the local encoding of the machine it is running on. You can only type characters from BIG5 in China, Shift-JIS in Japan, and so on. If you have a need to make a VBA module that uses strings suitable for all locales ... you're pretty much stuffed.

Except you're not. If you do it properly, you can have any unicode character displayed in any VBA locale. So how do you do it properly? Well, you're supposed to use ChrW$() to generate unicode characters individually. Yes. Really.

Say you want to include the Chinese character 汉 in your latin1 text. The only way to do it is to use a string like this:

    MyString = "This is my string with a " &
    ChrW$(&H6C49) & " in it."

Imagine if you have entire sentences made up of such characters. Doesn't bear thinking about, does it? However, I've made a simple way to work with it. The trick is to write your strings outside the VBA editor using a unicode encoding (e.g. UTF-8). For example, take this little proto-module:

Sub Tomato
    MyForm.MyLabel.Caption = "Chinese or the Sinitic language(s) (汉语/漢語, pinyin: Hànyǔ; 中文, pinyin: Zhōngwén) can be considered a language or a language family and is originally the indigenous languages spoken by the Han Chinese in China."
End Sub

That won't work with VBA due to the Chinese characters. However, they display fine in unicode. Hrm. What to do. Well, you can use my little utility to fix it. It's a simple .NET app that should run on just about anything (Mono included, although I've not tested it).

Using it is very simple. Just paste your unicode string into the top box and click the Unicode to VBA button. The output should look like this:

You can also go the other way, so copy the already converted string back out of your VBA module into the top box and hit VBA to Unicode. You should see this:

That's basically it. Once the converted string is copied into your VBA, it will look like this.

Sub Tomato
    MyForm.MyLabel.Caption = "Chinese or the Sinitic language(s) (" & ChrW$(&H6C49) & ChrW$(&H8BED) & "/" & ChrW$(&H6F22) & ChrW$(&H8A9E) & ", pinyin: Hàny" & ChrW$(&H1D4) & "; " & ChrW$(&H4E2D) & ChrW$(&H6587) & ", pinyin: Zh" & ChrW$(&H14D) & "ngwén) can be considered a language or a language family and is originally the indigenous languages spoken by the Han Chinese in China."
End Sub

That should be fine for inclusion directly in VBA and all will work properly.

You can download VBAUnicodeConverter.exe directly, or the source files in VBAUnicodeCoverter.zip. There is also the original Perl script that you can find in convert.pl. There are no particular restrictions on the source, except that you should credit me if you nick bits of it. Not that there's much to nick.

Displays ?? for Chinese - Msgbox works though

Hi Craig,

Thanks for the nice and easy utility for converting the strings from Unicode to VBA.
I am using VBA in Client Builder environment. I tried using your Msgbox technic. and it worked very well ! But when I replaced my text strings with the VBA converted string, it displayed ?? for the chinese chars ( English was displayed as it is ).
I wonder why the text display on the form did not work while the Msgbox could display the chinese char properly .
Is there anything else needed to display simple text?

Thanks
Durga

ClientBuilder

Hi,

I've never used Client Builder, so I'm not sure if it uses the standard Windows forms. I've never had a problem with the actual forms, only with the message box, but the problem you're describing sounds like the same one as with MsgBox. If it only supports single byte characters, you're never going to get Chinese characters.

If you find a solution, can you post it on here? It'll be handy for anyone else with similar problems :)

Ta,

--
C

Client Builder

Sure, I will post once I get the solution !
Meanwhile , I got another problem.

The caption of the MessageBox does not display the chinese characters.

Here is what I tried.
MessageBoxW "Can you read this character: " & ChrW$(&H6C49) & "?", _
vbYesNo, "Test" & ChrW$(&H6C49) & "Box"

and I got : "Test ? Box"

Any idea how to get this ? This is also very important for me.

Thanks
Durga

Captions

Hmm ... I'm not sure. As far as I know it's related to the font that's being used for the caption, but it could be to do with the system locale. What version of Windows is this?

--
C

Captions

Windows XP
Professional
Version 2002
Service Pack 2

I had regional settings installed Asian language pack.

Thanks
Durga

Captions display in Client Builder

Hi Craig,

I got the solution for the ?? display problem in Client Builder environment.

You need to set the language as Chinese ( or whatever target language it is ) in the ControlPanel->Regional and LanguageOptions --> Advanced tab , for nonUnicode applications.
That lets the Client Builder display Chinese on all mimic screens and other VBA generated Unicode strings.

By the way , thanks once again for the little Utility 'Unicode to VBA'. We will be using it to generate VBA CharW() for our Unicode strings in the CB code.

Thanks
Durga

Good stuff!

Durga,

Really glad to hear you got it working! Also very glad you find the utility useful :)

--
C

Issue displaying chinese characters using a msgbox

Hello Craiga,

I need what you have just written. However I don´t get the chinese characters when using a msgbox.

This works fine...
Range("A1") = "This is my string with a " & ChrW$(&H6C49) & " in it."

This doesn´t
msgbox("This is my string with a " & ChrW$(&H6C49) & " in it.")
The excel message displays "This is my string with a ? in it."

I would really appreciate any help on this issue.
Kind regards,

Rodrigo Gomez
skype: rodrogg
rodrigo@arcadiaconsulting.es

Using MessageBoxW

The normal MsgBox function uses the Windows system font, and as such is tied to the OS's code page. You can display unicode characters in message boxes using the MessageBoxW function from User32.dll. Include this code before you use it:
Private Declare Function User32MsgBox Lib "user32" Alias "MessageBoxW" _
    (Optional ByVal hWnd As Long, Optional ByVal Prompt As Long, _
     Optional ByVal Title As Long, Optional ByVal Buttons As Long) As Long

Function MessageBoxW(cPrompt As String, _
    Optional cButtons As VbMsgBoxStyle = vbOKOnly, _
    Optional cTitle As String) As Long

    MessageBoxW = User32MsgBox(0, StrPtr(cPrompt), StrPtr(cTitle), cButtons)

End Function
You can then call MessageBoxW using just the text, or with the optional caption and button styles like this:
MessageBoxW "This is my string with a " & ChrW$(&H6C49) & " in it."

MessageBoxW "Can you read this character: " & ChrW$(&H6C49) & "?", _
            vbYesNo, "Test Message Box"

MessageBoxW does not work for me

Hello Craig,

Thank you for your solution. However it doesn´t work for me.
You are right, message box characters depend on the operating system. We have tried on a chinese pc and it´s displayed correctly.

I checked User32.dll and it´s in C:\WINDOWS\system32.
Any clue?.
Once more, I really appreciate your time.
Regards,

Rodrigo Gomez
skype: rodrogg
rodrigo@arcadiaconsulting.es

Interesting ...

I've put together a Word 2003 document that illustrates the MessageW example. You can download it here. See if that works for you. I've tested it on Word 2007 and Word 2003.