Improving language support
Improving language support
I'm thinking of making some changes that will improve support for languages that include special characters such as umlauts. My goal is that you will be able to use special characters in profiles and waypoint files in particular without having to use codes.
I'm asking for feedback as to what you typically need to do to make things work in your language. For example you might need to use character codes with the INV_AUTOSELL_TYPES option in your profile. Also I want to know what changes you need to make to existing waypoint files you have downloaded that are supposedly multilinguage but don't work until you change them.
I'm asking for feedback as to what you typically need to do to make things work in your language. For example you might need to use character codes with the INV_AUTOSELL_TYPES option in your profile. Also I want to know what changes you need to make to existing waypoint files you have downloaded that are supposedly multilinguage but don't work until you change them.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
- Eggman1414
- Posts: 111
- Joined: Sun Jun 17, 2012 2:27 pm
Re: Improving language support
Interesting idea, Would seem pretty cool, if not to just have a little bit more customization.
Re: Improving language support
My current idea that I'm exploring is to make the bot use all utf8 everywhere and just convert strings for printing. That would involve hooking into the print functions to convert before printing. Even though I believe this is a good idea and will make it the most compatible, it could still cause problems. I think whatever I decide to do will cause some initial problems.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
Re: Improving language support
Awe, no feedback.
Ok I'll make my own list of situations that need to be considered. Don't blame me if I miss something.
In Profiles
Code snippets
Next step is to see what sort of strings each of those situations expect. For the ones that expect ascii umlauts, come up with a plan on how to fix it with, hopefully, backward compatibility.
Ok I'll make my own list of situations that need to be considered. Don't blame me if I miss something.
In Profiles
- INV_AUTOSELL_IGNORE
INV_AUTOSELL_TYPES and INV_AUTOSELL_TYPES_NOSELL
<friends> and <mobs>
PARTY_FOLLOW_NAME
- RoMScript and variations
QuestByName functions
ChoiceOptionByName
getQuestStatus
inventory:findItem
pawn:hasBuff
player:findNearestNameOrId
Code snippets
- if target.Name == "name with umlauts" then
if target.Name == GetIdName(id of obj with umlauts in name) then
Next step is to see what sort of strings each of those situations expect. For the ones that expect ascii umlauts, come up with a plan on how to fix it with, hopefully, backward compatibility.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
-
spyfromsiochain
- Posts: 84
- Joined: Sun Aug 18, 2013 9:57 am
Re: Improving language support
Well I dont want u to just speak alone rock, but I dont think I can help u, I have english client, no umlauts here <3
I look at those fans with no blades and it amazes me everytime, how can they push air without using blades lol - lisa (pro sentence
)
Re: Improving language support
There have been a few times when I've explained things very carefully, mainly to help myself clarify what I'm doing, this is one of them. So even if no one helps, these posts are serving a purpose.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
-
spyfromsiochain
- Posts: 84
- Joined: Sun Aug 18, 2013 9:57 am
Re: Improving language support
Agree.
Well in what I can help, shout!
Well in what I can help, shout!
I look at those fans with no blades and it amazes me everytime, how can they push air without using blades lol - lisa (pro sentence
)
Re: Improving language support
Now, I'll list the ones that expect ascii characters and will cause problems.
<friends> and <mobs>
<friends> and <mobs>
- The friends and mobs lists expect ascii character codes at the moment. Seeing as these are loaded only when the profile is loaded, I can easily convert it to utf8 when it loads. So it would then support old profiles that still use ascii codes and newer ones that can use utf8 codes or characters.
- Also expects ascii codes. The easy option again would be to just convert it to utf8 when loaded for backward compatibility.
- This ones a bit more trouble. I could do a convert on any names used with it but it's a more highly used function so a convert could slow the bot a bit. Although probably not too much.
- On English PCs, whatever command you try on the commandline can be copied to a file and used there. But if you use strings with umlauts, you wont get the same results at the commandline as you would in a file. That's because the characters you type are ascii. I should be able to convert the whole command to utf8 before executing it. Funnily, if the command is a print statement it will be converted back to ascii for printing. Can't be helped but shouldn't matter as long as the convert functions are fast enough.
- pawns and objects have always had their names converted to ascii. So if you had some code like this you would have had to use ascii codes for any umlauts int the name. Unfortunately there is nothing I can do about this. If you have such code you will have to change the name to use utf8 characters.
- I believe this never worked as shown by the Spearmen in Yolius' Haunted minigame. I believe users have had to change it to use a string in their language (using ascii codes) to make it work. After these changes it will work. Which of course means those users who changed it will have to put it back the way it was to get it to work again.
- Should be able to take the same steps as for commandline.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
- Bill D Cat
- Posts: 555
- Joined: Sat Aug 10, 2013 8:13 pm
- Location: Deep in the Heart of Texas
Re: Improving language support
Probably the easiest update in all this will be getting createpath.lua to output in UTF-8 format. Though the input for Numpad-0 text will also have to be processed correctly. But generally, since the vast majority of the text that it creates is ASCII, there shouldn't be a huge impact on the performance. Once getTEXT() and GetIdName() fully support UTF-8, then those are the major steps conquered in getting it updated.
After that, it is just a matter of nudging people along in the right direction to use a text editor that supports it. Though I think Notepad, Wordpad and Notepad++ all handle it pretty well in Windows Vista and newer.
After that, it is just a matter of nudging people along in the right direction to use a text editor that supports it. Though I think Notepad, Wordpad and Notepad++ all handle it pretty well in Windows Vista and newer.
Re: Improving language support
Actually I didn't think of createpath. All the automated stuff should be alright because it will just leave all the original strings in utf8. The only thing we have to worry about is any strings entered in MM. I think that's only the 'Add Code" option. So, like commandline, we just have to convert anything typed to utf8 before saving it. The conversion functions are turning out to be a bit of a pain but still working on it
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan
- Bill D Cat
- Posts: 555
- Joined: Sat Aug 10, 2013 8:13 pm
- Location: Deep in the Heart of Texas
Re: Improving language support
I was thinking about some other issues that may or may not cause problems with the language conversion.
Would saving a file that was previously UTF-8 encoded as an ASCII file cause the bot to choke on it?
What I am getting at is this conversion would not be a one-and-done thing for profiles, waypoints and userfunctions. Would the load routines would have to know what the saved format of the file was when it is opened so that it only converts to UTF-8 as needed?
I just don't know if every user would save an edited file in UTF-8 format every time, so some type of sanity check might need to be done any time the file was accessed. I guess I am just not all that familiar with the exact differences between them as far as opening and reading the files. I understand the 4-byte encoding method, but not how to initially tell if a file was saved in one format or the other.
Would saving a file that was previously UTF-8 encoded as an ASCII file cause the bot to choke on it?
What I am getting at is this conversion would not be a one-and-done thing for profiles, waypoints and userfunctions. Would the load routines would have to know what the saved format of the file was when it is opened so that it only converts to UTF-8 as needed?
I just don't know if every user would save an edited file in UTF-8 format every time, so some type of sanity check might need to be done any time the file was accessed. I guess I am just not all that familiar with the exact differences between them as far as opening and reading the files. I understand the 4-byte encoding method, but not how to initially tell if a file was saved in one format or the other.
Re: Improving language support
First of all any file that doesn't have any bytes above 127 will save the same if saved as ascii or utf8 without BOM (I don't think MM can handle files saved as utf8 with bom). Any file that already has bytes higher than 127 should open as utf8 regardless of how it was saved (in a code editor such as notepad++). The same goes for files with no bytes above 127. They will be opened as ascii regardless of how they were saved (excluding xml files, see note2 below).
When a file is open, switching between utf8 and ascii wont change the file. It only changes the way it displays the characters. utf8 characters will change to their individual bytes. You can even open and save a file with utf8 characters in notepad, which only supports ascii. It will just show the individual utf8 bytes. The only thing to be considered when dealing with utf8 vs ascii is when you type a special character.
When in utf8 mode, if you type a special character, eg. ä (Alt 132), that is a utf8 character and is saved as 2 utf8 bytes in the file. If you are in ascii mode and type ä, that is an ascii character which is not supported by MM because MMs ascii codes are different than windows ascii codes. This is because Windows uses 2 character sets, one for Window apps and one for console apps such as cmd.exe and MM.
So it has always been the case that if users wanted to use actual special characters they had to save as utf8 without bom. This hasn't changed.
Note: if you use slash codes, eg. \132, this is unaffected by what encoding you save as.
Note2: profiles and waypoint files are xml files and should always start withThis causes editors such as NotePad++ to open and save the file as utf8 automatically regardless of if there are bytes higher than 127 or not.
Hope that helps.
When a file is open, switching between utf8 and ascii wont change the file. It only changes the way it displays the characters. utf8 characters will change to their individual bytes. You can even open and save a file with utf8 characters in notepad, which only supports ascii. It will just show the individual utf8 bytes. The only thing to be considered when dealing with utf8 vs ascii is when you type a special character.
When in utf8 mode, if you type a special character, eg. ä (Alt 132), that is a utf8 character and is saved as 2 utf8 bytes in the file. If you are in ascii mode and type ä, that is an ascii character which is not supported by MM because MMs ascii codes are different than windows ascii codes. This is because Windows uses 2 character sets, one for Window apps and one for console apps such as cmd.exe and MM.
So it has always been the case that if users wanted to use actual special characters they had to save as utf8 without bom. This hasn't changed.
Note: if you use slash codes, eg. \132, this is unaffected by what encoding you save as.
Note2: profiles and waypoint files are xml files and should always start with
Code: Select all
<?xml version="1.0" encoding="utf-8"?>Hope that helps.
- Please consider making a small donation to me to support my continued contributions to the bot and this forum. Thank you. Donate
- I check all posts before reading PMs. So if you want a fast reply, don't PM me but post a topic instead. PM me for private or personal topics only.
- How to: copy and paste in micromacro
________________________
Quote:- “They say hard work never hurt anybody, but I figure, why take the chance.”
- Ronald Reagan