JP1 Remotes Forum Index JP1 Remotes


FAQFAQ SearchSearch 7 days of topics7 Days MemberlistMemberlist UsergroupsUsergroups RegisterRegister
ProfileProfile Log in to check your private messagesLog in to check your private messages Log inLog in

Unicode Character Representation in RemoteMaster Source Code

 
Post new topic   Reply to topic    JP1 Remotes Forum Index -> JP1 - Software
View previous topic :: View next topic  
Author Message
WagonMaster



Joined: 16 Apr 2009
Posts: 361

                    
PostPosted: Sat Sep 12, 2009 10:01 pm    Post subject: Unicode Character Representation in RemoteMaster Source Code Reply with quote

While I was looking into fixing what I thought might be a bug in RemoteMaster, I was compiling the Java source code and ran into a small problem with the character representations and/or file encodings.

There are a couple of "special" characters in the RM source code that are causing the Java compiler to choke when I compile.

One is the "registered trademark" character ('(R)'), Unicode character 0x00AE, which I've come to understand seems to be used as a "line separator character" in a single line of notes in the '*.ir' files. (Is this sort of thing documented anywhere or is it just "assimilated knowledge"? I looked through the 'IR.exe' PDFs and found nothing referring to this.)

The other "special" character is some odd character which seems to translate (by my best efforts) to the Unicode 0xFFFD character, which is, per the documented Unicode character tables, "used to replace an incoming character whose value is unknown or unrepresentable in Unicode".

I think it would be wise to replace those 2 characters with the Java representation of a Unicode character. Essentially this would mean editing 'PropertyReader.java' on line 100 (as of v1.96) to use this line:
Code:

   else if ( ch == '\u00AE' )

And editing 'RemoteConfiguration.java' on line 1615 to use this line:
Code:

   buff.append( '\uFFFD' );

and on line 1580 to use this line:
Code:

   StringTokenizer st = new StringTokenizer( text, "\uFFFD" );

Of course, this assumes that I've interpreted that 0xFFFD character correctly. In reality, I cannot see where/how that character is ever used! Please enlighten me as to its purpose.

Also note that the last recommended line change is for a line whose original form does not actually cause a compiler error, but I suspect the code being generated might not be correct without the change.

Regards,
Bill
Back to top
View user's profile Send private message
Mark Pierson
Expert


Joined: 03 Aug 2003
Posts: 3017
Location: Connecticut, USA

                    
PostPosted: Sun Sep 13, 2009 8:42 am    Post subject: Re: Unicode Character Representation in RemoteMaster Source Reply with quote

WagonMaster wrote:
One is the "registered trademark" character ('(R)'), Unicode character 0x00AE, which I've come to understand seems to be used as a "line separator character" in a single line of notes in the '*.ir' files.
KM uses ASCII 166, 0x00A6, which is the "broken bar" character in Windows ('¦'), as the note delimiter for IR. RM should be using the same thing. There are 2 others used to delimit notes created in KM and RM. They are ASCII 171, 0x00AB ('«') and ACSII 187, 0x00BB ('»').

These are used by KM/RM to embed notes that can be parsed by IR when code is pasted.

KM references a 4th special character: ASCII 182, 0x00B6 ('¶') but I can't remember nor find if it's internal to KM or in fact used by IR.
_________________
Mark
Back to top
View user's profile Send private message Send e-mail Visit poster's website
gfb107
Expert


Joined: 03 Aug 2003
Posts: 3411
Location: Cary, NC

                    
PostPosted: Sun Sep 13, 2009 9:31 am    Post subject: Reply with quote

Mark's got it right. Those are various ASCII characters IR and KM use as delimiters in certain situations (line breaks in notes, notes in keymoves) that RM must also use. They are used when
  1. Importing .ir files into RMIR
  2. RM generating embedded key moves in a device upgrade for pasting into IR


I'll try switching to the unicode equivalents Mark has suggested and see if it works.
_________________
-- Greg
Original RemoteMaster developer
JP1 How-To's and Software Tools
The #1 Code Search FAQ and it's answer (PLEASE READ FIRST)
Back to top
View user's profile Send private message Visit poster's website
WagonMaster



Joined: 16 Apr 2009
Posts: 361

                    
PostPosted: Sun Sep 13, 2009 2:40 pm    Post subject: Reply with quote

Thanks for the clarification, guys.

Further source code analysis (looking for both the raw character and the Unicode or hex representation of it) shows this:
  • IR.exe uses: vertical broken bar, registered trademark, and the left/right double angle quotes
  • IR.exe does not use: paragraph (pilcrow)
  • RMIR uses: vertical broken bar, registered trademark, the left/right double
    angle quotes, and something not understood (by me) that looks like Unicode 0xFFFD
    (see below)
  • RMIR does not use: paragraph (pilcrow)

Mark Pierson wrote:
KM references a 4th special character: ASCII 182, 0x00B6 ('¶') but I can't remember nor find if it's internal to KM or in fact used by IR.
Based on my findings, I'd say the paragraph/pilcrow symbol is only used by KM.

RMIR is also using the normal (solid) vertical bar symbol (ASCII 124, 0x7C) in replacing occurrences of it with the Java Unicode representation of '\u007c' in the 'store()' function in file 'DeviceUpgrade.java'. I'm not sure why that is, but I'm not too concerned.

WagonMaster wrote:
Of course, this assumes that I've interpreted that 0xFFFD character correctly. In reality, I cannot see where/how that character is ever used! Please enlighten me as to its purpose.

This is still the case. I don't understand the use of this character in RM/RMIR source code. What is its purpose?

When I load up the original RM/RMIR source code file ('RemoteConfiguration.java') with (2 occurrences of) that symbol, I see 3 characters (Unicode 0x00EF, 0x00BF, 0x00BD, which in layman's terms is 'i' with diaeresis ['ï'], inverted question mark
['¿'], and fraction one-half ['½']) where clearly the code is expecting a single character.

The other day when I was dealing with this, I used the SourceForge site to browse the CVS source to figure out what these characters were really supposed to be and found that the character in 'PropertyReader.java' is the registered trademark ('®') symbol and the other character (in 'RemoteConfiguration.java') is a question mark in a solid circle ('(?)'). That latter one looked to me like Unicode 0xFFFD, but maybe something is borked up there. Greg, can you look at that 'RemoteConfiguration.java' source file and say what that character (used in 2 spots) is actually supposed to be doing, please? I just cannot figure it out.

Regardless of what that character is doing, I'm confident that when you switch to the '\u00__' Java Unicode representation, it will fix my compilation issues for good. Thanks!

Bill
Back to top
View user's profile Send private message
gfb107
Expert


Joined: 03 Aug 2003
Posts: 3411
Location: Cary, NC

                    
PostPosted: Tue Sep 22, 2009 6:34 pm    Post subject: Reply with quote

That odd character is found in some IR files I have laying around and is used instead of the registered trademark symbol.

It is used when importing these odd IR files into RMIR.

The character in those files has no unicode equivalent, so Java converts it to \uFFFD.

I'll clean this up.
_________________
-- Greg
Original RemoteMaster developer
JP1 How-To's and Software Tools
The #1 Code Search FAQ and it's answer (PLEASE READ FIRST)
Back to top
View user's profile Send private message Visit poster's website
gfb107
Expert


Joined: 03 Aug 2003
Posts: 3411
Location: Cary, NC

                    
PostPosted: Tue Sep 22, 2009 7:48 pm    Post subject: Reply with quote

Give v1.97 a try.
_________________
-- Greg
Original RemoteMaster developer
JP1 How-To's and Software Tools
The #1 Code Search FAQ and it's answer (PLEASE READ FIRST)
Back to top
View user's profile Send private message Visit poster's website
WagonMaster



Joined: 16 Apr 2009
Posts: 361

                    
PostPosted: Wed Sep 23, 2009 4:59 pm    Post subject: Reply with quote

Just a confirmation.... I've now compiled the new v1.97 release and the oddball-character problem is indeed fixed! Thanks!

Bill
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic       JP1 Remotes Forum Index -> JP1 - Software All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


 

Powered by phpBB © 2001, 2005 phpBB Group
Top 7 Advantages of Playing Online Slots The Evolution of Remote Control