Page 1 of 1

Some protocols are not decodable without repeating magic

Posted: Tue Feb 07, 2023 9:15 am
by seanyoung
I'm writing an IRP encoder/decoder in rust, see https://docs.rs/irp/ and https://github.com/seanyoung/cir/tree/main/irp

The encoder is working fine, and produces the same output as IrpTransmogrifier for the protocols defined in IrpProtocols.xml

The decoder generates a fine state machine like regexes. I've got NFA mostly working, with a few minor exceptions. DFA will be next. Note that my decoder works fine with protocols marked decodable=false in IrpProtocols.xml 8-)

Now, in order to generate a decoder, sometimes I have to jump through some hoops because the way the protocols are written in IRP.

Take DirecTV_3FG for example.

Code: Select all

{38k,600,msb}<1,-1|1,-2|2,-1|2,-2>(10,-2,(D:4,F:8,C:4,1,-30m,5,-2)*){C=7*(F:2:6)+5*(F:2:4)+3*(F:2:2)+(F:2)}[D:0..15,F:0..255]
If I split this into down, repeating, and up sections I get:

down: 10,-2
repeating: D:4,F:8,C:4,1,-30m,5,-2
up:

However, down does not work because 10,-2 does not define F and D. After all, I want a decoded down event to actually have usable values. So, my decoder has special code which goes something like this "if the down part does not set the required variables, add a single instance of repeat to down", e.g.

down: 10,-2,D:4,F:8,C:4,1,-30m,5,-2
repeating: D:4,F:8,C:4,1,-30m,5,-2
up:

Now the down decoder can actually give you the value of D & F. See https://github.com/seanyoung/cir/blob/m ... ariants.rs

So now effectively I've re-written:

Code: Select all

{38k,600,msb}<1,-1|1,-2|2,-1|2,-2>(10,-2,(D:4,F:8,C:4,1,-30m,5,-2)*){C=7*(F:2:6)+5*(F:2:4)+3*(F:2:2)+(F:2)}[D:0..15,F:0..255]
into

Code: Select all

{38k,600,msb}<1,-1|1,-2|2,-1|2,-2>(10,-2,(D:4,F:8,C:4,1,-30m,5,-2)+){C=7*(F:2:6)+5*(F:2:4)+3*(F:2:2)+(F:2)}[D:0..15,F:0..255]
Note * => +

In fact, another way to describe this problem is: if I render the IRP with 0 repeats, then the protocol is not decodable.

Code: Select all

$ cir transmit irp --dry-run '{38k,600,msb}<1,-1|1,-2|2,-1|2,-2>(10,-2,(D:4,F:8,C:4,1,-30m,5,-2)*){C=7*(F:2:6)+5*(F:2:4)+3*(F:2:2)+(F:2)}[D:0..15,F:0..255]' -f D=0,F=0 -r 0
info: carrier: 38000Hz
info: rawir: +6000 -1200
A somewhat related issue is the lack of repeat markers in many protocols, e.g. sony8:

Code: Select all

{40k,600}<1,-1|2,-1>(4,-1,F:8,^45m)[F:0..255]
So does this protocol really not permit any repeats, you cannot hold a key down on this remote? Seems like nonsense. So, I have a rule: if there is no repeat marker anywhere, assume the whole thing can be repeated. This essentially does:

Code: Select all

{40k,600}<1,-1|2,-1>(4,-1,F:8,^45m)+[F:0..255]
Hope this makes sense.

How do we feel about amending the protocols so they all make sense wrt having somethign sensible in the down and repeat part?

Posted: Tue Feb 07, 2023 12:55 pm
by The Robman
What do the acronyms NFA and DFA mean?

I wouldn't worry about Sony8, it's not a real protocol, it's just a small component of the Sony DSP protocol, so in that sense it doesn't repeat. (Sony DSP sends 3 repeats of Sony15 36,96, followed by three Sony8 entries and one more Sony15 entry)

What other protocols do you have listed as non-repeating?

Posted: Tue Feb 07, 2023 1:26 pm
by seanyoung
DFA => Deterministic Finiite Automation
NFA => Nondeterministic Finite Automation
This is used by regex engines. By using DFA we can get super-fast decoding, see https://shivankaul.com/blog/nfa-dfa-and-regexes

Here is the nec1 protocol converted to NFA (not DFA).

Image

Posted: Tue Feb 07, 2023 3:59 pm
by seanyoung
The state machine starts at A(0).

Note that there are two edges with 'flash 9024' coming out of A(0), one for repeat and one for down. This means that after flash 9024 the state machine is in two states, M(12) and B(1).

This is exactly the difference between NFA and DFA. In DFA, these two edges would be merged. A DFA state machine can only be in one state at any time. However, removing the double edges to get a minimal state machine is tricky, especially with manchester encoding protocols like rc5.

Posted: Sun Feb 12, 2023 1:21 pm
by Barf
Sorry for not replying earlier; I have been away for a few days.

The way new protocols are discovered/invented is often that someone discovers a device, that sends a pattern that has not been observed before. Then this is scrutinized, and added to our data base of protocols. (Sometimes this scutinization is done possibly too carefully, so that probable implementation quirks are modeled -- Samsung36 (has been fixed now).) It is just not the case that all protocol are sane, general-purpose protocols, but often exist to give a name to a, lets call it pattern, that occurs in IR communication. So, if for example, DirecTV_3FG is "silly", just don't use it, use something else instead. There is also an element of "hey, why can't your program decode this while <other program> can?" "Decoding" many different signals is often considered a strength, preventing false decodes is not really an issue...

To some extent, this can be clarified by verbal documentation of the protocols -- feel free to contribute.

Sean, if you list the protocols that you consider silly, we can try to sort it out.

IrpTransmogrifer does not build DFA or NFA explicitly, but uses the parser generator tool/language ANTLR instead. Have you considered using ANTLR in your tool? A Rust backend for ANTLR appears to exist. The whole parsing process is goverend by the grammar file, which is understandable both by man and machine.
Note that my decoder works fine with protocols marked decodable=false
decodable: false is rather a directive to the decoder not to try to decode using that protocol (for whatever reason), and does not a priori describe a limitation of the decoder.

Posted: Mon Feb 13, 2023 5:16 am
by seanyoung
Thanks for the feedback. Let me compose a list of problematic protocols and maybe some better wording of the problem.

NFA and DFA are state machines used for decoding IR, not for parsing IRP language. The rust code has a peg grammar for parsing IRP: https://github.com/seanyoung/cir/blob/m ... ser.rs#L15