The problem is that the code is unable to detect the initial state correctly, causing fail detection of the first half-step.
The way I solved the problem was to add this line:
self._process_rotary_pins(None)
to the end of the __init__ function of the RotaryIRQ class.
I don't know how it works, but it works.