[APP CRASH, AArch64] aflags not properly restored using drreg API due to incomplete decoder
Use of drreg's API ( drreg_reserve_aflags() and drreg_unreserve_aflags()) to save and restore aflags on AArch64 clobbered the aflags for an application. The control flow under DR was different than the control flow on a native run leading to an unexpected application-level assertion. The bug exhibited after a couple of millions of instructions and DR was run with these arguments: -unsafe_build_ldstex -disable_traces -vm_size 2G -no_enable_reset -t drcachesim -offline -trace_after_instrs 3000G -exit_after_tracing 20G.
As an optimization, drreg performs restoration of aflags as late as possible to minimize spills/fills. This optimization works fine for all the recognized flags-sensitive (read/write flags) instructions in the application that I checked. However, for AArch64 the decoder is incomplete and there are unrecognized instructions decoded as instances of a generic instruction, OP_xx. It seems that the placement of the aflags restoration by drreg does not take into account these unrecognized instructions and there are multiple cases like the following:
m4 ... subs ...
L3 ... xx ...
m4 ... msr %x1 -> %nzcv
In this example, DR clobbers the aflags (with subs), then an unrecognized instruction of the application (xx) that may read/write aflags operates on the clobbered aflags, and then the flags are overwritten (msr) nullifying any effect of the unrecognized instruction on the flags. Instead, the flags should have been restored prior to the xx instruction.