hi,
when I experimented with the 1000 samples you provided from the AG dataset and the bert model, the results are:
For dataset data/ag: For target model bert: original accuracy: 94.200%, adv accuracy: 29.500%, avg changed rate: 25.076%, num of queries: 481.7
But the adv accuracy in your paper is 11.5%.
I'm not quite sure what led to this result. So did you use the same 1000 examples in your paper?
Thanks.