Adversarial Attack on Neural Machine Translation System
No Thumbnail Available
Date
2019-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Indian Statistical Institute, Kolkata
Abstract
Nowadays Deep Neural Network based solutions are deployed to solve numerous tasks. Thus, it
has become absolutely important to study the robustness of these systems. Machine Translation
is one of the popular applications of Deep Neural Networks. This thesis studies the robustness of
Neural Machine Translation systems by generating adversarial examples with the objective to fool
the model. Whenever there is a change in the source, i.e. when a word in the input sentence is
replaced by an unrelated word, the translation system is supposed to re
ect the changes while doing
translation. These unwanted invariance learned by the model is undesirable. With intention to exploit
this undesirable property learned by a Neural Machine Translation system we design an attack
called: Invariance-based targeted attack. This attack introduces multiple changes(replacement of
words) to the original input sentence, keeping the translation unchanged. In-order to facilitate
the explanation of the design of the attack we introduce two methods: (i) Min-Grad method: To
identify the position where a replacement of the word makes the least change in the translation,
and (ii) Soft-Attn method: To search for a new word to replace, given a list of choices.
The initial part of the report explain the preliminary explorations we did in-order to get some
insights on how to do the problem formulation. These experiments are run on LSTM based models
with single replacement policy. Using the learning from the rst part we extend the experiments
to Transformer and BLSTM based models, which are considered as the state-of-the-art systems for
machine translation.
Description
Dissertation under the supervision of Dr. Utpal Garain
Keywords
Neural Machine Translation, Machine Translation System Dissertation, Neural Networks, LSTM models
Citation
40p.
