心流logo

RainProof: An Umbrella to Shield Text Generator from Out-Of-Distribution Data

As more and more conversational and translation systems are deployed in production, it is essential to implement and develop effective control mechanisms to ensure their proper functioning and security. An essential component to ensure the safe behavior of the system is out-of-distribution (OOD) detection, which aims to detect whether an input sample is statistically far from the training distribution. While OOD detection is a widely covered topic in classification tasks, it has received much less attention in text generation. This paper addresses the problem of OOD detection for machine translation and dialog generation from an operational perspective. Our contribution includes (i) RAINPROOF a Relative informAItioN Projection Out OF distribution detection framework and (ii) a more operational evaluation setting for OOD detection. Surprisingly, we find that OOD detection is not necessarily aligned with task-specific measures. The OOD detector may filter out samples that are well processed by the model and keep samples that are not, leading to weaker performance. Our results show that RAINPROOF breaks this curse and achieve good results in OOD detection while increasing system performance.