Named entity recognition (NER) aims to recognize mentions of rigid designators from text belonging to predefined semantic types, such as person, location, and organization. In this article, we focus on a fundamental subtask of NER, named entity boundary detection, which aims at detecting the start and end boundaries of an entity mention in the text, without predicting its semantic type. The entity boundary detection is essentially a sequence labeling problem. Existing sequence labeling methods either suffer from sparse boundary tags (i.e., entities are rare and nonentities are common) or they cannot well handle the issue of variable size output vocabulary (i.e., need to retrain models with respect to different vocabularies). To address these two issues, we propose a novel entity boundary labeling model that leverages pointer networks to effectively infer boundaries depending on the input sequence. On the other hand, training models on source domains that generalize to new target domains at the test time are a challenging problem because of the performance degradation. To alleviate this issue, we propose METABDRY, a novel domain generalization approach for entity boundary detection without requiring any access to target domain information. Especially, adversarial learning is adopted to encourage domain-invariant representations. Meanwhile, metalearning is used to explicitly simulate a domain shift during training so that metaknowledge from multiple resource domains can be effectively aggregated. As such, METABDRY explicitly optimizes the capability of ``learning to generalize,'' resulting in a more general and robust model to reduce the domain discrepancy. We first conduct experiments to demonstrate the effectiveness of our novel boundary labeling model. We then extensively evaluate METABDRY on eight data sets under domain generalization settings. The experimental results show that METABDRY achieves state-of-the-art results against the recent seven baselines.
@article{li21domaingen,
author = {Jing Li and Shuo Shang and Lisi Chen},
title = {Domain Generalization for Named Entity Boundary Detection via Metalearning},
journal = {IEEE Transactions on Neural Networks and Learning Systems (TNNLS)},
volume = {32},
number = {9},
pages = {3819--3830},
year = {2021},
url = {https://doi.org/10.1109/TNNLS.2020.3015912},
doi = {10.1109/TNNLS.2020.3015912},
}