Oanh Thi Tran , Bach Xuan Ngo , Professor Dr. Minh Le Nguyen , and Professor Dr. Akira Shimazu , all of the Japan Advanced Institute of Science and Technology, have published Automated reference resolution in legal texts , forthcoming in Artificial Intelligence and Law .
Here is the abstract:
This paper investigates the task of reference resolution in the legal domain. This is a new interesting task in Legal Engineering research. The goal is to create a system which can automatically detect references and then extracts their referents. Previous work limits itself to detect and resolve references at the document targets. In this paper, we go a step further in trying to resolve references to sub-document targets. Referents extracted are the smallest fragments of texts in documents, rather than the entire documents that contain the referenced texts. Based on analyzing the characteristics of reference phenomena in legal texts, we propose a four-step framework to deal with the task: mention detection, contextual information extraction, antecedent candidate extraction, and antecedent determination. We also show how machine learning methods can be exploited in each step. The final system achieves 80.06 % in the F1 score for detecting references, 85.61 % accuracy for resolving them, and 67.02 % in the F1 score for the end-to-end setting task on the Japanese National Pension Law corpus.
Filed under: Applications, Articles and papers, Research findings, Technology developments Tagged: Akira Shimazu, Artificial intelligence and law, Bach Xuan Ngo, Legal citation detection, Legal citation extraction, Legal citation resolution, Legal citations, Legal machine learning, Legal natural language processing, Legal text processing, Machine learning and law, Machine learning in legal text processing, Minh Le Nguyen, Natural language processing and law, Oanh Thi Tran
via Legal Informatics Blog http://legalinformatics.wordpress.com/2013/12/01/tran-et-al-automated-reference-resolution-in-legal-texts/
Niciun comentariu:
Trimiteți un comentariu