논문윤리하기 논문투고규정
  • 오늘 가입자수 0
  • 오늘 방문자수 723
  • 어제 방문자수 944
  • 총 방문자수 2790
2024-11-05 16:02pm
논문지
HOME 자료실 > 논문지

발간년도 : [2023]

 
논문정보
논문명(한글) [Vol.18, No.2] Natural Language Processing-based Korean Summary System Considering Linguistic Features
논문투고자 Jongwon Lee, Sungjun Park, Hanjung Kim, Hoekyung Jung
논문내용 As large-scale data is distributed through the Internet, it has become difficult for Internet users to find the data they need. In addition, when the form of data is text, it is required to compress and summarize the text. In this paper, we construct an efficient system for compressing and summarizing text composed of Korean using the Transformer Encoder-Decoder-based KoBART (Korean Bidirectional and Auto-Regulatory Transformers) model. This system consisted of a preprocessor that performs an extraction summary and a KoBART model that performs a generation summary. The preprocessor performs an extraction summary based on the sentence when a specific phrase appears considering the linguistic characteristics of the Korean language, and the KoBART model performs a generation summary on texts not processed by the preprocessor. The proposed system used the preprocessor considering the linguistic features of Korean and the KoBART model, which is a pre-learning language model, to compress and summarize text composed of Korean, and showed superior performance compared to the general extraction summary model and generation summary model. This suggests that a method of analyzing and utilizing the characteristics of a specific country's language in more detail  can expect better results than using only a pre-learning language model with excellent performance. It is expected that this paper will be a leading study in spreading the synergy of the pre-learning language model with linguistic features and excellent performance.
첨부논문
   18-2-12.pdf (537.3K) [15] DATE : 2023-05-04 16:17:14