논문윤리하기 논문투고규정
  • 오늘 가입자수 0
  • 오늘 방문자수 74
  • 어제 방문자수 93
  • 총 방문자수 162
2021-01-27 15:49pm
학회 논문지
HOME 자료실 > 학회 논문지

발간년도 : [2020]

 
논문정보
논문명(한글) [Vol.15, No.6] A Design and Implementation of Paragraph-based Focused Web Crawler Using Semantic Priority of Link
논문투고자 Nam-Oh Kang, Jae-Ho Kim
논문내용 A search engine maintaining whole Web consistency is very important to retrieve information correctly and efficiently. However, as the size of Web is rapidly growing and content is also dynamically changing, it is impossible for the search engine to achieve the goal by using limited resources such as hardware, network and computing time. In order to solve this problem, a focused web crawler has been introduced which can identify and visit the most promising links related to a specific topic and avoid downloading off-topic documents efficiently under limited resources. In this research, we propose a paragraph-based focused web crawler using semantic priority of link. The proposed system selects promising links from a downloaded web page by measuring similarity between a topic and link's data such as anchor text and a paragraph containing the link. In this paper, different from existing methods, we proposed a novel similarity function for calculating a link priority by using WordNet. And we introduced a method to visit high-priority link first. We conducted experiments to prove the performance of the proposed paragraph-based web focused crawler by using some topics. The experimental result showed the paragraph-based web focused crawler using semantic priority of link improves the term frequency of document retrieval.
첨부논문
   15-6-15.pdf (352.6K) [4] DATE : 2021-01-01 18:14:09