발간년도 : [2018]
논문정보 |
|
논문명(한글) |
[Vol.13, No.1] Improving Predictive Accuracy of User-based Collaborative Filtering Using Word2Vec |
|
논문투고자 |
Boo-Sik Kang |
|
논문내용 |
Word2Vec is a most popular method in text mining area, recently. It converts words to vectors using association among words in sentences. Similar words are nearly located in the vector space. Improving predictive accuracy of recommender algorithms is a major work in the area of recommender systems. User-based collaborative filtering recommends products using the information about product preference of Neighbors. This study proposed a method to compute user similarity using vectors of users by Word2Vec instead of using traditional method. In order to use Word2Vec, we separate sentences first, and then find corpus that is meaningful word set of the sentences. For using Word2Vec in user-based movie recommender, we find users that have seen same movies first, we substitute an user to a word and user list of a movie to corpus of one sentence. There can be several methods to compose the sentences in recommender systems. This study considers two methods, first method constructs a sentence per movie and second method can construct several sentences per movie. After sentence construction, it enters corpus of sentences into Word2Vec and computes vectors of users, and then computes user similarity by coefficient corelation method using the vectors of users. Using the similarity, it recommends products by user-based collaborative filtering. To validate, the proposed methods were applied to filmtrust dataset. The experimental results of repeating 10-fold cross validation three times showed that mean MAE of user-based collaborative filtering(wvCF3.0) applying Word2Vec improved the predictive accuracy greatly than that of conventional collaborative filtering method(uCF). Also, it showed that the sentence expansion method(wvCFthree) constructing several sentences per movie is better than the one sentence method(wvCF3.0) constructing one sentence per movie for improving the predictive accuracy. To test statistical significance between uCF and wvCF3.0, and between wvCF3.0 and wvCFthree, we experimented paired t-test and confirmed the statistical significance. |
|
첨부논문 |
|
|
|
|
|