RESEARCH OF OPEN DATA SETS OF WEB RESOURCES IN THE CONTEXT OF THEIR APPLICATION FOR TESTING RECOMMENDATION SYSTEMS
DOI:
https://doi.org/10.26906/SUNZ.2019.4.110Keywords:
recommendation systems, testing, data analysis, open data sets, digital marketingAbstract
The subject matter of the article is the process of testing methods of building recommender systems based on open data sets from the Internet. The goal is to research open data sets of web-resources in the context of using them to test various methods of building recommender systems. The tasks to be solved are: to explore modern web-platforms with open data sets and to research the possibility of using their data to test the work quality of various recommender systems. The following results were obtained: The most popular web-platforms with open sets of various network data were considered. The comparative analysis of these platforms was carried out in terms of the availability of free access to downloads of data, their functionality and geographical location, data format and convenience for future use for machine learning, as well as the possibility of using for testing recommender systems. Also, an assessment of the relevance of data stored in repositories with free access and their availability over time was made. Conclusions. Web-platforms containing open data sets that can be used to test recommender systems were explored. The main advantages of most platforms are the support of modern data formats and conditionally free or free access. Among the shortcomings of the considered platforms, it is worth noting the lack of structuredness of some data sets, in particular, text data, which significantly limits their use for testing content-based filtering methods. In addition, one of the factors that limit the use of open data sets is their relevance, since some of the sets stored on the platforms are outdated and not updated. All considered data sets can be applied for research purposes and for testing the work of recommender systems.Downloads
References
Linden G., Smith B. and York J. (2003), “Amazon.com recommendations: Item-to-item collaborative filtering”, Internet Computing, IEEE 7, 1, pp. 76–80.
Jannach D., Gedikli F., Karakaya Z., Juwig O. (2012) Recommending Hotels based on Multi-Dimensional Customer Ratings. In: Fuchs M., Ricci F., Cantoni L. (eds) Information and Communication Technologies in Tourism 2012. Springer, Vienna, pp 320-331.
Bennet J. and Lanning S. (2007) “The Netflix Prize”, Proceedings of KDD cup and workshop, available at : http://www.netflixprize.com (last accessed May 31, 2019).
Aggarwal C. (2017). Recommender Systems: The Textbook, New York: Springer. 498 p.
Adomavicius G. and Tuzhilin A. (2005), “Towards the Next Generation of Recommender Systems” A Survey of the State-ofthe-Art and Possible Extensions, IEEE Transactions on Knowledge and Data Engineering, No. 17, pp. 634–749.
Son Le Hoang (2016). Dealing with the new user cold-start problem in recommender systems: A comparative review. Information Systems, 58, 87-104.
Bernardi L., Kamps J., Kiseleva J, Müller M. (2015). The Continuous Cold Start Problem in e-Commerce Recommender Systems. CoRR abs/1508.01177.
Чалий С.Ф., Лещинський В.О., Лещинська І.О. (2018). Моделювання контексту в рекомендаційних системах. Проблеми інформаційних технологій, 1(023), 21-26.
Braunhofer M. (2014). Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems. UMAP 2014: User Modeling, Adaptation, and Personalization, 484-489.
Koren, Y. (2009). Collaborative Filtering with Temporal Dynamics. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 447-456.
Xiang L., Yuan Q. (2010). Temporal Recommendation on Graphs via Long-and Short-term Prefence Fusion. KDD’10 of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 723-732.
Elahi M., Ricci F., Rubens N. (2016). A survey of active learning in collaborative filtering recommender systems. Computer Science Review, 20, 29-50.
Chalyi S., Pribylnova I. (2019). The method of constructing recommendations online on the temporal dynamics of user interests using multilayer graph. EUREKA: Physics and Engineering, 3, 13-19.
Luo C., Cai X. (2014). Self-training Temporal Dynamics Collaborative Filtering. PAKDD’14, 461-472.
Zhu Y., Lin J., He S., Wang B., Guan Z., Liu H., and Cai D. (2018). Addressing the item cold-start problem by attributedriven active learning,” arXiv preprint arXiv:1805.09023.
Kalynychenko O., Chalyi S., Bodyanskiy Y., Golian V., Golian N. (2013). Implementation of search mechanism for implicit dependences in process mining. 2013 IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS). Institute of Electrical and Electronics Engineers (IEEE). DOI: 10.1109/IDAACS.2013.6662657
Levykin V., Chala O. (2018). Development of a method of probabilistic inference of sequences of business process activities to support business process management. Eastern-European Journal of Enterprise Technologies, 5/3(95), 16-24. DOI: 10.15587/1729-4061.2018.142664.
Levykin V., Chala O. (2018). Method of determining weights of temporal rules in markov logic network for building knowledge base in information control system. EUREKA: Physics and Engineering, 5(18), 3-10.