TitleEstimation and Visualization of Digital Library Content Similarities
Publication TypeConference Papers
Year of Publication2015
AuthorsReitsma, R, Hsieh, P-H, Robson, R
Conference NameIntern. Conf. on Inf. Systems (ICIS) 2015
Date Published2015
KeywordsBIS, Supply Chain

We report on a process for similarity estimation and two-dimensional mapping of lesson materials stored in a Web-based K12 Science, Technology, Engineering and Mathematics (STEM) digital library. The process starts with automated removal of all information which should not be included in the similarity estimations followed by automated indexing. Similarity estimation itself is conducted through a natural language processing algorithm which heavily relies on bigrams. The resulting similarities are then used to compute a Sammon-map; i.e., a projection in n dimensions, the item-to-item distances of which best reflect the input similarities. In this paper we concentrate on specification and validation of this process. The similarity results show almost 100% precision-by-rank in the top three to five ranks. Sammon mapping in two dimensions corresponds well with the digital library‘s table of content.

Custom 2