Besides developing new ways to search through large amounts of texts like we explained in our previous blogs about the introduction of 'Active Learning' in our project search tool and Semantic Searching, we are also constantly trying to improve our search algorithms. The current version of our project search is obviously also a result of this continuous improvement. But how exactly do we try to improve these algorithms and how do we measure the improvement? The short answer is: we incorporate state-of-the-art technologies and look at the performance of the algorithms in a quantitative and qualitative way.
For our search algorithms we use a number of techniques from natural language processing (NLP). NLP is a rapidly evolving research area which has seen some major advances over the last decade. Moreover, there are many open source projects that have implemented these new techniques and make them easily available. Examples of this are Sentence Transformers and ASReview. We either try to incorporate them directly into our algorithms, or use them as inspiration.
Evaluating a search algorithm in a quantitative way is not always easy. Ideally we would have a big set of search queries of researchers together with the research projects / funding opportunities / papers they would have like to have found. Then we could accurately measure what the effect is of each part of the algorithm on the end result. In absence of this set, we measure the performance of our search algorithm on a number of substitute problems for which we can find data. For example, given a big set of articles from arXiv, we can find out how well our algorithm can find a full text if it is given the abstract as the search query. Or given a labeled dataset from a systematic review, we can take one included abstract as the search query and see if we can find all the other included papers. The hope is that if the search algorithm performs well on these substitute problems, it will also perform well on the problems we are really interested in.
We also evaluate our search algorithms in a qualitative way. We try a number of test cases ourselves and check if the results make sense. Do we get relevant texts back from a search query? Are there results that we expected to get back, but weren’t there? Does the algorithm give better results if it gets user input? Of course, these questions are best answered by the actual researcher, instead of by us. What makes a relevant result is heavily dependent on the research area and interests of the person searching. That’s why we are always looking for feedback.
Have you used Impacter search functions and were you happy or disappointed or surprised? Let us know!