Fast Feature Selection for Learning to Rank

Andrea Gigli, University of Pisa, Italy
Claudio Lucchese, HPC Lab., ISTI-CNR, Pisa, Italy
Franco Maria Nardini, HPC Lab., ISTI-CNR, Pisa, Italy
Raffaele Perego, HPC Lab., ISTI-CNR, Pisa, Italy

July 02 2016

Accepted at ICTIR ’16: International Conference on the Theory of Information Retrieval [1].

Abstract. An emerging research area named Learning-to-Rank (LtR) has shown that effective solutions to the ranking problem can leverage machine learning techniques applied to a large set of features capturing the relevance of a candidate document for the user query. Large-scale search systems must however answer user queries very fast, and the computation of the features for candidate documents must comply with strict back-end latency constraints. The number of features cannot thus grow beyond a given limit, and Feature Selection (FS) techniques have to be exploited to find a subset of features that both meets latency requirements and leads to high effectiveness of the trained models.

In this paper, we propose three new algorithms for FS specifically designed for the LtR context where hundreds of continuous or categorical features can be involved. We present a comprehensive experimental analysis conducted on publicly available LtR datasets and we show that the proposed strategies outperform a well-known state-of-the-art competitor.

References

[1]   Claudio Lucchese Andrea Gigli, Franco Maria Nardini, and Raffaele Perego. Fast feature selection for learning to rank. In ICTIR ’16: International Conference on the Theory of Information Retrieval, 2016.

Share on