A Language-Independent Approach to Keyphrase Extraction and Evaluation


Mari-Sanna Paukkeri, Ilari T. Nieminen, Matti Pöllä, and Timo Honkela. A language-independent approach to keyphrase extraction and evaluation. In Coling 2008: Companion volume: Posters, pages 83–86, Manchester, UK, August 2008. Coling 2008 Organizing Committee.


We present Likey, a language-independent keyphrase extraction method based on statistical analysis and the use of a reference corpus. Likey has a very light-weight preprocessing phase and no parameters to be tuned. Thus, it is not restricted to any single language or language family. We test Likey having exactly the same configuration with 11 European languages. Furthermore, we present an automatic evaluation method based on Wikipedia intra-linking.

Suggested BibTeX entry:

    address = {Manchester, UK},
    author = {{Mari-Sanna} Paukkeri and Ilari T. Nieminen and Matti P\"{o}ll\"{a} and Timo Honkela},
    booktitle = {Coling 2008: Companion volume: Posters},
    month = {August},
    pages = {83--86},
    publisher = {Coling 2008 Organizing Committee},
    title = {A Language-Independent Approach to Keyphrase Extraction and Evaluation},
    year = {2008},

