Data Mining: Theory and Applications

The Data Mining: Theory and Applications group at Aalto University conducts research on finding local patterns and global models in discrete high-dimensional data. Techniques for this task include both algorithmics in the traditional computer science sense and probabilistic methods.

We develop new concepts, algorithms, principles and frameworks for algorithmic data analysis. We believe that developing new concepts and algorithms is at best an iterative process, consisting of interacting extensively with the application experts, formulating computational concepts, analyzing the properties of the concepts, designing algorithms and analyzing their performance, implementing and experimenting with the algorithms, and applying the results in practice.

The group is currently not accepting new members.

The Data Mining research group is located at the Department of Information and Computer Science at the School of Science of Aalto University. We are members of the Helsinki Institute for Information Technology HIIT, Finnish Centre of Excellence for Algorithmic Data Analysis Research (Algodan), and PASCAL2 (Pattern Analysis, Statistical Modelling and Computational Learning).

Contact information and how to get here


Selected publications in 2011 grouped by topic

Time series and biological sequences

Panagiotis Papapetrou, Vassilis Athitsos, Michalis Potamias, George Kollios, and Dimitrios Gunopulos. Embedding-based Subsequence Matching in Time Series Databases. In ACM TODS, 36(3): 17, 2011.

Aleksi Kallio, Niko Vuokko, Markus Ojala, Niina Haiminen, and Heikki Mannila. Randomization Techniques for Assessing the Significance of Gene Periodicity Results. BMC Bioinformatics, 12:330, 2011.

Panagiotis Papapetrou, Gary Benson, and George Kollios. Mining Poly-regions in DNA Sequences. International Journal of Data Mining and Bioinformatics, 2011 (to appear).

Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollmen. Artemis: Assessing the Similarity of Event-Interval Sequences. In Proc ECML PKDD, 229-244, 2011.

Alexios Kotsifakos, Panagiotis Papapetrou, Jaakko Hollmen, and Dimitrios Gunopulos. A Subsequence Matching with Gaps-Range-Tolerances Framework: A Query-By-Humming Application. In Proc PVLDB, 4(11): 761-771, 2011.

Niko Vuokko and Petteri Kaski. Significance of Patterns in Time Series Collections. In Proc 11th SDM, 676-686, 2011.

Linguistics and word sequences

Terttu Nevalainen, Helena Raumolin-Brunberga, and Heikki Mannila. The diffusion of language change in real time: Progressive and conservative individuals and the time depth of change. Language Variation and Change, 23: 1-43, 2011.

Jefrey Lijffijt, Panagiotis Papapetrou, Kai Puolamäki, and Heikki Mannila. Analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping. In Proc ECML PKDD, 2011.

Computational ecology

Aleksi Kallio, Kai Puolamäki, Mikael Fortelius, and Heikki Mannila. Correlations and co-occurrences of taxa: the role of temporal, geographic and taxonomic restrictions. Palaeontologia Electronica, 14(1), 2011.

Jussi T. Eronen, Kai Puolamäki, Hannes Heikinheimo, Heikki Lokki, Ari Venäläinen, Heikki Mannila, and Mikael Fortelius. The effect of scale, climate and environment on species richness and spatial distribution of Finnish birds. Annales Zoologici Fennici, 48(5):257-274, 2011.

Concepts and theory of data mining

Gemma Garriga, Esa Junttila, and Heikki Mannila. Banded structure in binary matrices. Knowledge and Information Systems, 28(1): 197-226, 2011.

Esa Junttila and Petteri Kaski. Segmented nestedness in binary data. In Proc 11th SDM, 235-246, 2011.

Sami Hanhijärvi. Multiple Hypothesis Testing in Pattern Discovery. In Proc DS, LCNS 6926/2011, 122-134, 2011.

User modeling and social networks

Antti Ajanki, Mark Billinghurst, Hannes Gamper, Toni Järvenpää, Melih Kandemir, Samuel Kaski, Markus Koskela, Mikko Kurimo, Jorma Laaksonen, Kai Puolamäki, Teemu Ruokolainen, and Timo Tossavainen. An augmented reality interface to contextual information. The International Journal of Virtual Reality, 15:161-173, 2011.

Panagiotis Papapetrou, Aristides Gionis, and Heikki Mannila. A Shapley-value Approach for Influence Attribution. In Proc ECML PKDD, 549-564, 2011.

...plus many peer-reviewed workshop papers and other publications.

The publications of the past and current group members beginning 2008 and prior to the current year are available at the Aalto Publication Database.


Former personnel: Ella Bingham, Sirkka Eloranta, Dimitru Erhan, Gemma Garriga, Robert Gwadera, Sami Hanhijärvi, Hannes Heikinheimo, Heli Hiisilä, Johan Himberg, Saara Hyvönen, Jaripekka Juhala, Esa Junttila, Mikko Katajamaa, Olli-Pekka Koistinen, Mikko Koivisto, Kalle Korpiaho, Aino Lahdenperä, Teemu Murtola, Markus Ojala, Anne Patrikainen, Panagiotis Papapetrou, Ville Pettersson, Antti Rasinen, Salla Ruosaari, Antti Savolainen, Jouni K. Seppänen, Nikolaj Tatti, Johanna Tikanmäki, Antti Ukkonen, Niko Vuokko.