Research and Code


PIC

Motto

“Data! Data! Data!” he cried impatiently.
“I can’t make bricks without clay.”
— Sherlock Holmes.

Pattern Mining

Below you will find the description of a few papers and implementations.

PaNDa+

This is the result of our work on mining patterns, in particular dense “rectangles”, in noisy binary datasets. The implementation of the TKDE ’14 paper is available here.

Direct Local Pattern Sampling

This is the result of a collaboration with M. Boley, Sandy Moens, Daniel Paurat and Thomas Gärtner from the University of Bonn. website.

DCI-Closed

This is the result of our work on Pattern Mining algorithm for the discovery of closed frequent itemsets, in particular it implements three of our papers at ICDM ‘07, TKDE ‘06 and SDM ‘06, in one comprehensive software specialized for the dense datasets. In fact, it supports both out-of-core and multi-core mining. download.

Find-Rules

This software generates association rules with given minimum confidence from a collection of frequent itemsets. Differently from others, it does not need a downward closed collection. It extracts all the possible association rules, without assuming that given a frequent itemset the supports of its subsets is known. The input format is the usual ascii format: 1 2 3 (95). download.

Logistic PCA

This is a porting to python of Andrew I. Schein’s matlab code from the paper “A Generalized Linear Model for Principal Component Analysis of Binary Data”. download.

Share on