...

import zgli
A clustering technique.

Compress and cluster your files using python!

Complearn

Zgli is inspired by the complearn tool available here. Show them some support if you find our tool usefull!

Designed to be simple

We created zgli to make the experience of clustering by compression simple to use and easy to integrate in python machine learning pipelines.

Many features

We've implemented 4 compression methods, a feature encoder and intend to implement the quartet method in the foreseeable future.

Github and Related Work

This code is only made possible by the awsome work that came before it.

Take a look at the source code
and the related work!

Github

All our python source code is available at our Github.

Complearn

Take a look at the original command line tool that inspired our work here.

Clustering by Compression

Check Paul Vitány et. al. work here to get a closer look at the theory behind this method.

...

How does zgli work?

Take a look at the simple work

1

We start by colecting the files we want to cluster and placing them inside a folder.

input

2

We take the files form the foler as input and compress and calculate the normalized compression distance between the files inside the folder.

Compress

3

After the computation of all distances, the zigly library outputs a distance matrix.

Output