We recently released the first version of Cloudster, a cloud-computing distributed implementation of k-means clustering for Windows Azure. This release features a full working environment and a bunch of samples showing how the API works, e.g. to cluster vectors, images, DNA, etc.

Website of the Cloudster project

In this release, I developed a text sample clustering documents using Salton et al.’s vector space model for representing text documents. Such clustering can be used for instance in query expansion for search engines. At present the Clusty search engine follows this approach.

Warm thanks to our professor Joannes Vermorel for gathering the team and launching us on this very cool project!


There are no comments yet. Feel free to leave a reply using the form below.

Post a comment

You can use Markdown with $\LaTeX$ formulas in your comment.

You agree to the publication of your comment on this page under the CC BY 4.0 license.

Your email address will not be published.