We recently released the first version of Cloudster, a cloud-computing distributed implementation of k-means clustering for Windows Azure. This release features a full working environment and a bunch of samples showing how the API works, e.g. to cluster vectors, images, DNA, etc.

Website of the Cloudster project

In this release, I developed a text sample clustering documents using Salton et al.’s vector space model for representing text documents. Such clustering can be used for instance in query expansion for search engines. At present the Clusty search engine follows this approach.

Warm thanks to our professor Joannes Vermorel for gathering the team and launching us on this very cool project!


Feel free to post a comment by e-mail using the form below. Your e-mail address will not be disclosed.

📝 You can use Markdown with $\LaTeX$ formulas in your comment.

By clicking the button below, you agree to the publication of your comment on this page.

Opens your e-mail client.