Jacob Kaplan-Moss

Activity tagged “via:jkocherhans”

Bookmarks

Geeking with Greg: Clever method of near duplicate detection

A slick algorithm to “fingerprint” text based on chains for words following stop words.

(algorithm, data, detection, duplicate, similarity, via:jkocherhans)

Repository - directory - public: enfold.solr/trunk/server

This is awesome: a complete Solr/Jetty setup. This is similar to what I've been using, but even nicer. Thanks, Joseph!

(java, jetty, script, solr, via:jkocherhans)

Xapian: Theoretical Background

Some good notes on how relevancy algorithms actually work. Must read through this in more detail.

(algorithms, indexing, readloater, relevency, search, text, via:jkocherhans)

jerith.za.net: Socket programming in Erlang

Sweet - I've been looking for a quick tutorial of this nature.

(erlang, programming, server, sockets, tutorial, via:jkocherhans)