Jacob Kaplan-Moss

Activity tagged “detection”

Bookmarks

Geeking with Greg: Clever method of near duplicate detection

A slick algorithm to “fingerprint” text based on chains for words following stop words.

(algorithm, data, detection, duplicate, similarity, via:jkocherhans)