home > community > projects > duke - fast deduplication engine

close subject identifiers for duke - fast deduplication engine
  • http://www.topicmapslab.de/projects/duke

duke - fast deduplication engine

Project category: Utilities and Components
Project status: Alpha


Duke is a fast and flexible deduplication (or entity resolution, or record linkage) engine written in Java on top of Lucene. At the moment (2011-04-07) it can process 1,000,000 records in 11 minutes on a standard laptop in a single thread.

Project Leader

Lars Marius Garshol

No contact information available. 


Lars Marius is project leader of TM Photo, Topic Maps Tools, and duke - fast deduplication.. .


Follow us on Twitter


The idea of Topic Maps is essential to enable dynamic information logistic. This requires a system that understands the context of the user to provide relevant informations and options automatically. Therefore semantic analysis is needed organizing content in a dynamic net structure.

Jörg Wurzer
Topic Maps Lab auf der Cebit 2011

Graduate from the Topic Maps Lab