🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
-
Updated
Oct 14, 2024 - Python
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
A powerful and modular toolkit for record linkage and duplicate detection in Python
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
Insightful Tutorials and Papers about Knowledge Graphs
On-device Speech-to-Intent engine powered by deep learning
🆔 Command line tool for deduplicating CSV files
🆔 Examples for using the dedupe library
A list of free data matching and record linkage software.
Recent trends of Entity Linking, Disambiguation, and Representation.
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
An open source, high scalability toolkit in Java for Entity Resolution.
ReFinED is an efficient and accurate entity linking (EL) system.
🔎 Finds fuzzy matches between CSV files
Entity resolution for Elasticsearch.
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Resources for tackling record linkage / deduplication / data matching problems
OpenRefine reconciliation services for VIAF, ORCID, and Open Library framework for creating more.
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
Created by Halbert L. Dunn
Released 1946