This repository contains a Python notebook that automates the process of matching district names between two datasets: people_of_india_clean_2014.csv and minority_conc_census_2011.csv. The matching is performed using n-grams and Jaccard similarity to compare district names and identify the most similar pairs.
-
Notifications
You must be signed in to change notification settings - Fork 0
This repository contains a Python notebook that automates the process of matching district names between two datasets: people_of_india_clean_2014.csv and minority_conc_census_2011.csv. The matching is performed using n-grams and Jaccard similarity to compare district names and identify the most similar pairs.
bishmaybarik/ngram-code
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
This repository contains a Python notebook that automates the process of matching district names between two datasets: people_of_india_clean_2014.csv and minority_conc_census_2011.csv. The matching is performed using n-grams and Jaccard similarity to compare district names and identify the most similar pairs.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published