# SIMPLE RATIO
fuzz.ratio("this is a test", "this is a test!")
# 96
# PARTIAL RATIO
fuzz.partial_ratio("this is a test", "this is a test!")
# 100
# TOKEN SORT RATIO
fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
# 90
fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
# 100
# TOKEN SET RATIO
fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
# 84
fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
# 100
# PROCESS
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
process.extract("new york jets", choices, limit=2)
# [('New York Jets', 100), ('New York Giants', 78)]
process.extractOne("cowboys", choices)
# ("Dallas Cowboys", 90)
May 24, 2013
Levenshtein distance (a string metric for measuring the difference between two sequences)
FuzzyWuzzy: Fuzzy String Matching in Python
Labels:
algorithms,
Levenshtein distance,
strings difference
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment