crandas.string_metrics¶
String metrics for approximate string matching or comparison and in fuzzy string searching.
- crandas.string_metrics.edit_distance(left, right, distance_type='levenshtein', **type_opts) CSeries¶
Computes the edit distance between two string columns.
By default, the Levenshtein edit distance is computed using function
levenshtein_distance().- Parameters:
- Returns:
CSeries with the edit distances.
- Return type:
- crandas.string_metrics.levenshtein_distance(left, right, score_cutoff=None) CSeries¶
Computes the edit distance between two string columns.
Compute the Levenshtein edit distance between two string columns, i.e., the minimum number of character insertions, deletions and substitutions required to transform one string into the other.
- Parameters:
left (CSeries or str) – The string columns to compare.
right (CSeries or str) – The string columns to compare.
score_cutoff (int, optional) – Maximum edit distance to consider. If the edit distance is larger than
score_cutoff,score_cutoff + 1is returned. IfNone, no cutoff is applied. A lower value improves performance.
- Returns:
CSeries with the edit distances (integers).
- Return type: