Approximate string matching cheat sheet

  1. Levenshtein distance : if the pattern is coil, foil differs by one substitution, coils by one insertion, oil by one deletion, and foal by two substitutions.
  2. Damerau–Levenshtein distance : Like Levenshtein but including transpositions among its allowable operations.
  3. Jaro–Winkler distance : designed and best suited for short strings such as person names.
  4. Smith–Waterman algorithm : performs local sequence alignment for determining similar regions between two strings, instead of looking at the total sequence.
  5. Needleman–Wunsch algorithm : divides a large sequence into a series of smaller problems and uses the solutions to the smaller problems to reconstruct a solution to the larger problem.
  6. Soundex : a phonetic algorithm for indexing names by sound, as pronounced in English.
  7. Metaphone : improves on Soundex by using variations and inconsistencies in English spelling and pronunciation.

A note to designers

  • Your job is to make things simpler, not cooler.
  • Restrain the product people from adding unnecessary features.
  • Save time for the engineering team, not make their lives harder.
  • Make sure that form follows function.

Forgive the early adopter

This is a letter to my friends and my colleagues.
I'm truly asking your forgiveness.
Forgive me for switching between messaging apps every few weeks.
Forgive me for forcing you to use my unfinished apps and services.
Forgive me for being mad at you when you buy the "wrong" phone, laptop or TV set.
Forgive the early adopter in me :-)

I really appreciate the fact that you're willing to go through all the beta phases and stick with it.
Your devotion will not be overlooked when our robot overlords take over us and appoint me as one of their liaisons.