We discussed a paper Zhimin brought up on binning by soundex to improve performance of fuzzy matching (using Australian census data with networks of peoples names). Zhimin will implement as an available algorithm for FP fuzzy matching. Soundex is more sensitive to errors near the beginning of words than the end, binning then fuzzy matching is likely to lead to local optima for a match, need ways to escape, potentially using likely kinds of errors (e.g. typographic errors, or OCR errors, each of which should produce a distinct signal of errors in the data).
Paul has gotten IPT to build on a development machine and has started to alter it as a FP annotation injection client.
Next meeting: 2009Jul16