Improve patron address parser to better map data to the patron_record_address table
Sierra has a mechanism for parsing the free text entered into patron address fields in the client into separate fields for elements such as city/state/postal code that exist in the patronrecordaddress table. Those distinct fields may then be searched in create lists or utilized in SQL queries. However, the algorithm used to parse out those data points is extremely rigid in terms of the data it expects to see and is quite prone to mapping data incorrectly if an address entered into a patron record does not conform to those guidelines.
For one common example in our system if an extended zip code is entered in any format other than #####-#### (say using a space instead of the hyphen, or just including a space before or after the hyphen for that matter) then the city/region/postal_code fields will all be lumped together in the city field, leaving the region and postal_code as NULL. Similar issues can occur if extra lines are used to separate out these elements in the client.
This leads to issues with searching patron records in create lists, and can lead to quite difficult query needs in SQL (not to mention an enormous amount of confusion for any third party vendors that are expected to work with query results).
The parser should be improved to at least account for some of these more common data entry scenarios.

-
Jeremy Goldstein commented
After a bit more experimenting I found another common occurrence that can lead to similar behavior. The parser seems to want the city and state to share a line of the address, and to either be separated by a comma, or for the state abbreviation to have both letters capitalized.
"Boston, Ma" works as does "Boston MA". However "Boston ma" or "Boston Ma" does not.
-
Jennifer Nicolotti commented
I agree with all of the points Jeremy makes and more. We allow customers to self-register and we also have over 300 staff members that create customer records manually. One simple typo in the address fields can throw off the entire address and make it unusable both searching within Sierra, as well as sharing this data with 3rd party applications who can't parse our data because of the errors in the format.
-
Philip McNulty commented
Having a sound foundation for our patron data is important in a variety of our functions, from reporting, to 3rd party integrations, to patron communications. Sierra's weirdly flexible patron input forms require strong back-end tools to make sure this data is correct, and this idea will improve those tools.
-
Ruth Souto commented
The normalization of zip codes is important to ensure accuracy in the patron record and in the cases where notices are still physically mailed (we have quite a few that are) to ensure that the notice gets delivered by the USPS. Staff typos happen, if we could prevent those from slipping through, our patron database would be more useful and efficient.