Consider the input address here:
1700 Alondra Blvd, Compton, CA
Let's take a look at the address components that were entered. (In this simple case, an address component is surrounded by spaces or a comma. Cities will certainly have multiple words in them and streets will also have multiple words in them.):
primary_number: 1700
street_predirection: none
street_name: Alondra
street_suffix: Blvd
street_postdirection: none
secondary_number: none
secondary_designator: none
city_name: Compton
state_abbreviation: CA
zipcode: none
plus4_code: none
You definitely don't want to return an address that has fewer address components than the input address.
With that in mind, I would recommend considering both the US_RoofTop response and also the US_Streets response. In this case, the US_Streets response has two comparable responses, one East and one West. There is no way for you to guess which one is preferred. The US_RoofTop respons is a duplicate of the US_Streets respons (based on the output address string) so it can be removed from what you present to the user.
No ZIP Code was input, that means the user is relying on your service to determine the ZIP Code. This is important because if the input had included a ZIP code, either 90220 or 90221, you would have been able to narrow the response down to just one address.
So, in summary, Take the response(s) that have the greatest number of address components as they are most likely be more accurate, consolidate down to just unique responses, and present those back to the user. You have then been as smart as you possibly but still allow your user to clarify when needed.
expertise: I work with addresses all day long as a street genius at SmartyStreets.
Best Answer
The scores are based on a weighted numbering system; based on the number of matching characters in each of the prioritized/configured address element areas. So the more characters that can match the better the likelihood of a high score.
When using ranged-address data such as street center-lines the address range and parity will also figure into the process. So if you have a range from 3000-6000 even and the address is 2998 but the rest of the streetname match; ArcGIS will make this a candidate but lower the score since the number was outside the expected goal.
See Bruce Harold's response at Re: Geocoding Score Documentation: How is the score value determined?:
"Re: Geocoding Score Documentation: How is the score value determined? Bruce Harold Level 5 Bruce Harold Employee Apr 10, 2015 2:25 PM (in response to Nathan Lowry)
Hello
Score calculation is not documented in detail, but I can give you a thumbnail.
If you open USAddress.lot.xml in Firefox from its installed location at file:///C:/Program Files (x86)/ArcGIS/Desktop10./Locators you will see a navigable tree.
In Top Level Elements navigate to FullNormalAddress; the superscript numbers for NormalAddress (70) and Zone (30) are the relative weights for score contributions from those elements. Coincidentally they sum to 100 but only the relative weight is relevant.
Navigating further from NormalAddress you will see 70/100 of the score is contributed 15/75 and 60/75 by House and FullStreetName respectively, where 75 is the sum of the weights, and further down you can see the elements prefix (5/92), pretype (6/92), StName (70/92), suftype (6/92) and suffix (5/92) weights where 92 is the sum of those weights. An individual score for any lowest level element (like how to calculate a score contribution from an imperfect street name) may be determined by the Spelling/Scoring section of the XML file if an anticipated spelling correction is required to match the reference data, or by a proprietary algorithm for unanticipated spelling errors or noise or repeated characters, as when you have keybounce.
Scores are weight summed, with percentage normalization, from the bottom up. Missing elements do not penalize a score, they simply do not contribute.