The application exposes an API endpoint to standardize merchant names from raw transaction descriptions.
GET /payees/real-name
payee(string, required): The raw payee string from the transaction (e.g., "7-ELEVEN 23020 SEATTLE WA").
Returns a JSON object with:
real_name(string): The standardized merchant name.confidence(float): A score between 0.0 and 1.0 indicating certainty.
Request:
GET /payees/real-name?payee=7-ELEVEN%2023020%20SEATTLE%20WAResponse:
{
"real_name": "7-Eleven",
"confidence": 0.95
}Request:
GET /payees/real-name?payee=Unknown%20Store%20123Response:
{
"real_name": "Unknown Store",
"confidence": 0.4
}The resolution logic uses a hybrid approach:
- Cleaning: Removes digits, common location codes (states, countries), and expands abbreviations.
- Dictionary Match: Checks against a list of known merchants.
- Fuzzy Matching: Finds similar names using string distance metrics.
- Clustering (ML): Uses TF-IDF and K-Means clustering to identify common payee patterns that aren't in the manual dictionary.