Skip to content

sraftopo/StringSimilarity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

String Similarity Server

A Node.js Express server that provides string similarity comparison using multiple algorithms including TF-IDF, Jaro-Winkler, Levenshtein distance, Jaccard similarity, and cosine similarity.

Features

  • Multiple Similarity Algorithms: Combines 5 different similarity metrics for accurate results
  • Weighted Scoring: Uses weighted combination of different similarity measures
  • RESTful API: Simple POST endpoint for string comparison
  • Flexible Input: Works with arrays of objects and custom property selection
  • JSON String Parsing: Automatically parses JSON strings sent as inputObject
  • Nested Object Access: Supports dot notation for accessing nested properties (e.g., 'practicetypes.practicetype_title')
  • Detailed Results: Returns similarity scores with detailed breakdowns

Installation

npm install

Usage

Start the Server

npm start

The server will start on port 3031.

API Endpoints

POST /compare

Compare a string against an array of objects.

Request Body:

{
  "inputObject": {
    "id": 1,
    "title": "Hello world, this is a test",
    "category": "Sample"
  },
  "inputElement": "title",
  "arrayOfObjects": [
    { "id": 1, "text": "Hello world, this is a test" },
    { "id": 2, "text": "Hello world, this is a sample" },
    { "id": 3, "text": "Hi there, this is a test" }
  ],
  "elementToCheck": "text"
}

Alternative: JSON String as inputObject:

{
  "inputObject": "{\"id\": 1, \"title\": \"Hello world, this is a test\", \"category\": \"Sample\"}",
  "inputElement": "title",
  "arrayOfObjects": [
    { "id": 1, "text": "Hello world, this is a test" },
    { "id": 2, "text": "Hello world, this is a sample" },
    { "id": 3, "text": "Hi there, this is a test" }
  ],
  "elementToCheck": "text"
}

Nested Object Access with Dot Notation:

{
  "inputObject": {
    "practicetypes": {
      "practicetype_title": "Εξωτερική ακτινοθεραπεία με ακτίνες Χ",
      "practicetype_number": 1
    },
    "practice_number": 1,
    "practice_code": "Γ1"
  },
  "inputElement": "practicetypes.practicetype_title",
  "arrayOfObjects": [
    { "PracticeTypeTitle": "Εξωτερική ακτινοθεραπεία με ακτίνες Χ" },
    { "PracticeTypeTitle": "Διαγνωστική ακτινοσκόπηση" }
  ],
  "elementToCheck": "PracticeTypeTitle"
}

Response:

{
  "inputObject": {
    "id": 1,
    "title": "Hello world, this is a test",
    "category": "Sample"
  },
  "inputString": "Hello world, this is a test",
  "totalCompared": 3,
  "resultsReturned": 3,
  "results": [
    {
      "index": 0,
      "score": 1.0,
      "targetString": "Hello world, this is a test",
      "originalObject": { "id": 1, "text": "Hello world, this is a test" }
    },
    {
      "index": 2,
      "score": 0.7234,
      "targetString": "Hi there, this is a test",
      "originalObject": { "id": 3, "text": "Hi there, this is a test" }
    },
    {
      "index": 1,
      "score": 0.6789,
      "targetString": "Hello world, this is a sample",
      "originalObject": { "id": 2, "text": "Hello world, this is a sample" }
    }
  ],
  "topMatch": {
    "index": 0,
    "score": 1.0,
    "targetString": "Hello world, this is a test",
    "originalObject": { "id": 1, "text": "Hello world, this is a test" }
  }
}

GET /health

Health check endpoint.

GET /

Server information and usage instructions.

Similarity Algorithms Used

  1. String Similarity (Dice Coefficient): 30% weight - Wikipedia: Sørensen–Dice coefficient
  2. Jaro-Winkler Distance: 20% weight - Wikipedia: Jaro–Winkler distance
  3. Levenshtein Distance: 20% weight - Wikipedia: Levenshtein distance
  4. Jaccard Similarity: 15% weight - Wikipedia: Jaccard index
  5. TF-IDF Cosine Similarity: 15% weight - Wikipedia: TF-IDF and Wikipedia: Cosine similarity

Testing

Run the test script to verify the server functionality:

node test-server.js

Development

For development with auto-restart:

npm run dev

Example Usage with curl

curl -X POST http://localhost:3031/compare \
  -H "Content-Type: application/json" \
  -d '{
    "inputObject": {
      "id": 1,
      "title": "Hello world",
      "category": "Greeting"
    },
    "inputElement": "title",
    "arrayOfObjects": [
      {"id": 1, "text": "Hello world"},
      {"id": 2, "text": "Hi there"},
      {"id": 3, "text": "Goodbye world"}
    ],
    "elementToCheck": "text"
  }'

Error Handling

The server includes comprehensive error handling for:

  • Missing or invalid input parameters
  • Empty arrays
  • Invalid object structures
  • Server errors

All errors return appropriate HTTP status codes and descriptive error messages.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published