Skip to content

Test File Formats

Julian Weise edited this page Aug 12, 2019 · 6 revisions

Test Configuration

Schema

configuration:
    type: object
    required:
        - task_configurations
        - categories
    properties:
        task_configurations:
            type: array
            items:
                $ref: '#/task_configuration'
        categories:
            type: array
            items:
                $ref: '#/category'
            
task_configuration:
    type: object
    required:
        - type
        - enabled
    properties:
        type:
            type: string
            enum: [cosine_similarity, euclidean_similarity, cosine_neighborhood, euclidean_neighborhood, cosine_outlier_detection, euclidean_outlier_detection]
        enabled:
            type: boolean            

category:
    type: object
    required:
        - name
        - enabled
        - entities
        - tasks
        - categories
    properties:
        name:
            type: string
        enabled:
            type: boolean
        entities:
            type: string
            description: Path to file, which contains all linked entities, which are used to calculate the base noise for tasks.
        tasks:
            type: array
            items:
                $ref: '#/task'
        categories:
            type: array
            items:
                $ref: '#/category'

task:
    type: object
    required:
        - name
        - type
        - test_set
    properties:
        name:
            type: string
        type:
            type: string
            enum: [cosine_similarity, euclidean_similarity, cosine_neighborhood, euclidean_neighborhood, cosine_outlier_detection, euclidean_outlier_detection]
        test_set:
            type: string
            description: Path to local test-set file.

Example

configuration:
    task_configurations:
        - task_configuration: 
            type: cosine_similarity
            enabled: true
        - task_configuration:
            type: euclidean_similarity
            enabled: true
        - task_configuration:
            type: analogy
            enabled: true
        - task_configuration:
            type: cosine_neighborhood
            enabled: false
        - task_configuration:
            type: euclidean_neighborbood
            enabled: false
        - task_configuration:
            type: cosine_outlier_detection
            enabled: true
        - task_configuration:
            type: euclidean_outlier_detection
            enabled: true
    categories:
        - category:
            name: Geography
            enabled: true
            entities: all_entities.txt
            tasks:
                - task:
                    name: is_conutry_similar_cosine
                    type: cosine_similarity
                    test_set: is_conutry_similar/data.csv
                - task:
                    name: is_conutry_similar_euclidean
                    type: euclidean_similarity
                    test_set: is_conutry_similar/data.csv
                - task:
                    name: is_capital_of
                    type: analogy
                    metric: cosine
                    test_set: is_capital_of/data.csv
            categories:
                - category:
                    name: Europe
                    enabled: false
                    entities: geography_entities.txt
                    tasks:
                        - task:
                            name: is_capital_of
                            type: cosine_neighborhood
                            test_set: is_capital_of/data.csv
                    categories:
                - category:
                    name: North_America
                    enabled: true
                    entities: geography_entities.txt
                    tasks:
                        - task:
                            name: is_capital_of
                            type: cosine_outlier_detection
                            test_set: is_capital_of/data.csv
                    categories:
        - category:
            name: Food
            enabled: false
            entities: all_entities.txt
            tasks:
            categories:
                - category:
                    name: Meat
                    enabled: true
                    entities: food_entities.txt
                    tasks:
                        - task:
                            name: is_similar_food
                            type: cosine_similarity
                            test_set: food/data.csv
                    categories:

Test Sets

Similarity

Schema:

a [:knowledgebase_id], b [:knowledgebase_id], group_id [:int], rank [:int - 0 is most similar]

Example:

wd:Q567, wd:Q71359, 1, 0 # Angela Merkel, Annegret Kramp-Karrenbauer
wd:Q567, wd:Q9671, 1, 1 # Angela Merkel, Michael Schumacher

Analogy

Schema:

a [:knowledgebase_id], b [:knowledgebase_id]

Example:

wd:Q64, wd:Q183 # Berlin, Germany
wd:Q90, wd:Q142 # Paris, France

Neighborhood

Schema:

a [:knowledgebase_id], group_id [:int], is_similar [:boolean]

Example:

wd:64, 1, true    # Berlin
wd:90, 1, true    # France
wd:567, 2, false  # Angela Merkel
wd:9671, 2, false # Micheal Schumacher

Outlier Detection

Schema:

a [:knowledgebase_id], group_id [:int], is_outlier [:boolean]

Example:

wd:567, 1, false    # Angela Merkel
wd:Q71359, 1, false # Annegret Kramp-Karrenbauer
wd:9671, 1, true    # Michael Schumacher

Linking

Schema:

embedding_label [:string], knowledgebase_id [:knowledgebase_id]

Example:

*_entity_angela_merkel, wd:Q567

Entities

Schema:

knowledgebase_id [:string]

Example:

wd:Q567
wd:Q71359
wd:9671

Clone this wiki locally