Skip to content

Inconsistent results #2

@tpellard

Description

@tpellard

The distances computed are inconsistent depending on the method used.

With distance_matrix on the whole data set:

library(dialectR)
data(Dutch)
distDutch <- distance_matrix(Dutch, "leven", alignment_normalization = TRUE)
distDutch[1:2,1:2]
            Aalsmeer NH Aalst BeLb
Aalsmeer NH   0.0000000  0.3609137
Aalst BeLb    0.3609137  0.0000000

With distance_matrix on a subset of the data set:

distance_matrix(Dutch[1:2,], "leven", alignment_normalization = TRUE)
            Aalsmeer NH Aalst BeLb
Aalsmeer NH           0        Inf
Aalst BeLb          Inf          0

The average of the output of the leven function on the same subset:

mean(leven(unlist(Dutch[1,]), unlist(Dutch[2,]), alignment_normalization = TRUE), na.rm = TRUE)
0.3937826

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions