Skip to content

PEP-4: General Discussion #8

@clnsmth

Description

@clnsmth

During our recent meeting, we discussed PEP-4 and raised several important points that are summarized here for further consideration:

Item 1: Representation of Date and Time Components

  • Issue: Date and time components should not be described as dateTime types in EML.

  • Proposal: Individual components of date and/or time (e.g., year, hour) should be represented using numeric EML AttributeType / measurementScale rather than a dateTime type. This approach allows for the assignment of a unit from the EML standard unit dictionary to describe the date or time component (e.g., nominalYear, nominalHour).

  • Rationale: A year value, for example, is neither a complete date nor a time. Using dateTime for these components goes against the schema definition for that type. Assigning them to a numeric type allows for correct unit specification.

  • Action: The supported list of formats used by the ECC and ezEML congruence checkers needs to be reviewed and updated accordingly.

Item 2: Promoting Automation in Data Reading

  • Goal: We aim to promote automation for reading data to streamline usage by applications and researchers.

  • Challenge: This requires converting the format string declared in the EML into one that is compatible with the target application. This conversion can be complex, as shown by our experience with ECC and DEX applications.

  • Recommendation: ISO-8601 is widely recognized across many applications and remains a strong candidate for a date and time standard. However, we need to survey other commonly used formats to evaluate whether they should also be supported, while balancing automation needs with format flexibility.

  • Tentative Agreement: We should aim for a balance between enabling automation and extending support to widely-used formats that may not be fully automatable.

Item 3: Zero-Padded Dates and Times

  • Current State: Zero padding for date and time values is required by the current list of supported formats. While some programs tend to drop leading zeros when writing to file, anecdotal evidence suggests this doesn’t impact the data’s readability.

  • Action: This behavior should be verified, as it may have implications for format checking.

Item 4: Case Sensitivity in Date and Time Formats

  • Observation: Data packages in production at the repository show significant variance in case usage for date and time formats (e.g., yyyy-MM-dd vs. YYYY-MM-DD).

  • Standard: ISO-8601 specifies that date components should use uppercase letters and time components should use lowercase letters for consistency across human and machine readers.

  • Discussion: We might not need to enforce strict case sensitivity, as context (e.g., MM in a date) can differentiate between date and time components without confusion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions