Skip to content

Case sensitivity issues with size magnitudes (KB vs kb) #83

@ryanhamel

Description

@ryanhamel

Therse seems to be a bug in CycleCloud PBS my customer was complaining about (CC version 8.7.1-3364 .... with pbspro):

Any user can make autoscaling stop working, i.e. no further jobs handled if a memory size is specified in capitalized units. For example:

$ echo sleep 300 | qsub -l select=1:ncpus=120:mem=4194304KB:mpiprocs=120 or echo 300 | qsub -l nodes=1:ppn=120 -l mem=4194304KB

OpenPBS handles this just fine, but /opt/cycle/pbspro/autoscale.log will have python exceptions - and autoscaling will never proceed for the entire cluster.

Workaround is a manual like this:
$ qalter -l select=1:ncpus=120:mem=4194304kb:mpiprocs=120 JOBID


Traceback (most recent call last):

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/resource.py", line 154, in parse

    return HPCSize.value_of(expr)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/hpc/autoscale/hpctypes.py", line 106, in value_of

    return Size._value_of(Size, value)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/hpc/autoscale/hpctypes.py", line 142, in _value_of

    ).format(mag, mag)

RuntimeError: Unknown SizeMagnitude 'KB'. To register custom magnitudes, call hpc.autoscale.hpctypes.add_magnitude_conversion('KB', N), where N is the number of bytes.
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/hpc/autoscale/clilib.py", line 1892, in main

    args.func(**kwargs)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/hpc/autoscale/clilib.py", line 436, in autoscale

    output_columns = output_columns or self._get_default_output_columns(config)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/hpc/autoscale/clilib.py", line 1077, in _get_default_output_columns

    default_cmd = self._default_output_columns(config, cmd_name)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/cli.py", line 76, in _default_output_columns

    env = self._pbs_env(driver)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/cli.py", line 110, in _pbs_env

    self.__pbs_env = environment.from_driver(pbs_driver.config, pbs_driver)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/environment.py", line 58, in from_driver

    jobs = pbs_driver.parse_jobs(queues, default_scheduler.resources_for_scheduling)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/driver.py", line 594, in parse_jobs

    self.pbscmd, self.resource_definitions, queues, resources_for_scheduling

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/driver.py", line 747, in parse_jobs

    rdict = parser.convert_resource_list(res_list)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/parser.py", line 38, in convert_resource_list

    ret["select"] = self.parse_select(str(raw_dict["select"]))

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/parser.py", line 101, in parse_select

    value = self.resource_definitions[key].type.parse(value)

  File "/opt/cycle/pbspro/venv/lib/python3.6/site-packages/pbspro/resource.py", line 157, in parse

    "Could not parse '{}' as type size (e.g. 1mb)".format(expr)

pbspro.resource.ResourceParsingError: Could not parse '4194304KB' as type size (e.g. 1mb)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions