Skip to content
This repository was archived by the owner on Jun 4, 2024. It is now read-only.
This repository was archived by the owner on Jun 4, 2024. It is now read-only.

[autoscaler] Documentation instructions for mounting EFS does not work when docker is specified  #3

@jennakwon06

Description

@jennakwon06

In this documentation page: https://docs.ray.io/en/latest/cluster/aws-tips.html#

The instructions only work when docker is not specified.
When docker is specified, the efs-related commands inside setup_commands array will try to run inside the Docker container and fail due to not having sudo installed.
I suggest improving the documentation page to be more accurate about only working when docker is not specified. In addition, it would be great to include a working example when docker container is getting used.

Sample yaml:

cluster_name: jkkwon_ray_test

min_workers: 5

max_workers: 10

upscaling_speed: 1.0

docker: 
    image: "048211272910.dkr.ecr.us-west-2.amazonaws.com/barsecrrepo-1cda8d0d3d9ee1867bae37291b6adc586a3f650c:308796b3-5c89-4a7c-83d0-5ce0abad3094_MiamiMLImage_main"
    container_name: "ray_container"
    pull_before_run: True
    run_options: []

idle_timeout_minutes: 5

provider:
    type: aws
    region: us-west-2
    availability_zone: us-west-2a,us-west-2b,us-west-2c,us-west-2d
    cache_stopped_nodes: False

auth:
    ssh_user: ubuntu
    ssh_private_key: miami_dev_dask_emr_key_pair.pem

head_node:
    InstanceType: r5.12xlarge
    ImageId: latest_dlami
    SecurityGroupIds:
        - "sg-08ed97f6d08d451f6"
    SubnetIds: [
        "subnet-02876545b671b57b0"
    ]
    # You can provision additional disk space with a conf as follows
    BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
              VolumeSize: 100
    KeyName: "miami_dev_dask_emr_key_pair"

worker_nodes:
    InstanceType: r5.12xlarge
    ImageId: latest_dlami
    SecurityGroupIds:
        - "sg-08ed97f6d08d451f6"
    SubnetIds: [
        "subnet-0180e9267b994bf97",  # us-west-2a, 8187 IP addresses. 10.0.32.0/19
        "subnet-073e6e0338bf209cb",  # us-west-2b, 8187 IP addresses. 10.0.64.0/19
        "subnet-03caa10b59288efae",  # us-west-2c, 8187 IP addresses. 10.0.96.0/19
        "subnet-06dd6dbb8caf5c310",  # us-west-2d, 8187 IP addresses. 10.0.128.0/19
    ]
    # Run workers on spot by default. Comment this out to use on-demand.
    InstanceMarketOptions:
        MarketType: spot
    KeyName: "miami_dev_dask_emr_key_pair"
    
file_mounts_sync_continuously: False

rsync_exclude:
    - "**/.git"
    - "**/.git/**"

rsync_filter:
    - ".gitignore"

initialization_commands:
    - aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 048211272910.dkr.ecr.us-west-2.amazonaws.com;

# List of shell commands to run to set up nodes.
setup_commands:
      - pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp37-cp37m-manylinux2014_x86_64.whl
      - sudo kill -9 `sudo lsof /var/lib/dpkg/lock-frontend | awk '{print $2}' | tail -n 1`;
        sudo pkill -9 apt-get;
        sudo pkill -9 dpkg;
        sudo dpkg --configure -a;
        sudo apt-get -y install binutils;
        cd $HOME;
        git clone https://github.com/aws/efs-utils;
        cd $HOME/efs-utils;
        ./build-deb.sh;
        sudo apt-get -y install ./build/amazon-efs-utils*deb;
        cd $HOME;
        mkdir efs;
        sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-098309a3.efs.us-west-2.amazonaws.com:/ efs;
        sudo chmod 777 efs;    
        

head_setup_commands: []

worker_setup_commands: []

head_start_ray_commands:
    - ray stop
    - ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml

worker_start_ray_commands:
    - ray stop
    - ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076

Sample output:



WARNING: You are using pip version 20.3.3; however, version 21.0 is available.
You should consider upgrading via the '/usr/local/bin/python3.7 -m pip install --upgrade pip' command.
Shared connection to 34.220.27.124 closed.
    (1/2) sudo kill -9 `sudo lsof /var/l...
bash: sudo: command not found
bash: sudo: command not found
bash: sudo: command not found
bash: sudo: command not found
bash: sudo: command not found
bash: sudo: command not found
Cloning into 'efs-utils'...
remote: Enumerating objects: 142, done.
remote: Counting objects: 100% (142/142), done.
remote: Compressing objects: 100% (74/74), done.
remote: Total 792 (delta 79), reused 100 (delta 51), pack-reused 650
Receiving objects: 100% (792/792), 234.63 KiB | 6.02 MiB/s, done.
Resolving deltas: 100% (462/462), done.
+ pwd
+ BASE_DIR=/root/efs-utils
+ BUILD_ROOT=/root/efs-utils/build/debbuild
+ VERSION=1.28.2
+ RELEASE=1
+ DEB_SYSTEM_RELEASE_PATH=/etc/os-release
+ UBUNTU18_REGEX=Ubuntu 18
+ UBUNTU20_REGEX=Ubuntu 20
+ DEBIAN11_REGEX=Debian GNU/Linux bullseye
+ echo Cleaning deb build workspace
Cleaning deb build workspace
+ rm -rf /root/efs-utils/build/debbuild
+ mkdir -p /root/efs-utils/build/debbuild
+ echo Creating application directories
Creating application directories
+ mkdir -p /root/efs-utils/build/debbuild/etc/amazon/efs
+ mkdir -p /root/efs-utils/build/debbuild/etc/init/
+ mkdir -p /root/efs-utils/build/debbuild/etc/systemd/system
+ mkdir -p /root/efs-utils/build/debbuild/sbin
+ mkdir -p /root/efs-utils/build/debbuild/usr/bin
+ mkdir -p /root/efs-utils/build/debbuild/var/log/amazon/efs
+ mkdir -p /root/efs-utils/build/debbuild/usr/share/man/man8
+ [ -f /etc/os-release ]
+ grep -e Ubuntu 18 -e Debian GNU/Linux bullseye+  -e Ubuntu 20
grep PRETTY_NAME /etc/os-release
+ echo PRETTY_NAME="Ubuntu 18.04.5 LTS"
PRETTY_NAME="Ubuntu 18.04.5 LTS"
+ echo Correcting python executable
Correcting python executable
+ sed -i -e s/python|python2/python3/ dist/amazon-efs-utils.control
+ sed -i -e 1 s/^.*$/\#!\/usr\/bin\/env python3/ src/watchdog/__init__.py
+ sed -i -e 1 s/^.*$/\#!\/usr\/bin\/env python3/ src/mount_efs/__init__.py
+ echo Copying application files
Copying application files
+ install -p -m 644 dist/amazon-efs-mount-watchdog.conf /root/efs-utils/build/debbuild/etc/init
+ install -p -m 644 dist/amazon-efs-mount-watchdog.service /root/efs-utils/build/debbuild/etc/systemd/system
+ install -p -m 444 dist/efs-utils.crt /root/efs-utils/build/debbuild/etc/amazon/efs
+ install -p -m 644 dist/efs-utils.conf /root/efs-utils/build/debbuild/etc/amazon/efs
+ install -p -m 755 src/mount_efs/__init__.py /root/efs-utils/build/debbuild/sbin/mount.efs
+ install -p -m 755 src/watchdog/__init__.py /root/efs-utils/build/debbuild/usr/bin/amazon-efs-mount-watchdog
+ echo Copying install scripts
Copying install scripts
+ install -p -m 755 dist/scriptlets/after-install-upgrade /root/efs-utils/build/debbuild/postinst
+ install -p -m 755 dist/scriptlets/before-remove /root/efs-utils/build/debbuild/prerm
+ install -p -m 755 dist/scriptlets/after-remove /root/efs-utils/build/debbuild/postrm
+ echo Copying control file
Copying control file
+ install -p -m 644 dist/amazon-efs-utils.control /root/efs-utils/build/debbuild/control
+ echo Copying conffiles
Copying conffiles
+ install -p -m 644 dist/amazon-efs-utils.conffiles /root/efs-utils/build/debbuild/conffiles
+ echo Copying manpages
Copying manpages
+ install -p -m 644 man/mount.efs.8 /root/efs-utils/build/debbuild/usr/share/man/man8/mount.efs.8
+ echo Creating deb binary file
Creating deb binary file
+ echo 2.0
+ echo Setting permissions
Setting permissions
+ find /root/efs-utils/build/debbuild -type d
+ xargs chmod 755
+ echo Creating tar
Creating tar
+ cd /root/efs-utils/build/debbuild
+ tar czf control.tar.gz control conffiles postinst prerm postrm --owner=0 --group=0
+ tar czf data.tar.gz etc sbin usr var --owner=0 --group=0
+ cd /root/efs-utils
+ echo Building deb
Building deb
+ DEB=/root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb
+ ar r /root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb /root/efs-utils/build/debbuild/debian-binary
ar: creating /root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb
+ ar r /root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb /root/efs-utils/build/debbuild/control.tar.gz
+ ar r /root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb /root/efs-utils/build/debbuild/data.tar.gz
+ echo Copying deb to output directory
Copying deb to output directory
+ cp /root/efs-utils/build/debbuild/amazon-efs-utils-1.28.2-1_all.deb build/
bash: sudo: command not found
bash: sudo: command not found
bash: sudo: command not found
Shared connection to 34.220.27.124 closed.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions