Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
39db78d
Include base ROOT image with Madgraph installed (#298)
Soap2G Dec 5, 2024
76448a0
Update version of image
Soap2G Dec 5, 2024
3eb4534
Temporary set latest tag
Soap2G Dec 5, 2024
dea6a29
Conditional poststarthook creation (#299)
Soap2G Dec 6, 2024
fc7ec02
Update description of environment
Soap2G Dec 6, 2024
d06358d
Changing pull policy to cache images (#300)
Soap2G Dec 6, 2024
c68b70f
Update jhub-release.yaml
Soap2G Dec 9, 2024
c69721c
Update jhub-release.yaml
Soap2G Dec 9, 2024
fc52aac
Update links in README.md (#301)
garciagenrique Dec 12, 2024
8103d41
flix conflicts
garciagenrique Dec 13, 2024
350d069
Add def env (#302)
garciagenrique Dec 13, 2024
259f8a4
Merge branch 'main' of github.com:vre-hub/vre
garciagenrique Dec 13, 2024
c80d287
fix rucio-root-client
garciagenrique Jan 20, 2025
cfc1beb
change rses.txt file with working rses (#304)
garciagenrique Jan 20, 2025
b1c6e7d
upgrade fts servers URL
garciagenrique Jan 21, 2025
6197aee
add rucio-iam-connected-client pod (#303)
garciagenrique Jan 23, 2025
2b91044
Update daemons schema
garciagenrique Jan 23, 2025
7d3d264
change conveyor usercert to /opt/proxy path
garciagenrique Jan 29, 2025
c3d15e1
change num of daemons to 1 count for easier debug
garciagenrique Jan 29, 2025
25fcc54
improve verbosity and loops of rucio noise container (#305)
garciagenrique Jan 29, 2025
2d391b1
update version rucio-noise-pod-and-rucio-ewp2c01
garciagenrique Jan 29, 2025
929f8b9
forgot to end if in produce_noise.sh (#306)
garciagenrique Jan 29, 2025
9144522
uncomment line (#307)
garciagenrique Jan 30, 2025
65d7634
upgrade rucio noise container version
garciagenrique Jan 30, 2025
a59acb2
WIP: add ingress for CERN prometheus configuration (#278)
garciagenrique Jan 30, 2025
e9ab0e9
test jhub helm release
garciagenrique Jan 30, 2025
d6a714d
jhub release stuck on weird revision, returning to normal resources
garciagenrique Jan 30, 2025
fa09652
start MONIT deployment on the VRE
garciagenrique Jan 30, 2025
6a6286f
first version of monit helm charts
garciagenrique Jan 30, 2025
37c5b56
test WDF dev environment
garciagenrique Feb 17, 2025
f8ace98
updated jhub with latests environment images
garciagenrique Feb 18, 2025
6b6c873
update servers, auth and ui grid host certificates
garciagenrique Feb 26, 2025
d60ab4f
update publications on readme
garciagenrique Mar 5, 2025
aa3a9ad
increase num of reapers and hermes to 3
garciagenrique Mar 5, 2025
a390934
forgot to push daemons new fts certs
garciagenrique Mar 12, 2025
3c6c0d9
add label component: singleuser-server to singleuser
garciagenrique Mar 12, 2025
848ad23
increase rucio servers and UI limits and jupyter memory
garciagenrique Mar 14, 2025
453ce08
reduce jhub user resources
garciagenrique Mar 17, 2025
e9cda83
fix: change user attributes and user account type on the sync-rucio-i…
garciagenrique Mar 24, 2025
bcc7fb3
change rucio-iam-sync container to newer version and update rucio-noi…
garciagenrique Mar 24, 2025
17039a7
Merge branch 'main' into monit_helm
garciagenrique Mar 25, 2025
321b987
continue dev of monit release
garciagenrique Mar 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/merge-check-paths.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@ on:
pull_request:
paths:
- 'infrastructure/cluster/flux/**'
- '**.tf'
push:
branches:
- main
paths:
- '**.tf'
- 'infrastructure/cluster/flux/**'

jobs:
Expand Down
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@ VRE links:
- Code: https://github.com/vre-hub/vre/
- User documentation: https://vre-hub.github.io/
- Technical documentation: https://github.com/vre-hub/vre/wiki
- :construction: Ongoing migration: https://vre-hub.github.io/docs/tech-docs/home
- VRE file transfer monitoring: https://monit-grafana-open.cern.ch/d/PJ65OqBVz/vre-rucio-events?orgId=16
- Live status of the VRE services: https://vre-hub.github.io/status/
- VRE Slack channel: [invitation link](https://join.slack.com/t/eosc-escape/shared_invite/zt-1zd76ivit-Z2A2nszN0qfn4VF6Uk6UrQ).
- ESCAPE Mattermost Team: [invitation link](https://mattermost.web.cern.ch/signup_user_complete/?id=zqaa9p5fqfd9bnnc64at4b5aye&md=link&sbr=su).
- :exclamation: Afterwards please join the `VRE Support` channel


[![flux check pipeline](https://github.com/vre-hub/vre/actions/workflows/merge-check-paths.yml/badge.svg)](https://github.com/vre-hub/vre/actions/workflows/merge-check-paths.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
Expand All @@ -26,9 +28,10 @@ VRE links:

To cite us, please use the latest publication:

- The Virtual Research Environment: towards a comprehensive analysis platform - [arXiv link](https://arxiv.org/abs/2305.10166).
- Data discovery, analysis and reproducibility in Virtual Research Environments. CHEP 2024 proceedings - [arXiv link](https://arxiv.org/abs/2503.02483).
- The Virtual Research Environment: A multi-science analysis platform. CHEP 2023 Proceedings - [EPJ Web of Conf.](https://doi.org/10.1051/epjconf/202429508023) and [arXiv link](https://arxiv.org/abs/2305.10166).


## Contact

Email the CERN VRE team: `escape-cern-ops'at'cern.ch`
Email the CERN VRE team: `escape-cern-ops'at'cern.ch`
9 changes: 4 additions & 5 deletions containers/iam-rucio-sync/sync_iam_rucio.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ def sync_accounts(self, iam_users):

if not account.account_exists(InternalAccount(username)):
account.add_account(InternalAccount(username),
AccountType.SERVICE, email)
AccountType.USER, email)
logging.debug(
'Created account for User {} ***'.format(username))

Expand All @@ -144,10 +144,9 @@ def sync_accounts(self, iam_users):
set_local_account_limit(InternalAccount(username),
rse_obj['id'], 1000000000000)

# Make the user an admin & able to sign URLs
# Make the user able to sign URLs
# Admins are added by hand
try:
add_account_attribute(InternalAccount(username), 'admin',
'True')
add_account_attribute(InternalAccount(username), 'sign-gcs',
'True')
except Exception as e:
Expand Down Expand Up @@ -300,4 +299,4 @@ def make_gridmap_compatible(self, certificate):
logging.info("Synchronising X509 identities")
syncer.sync_x509(iam_users)

logging.info("IAM -> RUCIO synchronization successfully completed.")
logging.info("IAM -> RUCIO synchronization successfully completed.")
42 changes: 24 additions & 18 deletions containers/rucio-noise/produce_noise.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,32 +21,38 @@ echo '* RUCIO_SCOPE = '"$RUCIO_SCOPE"''
echo '* FILE_LIFETIME = '"$FILE_LIFETIME"''

upload_and_transfer_and_delete () {

for (( i=0; i<$len; i++ )); do

if [ $1 != $i ]; then
echo '*** ======================================================================== ***'
echo '*** '"${rses[$i]}"' ***'

RANDOM_STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
echo '*** generated random file identifier: '"$RANDOM_STRING"' ***'
filename=/home/auto_uploaded_${RANDOM_STRING}_source${rses[$i]}
did=auto_uploaded_${RANDOM_STRING}_source${rses[$i]}

echo '*** generating '"$FILE_SIZE"' file on local storage ***'
head -c $FILE_SIZE < /dev/urandom > $filename
echo '*** filename: '"$filename"' ***'

echo '*** uploading filename: '"$filename"' to '"${rses[$i]}"' ***'
rucio -v upload --rse ${rses[$i]} --lifetime $FILE_LIFETIME --scope $RUCIO_SCOPE $filename

for (( j=0; j<$len; j++ )); do

echo '*** ======================================================================== ***'
if [ $i != $j ]; then

RANDOM_STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
echo '*** generated random file identifier: '"$RANDOM_STRING"' ***'
filename=/home/auto_uploaded_${RANDOM_STRING}_source${rses[$1]}
did=auto_uploaded_${RANDOM_STRING}_source${rses[$1]}

echo '*** generating '"$FILE_SIZE"' file on local storage ***'
head -c $FILE_SIZE < /dev/urandom > $filename
echo '*** filename: '"$filename"''
echo '*** adding rule from '"${rses[$i]}"' to '"${rses[$j]}"' ***'
rucio -v add-rule --lifetime $FILE_LIFETIME --activity "Functional Test" $RUCIO_SCOPE:$did 1 ${rses[$j]}

echo '*** uploading to rse '"${rses[$1]}"' and adding rule to rse '"${rses[$i]}"''
rucio -v upload --rse ${rses[$1]} --lifetime $FILE_LIFETIME --scope $RUCIO_SCOPE $filename && rucio add-rule --lifetime $FILE_LIFETIME --activity "Functional Test" $RUCIO_SCOPE:$did 1 ${rses[$i]}
fi

#echo 'sleeping' sleep 3600
done

echo '*** removing all replicas and dids associated to from rse '"${rses[$1]}"' and adding rule to rse '"${rses[$i]}"''
echo '*** testing if `rucio erase` is able to remove all the replicas too ***'
rucio -v erase $RUCIO_SCOPE:$did
echo '*** Uploaded files and replicas should disappear after '${FILE_LIFETIME}' seconds ***'
# echo '*** Otherwise do a `rucio -v erase $RUCIO_SCOPE:$did` ***'

rm -f $filename
fi
done
}

Expand Down
3 changes: 1 addition & 2 deletions containers/rucio-noise/rses.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
CERN-EOS
CESNET-S3
CERN-EOSPILOT
CNAF-STORM
CC-DCACHE
PIC-DCACHE
Expand Down
40 changes: 20 additions & 20 deletions infrastructure/cluster/flux/jhub/jhub-configmap-profiles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,32 @@ data:
singleuser:
profileList:
- display_name: "Default environment"
description: "Based on a scipy notebook environment with a python-3.11 kernel, the rucio jupyterlab extension and the reana client installed."
description: "Based on a scipy notebook environment with a python-3.11 kernel, the Rucio jupyterlab extension and the Reana client installed."
default: true
- display_name: "Default environment - python 3.8"
description: "Same environment as the default one except for a python-3.8 kernel installed. This environment will be deprecated soon."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-py38:sha-7ed7d80
- display_name: "Default environment - python 3.9"
description: "Same environment as the default one except for a python-3.9 kernel installed. This environment will be deprecated soon."
- display_name: "ROOT Higgs 2024 environment"
description: "ROOT v6.32.04, and a python-3.11 kernel."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser:sha-423e01a
image: ghcr.io/vre-hub/vre-singleuser-root-base:latest
- display_name: "ROOT environment"
description: "ROOT v6.26.10 as well as a ROOT C++ and a python-3.8 kernel."
description: "Legacy ROOT v6.26.10 as well as a ROOT C++ and a python-3.8 kernel."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-root:sha-c94d95a
- display_name: "VIRGO - WDF environment"
description: "Contains the full WDF v2.2.1 environment - Python 3.9 kernel."
description: "Contains the full WDF v2.2.3 environment and a Python 3.11 kernel."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-wdf:sha-ba497d3
- display_name: "Python 3.11 environment"
description: "quay.io/jupyter/scipy-notebook:python-3.11 image"
kubespawner_override:
image: gitlab-registry.in2p3.fr/escape2020/virtual-environment/docker-images/datalake-singleuser-wdf:cd832522
image: quay.io/jupyter/scipy-notebook:python-3.11.8
- display_name: "Default environment - python 3.9"
description: "Same environment as the default one except for a python-3.9 kernel installed. This environment will be deprecated soon."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser:sha-423e01a
- display_name: "Default environment - python 3.8"
description: "Same environment as the default one except for a python-3.8 kernel installed. This environment will be deprecated soon."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-py38:sha-7ed7d80
- display_name: "KM3Net Science Project environment"
description: "Contains gammapy=1.1, km3irf and km3net-testdata libraries - Python 3.9 kernel."
kubespawner_override:
Expand All @@ -38,19 +46,11 @@ data:
description: "Contains the MLFermiLATDwarfs and fermitools libraries - Python 3.9 kernel."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-microomega:sha-5cbf4f4
- display_name: "DEV environment"
- display_name: "VRE DEV environment"
description: "Development environment with various tools installed."
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-dev:latest
- display_name: "Reana DEV environment"
description: "For testing purposes"
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-reana-dev:latest
- display_name: "Zenodo extension DEV environment"
description: "For testing purposes"
kubespawner_override:
image: ghcr.io/vre-hub/vre-singleuser-zen_ext-dev:latest
- display_name: "Python 3.11 environment"
description: "quay.io/jupyter/scipy-notebook:python-3.11 image"
kubespawner_override:
image: quay.io/jupyter/scipy-notebook:python-3.11.8
49 changes: 28 additions & 21 deletions infrastructure/cluster/flux/jhub/jhub-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,33 +119,40 @@ spec:
c.GenericOAuthenticator.enable_auth_state = True

singleuser:
extraLabels:
component: singleuser-server
defaultUrl: "/lab"
# The liefcycle hooks are used to create the Rucio configuration file,
# and the token file by copying the REFRESH_TOKEN from the environment variable to the token file.
startTimeout: 600
startTimeout: 1200
lifecycleHooks:
postStart:
exec:
command:
- "sh"
- "-c"
- >
mkdir -p /certs /tmp;
echo -n $RUCIO_ACCESS_TOKEN > /tmp/rucio_oauth.token;
echo -n "oauth2:${EOS_ACCESS_TOKEN}:iam-escape.cloud.cnaf.infn.it/userinfo" > /tmp/eos_oauth.token;
chmod 0600 /tmp/eos_oauth.token;
mkdir -p /opt/rucio/etc;
echo "[client]" >> /opt/rucio/etc/rucio.cfg;
echo "rucio_host = https://vre-rucio.cern.ch" >> /opt/rucio/etc/rucio.cfg;
echo "auth_host = https://vre-rucio-auth.cern.ch" >> /opt/rucio/etc/rucio.cfg;
echo "ca_cert = /certs/rucio_ca.pem" >> /opt/rucio/etc/rucio.cfg;
echo "account = $JUPYTERHUB_USER" >> /opt/rucio/etc/rucio.cfg;
echo "auth_type = oidc" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_audience = rucio" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_polling = true" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_issuer = escape" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_scope = openid profile offline_access" >> /opt/rucio/etc/rucio.cfg;
echo "auth_token_file_path = /tmp/rucio_oauth.token" >> /opt/rucio/etc/rucio.cfg;
- |
if [ "${SKIP_POSTSTART_HOOK}" = "true" ]; then
echo "hello world";
else
mkdir -p /certs /tmp;
echo -n $RUCIO_ACCESS_TOKEN > /tmp/rucio_oauth.token;
echo -n "oauth2:${EOS_ACCESS_TOKEN}:iam-escape.cloud.cnaf.infn.it/userinfo" > /tmp/eos_oauth.token;
chmod 0600 /tmp/eos_oauth.token;
mkdir -p /opt/rucio/etc;
echo "[client]" >> /opt/rucio/etc/rucio.cfg;
echo "rucio_host = https://vre-rucio.cern.ch" >> /opt/rucio/etc/rucio.cfg;
echo "auth_host = https://vre-rucio-auth.cern.ch" >> /opt/rucio/etc/rucio.cfg;
echo "ca_cert = /certs/rucio_ca.pem" >> /opt/rucio/etc/rucio.cfg;
echo "account = $JUPYTERHUB_USER" >> /opt/rucio/etc/rucio.cfg;
echo "auth_type = oidc" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_audience = rucio" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_polling = true" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_issuer = escape" >> /opt/rucio/etc/rucio.cfg;
echo "oidc_scope = openid profile offline_access" >> /opt/rucio/etc/rucio.cfg;
echo "auth_token_file_path = /tmp/rucio_oauth.token" >> /opt/rucio/etc/rucio.cfg;
fi;

networkPolicy:
enabled: false
storage:
Expand Down Expand Up @@ -188,9 +195,9 @@ spec:
# operator: Equal
# value: singleuser
# effect: NoSchedule
# memory:
# limit: 3.5G #4G
# guarantee: 3G #2G
memory:
limit: 2G #4G
guarantee: 1G #2G

cmd: null
extraEnv:
Expand Down
3 changes: 3 additions & 0 deletions infrastructure/cluster/flux/monit/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# CERN VRE Monitoring

The CERN VRE uses the upstream [CERN k8s Monitoring Helm Chart](https://gitlab.cern.ch/monitoring/helm-charts/kubernetes-monitoring), collecting metrics and logs, and forwarding them to the CERN MONIT infrastructure.
8 changes: 8 additions & 0 deletions infrastructure/cluster/flux/monit/monit-helm_repository.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
name: cern-monit
namespace: monit
spec:
interval: 10m
url: oci://registry.cern.ch/monit/cern-it-monitoring-kubernetes
6 changes: 6 additions & 0 deletions infrastructure/cluster/flux/monit/monit-namespace.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Namespace
apiVersion: v1
metadata:
name: monit
labels:
name: monit
37 changes: 37 additions & 0 deletions infrastructure/cluster/flux/monit/monit-release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: cern-monit
namespace: monit

spec:
releaseName: cern-monit
interval: 5m
chart:
spec:
sourceRef:
kind: OCIRepository
name: cern-monit
namespace: monit
chart: cern-it-monitoring-kubernetes
interval: 5m
version: 2.1.0 # Monit releases on CERN harbor site
valuesFrom:
- kind: Secret
name: monit-escape-vre-tenant
valuesKey: password
targetPath: tenant.password

values:
tenant:
name: escape-vre

kubernetes:
clusterName: vre

# All the rest comes from the upstream MONIT charts
metrics:
enabled: true

logs:
enabled: false # We would need to enable this
Loading