Skip to content

PVFS Filesystem Setup enables a distributed file system for high-performance computing by configuring metadata, I/O servers, and clients. It ensures parallel I/O, scalability, and fault tolerance.

Notifications You must be signed in to change notification settings

Surajkumar4-source/PVFS_Filesysytem_Guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 

Repository files navigation

PVFS Filesystem Setup (Server and Client Implementation)

1. Introduction to PVFS (Parallel Virtual File System):

  • PVFS is a distributed file system designed to provide high-performance storage for parallel computing environments. It enables multiple servers to work together to handle large amounts of data, ensuring that the system scales efficiently with increasing workload demands.

Key Features:

  • High Scalability: PVFS can scale from a few nodes to thousands, offering seamless growth for data-intensive applications.

  • Parallel I/O: It allows for simultaneous data access across multiple nodes, improving performance for large-scale applications.

  • Fault Tolerance: Built-in mechanisms ensure data integrity and availability in case of node failures.

  • Clustered Architecture: PVFS operates on a client-server model where the server provides data storage, and clients access the data through a mounted filesystem.

2. Key Components in PVFS:

  • Metadata Server (MDS): Manages file metadata, such as file names, directories, and file attributes. It doesn't store file data but instead keeps track of where data resides on I/O servers.

  • I/O Server (OSS): Responsible for the actual storage of the file data. Each file's data is distributed across multiple I/O servers to improve parallel access.

  • Client: Provides the interface to access the file system. The client interacts with the metadata and I/O servers to read and write data.

3. Workflow in PVFS:

  • File Access: When a client requests a file, the request is first handled by the metadata server to determine the location of the data. The metadata server then redirects the request to the appropriate I/O servers, where the actual data is stored.

  • Data Distribution: PVFS distributes file data across multiple I/O servers, ensuring load balancing and high throughput. The data is typically split into chunks and stored on different servers.

  • Concurrency: Multiple clients can access the same file simultaneously, improving the overall performance. This parallel access is one of the core strengths of PVFS.

4. PVFS Installation and Configuration Steps:

  • Server-Side Setup:

    • Configure repositories, disable firewalls, and install necessary dependencies.

    • Install the OrangeFS server software (the implementation of PVFS).

    • Configure the storage (extra HDD) and set up directories for storing data and metadata.

    • Enable and start the OrangeFS server on the server machine.

  • Client-Side Setup:

    • Install the OrangeFS client on the client machine, ensuring it can interact with the server.

    • Mount the shared file system on the client and verify connectivity to the server.

    • Test I/O operations by creating test files on the mounted directory.

5. Verification and Testing:

  • After the setup, verifying the configuration is crucial:

    • Server Verification: Ensure the server is up and running, check if the file system is mounted correctly, and ensure proper connectivity using the pvfs2- ping command.

    • Client Verification: Ensure the client can mount the file system, communicate with the server, and perform read/write operations. Use test files to confirm proper functionality.

6. Use Cases of PVFS:

  • High-Performance Computing (HPC): PVFS is widely used in HPC environments, where large amounts of data need to be processed simultaneously by multiple computational nodes.

  • Scientific Research: Research applications that require large data sets (e.g., simulations, data analysis) benefit from the parallel access and fault tolerance provided by PVFS.

  • Big Data: PVFS is well-suited for Big Data applications that require efficient storage and retrieval of large datasets.

Conclusion:

PVFS provides an effective solution for distributed, high-performance storage in environments requiring scalable and fault-tolerant file systems. By distributing data across multiple nodes, PVFS maximizes parallelism and throughput, making it a popular choice in HPC and Big Data applications.



Prerequisites for PVFS Implementation

  1. System Requirements:

    • CentOS 8 OS.
    • Server with an extra HDD and client with sufficient storage.
    • Static IPs for all nodes in the same network.
  2. Software Requirements:

    • Access CentOS vault and ELRepo repositories.
    • Install: - Server: orangefs, orangefs-server - Client: orangefs, orangefs-fuse
    • Latest kernel from ELRepo.
  3. User Privileges:

    • Root access on both server and client nodes.
  4. Network Setup:

    • Set hostnames (ofs-srv-1, ofs-client-1).
    • Update /etc/hosts with static IPs.
  5. Security:

    • Disable firewall and SELinux.
  6. Storage:

    • Attach, format, and mount additional HDD on the server.
  7. Tools:

    • Install nano, epel-release, pvfs2-ping, and dd.

8.Connectivity:

  • Verify communication between server and client nodes.


*********** Implementation Steps ************



1. Prepare Server and Client

  • Server Configuration: - Add extra HDD. - Set hostname .

    sudo hostnamectl set-hostname ofs-srv-1
  • Client Configuration:

    • Set hostname to ofs-client-1.
    sudo hostnamectl set-hostname ofs-client-1

2. Hosts File Configuration (For Both Server and Client)

  • Update /etc/hosts on both the server and client to add entries for each host. - Replace 192.168.1.x with the actual IP addresses:
192.168.1.x ofs-srv-1
192.168.1.x ofs-client-1

*************Implementation setup ********************

Server Implementation:

1. YUM Repository Configuration:

  • Run the following to configure the YUM repositories:
cd /etc/yum.repos.d/
sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-* 
sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

2. Disable Firewall:

  • Stop and disable firewalld:
systemctl stop firewalld
systemctl disable firewalld

3. Install Dependencies:

  • Install necessary packages:
yum install -y nano
yum -y install epel-release

4. Disable SELinux:

Edit SELinux configuration to disable it: bash

nano /etc/selinux/config

# Change SELINUX=enforcing to SELINUX=disabled

5. Add ELRepo for Kernel Installation:

  • Add the ELRepo repository:
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
echo "[elrepo]
name=ELRepo.org Community Enterprise Linux Repository - el8
baseurl=http://elrepo.org/linux/elrepo/el8/\$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-elrepo.org

[elrepo-kernel]
name=ELRepo.org Community Enterprise Linux Kernel Repository - el8
baseurl=http://elrepo.org/linux/kernel/el8/\$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-elrepo.org" | sudo tee /etc/yum.repos.d/elrepo.repo

6. Install Latest Kernel and Reboot:

  • Install the latest kernel:
yum -y --enablerepo=elrepo-kernel install kernel-ml
reboot

7. Install OrangeFS Server:

  • Install OrangeFS server and required dependencies:
yum -y install orangefs orangefs-server

8. Format and Mount Extra HDD:

  • Format the extra HDD and mount it:
mkfs.ext4 /dev/nvme0n2
mount -t ext4 /dev/nvme0n2 /mnt/ofsmnt/

9. Configure OrangeFS:

  • Generate and configure OrangeFS settings:
pvfs2-genconfig /etc/orangefs/orangefs.conf
  • When prompted, enter the following details:

    • Protocol type: tcp
    • Port number: 3334
    • Directory name: /mnt/ofsmnt
    • Hostnames: ofs-srv-1

10. Start OrangeFS Server:

  • Start the OrangeFS server:
pvfs2-server -f /etc/orangefs/orangefs.conf

11. Auto-mount Configuration:

nano /etc/pvfs2tab

tcp://ofs-srv-1:3334/orangefs /mnt/ofsmnt pvfs2

12. Enable and Start OrangeFS Server:

  • Enable and start the OrangeFS server:
systemctl enable orangefs-server
systemctl start orangefs-server
systemctl status orangefs-server

13. Test Connection to Server:

Verify the connection to the server:

pvfs2-ping -m /mnt/ofsmnt


Client Implementation:

1. YUM Repository Configuration:

  • Perform the same repository configuration as on the server:
cd /etc/yum.repos.d/
sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-* 
sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

2. Install Dependencies:

  • Install necessary packages:
yum -y install orangefs orangefs-fuse

3. Load OrangeFS Module:

  • Load the OrangeFS module:
modprobe orangefs

4. Mount OrangeFS Filesystem:

  • Create the mount directory and configure /etc/pvfs2tab:
mkdir /mnt/ofsmnt

nano /etc/pvfs2tab

tcp://ofs-srv-1:3334/orangefs /mnt/ofsmnt pvfs2

5. Verify Connection to Server:

  • Test the connection to the OrangeFS server:
pvfs2-ping -m /mnt/ofsmnt

6. Mount OrangeFS on Client:

  • Mount the OrangeFS filesystem on the client:
pvfs2fuse /mnt/ofsmnt/ -o fs_spec=tcp://ofs-srv-1:3334/orangefs

7. Verify Mount and Disk Space:

  • Check disk space:
df -h

8. Test File Creation:

 - Create a test file to ensure proper functionality:
dd if=/dev/zero of=1GB-file bs=1MB count=1024

Final Verification:

  • Both the server (ofs-srv-1) and client (ofs-client-1) should now be successfully configured to communicate via OrangeFS.

  • Test by accessing /mnt/ofsmnt on both nodes and ensuring that the filesystem is mounted and operational.





๐Ÿ‘จโ€๐Ÿ’ป ๐“’๐“ป๐“ช๐“ฏ๐“ฝ๐“ฎ๐“ญ ๐“ซ๐”‚: Suraj Kumar Choudhary | ๐Ÿ“ฉ ๐“•๐“ฎ๐“ฎ๐“ต ๐“ฏ๐“ป๐“ฎ๐“ฎ ๐“ฝ๐“ธ ๐““๐“œ ๐“ฏ๐“ธ๐“ป ๐“ช๐“ท๐”‚ ๐“ฑ๐“ฎ๐“ต๐“น: csuraj982@gmail.com


About

PVFS Filesystem Setup enables a distributed file system for high-performance computing by configuring metadata, I/O servers, and clients. It ensures parallel I/O, scalability, and fault tolerance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published