14. AWS Setup¶
This page describes a simple setup process for Bodo on Amazon EC2 instances. You need to have an account on Amazon Web Services (AWS) and be familiar with the general AWS EC2 instance launch interface. The process below is for demonstration purposes only and is not recommended for production usage due to security, performance and other considerations.
- Launch instances
Select a Linux instance type (e.g. Ubuntu Server 18.04, c5n types for high network bandwidth).
Select number of instances (e.g. 4).
Select placement group option for better network performance (check “add instance to placement group”).
Enable all ports in security group configuration to simplify MPI setup (add a new rule with “All traffic” Type and “Anywhere” Source).
- Setup password-less ssh between instances
a. Copy your key from your client to all instances. For example, on a Linux clients run this for all instances (find public host names from AWS portal):
scp -i "user.pem" user.pem firstname.lastname@example.org:~/.ssh/id_rsa
Disable ssh host key check by running this command on all instances:
echo -e "Host *\n StrictHostKeyChecking no" > .ssh/config
Create a host file with list of private hostnames of instances on home directory of all instances:
echo -e "ip-11-11-11-11.us-east-2.compute.internal\nip-11-11-11-12.us-east-2.compute.internal\n" > hosts
Set permission for
chmod 600 ~/.ssh/config
Set permission for
chmod 400 ~/.ssh/id_rsa
Install Anaconda Python distribution and Bodo on all instances:
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh chmod +x miniconda.sh ./miniconda.sh -b export PATH=$HOME/miniconda3/bin:$PATH conda create -n Bodo python source activate Bodo conda install bodo h5py scipy hdf5=*=*mpich* -c file:///path-to-bodo-package/bodo-inc/ -c conda-forge
4. Copy the Pi example to a file called pi.py in the home directory of all instances and run it with and without MPI and see execution times. You should see speed up when running on more cores (“-n 2” and “-n 4” cases):
python pi.py # Execution time: 2.119 mpiexec -f hosts -n 2 python pi.py # Execution time: 1.0569 mpiexec -f hosts -n 4 python pi.py # Execution time: 0.5286
Possible next experiments from here are running a more complex example like the logistic regression example. Furthermore, attaching a shared EFS storage volume and experimenting with parallel I/O in Bodo is recommended.