Skip to content

ynshung/mpi-kubernetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mpi-kubernetes

A simple example of running an MPI program on Kubernetes. Contains example program for:

See also: Implementation in Hadoop Docker

Prerequisites

Usage

To modify the amount of workers, change the replicas value in kubernetes/statefulset.yaml. Default is set to 4 (including the master).

Setup

# Check if docker image is available
docker pull ynshung/mpi-docker:latest

# Generate ssh keys (optional)
./generate-ssh-secret.sh

# Apply all Kubernetes configurations
kubectl apply -f kubernetes/

# Check the status of the pods
kubectl get pods
# Make sure all pods are running and the copied IP addresses for next command are only from mpi-workers-N

# Get the IP addresses of the pods, copy them temporarily
kubectl get pods -o custom-columns="IP:.status.podIP" --no-headers

# Enter the master pod
kubectl exec -it mpi-worker-0 -- bash

# Create a hostfile
echo "<paste copied ip addresses here>" > hostfile
# (Optional) You may remove the master IP address from the hostfile

Word Count

The script counts the number of each words in a directory of text file containing a sample of NoW corpus and saves the results to a CSV file while tracking execution time.

# Run the MPI program for word count (with 3 workers)
mpirun --allow-run-as-root \
    --hostfile hostfile \
    -np 3 python3 /app/word_count_mpi.py

# Run word count without MPI
python3 /app/word_count.py

Netflix Ratings

The script calculates and sorts Netflix movie ratings by average rating, saving the results to a CSV file while tracking execution time.

# Run the MPI program for Netflix ratings (with 3 workers)
mpirun --allow-run-as-root \
    --hostfile hostfile \
    -np 3 python3 /app/netflix_mpi.py

# Run calculation without MPI
python3 /app/netflix.py

Get Results

The results are saved in the output directory in the master pod. You can copy the results to your local machine with the following command:

kubectl cp -n default mpi-worker-0:/app/output output

About

A simple example of running an MPI program on Kubernetes

Resources

Stars

Watchers

Forks