mpi-kubernetes

A simple example of running an MPI program on Kubernetes. Contains example program for:

Word count of NOW corpus
Average rating calculation of each Netflix movies

See also: Implementation in Hadoop Docker

Prerequisites

Docker Desktop
Kubernetes
- In Docker Desktop, enable Kubernetes in the settings
- [Untested] Use microk8s or minikube

Usage

To modify the amount of workers, change the replicas value in kubernetes/statefulset.yaml. Default is set to 4 (including the master).

Setup

# Check if docker image is available
docker pull ynshung/mpi-docker:latest

# Generate ssh keys (optional)
./generate-ssh-secret.sh

# Apply all Kubernetes configurations
kubectl apply -f kubernetes/

# Check the status of the pods
kubectl get pods
# Make sure all pods are running and the copied IP addresses for next command are only from mpi-workers-N

# Get the IP addresses of the pods, copy them temporarily
kubectl get pods -o custom-columns="IP:.status.podIP" --no-headers

# Enter the master pod
kubectl exec -it mpi-worker-0 -- bash

# Create a hostfile
echo "<paste copied ip addresses here>" > hostfile
# (Optional) You may remove the master IP address from the hostfile

Word Count

The script counts the number of each words in a directory of text file containing a sample of NoW corpus and saves the results to a CSV file while tracking execution time.

# Run the MPI program for word count (with 3 workers)
mpirun --allow-run-as-root \
    --hostfile hostfile \
    -np 3 python3 /app/word_count_mpi.py

# Run word count without MPI
python3 /app/word_count.py

Netflix Ratings

The script calculates and sorts Netflix movie ratings by average rating, saving the results to a CSV file while tracking execution time.

# Run the MPI program for Netflix ratings (with 3 workers)
mpirun --allow-run-as-root \
    --hostfile hostfile \
    -np 3 python3 /app/netflix_mpi.py

# Run calculation without MPI
python3 /app/netflix.py

Get Results

The results are saved in the output directory in the master pod. You can copy the results to your local machine with the following command:

kubectl cp -n default mpi-worker-0:/app/output output

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
kubernetes		kubernetes
scripts		scripts
Dockerfile		Dockerfile
README.md		README.md
generate-ssh-secret.sh		generate-ssh-secret.sh
mpi_sort.py		mpi_sort.py
sort.py		sort.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mpi-kubernetes

Prerequisites

Usage

Setup

Word Count

Netflix Ratings

Get Results

About

Languages

ynshung/mpi-kubernetes

Folders and files

Latest commit

History

Repository files navigation

mpi-kubernetes

Prerequisites

Usage

Setup

Word Count

Netflix Ratings

Get Results

About

Resources

Stars

Watchers

Forks

Languages