Efficient Shipping Time Calculation Using MPI Techniques

Imagine you are running a large e-commerce operation where shipping times are recorded daily at multiple warehouse hubs worldwide. Each local warehouse manager computes the minimum and maximum shipping times for their region. Your goal is to determine the global minimum and maximum shipping times across all warehouses to optimize your company’s delivery logistics. Using MPI for parallel processing, write a program flow with proper MPI system calls that collect these local minimum and maximum shipping times from each regional hub and compute the overall global minimum and maximum.

In the previous post we saw how global computation can be done using using the MPI_Reduce() and MPI_Allreduce() APIs. In this post we are going to solve a slightly more complicated problem which is stated at the beginning. The problem given says different warehouses keep track of their minimum and maximum shipping times. The job is to collect all the different warehouse minimum and maximum shipping times. Then, compute the final global minimum and maximum from all those values. In the last post, we explored global computations using MPI_Reduce(). We demonstrated efficient aggregation of data from multiple nodes. In this post, we’ll take things a step further by solving a more complex problem—the one stated at the beginning.

The challenge involves multiple warehouses, each tracking its minimum and maximum shipping times. Our goal is to collect these values from all warehouses. We aim to compute the global minimum and maximum shipping times across the entire network.

Just like in the previous example, we will simulate the minimum and maximum shipping times. We will use C’s built-in random number generator. Each warehouse (represented as an MPI node) will generate its own pair of minimum and maximum values. Once we have these values distributed across different nodes, we’ll apply MPI_Reduce() twice. First, to compute the global minimum. Then, again to compute the global maximum. This will enable us to determine the shortest shipping times across all warehouses. It will also help us find the longest shipping times in a distributed computing environment.

Simulation of min and max times

This problem involves two sets of values—one for minimum and one for maximum. We need to take additional considerations compared to the previous problem. Specifically, we must ensure that the minimum value is always smaller than the maximum value on each node.

There are multiple ways to generate these values, but the core principle remains the same: leveraging C’s random number generator. To ensure better randomness, we need to properly seed the generator. In this case, we’ll initialise it as follows:

 srand(time(NULL) ^ (rank * 12345));

There are different ways to fine-tune this, but the approach above works well for our example.

Now, let’s move on to simulating the minimum shipping time for each node using the following method:

double min = ((double)rand() / RAND_MAX) * 50.0 + 1.0; // Between 1 and 51

As mentioned earlier, you can experiment with different ways to manipulate and randomise these values. However, the approach above should work well for this case.

Now, let’s move on to simulating the maximum shipping time using the following method:

double max = min + ((double)rand() / RAND_MAX) * 20.0; // Ensures max > min

This ensures that the maximum value is always greater than the minimum, maintaining the correct relationship between the two.

Applying MPI_Reduce() on the dataset

Finally, we need to apply MPI_Reduce() twice—once to aggregate the minimum values from all nodes:

MPI_Reduce(&min, &global_min, 1, MPI_DOUBLE, MPI_MIN, MPI_COMM_WORLD);

And then, we apply MPI_Reduce() once more to aggregate the maximum values from all nodes:

MPI_Reduce(&max, &global_max, 1, MPI_DOUBLE, MPI_MAX, MPI_COMM_WORLD);

If you’ve noticed, we’ve also changed the datatype to MPI_DOUBLE to accommodate larger and more precise values. With that adjustment, the rest of the program remains largely unchanged.

Now, let’s put everything together and write our solution for the problem stated at the beginning of this post.

mpi_reduce_problem_7.c

#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
#include <time.h>

int main()
{
    int size = 0;
    int rank = 0;

    MPI_Init(NULL, NULL);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    srand(time(NULL) ^ (rank * 12345));

    double min = ((double)rand() / RAND_MAX) * 50.0 + 1.0;  // Between 1 and 51
    double max = min + ((double)rand() / RAND_MAX) * 20.0;  // Ensures max > min
    double global_min = 0.0;
    double global_max = 0.0;

    /*
     * int MPI_Reduce(const void *sendbuf,   // Pointer to local data (input)
                      void *recvbuf,         // Pointer to result buffer (output, valid only in root process)
                      int count,             // Number of elements to be reduced
                      MPI_Datatype datatype, // Data type of elements (e.g., MPI_INT, MPI_DOUBLE)
                      MPI_Op op,             // Reduction operation (e.g., MPI_SUM, MPI_MIN, MPI_MAX)
                      int root,              // Rank of process that receives the final result
                      MPI_Comm comm)         // MPI communicator (usually MPI_COMM_WORLD)
     */


    MPI_Reduce(&min, &global_min, 1, MPI_DOUBLE, MPI_MIN, 0, MPI_COMM_WORLD);
    MPI_Reduce(&max, &global_max, 1, MPI_DOUBLE, MPI_MAX, 0, MPI_COMM_WORLD);

    printf("rank = %d: min = %.2f max = %.2f\n", rank, min, max);

    if(rank == 0)
    {
        printf("global_min = %lf\n", global_min);
        printf("global_max = %lf\n", global_max);
    }

    MPI_Finalize();

    return 0;
}

Compiling and Running the Program

To compile the above program you can use the following command in the command line:

mpicc mpi_reduce_problem_7.c -o mpi_reduce_problem_7

And once compiled you can run the program in the following way:

mpiexec -n 4 ./mpi_reduce_problem_7 
rank = 2: min = 18.40 max = 26.21
rank = 3: min = 35.64 max = 37.02
rank = 1: min = 18.14 max = 21.37
rank = 0: min = 38.12 max = 52.70
global_min = 18.14
global_max = 52.70

Understanding the output

In this output, each of the four MPI ranks (0 to 3) generates a local minimum. Each also generates a local maximum value using a random number generator. The MPI_Reduce() operation is then applied twice—once to compute the global minimum and once for the global maximum. Here, rank 1 has the lowest minimum value (18.14), making it the global minimum across all nodes. Meanwhile, rank 0 has the highest maximum value (52.70), making it the global maximum. This confirms that MPI_Reduce() correctly aggregates the values and provides the expected results at the root process (rank 0).

Source Code Repository

You can access and download all the MPI-related example codes from the following repository:

🔗 GitHub Repository: mpi_programming

This repository contains various MPI examples, including implementations of MPI_Reduce(), MPI_Allreduce(), and other essential MPI operations. Feel free to explore, clone, and experiment with the code!

In this post, we addressed a real-world distributed computing problem using MPI_Reduce(). We demonstrated how to aggregate minimum and maximum values efficiently across multiple nodes. We explored the importance of properly generating random values. It is crucial to ensure correct relationships between min and max. We also need to apply MPI operations in a structured way. With this, we’ve built a solid foundation for handling global computations in an MPI environment.

If you’re following along, I encourage you to experiment further. Try different ways to randomize data. Test with more MPI processes. Explore MPI_Allreduce() to see how it differs. And as always, you can find all the example codes in the GitHub repository linked above.

Stay tuned for the next post, where we’ll take on an even more challenging MPI problem!

Efficient Shipping Time Calculation Using MPI Techniques

Simulation of min and max times

Applying MPI_Reduce() on the dataset

Compiling and Running the Program

Understanding the output

Source Code Repository

1 Comment

Leave a Reply Cancel reply

Simulation of min and max times

Applying MPI_Reduce() on the dataset

Compiling and Running the Program

Understanding the output

Source Code Repository

Share this:

Related

1 Comment

Leave a Reply Cancel reply