October 21, 2023

Hey there 👋!

With multicore processors now standard, it's essential for high-performance applications to harness this power through effective concurrency. Java's Fork/Join framework, part of the java.util.concurrent package since JDK 7, optimizes the efficiency of multi-threaded tasks, fully utilizing the capabilities of modern hardware. In this article, I'm going to delve into the Fork/Join framework, explaining its purpose, significance, and application, all illustrated with a practical example.

What the Fork?

The Fork/Join framework is an implementation of the ExecutorService interface that helps developers solve problems using divide-and-conquer algorithms. These algorithms work by breaking down a task into smaller, more manageable pieces, solving each piece separately, and then combining the results. In this framework, any task can be forked (split) into smaller tasks, and the results can be joined subsequently, hence the name "Fork/Join."

You may wonder, "Why opt for the Fork/Join Framework when traditional threading is an option?"

Why?

Picture multitasking. But it's Java juggling your tasks with finesse! Aaaaand it's all about efficiency, and here's why: -

👉
Efficient Thread Utilization: It makes use of a work-stealing algorithm, where idle threads "steal" tasks from busy threads' queues. This ensures that all threads are actively used, reducing overhead and improving performance.
👉
Handling Recursive Tasks: The framework excels at handling recursive computations, a common scenario in divide-and-conquer algorithms.
👉
Improved Scalability: It's designed to scale well to available parallelism, which means better performance on multicore processors.

How Does It Work?

Consider a scenario where you're faced with an array of 10 tasks, each representing operations that are resource-intensive, such as database or I/O operations.

Your objective is to expedite the processing of these tasks efficiently without overburdening the system resources. Sequential execution is off the table due to time constraints, and a single thread can handle a maximum of two tasks consecutively. Intriguing challenge, isn't it? Let's tackle this problem by employing a divide-and-conquer strategy.

Recursively dividing the task array until we match a certain threshold

As demonstrated, our goal is to systematically divide tasks until they're manageable enough, aligning with our defined threshold. Wondering how this translates into Java? Stay with me; it's simpler than it seems.

First, let's establish our base scenario. Imagine a Task class, responsible for handling time and resource-intensive operations—think bulk updates, I/O, network calls, and more. Here's a glimpse of what it looks like:

However, we can enhance our approach by abstracting the process() method, leading us to define a Computable functional interface and implement it in our Task class, like so:

Noice! That's so much neater, isn't it? Next, we generate an array of random tasks, shifting our focus to the crux of the issue: parallel execution.

But how do we execute these tasks in parallel? Enter Java's Fork/Join framework. Here's a simplified guide:

Extend RecursiveTask<R> or RecursiveAction, depending on whether you need a result.
Override the compute() method to specify the task's logic and the conditions for further splitting or direct execution.
Engage a ForkJoinPool to invoke the root task.

Let's start with step 1, creating a class named TaskProcessor extending RecursiveAction, and override the compute() method:

In step 2, we refine our TaskProcessor to accept an array of tasks and employ a loop within compute() to handle tasks sequentially. But there's a catch: we haven't set the base condition for task division. Here's where start and end come into play, marking the range of tasks processed by each TaskProcessor instance. This range helps us determine when to divide tasks further or process them directly, based on a THRESHOLD.

The middle calculation (start + (end - start) / 2) ensures a consistent split of tasks, with leftProcessor and rightProcessor handling each segment. The call to ForkJoinTask.invokeAll() initiates parallel execution.

After setting up our tasks and creating the TaskProcessor class, we need to instantiate a ForkJoinPool and start our tasks. Here's how we can do it:

And just like that, it works! Super quick and almost like magic. Your tasks are done before you know it! So, it works super-fast, but how, huh? Let's take a simple look at the main code that makes this happen:

We first generate our tasks using the generateTasks() method, which returns an array of Task objects.
We create an instance of TaskProcessor, passing the tasks along with the start and end indices.
We then instantiate the ForkJoinPool. We can either use ForkJoinPool.commonPool(), which reuses the common pool shared among all ForkJoinTasks, or create a new instance with a specific number of threads using new ForkJoinPool(numberOfThreads).

Then, we initiate the task with pool.invoke(taskProcessor). This starts the process, invoking the compute() method of TaskProcessor. The invoke() method is synchronous—it blocks until the task is complete.
Finally, it's a best practice to shut down the pool after all tasks are complete using pool.shutdown(), especially if you created a new instance of ForkJoinPool. Not shutting it down can lead to resource leaks.

Moving from best practices to performance considerations, it's essential to evaluate the efficiency of our implementation. Analyzing the runtime complexity, the algorithm gives us a time complexity of $O(n\ log\ n)$ , given the recursive halving of tasks $(log\ n)$ and the processing time for $n$ tasks.

Conclusion

The ForkJoinPool handles the heavy lifting of worker thread management, task synchronization, and other low-level details. The invoke() method of the pool executes the specified task and any subtasks that it may create, utilizing the available threads in the pool.

The framework ensures balanced distribution of tasks among threads and efficient execution, leveraging the divide-and-conquer principle we implemented in the compute() method of our TaskProcessor.

I hope this guide made Java's Fork/Join framework easier for you! Thanks for sticking around 🥰.

Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.

1. 2024
  October
  1. The World is Stochastic
    18th
2. May
  1. Debugging a Running Java App in Docker
    29th
3. February
  1. Why is it UTC and Not CUT?
    21st
1. 2023
  December
  1. Deep Prop Drilling in ReactJS
    26th
2. October
  1. Eigenvectors
    24th
  2. Java's Fork/Join Framework
    21st
3. August
  1. TypeScript's Omit & Pick
    10th
4. June
  1. reverse() vs toReversed()
    28th
5. May
  1. Integrating JUnit in Maven Projects
    25th
6. March
7. January
  1. Fast forward videos with ffmpeg
    18th
  2. Rotate Y-Axis of a 2D Vector
    5th
1. 2022
  December
  1. Synchronizing time
    31st
2. November
  1. Vector rotation
    20th
  2. Sed find and replace
    14th
3. September
  1. Asgardeo Try it Application
    6th
4. August
  1. Flatten error constraints
    11th
5. July
  1. Good Git commit messages
    24th
6. March
  1. Asgardeo JIT User Provisioning
    9th
7. February
  1. Monotonic Arrays
    25th
  2. How GOROOT and GOPATH Works
    1st
1. 2021
  November
  1. Two summation
    21st
  2. How I built my blog with NextJS
    20th

What the Fork?

Why?

How Does It Work?

Conclusion

Well, now what?

2024

October

May

February

2023

December

October

August

June

May

March

January

2022

December

November

September

August

July

March

February

2021

November