Task/Multi-thread processing, queue many tasks while detecting/using your available cores

Post Reply
User avatar
Godzilla
Posts: 38
Joined: Wednesday 26th September 2018 11:20am

Task/Multi-thread processing, queue many tasks while detecting/using your available cores

Post by Godzilla » Saturday 19th October 2019 9:12am

For anyone interested, I've been continuing my research into using multi-processing using all available cores,

Previously, I came up with a way of communicating post-processed data from detached Task processes back to the parent Gambas program, making multi-processing into something usable.

But I was left with the questions of how to automatically detect and use only the number of cores that a given computer has (without needing to hard-code it on a computer-by-computer basis). And also, how to queue a large number of operations to be multi-processed in parallel. Because to limit that to only the number of cores you have to work with is limiting and restrictive.

So after a lot of trial and error, I've solved both those problems and I'm attaching a sample project to demonstrate how it works. I had to come up with a different approach to program structure that centers around the Task process and CPUs.

The CPU_Test project detects the number of cores that are available on your system (from 1 to 32), and enqueues 40 operations to be performed by your number of cores. Whenever any one core has finished an operation, it immediately begins working on the next operation in the queue, until all 40 operations have completed. In this example, each operation is simply a counter from 0 to 200000000.

The form contains 6 radio buttons, a Go button, an Exit button, and 40 textboxes. The Go button simply starts the operations. And as each operation finishes, it is indicated by a series of text boxes, which also indicate CPU core number that was used for that operation.

The radio buttons are for core selection, which is automatically set to the correct number for your specific computer on Form_Open. They range from 1 core, 2 cores, 4 cores, 8 cores, 16 cores, and 32 cores (the code can be expanded to support 64 cores, 128 cores, 256 cores, 512 cores, etc). So you can use the default number of true cores you have and press the Go button. And then, if you like, select a different number of cores, and press the Go button again to see the results of it doing the same set of operations with a different number of cores.

Regardless of which radio button you choose, your actual number of cores will remain indicated by a parenthesis around that number. For example 1 2 (4) 8 16 32 if your computer has 4 cores.

One of my computers has an older 4 core Celeron CPU. And my other has an 8 core i7 CPU (4 cores + 4 hyper-threads). Here's some interesting benchmarks I got while testing this project on each computer.

1 core Celeron completes in 3m 36s
2 cores Celeron completes in 1m 28s
4 cores Celeron completes in 52 seconds

1 core i7 completes in 1m 05s
2 cores i7 completes in 30s
4 cores i7 completes in 15s
8 cores i7 completes in 11s

The i7 result for 8 cores raised a question. Why didn't 8 cores complete in half the time vs 4 cores (as was the case in 4 cores vs 2 cores)? I suppose this is due to the 4 hyper-threads this CPU uses. They're not actual physical cores, so probably perform at half the speed of the physical cores. But heck, I'll take it. :D

I wasn't sure what might happen if I chose the 16 cores or 32 cores radio buttons, but I tried it anyway. I expected a crash or an error. But to my surprise, it performed and completed the operations without incident. However, there was time decrease over the actual cores the CPU has. If you're fortunate enough to have access to something like a server, having 32 cores, I estimate these 40 operations would be completed between 1 and 2 seconds. Or maybe even less.

One thing to remember if using multi-processing in your programs is, your program will continue to perform the rest of its operations after starting a Task process, without regard to waiting for any Task processes to complete. So if your program depends on the receiving and working with any post-processed data from a Task process, you'll have to code in some sort of wait loop to delay program continuation until it gets the data it needs. File.Load and File.Save do not appear to work within Task processes. But object serialization works seamlessly.

Lastly, it should be stated that (to my understanding) Gambas creator Benoît Minisini frowns on using multi-processing. And its not that he doesn't agree that running complex operations in parallel saves time. Its simply that he believes that needing multi-processing is an indicator of sloppy, unoptimized programming. Or may contribute to the creation of sloppy, unoptimized programming. And I agree to an extent. Multi-processing is something the average Gambas programmer should never have any need for. But for power users, needing to process ungodly amounts of data as quickly as possible, the Task process is a Godsend. Once again, I thank cogier for letting me know it exists.

I hope any power users might find this project useful. Thanks for reading.
Attachments
CPU_Test.tar.gz
(28.77 KiB) Downloaded 13 times

User avatar
sjsepan
Posts: 40
Joined: Saturday 12th October 2019 10:11pm
Location: Leeper, PA, USA
Contact:

Re: Task/Multi-thread processing, queue many tasks while detecting/using your available cores

Post by sjsepan » Saturday 19th October 2019 1:19pm

Thank You, Godzilla, for continuing to shine a light on this area. 8-)
As Gambas users figure out 'when' to use multi-processing, hopefully much of the 'how' will be demonstrated by folks like you.

User avatar
Godzilla
Posts: 38
Joined: Wednesday 26th September 2018 11:20am

Re: Task/Multi-thread processing, queue many tasks while detecting/using your available cores

Post by Godzilla » Sunday 20th October 2019 1:11am

sjsepan wrote:
Saturday 19th October 2019 1:19pm
Thank You, Godzilla, for continuing to shine a light on this area. 8-)
As Gambas users figure out 'when' to use multi-processing, hopefully much of the 'how' will be demonstrated by folks like you.
Hey sjsepan, thanks for your reply and for your interest in this. I had a lot of fun working on that project. And I'm glad to give back to the Gambas community with what I've discovered. If it wasn't for cogier and the Gambas community, I wouldn't have even known any multi-processing option for Gambas even existed.

I'm sure there's people out there more experienced with Gambas multi-processing than I am, who might scoff at how I approached and solved the problems I was tackling. And that's fine. I'd welcome the scoffing, if they'd also offer a better approach than my own. :D I'd be happy to learn from them.

As for the question of when to us multi-processing: for the mass majority of Gambas programmers, multi-processing would be needlessly over-complicating code and overkill. Gambas is already very well-optimized for speed. And whatever operations their programming performs will certainly be done more or less instantaneously.

I think the only conditions where users might need to consider multi-processing are when:

1) Vast amounts of data needs to be processed, to the point even modern processors slow down to a crawl when using the traditional single thread.

2) Even slowed to a crawl, the processing will still eventually get done. Its only in situations where its critical that the processing to be completed in as little time as possible, and the current time is unacceptable. Examples may be: server operations, 3D gaming, scientific simulations.

Spending thousands of dollars on a new, more powerful computer is one option. But multi-processing, using the multi-core hardware you already have, is now a cheaper and more viable option. The reduction in time needed for processing, when multiple operations are performed in parallel, is proven to be very significant.

Thank you once again sjsepan for your reply and your interest.

Post Reply