Ok, so I got this working 100%. Quite a feat!
And I must give credit to cogier and his friend for letting me know that the Task method even existed, along with an example project to work with.
FYI, in this post I use Task, Fork, parallel processing, multi-processing interchangeably. Its all referring to the same thing.
In my earlier experimental project Task_Test, everything worked great. Except for one huge problem.
Public/Global arrays that were filled with hundreds of sorted records (proven with Print statements), called from within the Task/Fork processes, became inexplicably reset to Null, the instant the Task/Fork processes completed.
After looking into this matter, it turns out that Fork processes are copies of parts of the parent Gambas program, which are given over to the system, thereby allowing multi-processing to occur. The problem with this is, these Forked processes don't appear to have any way to communicate the results of their work back to the parent Gambas program. The Forks complete their duties, then POOF they're gone. Along with any variables containing post-processed data your Gambas program may have been expecting.
Well, if we can't get any post-processed work out of the Fork processes, then what good are they? (I know, right?) But, I've come up with a working solution, thanks to cogier. One simply needs to File.Save the post-processed resulting variable from within the Forked process, and then File.Load that saved variable into your Gambas program. This gives you all the benefits of multi-processing without Gambas needing to strictly have built-in support for it.
All fine and good. But what if the resulting variable is an array? What if its a monster structured class array from h*ll, like Godzilla has a fondness for working with, in his Task_Test project? Something like that is beyond the capabilities of File.Save and File.Load.
I thought I was out of luck. However, I've found a solution thanks to code posted by Jussi Lahtinen in 2013. Object serialization, which is just a fancy term for a way of saving and loading variables of essentially infinite complexity, using very fast binary files (the routines are SaveValues and LoadValues). So thanks to Jussi, Task_Test is now a 100% working project, updated here as Task_Test_Working.
Jussi's SaveValues and LoadValues routines were, however, incomplete. In that they were missing support for the variable types: Single, Float, Variant, Object, and Pointer. I've completed his routines to include all that were left out. I haven't strictly tested each of these additional variable types. But in theory, any variable type of any level of complexity you can throw at SaveValues and LoadValues should work.
Instructions for the Task_Test_Working project:
When you run the program, it will tell you how many threads your CPU is capable of using. "Available CPU threads: 8" for me, though 4 are hyper-threads which is fine. Pressing the Go button will generate a structured array containing 5000 (by default) records generated at random. You can change it to whatever number you like in the textbox. Higher numbers increase the duration of CPU load (exponentially) and lower numbers decrease the duration of CPU load. Once you decide on a new number, simply press Go again (pressing Go additional times has been fixed).
Assuming your CPU has 4 cores or hyper-threads, you can watch all the various subroutines being computed in parallel on your System Monitor > Resources Tab > CPU History graph. Fantastic!
If you tick the Parallel vs. Sequential checkbox, then press the Go button, it will run a normal multi-process Task, and print a time benchmark in the console on completion. Immediately after, it runs the exact same set of data using the exact same subroutines, but in sequence, and also giving a time benchmark in the console on completion.
Parallel vs. Sequential's two time benchmarks allow you to see and appreciate how much time you save using parallel processing over sequential processing, when it comes to CPU-intensive processes. I didn't do extensive tests, but it seems the time saved using multi-processing seems to increase as the number of records you choose to process increases. But I don't really know. Play around with it if you like. Run a controlled series of tests to find out whatever there is to find out.
You can, of course, call any set of subroutines to run in parallel that you want. Your subroutines don't have to use Public/Global variables (scope is immaterial in Forked processes), they don't have to return complex variables or arrays, and you don't have to use the SaveValues and LoadValues routines. You may not want any information returned from Fork processes at all. Just use it however it would be beneficial to you, if you need powerful computing done as quickly as possible.
As it is in this project, the Task method is hard-coded to use 4 threads, regardless of your CPU capability. It would be a nice addition to this code if the number of threads could be assigned dynamically according to how many are available to the CPU.
Also, let's say you have 16 CPU-intensive subroutines to call. I'm not sure how one could "queue" subroutines and have them be executed as CPU threads become available. It would be very interesting to be able to do this.
So there's much more to multi-processing in Gambas to think about. But at least we have our foot in the door now. This thread is very happily [SOLVED] thanks to cogier and his friend. But lets continue to develop and build on ideas for Tasking/Forking. Its something I couldn't be happier with or excited about.
Feel free to ask questions if you're trying to implement any of this into your own project. We're here to help.
- (34.46 KiB) Downloaded 33 times