There are roughly two kinds of threads: the ones which spend most of their time waiting, and the ones that work hard for a short period of time.
The waiting threads should just be created on-demand. If you want to communicate with them, create a thread-safe queue and give them a pointer to it. Or use a socket. It doesn't matter much. Try hard not to use shared memory.
You can mostly ignore them after creation, it doesn't matter if they wait for days at a time. And if the application terminates, they will be removed automatically. But it is better to have them periodically check if they should terminate, so they can clean up after themselves.
The threads that work hard for a short period of time are often collected in a thread-pool, as to have a maximum that is allowed to be active at any time. And often they are used by having them all run the same task, but on different pieces of some shared memory. And you wait until they're all finished.
Which is completely wrong. In that case it is most of the time faster just to use a single thread to do the calculations.
The only good way to divide a big task and run it in parallel, is to first split the data into separate, independent pieces. And for each piece of data, you start a separate thread. Give them a way to post back the result (queue or socket, or write it to disk) and forget about them. They should terminate and clean up when they're done.
And, as the separating of the data takes time, as well as the rebuilding of it afterwards, it won't cause much problems if you just start a task (thread) when you're ready to do so.
Even in that case, the bottleneck is in the process that coordinates it. So it's probably not faster than just doing it all in a single thread.
A good example of how to do it is a server process: it sits on a socket, waiting for activity. If something happens, it starts a new thread to handle the connection. Which probably spawns a new thread to process each transaction.
All those threads do their thing, terminate and clean themselves up, except perhaps the server process itself.
And if you want them to communicate with their parent: give them a pointer to a queue, or make sure the parent waits with terminating until all the child-threads have terminated.
Inter-thread communication is tricky business, because you either don't know if the receiver still exists (self terminating) or if it is still doing the same task (thread pool).