I'm writing a Windows service, the continued operation of which is important. To deal with crashes, the OS can be configured to automatically restart it. What may also happen, however, is it just getting stuck somehow. I've already seen this happen when network calls go wrong; once OpenSSL got into a faulty state and blocked forever. Now I'm looking for the best way to terminate the process when it seems like it's no longer doing any work for too long. A few options I can think of:
1. Doing the work in a separate thread that is monitored from the main thread
2. Doing the work in a child process that is monitored by the parent
3. Doing the work in the parent process that is monitored by a child
4. Doing the work in a process that is scheduled to be killed by Windows
First of all, I'm not sure if I need process isolation or not. Can one thread cause the whole process to hang, i.e. all threads, including those it has no interaction with? Process isolation might be ideal, but it also brings complications (e.g. logging to the same file) that I could do without.
If 1 is not good enough, do I really need to go for 2 or 3, or could 4 work well enough? If a schedule timer-queue timer is fired to terminate the process when stuck, does that running in a Windows-owned thread make it any different (in terms of reliability) from 1? Would it still work when the process is completely stuck? Is there another, better way to let Windows handle the situation?
Between 2 and 3, is there any meaningful difference?