Question for the HPP team:
I understand people can rent a virtual host (xen) and cpu-time for parallel processing jobs (opencl) on the HPP platform. What if, in the case of virtual host sharing, there is a power failure or the physical host gets powered down, is there some kind of "high availibility" mechanism in these cases? Will it be possible to implement a Virtual Machine High Availibility Cluster on top of the HPP platform?
David
Hi usr64,
Of course High Availability is a major concern for HPP Platform, and it is ensured by combining three technics :
- Nodes automatic deployment : if a node get powered down, The Tasks' Scheduler process an automatic deployment to another node using the replicated provider's data
- Node migration : in case a node receive a normal shut down signal from the User/OS, the node sends to the Task's Scheduler a snapshot of the current state before shutting down, the Tasks' Scheduler migrate the tasks to an available node.
- Providers data replication : HPP platform use replication in order to recover data if a node gets powered down without migration
I noticed from your preview posts that you are very interested in technical details, are you trying to build a concurrent platform

just kidding

you are welcome.
and also if you work on same field (Distributed Systems) you are welcome to join HPP Team.
Best regards
E. Ramlin
HPP Lead Developer
'
Haha, I am just curious, don't know much about distributive systems or hpc (practically nothing), i am more into Linux, i know a little bit about system programming, but nothing on parallel or gpu programming, i am not as smart as you guys!