What is the SimGrid way of modeling abnormal situations with network of hosts? For example, how to model break of link routes? Master
creates some task
and dsend
s it to worker. But in case of broken link, this task will be lost. But master
doesn't know about broken links and may continue to dsend
new task
to worker
.
UPDATED
I added .fail file to state_file
attribute of platform.xml.
PERIODICITY 10.0
1.0 1
2.0 0
Error occurs, when worker
stops working.
What is the way to handle it?
** SimGrid: UNCAUGHT EXCEPTION received on java(2): category: action canceled; value: 0
** �;
** Thrown by LHCb.Tier1() in this process
[Tier1_1:LHCb.Tier1:(2) 2.000000] /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/xbt/ex.c:140: [xbt_ex/CRITICAL] �;
** In _ZN7simgrid4java11JavaContext4stopEv() at /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/bindings/java/JavaContext.cpp:144
** In SIMIX_process_yield() at /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/simix/smx_process.cpp:1014
** In simcall_execution_wait() at /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/simix/libsmx.cpp:276
** In MSG_parallel_task_execute() at /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/msg/msg_gos.cpp:90
** In MSG_host_del_task() at /builds/workspace/SimGrid-Multi/build_mode/Debug/node/simgrid-ubuntu-trusty-64/build/SimGrid-3.13/src/msg/msg_vm.cpp:521
** In ExceptionOccurred() at /usr/lib/jvm/java-7-openjdk-amd64/include/jni.h:825
** In ?? at [0x7f7aa8e09d98]
Everything you need for that exists in SimGrid, under the name of failures, but unfortunately a bit sparsely documented. You want to add a state trace file to your hosts or links.
Please refer to the documentation or the platform tutorial. You can find an example of use in the archive, in the file examples/platforms/faulty_host.xml.
Note that it describes the failures of hosts, but that's exactly the same for the failures of links, which can be given a state file too in the XML.