Mail from Chris to Bilge 14th Oct 2005
> I am nearly finished with writing the cosmics config files. I was
> wondering what changes must be made when we run in multicrate mode. I
> suppose there is two partitions...
> Can you send me an example?
To the extent to which multicrate functionality can be claimed to
work, the only thing you have to do is
(1) tell the system it needs to start up the necessary extra
SctApiCrateServers, and
(2) tell the top controller that the extra CrateControllers
exist.
The hierarchy goes:
Partitions contain
Crates which contain
Rods.
The so-called "multi crate" use of SctRodDaq has only (thus far)
concerned itself with how to run multiple crates within one
partition.
NOTHING has considered multiple partitions interactiiing with each
other. I believe separate partitions are intended to work quite
separately from each other ....
You do modifications (1) and (2) purely by editing the OKS config
database files, since these are the places where it is defined which
processes should start up.
Lets look at (1) first:
SctAppilications.data.xml defines a
SctApiCrateServer0
object.
Effectively one needs to duplicate this entry in the file, but in the
second copy changing most "0"'s to "1"s.
The new addition would look like:
<obj class="Application" id="SctApiCrateServer1">
<attr name="Name" type="string">"SctApiCrateServer1"</attr>
<attr name="Parameters" type="string">"-u UCID:0.1"</attr>
<attr name="RestartParameters" type="string">"-u UCID:0.1"</attr>
<attr name="ControlledByOnline" type="bool">1</attr>
<attr name="IfDies" type="enum">"Ignore"</attr>
<attr name="IfFailed" type="enum">"Ignore"</attr>
<attr name="StartAt" type="enum">"Boot"</attr>
<attr name="StopAt" type="enum">"Shutdown"</attr>
<attr name="InitTimeout" type="u32">15</attr>
<attr name="CommTimeout" type="u32">10</attr>
<attr name="InputDevice" type="string">""</attr>
<attr name="OutputDevice"
type="string">"${TDAQ_LOGS_PATH}/SctApiCrateServer1.out"</attr>
<attr name="ErrorDevice"
type="string">"${TDAQ_LOGS_PATH}/SctApiCrateServer1.err"</attr>
<attr name="StartIn" type="string">"${SCT_DAQ_ROOT}/SctApi"</attr>
<rel name="RunsOn">"Computer" "SBC-NUMBER-2"</rel>
<rel name="ProcessEnvironment" num="0"></rel>
<rel name="InitializationDependsFrom" num="2">
"RunControlApplication" "config_server"
"Application" "SCTAPIServerISServer"
</rel>
<rel name="ShutdownDependsFrom" num="0"></rel>
<rel name="Program">"Binary" "SctApiCrateServer"</rel>
<rel name="ExplicitTag">"" ""</rel>
<rel name="Uses" num="0"></rel>
</obj>
Note the "UCID:0.1" lines which were "UCID:0.0" lines in
SctApiCrateServer0.
Those are "UniqueCrateIDentifiers". Essentially they are
UCID:PartitionNumber.CrateNumberWithinThatPartition.
You have to use this to tell the crate controller which crate it
should manage.
Note also the "SBC-NUMBER-2" line -- this should actually refer to
the name of the object that defines the physical SBC relevant to that
crate. If not already defined there, this will need to be defined in SctHardware.data.xml (called Hardware.data.xml prior to tdaq-01-04-00)
in the same manner that you define your exiting/other SBC.
The corresponding entry might look like:
<obj class="Computer" id="SBC-NUMBER-2">
<attr name="Type" type="string">"linux"</attr>
<attr name="Name" type="string">"vme4.hep.phy.cam.ac.uk"</attr>
<attr name="RLogin" type="string">"ssh"</attr>
<attr name="HW_Tag" type="enum">"i686-slc3"</attr>
</obj>
Note that the ip name (in the above example "vme4.hep.phy.cam.ac.uk") must be EXACTLY as returned by /bin/hostname as run on the relevant machine. Sometimes this is the fully qualified name (as above) and sometimes this is the short form (eg "vme4"). You will find it hard to debug this problem if you encounter it, so make doubly sure you get it right when you create the entry. [Aside: this silly process has been removed in tdaq-01-07 - from there onwards one only has to specify the FQDN hostname (hostname -f). Thankfully.]
Now let's look at (2)
Again, in SctAppilications.data.xml you see the top level SctApi
object defined:
<obj class="Application" id="SctApiServer">
<attr name="Name" type="string">"SctApiServer"</attr>
<attr name="Parameters" type="string">"-crate UCID:0.0"</attr>
<attr name="RestartParameters" type="string">"-crate
UCID:0.0"</attr>
<attr name="ControlledByOnline" type="bool">1</attr>
<attr name="IfDies" type="enum">"Restart"</attr>
<attr name="IfFailed" type="enum">"Restart"</attr>
<attr name="StartAt" type="enum">"Boot"</attr>
<attr name="StopAt" type="enum">"Shutdown"</attr>
<attr name="InitTimeout" type="u32">15</attr>
<attr name="CommTimeout" type="u32">10</attr>
<attr name="InputDevice" type="string">""</attr>
<attr name="OutputDevice"
type="string">"${TDAQ_LOGS_PATH}/SctApiServer.out"</attr>
<attr name="ErrorDevice"
type="string">"${TDAQ_LOGS_PATH}/SctApiServer.err"</attr>
<attr name="StartIn" type="string">"${SCT_DAQ_ROOT}/SctApi"</attr>
<rel name="RunsOn">"Computer" "Host"</rel>
<rel name="ProcessEnvironment" num="0"></rel>
<rel name="InitializationDependsFrom" num="3">
"RunControlApplication" "config_server"
"Application" "SCTAPIServerISServer"
"Application" "SctApiCrateServer0"
</rel>
<rel name="ShutdownDependsFrom" num="0"></rel>
<rel name="Program">"Binary" "SctApiServer"</rel>
<rel name="ExplicitTag">"" ""</rel>
<rel name="Uses" num="0"></rel>
</obj>
If you want it to manage more crates, then list them as extra
arguments eg as folows:
<attr name="Parameters" type="string">"-crate UCID:0.0 -crate UCID:0.1 -crate UCID:0.5"</attr>