Introduction
All DAQ related servers, SBCs and ROSs are in USA15. We have 9 servers, pc-sct-mon-0n where n=1,9. Servers 1,2,3,4 and 9 are generally used for the barrel DAQ. Servers 5 and 6 are used for endcapC DAQ. Servers 7 and 8 are used for endcapA DAQ. For the physical layout of the ROD crates, SBCs, servers, and associated ROSs, see the sections below.Barrel ROD crates
The shows which crates service which barrel modules, when viewed from the front of the crates:
Rack Y.23 | Rack Y.22 ------------------------------------------- SBC0 | SBC1 sbc-sct-rcc-bc-02 | sbc-sct-rcc-ba-02 C-side USA15 modules | A-side USA15 modules ROS pc-sct-ros-bc-01 | ROS pc-sct-ros-ba-01 ROBid 0x2201nn | ROBid 0x2101nn REB port 16 | REB port 14 ------------------------------------------- SBC2 | SBC3 sbc-sct-rcc-bc-01 | sbc-sct-rcc-ba-01 C-side US15 modules | A-side US15 modules ROS pc-sct-ros-bc-00 | ROS pc-sct-ros-ba-00 ROBid 0x2200nn | ROBid 0x2100nn REB port 15 | REB port 13 ------------------------------------------- Server pc-sct-mon-08 | Server pc-sct-mon-06 ------------------------------------------- Server pc-sct-mon-07 | Server pc-sct-mon-05 -------------------------------------------
EndCap ROD crates
The shows which crates service which endcap modules, when viewed from the front of the crates:
Rack Y.24-14 | Rack Y.25-14 ------------------------------------------- SBC7 | SBC5 sbc-sct-rcc-eca-02 | sbc-sct-rcc-ecc-02 ROS pc-sct-ros-eca-01| ROS pc-sct-ros-ecc-01 12 RODs | 11 RODs ROBid 0x2301nn | ROBid 0x2401nn ------------------------------------------- SBC6 | SBC4 sbc-sct-rcc-eca-01 | sbc-sct-rcc-ecc-01 ROS pc-sct-ros-eca-00| ROS pc-sct-ros-ecc-00 11 RODs | 12 RODs ROBid 0x2300nn | ROBid 0x2400nn ------------------------------------------- Server pc-sct-mon-02 | Server pc-sct-mon-04 ------------------------------------------- Server pc-sct-mon-01 | Server pc-sct-mon-03 -------------------------------------------
Working with the Barrels
The barrel DAQ is an SLC4 system running tdaq-01-08-03.
- ssh sctswinsaller@pc-sct-mon-09
- source /det/sct/tdaq-01-08-03/setup.SCT.sh
- setup_daq -p $TDAQ_PARTITION -i logger
Working with Endcap A
The endcapA DAQ is an SLC3 system running tdaq-01-07-00.
- ssh sctswinsaller@pc-sct-mon-07
- source /det/sct/tdaq-01-07-00/setup.endcapA.sh
- RUNDAQ
- run logs are in /alldisks/pc-sct-mon-07/logs
Working with Endcap C
The endcapC DAQ is an SLC3 system running tdaq-01-07-00.
- ssh sctswinsaller@pc-sct-mon-05
- source /det/sct/tdaq-01-07-00/setup.endcapC.sh
- RUNDAQ
- run logs are in /alldisks/pc-sct-mon-05/logs
What to do in case of problems
First of all note there is NO ROOT access to point 1 machines, but there is some limited root functionality available via 'sudo'.
- "quota exceeded" error message after RUNDAQ or running the setup script
Your /tmp directory is full. Try clearing out some garbage and try again. Eg remove the ConfigurationLog file if there is one, and it is probably safe to empty the /tmp/backup directory.
- The TDAQ IGUI or SCT GUI is partly blanked out.
Exit from the host machine and ssh back to it incluing the '-Y' option, eg ssh -Y pcp-sct-mon-07
- MRS error at CONFIGURE stating 'PVSS project not running'
This is a known DDC problem and happens intermittently. A fix is pending. Select ddc_ct in the Run Control hierachy in the IGUI and click the 'Clear error' button. Then do the same for the top level SCT, and then the RootController. DDC commands cannot be sent from SctGUI, but DCS data will continue to be dislayed.
- SCT GUI seems to start but doesn't open on the screen
The gui has probably hung as it is writing to a directory on a disk which has a stale nfs mount. Normally this results in an exception, and the gui would still open, but under certain conditions the gui could hang. Try 'ls /alldisks/pc-sct-mon-09' and if nothing is visible, reboot pc-sct-mon-09. But if you do this, you might need to then reboot other machines (see below).
- TDAQ GUI doesn't allow you do send commands (don't do this in global runs!)
rm_free_all_resources -p $TDAQ_PARTITION
- Some processes don't start
Check that all the relevant servers and SBCs can see the logs directory. For example, for the endcapC DAQ, the logs are in /alldisks/pc-sct-mon-05/logs. If this is not visible from the SBC, first try to ssh to pc-sct-mon-05 and see if the directory is there. If not, reboot pc-sct-mon-05 by typing 'sudo reboot'). Then reboot all machines in the endcapC DAQ (ie pc-sct-mon-06, sbc-sct-rcc-ecc-01 and sbc-sct-ecc-ecc-02). Another possibility is a temporary failure of the central file servers, which means that some directories are not visible or have a stale nfs mount. A general reboot of our machines should rectify any problems.
- Booting hierachy
In principle it should not matter which machines are booted in what order. But I would advise booting pc-sct-mon-09 first, then wait for at least 2 minutes, before booting all other machines.
- What to do after a power cut
Start or reboot the machines relevant to the current DAQ. These machines are all in USA15: pc-sct-mon-01 Bottom of Rack Y.24-14 pc-sct-mon-02 Bottom of Rack Y.24-14 pc-sct-mon-03 Bottom of Rack Y.25-14 pc-sct-mon-04 Bottom of Rack Y.25-14 pc-sct-mon-05 Bottom of Rack Y.22-11 pc-sct-mon-06 Bottom of Rack Y.22-11 pc-sct-mon-07 Bottom of Rack Y.23-11 pc-sct-mon-08 Bottom of Rack Y.23-11 pc-sct-mon-09 Lower section of Rack Y.24-11 **Do pc-sct-mon-09 first, and wait for 2 minutes before booting the others** Then the SBCs (Single Board Computers) which are in the ROD crates and in the TTC vme crate: sbc-sct-tcc-01 In vme crate in top half of Rack Y.24-11 sbc-sct-rcc-ba-01 Bottom ROD crate rack Y.22-11 sbc-sct-rcc-ba-02 Top ROD crate rack Y.22-11 sbc-sct-rcc-bc-01 Bottom ROD crate rack Y.23-11 sbc-sct-rcc-bc-02 Top ROD crate rack Y.23-11 sbc-sct-rcc-eca-01 Bottom ROD crate rack Y.24-14 sbc-sct-rcc-eca-02 Top ROD crate rack Y.24-14 sbc-sct-rcc-ecc-01 Bottom ROD crate rack Y.25-14 sbc-sct-rcc-ecc-02 Top ROD crate rack Y.25-14 It might take 2 minutes to boot. Then try to ssh to the machine(s). If, sometime after booting, there is still no response from a machine, (ie you cannot ping or ssh to a machine) then either: 1) the network connection to that machine is down, OR 2) the boot server is down or there is no connection to it, so the machine cannot load the boot image. The only way to verify whether 1) or 2) is valid is to connect a monitor to the machine and watch the boot sequence (watch it search for and find, or fail to find, the boot image). There is a monitor for this purpose below the TTC crate. Also check the status of network LEDs on the back of the machines. After everything is back, ssh to the relevant servers and SBCs to check that you can see the home directory and the /det/sct directory. If not, then the central file servers are still down, and you have to wait (and then reboot our machines yet again when the file servers are available). Also check that /alldisks/pc-sct-mon-09 is visible. NOTE: we have no ssh access to our ROS machines, these are the responsibility of the TDAQ sysadmins team.
- DDC is not working
Check that DDC is enabled in the configuration. From the main DAQ GUI, open the panel entitled "Segment and Resource". Double click on the SCT_Segment to view its constituent members. One of these will be named "SCT_DDC_Segment" or similar. If it is disabled, right click to enable it, then click on the "green arrow and grey cylinder" icon to save this change in the database. The change will be picked up when the DAQ is next booted. Of course the DDC segment may already be enabled, yet DDC is not working. This is most often true for commands. Close down the DAQ, then restart the DNS running on pcatlsctcr01. Hopefully DDC will now work for a little while, but if it does not, see the entry "MRS message at CONFIGURE" which describes how to ignore the ddc_ct error and continue...
ROD firmware
Originally loaded:- ROD Controller: f20
- ROD Formatter: f23
- ROD EFB: f1d
- ROD Router: f1b
Then the following were loaded to the ROD in slot 16 of crate 0 (sbc-sct-rcc-bc-02) (14/08/07):
- ROD Controller: f23
- ROD Formatter: f23
- ROD EFB: f1f
- ROD Router: f1b
Then the following were uploaded to slot 16 crate 0 on 07/09/07, and seem to work fine:
- Controller: v2ef
- Formatter: s23f
- EFB: v22f
- Router: v1ef
Now (12/11/07) we have the following on all RODs:
- Controller f30
- Fmt f25
- EFB f25
- RTR f1e
Installing an updated SctRodDaq
See also the HowToCompile notes.If tdaq-01-08, compile on pcatsct06 with /daqsoft/sct/setup_tdaq18_slc4.sh To copy to point1, use $SCT_DAQ_ROOT/MiniUtils/makePointTarBall and follow the scp instructions given by the script. Then in point1: cd /det/sct/tdaq-01-08-03 mkdir mySctRodDaq_directory cd mySctRodDaq_directory copy the tar ball to mySctRodDaq_directory and tar -xvzf cd /det/sct/tdaq-01-08-03 rm SctRodDaq ln -s mySctRodDaq_directory SctRodDaq NOTIFY other experts (Bruce, Dave etc) because the main DAQ will use your installation, including the main ATLAS DAQ !! If this unnerves you, please make your own private installation and create your own private OKS database (make a copy /det/sct/tdaq-01-08-03/oks/sct) with your own partition etc. and make it point to your new copy of SctRodDaq (the Variable object called SCT_DAQ_ROOT).
Miscellaneous
- to see the USB memory stick on (say) pc-sct-cr-01, mount /mnt/flash. When finished, umount /mnt/flasg
- Getting working on pcatsct06 as sctroddq: [pcatsct06] /work/srsctdaq1 > source /work/pcphsctr04/daqsoft/sct/setup_tdaq17.sh