Overview | Releases | Download | Docs | Links | Help | RecentChanges

Point 1 Setup

see also PointOneDcs

Introduction

All DAQ related servers, SBCs and ROSs are in USA15. We have 9 servers, pc-sct-mon-0n where n=1,9. Servers 1,2,3,4 and 9 are generally used for the barrel DAQ. Servers 5 and 6 are used for endcapC DAQ. Servers 7 and 8 are used for endcapA DAQ. For the physical layout of the ROD crates, SBCs, servers, and associated ROSs, see the sections below.

Barrel ROD crates

The shows which crates service which barrel modules, when viewed from the front of the crates:

Rack Y.23            | Rack Y.22
-------------------------------------------
SBC0                 | SBC1
sbc-sct-rcc-bc-02    | sbc-sct-rcc-ba-02
C-side USA15 modules | A-side USA15 modules
ROS pc-sct-ros-bc-01 | ROS pc-sct-ros-ba-01 
ROBid 0x2201nn       | ROBid 0x2101nn
REB port 16          | REB port 14
-------------------------------------------
SBC2                 | SBC3
sbc-sct-rcc-bc-01    | sbc-sct-rcc-ba-01
C-side US15 modules  | A-side US15 modules
ROS pc-sct-ros-bc-00 | ROS pc-sct-ros-ba-00
ROBid 0x2200nn       | ROBid 0x2100nn
REB port 15          | REB port 13
-------------------------------------------
Server pc-sct-mon-08 | Server pc-sct-mon-06
-------------------------------------------
Server pc-sct-mon-07 | Server pc-sct-mon-05
-------------------------------------------

EndCap ROD crates

The shows which crates service which endcap modules, when viewed from the front of the crates:

Rack Y.24-14         | Rack Y.25-14
-------------------------------------------
SBC7                 | SBC5
sbc-sct-rcc-eca-02   | sbc-sct-rcc-ecc-02
ROS pc-sct-ros-eca-01| ROS pc-sct-ros-ecc-01
12 RODs              | 11 RODs
ROBid 0x2301nn       | ROBid 0x2401nn
-------------------------------------------
SBC6                 | SBC4
sbc-sct-rcc-eca-01   | sbc-sct-rcc-ecc-01
ROS pc-sct-ros-eca-00| ROS pc-sct-ros-ecc-00
11 RODs              | 12 RODs
ROBid 0x2300nn       | ROBid 0x2400nn
-------------------------------------------
Server pc-sct-mon-02 | Server pc-sct-mon-04
-------------------------------------------
Server pc-sct-mon-01 | Server pc-sct-mon-03
-------------------------------------------

Working with the Barrels

The barrel DAQ is an SLC4 system running tdaq-01-08-03.

Working with Endcap A

The endcapA DAQ is an SLC3 system running tdaq-01-07-00.

Working with Endcap C

The endcapC DAQ is an SLC3 system running tdaq-01-07-00.

What to do in case of problems

First of all note there is NO ROOT access to point 1 machines, but there is some limited root functionality available via 'sudo'.

Your /tmp directory is full. Try clearing out some garbage and try again.
Eg remove the ConfigurationLog file if there is one, and it is probably
safe to empty the /tmp/backup directory.
Exit from the host machine and ssh back to it incluing the '-Y' option,
eg ssh -Y pcp-sct-mon-07
This is a known DDC problem and happens intermittently. A fix is pending.
Select ddc_ct in the Run Control hierachy in the IGUI and click the 
'Clear error' button.
Then do the same for the top level SCT, and then the RootController.
DDC commands cannot be sent from SctGUI, but DCS data will continue
to be dislayed.
The gui has probably hung as it is writing to a directory on a disk which 
has a stale nfs mount. Normally this results in an exception, and the gui
would still open, but under certain conditions the gui could hang. 
Try 'ls /alldisks/pc-sct-mon-09' and if nothing is visible, reboot pc-sct-mon-09. 
But if you do this, you might need to then reboot other machines (see below).
 rm_free_all_resources -p $TDAQ_PARTITION
Check that all the relevant servers and SBCs can see the logs directory.
For example, for the endcapC DAQ, the logs are in /alldisks/pc-sct-mon-05/logs. 
If this is not visible from the SBC, first try to ssh to pc-sct-mon-05 and 
see if the directory is there. 
If not, reboot pc-sct-mon-05 by typing 'sudo reboot'). 
Then reboot all machines in the endcapC DAQ (ie pc-sct-mon-06,
sbc-sct-rcc-ecc-01 and sbc-sct-ecc-ecc-02).

Another possibility is a temporary failure of the central file
servers, which means that some directories are not visible or
have a stale nfs mount. A general reboot of our machines should rectify
any problems.
In principle it should not matter which machines are booted in what order.
But I would advise booting pc-sct-mon-09 first, then wait for at least 2 minutes, 
before booting all other machines.
Start or reboot the machines relevant to the current DAQ.
These machines are all in USA15:
pc-sct-mon-01  Bottom of Rack Y.24-14
pc-sct-mon-02  Bottom of Rack Y.24-14
pc-sct-mon-03  Bottom of Rack Y.25-14
pc-sct-mon-04  Bottom of Rack Y.25-14
pc-sct-mon-05  Bottom of Rack Y.22-11
pc-sct-mon-06  Bottom of Rack Y.22-11
pc-sct-mon-07  Bottom of Rack Y.23-11
pc-sct-mon-08  Bottom of Rack Y.23-11
pc-sct-mon-09  Lower section of Rack Y.24-11 
**Do pc-sct-mon-09 first, and wait for 2 minutes before booting the others**

Then the SBCs (Single Board Computers) which are in the ROD crates and
in the TTC vme crate:
sbc-sct-tcc-01 In vme crate in top half of Rack Y.24-11
sbc-sct-rcc-ba-01 Bottom ROD crate rack Y.22-11
sbc-sct-rcc-ba-02 Top ROD crate rack Y.22-11
sbc-sct-rcc-bc-01 Bottom ROD crate rack Y.23-11
sbc-sct-rcc-bc-02 Top ROD crate rack Y.23-11
sbc-sct-rcc-eca-01 Bottom ROD crate rack Y.24-14
sbc-sct-rcc-eca-02 Top ROD crate rack Y.24-14
sbc-sct-rcc-ecc-01 Bottom ROD crate rack Y.25-14
sbc-sct-rcc-ecc-02 Top ROD crate rack Y.25-14

It might take 2 minutes to boot.
Then try to ssh to the machine(s).

If, sometime after booting, there is still no response from a machine,
(ie you cannot ping or ssh to a machine) then
either:
1) the network connection to that machine is down, OR
2) the boot server is down or there is no connection to it,
so the machine cannot load the boot image.

The only way to verify whether 1) or 2) is valid is to connect
a monitor to the machine and watch the boot sequence (watch
it search for and find, or fail to find, the boot image).
There is a monitor for this purpose below the TTC crate.
Also check the status of network LEDs on the back of the machines.

After everything is back, ssh to the relevant servers and SBCs
to check that you can see the home directory and the /det/sct directory.
If not, then the central file servers are still down, and you have
to wait (and then reboot our machines yet again when the file servers
are available).
Also check that /alldisks/pc-sct-mon-09 is visible.

NOTE: we have no ssh access to our ROS machines, these are the responsibility
of the TDAQ sysadmins team.
Check that DDC is enabled in the configuration.  From the main DAQ GUI, open
the panel entitled "Segment and Resource".  Double click on the SCT_Segment 
to view its constituent members.  One of these will be named 
"SCT_DDC_Segment" or similar.  If it is disabled, right click to enable it,
then click on the "green arrow and grey cylinder" icon to save this change 
in the database. The change will be picked up when the DAQ is next booted.

Of course the DDC segment may already be enabled, yet DDC is not working.
This is most often true for commands.  Close down the DAQ, then restart the
DNS running on pcatlsctcr01.  Hopefully DDC will now work for a little while,
 but if it does not, see the entry "MRS message at CONFIGURE" which describes
 how to ignore the ddc_ct error and continue...

ROD firmware

Originally loaded:

Then the following were loaded to the ROD in slot 16 of crate 0 (sbc-sct-rcc-bc-02) (14/08/07):

Then the following were uploaded to slot 16 crate 0 on 07/09/07, and seem to work fine:

Now (12/11/07) we have the following on all RODs:

Installing an updated SctRodDaq

See also the HowToCompile notes.
If tdaq-01-08, compile on pcatsct06 with /daqsoft/sct/setup_tdaq18_slc4.sh

To copy to point1, use $SCT_DAQ_ROOT/MiniUtils/makePointTarBall
and follow the scp instructions given by the script.

Then in point1:
cd /det/sct/tdaq-01-08-03
mkdir mySctRodDaq_directory
cd mySctRodDaq_directory
copy the tar ball to mySctRodDaq_directory and tar -xvzf
cd /det/sct/tdaq-01-08-03
rm SctRodDaq
ln -s mySctRodDaq_directory SctRodDaq

NOTIFY other experts (Bruce, Dave etc) because the main DAQ will
use your installation, including the main ATLAS DAQ !!
If this unnerves you, please make your own private installation and create
your own private OKS database (make a copy /det/sct/tdaq-01-08-03/oks/sct)
with your own partition etc. and make it point to your new copy of SctRodDaq
(the Variable object called SCT_DAQ_ROOT).

Miscellaneous