Alan's findings as of 12/7/05
One's TIM needs to be set up as described in http://www.hep.ucl.ac.uk/atlas/sct/tim/. The fact that the event counts only intermittently were right on some slots hinted at a trigger (synched to all slots) vs trigger counter (bussed) problem. In at least one TIM 3B, the switch logic was upside down.
Observations (before fixing tim with upside down logic in switches)
- With 2 modules and 20V, 50k trigs/bin and 101 bins, HV at 20 V (i.e. relatively high occupancy) scan.tim=1, scan.nth=15, single L1A, Qth=0.8fC
- 100 kHz triggers, 0.01 Hz reset - Errors and Overflow and then no histogramming
- 100 kHz triggers, 1 Hz reset - ran OK.
At the end of a working scan as above, but with 150V HV and 1 Hz reset we get:
******************* Post triggers status 2 ****************** Reading from dsp 0 TrapStatus: Event word count: 44141 IFrame tail: 26 XFrame tail: 32 IFrame head: 26 XFrame head: 32 Trap Command/Status: Trailer Header ISR active "Error count": 0 "Event count": 218 Bin: 40 Cal: 0 Num events: 0 Histogram status: 0x0 Bin err: 0 proc time: 15 Events recvd: 136666 Checking TIM scan complete
Having fixed TIM problem, a software problem becomes obvious - distSlave was not set to 0 for this scan in Release4-upwards :( With this fixed it also runs ok from the GUI.
Also dont foget the 16-bit ness of the TIM register. Either check for this or do a work-around.
Bruce (and Alan) debugging synch triggers 29/6/2005
See also SynchTriggerNoise !
Working in Release4-updated at Oxford on Barrel5 LMT02->05 Z-, upper crate (atlassbc2) Rod SN F77, slot 6, order 02->05. TIM 2B s/n 212.
Debugging of SynchTrigerNoise test
Problems identified:
- The events contain BC and L1 errors
- EFB error mask bits can be set so that they are not reported as errors
- Events containing BC and L1 errors are not histogrammed,
- Got distracted far too long by this...
- It would be nice if the DSP code applied the appropriate mask when looking for headers
- The Rod Buffer seems to fill up so that after a certain number of events we get a nearly-full error. This happens even if events are sent at 1kHz.
- The L1 and BC counters from the modules seem to do pretty-much what we expect.
- The L1 counter from the ROD seems to be stuck at 9.
- The BC counter on the ROD does not correspond to that on the modules.
- Sort of (sometimes out by 1?) looked like the correct increment to be the L1 counter
- The ECR counter (ROD) seems to incriment by 2 rather than 1 between bins.
- The triggers-sent-per-bin-counter on the ROD seems to be offset by 1 bin from the resets of the module L1 and BC counters.
- Looking at the bin divisions expected from the ECR counter, this doesn't correspond with the output histogram
debugging tools used:
- .X synchNoise.cxx
- With small numbers of triggers and bins
- Aim to fit all events in the 32 frames of the event buffer
- Also useful to know what the mask pattern is:
- tapi.modifyABCDVar(14, 1.0);
- tapi.modifyABCDVar( 9, 120.0); // 8 channels per chip
- tapi.scanEvents
- Summarises event buffers on all slaves
- tapi.decodeEvents
- Decodes one event from a buffer
- .L Functs.cxx
- set debug option("scan_error_trap_all")
- Switch to error event mode (histogramming probably won't work)
- tapi.decodeEvent needs telling it's an error event.
- eg tapi.decodeEvent(0, 0, 0, 0, 3, 0, 1); // Where 3 is the index into the buffer
- print_calib()
- There's probably something useful somewhere about what state the formatters get into that causes "ALMOST_FULL" (ie why the FIFOs don't empty)
Further investigations:
- Forgot to try setting nth = 1 and a low trigger rate
- This would help when looking at L1 and BC counters in the error event dumps
Dual Triggers
- Correct behaviour and output for six modules in
- groups 1,2,3,4
- groups 0,1,4,5
- [ Which configuration created errors? ]
- Tested both scanning, and one or other scanning.
- N.B. configure2 forces use of SP_BOTH
- also set trigger sequence 2
- if only one then the DSPs might interpret things differently
- also set trigger sequence 2
- if USE_DUAL_PORTS is #defined to 1, however normal nmask seems to generate BC errors, even with only groups 0->3.
- But it did get started, didn't fail till part way through bin 3
- It's possible we tested the wrong thing, should try
- groups 0, 1, 6, 7
Off-ROD redundancy
- Fixed in principle for configuration setting to all modules, and modifyABCDVarROD all modules
- AJB to merge to HEAD
- AJB to do for single modules
- needs to be tested
firmware etc
Started TIM status Serial Number: 18 Version: 9 L1ID: ffffff BCID: 6000 status: aa80 Setting TIM frequencies: 100kHz 0.01Hz trigPower 2 trigBase 10 trigNibble 6 (6, 0) rstPower -2 rstBase 0.1 rstNibble 31 (7, 3) TIM initialise successful Found configuration for 1 rods Load configuration for rod Initialise ROD 0 at 0x6000000 Constructed Doing full ROD reset... ... done Initialised Started ROD Slot: 6Serial Number:589Number of slave DSPs: 4 Status registers[0-2]: 2201 0 0 Command registers[0-1]: 0 0 Primitive state: 0 Text State: 0 Got some text: crate: 0 rod slot: 6 Text INFO : 540 [MDSP: rodConfiguration.c, 735]:: Configuring ROD: initialization mode. [MDSP: rodConfiguration.c, 207]:: SDSP #0 clock set to 160 MHz. [MDSP: rodConfiguration.c, 207]:: SDSP #1 clock set to 160 MHz. [MDSP: rodConfiguration.c, 207]:: SDSP #2 clock set to 160 MHz. [MDSP: rodConfiguration.c, 207]:: SDSP #3 clock set to 160 MHz. [MDSP: utilities.c, 614]:: SCT DSP code version 1.1.1 Oct 15 2004 Formatter code version SCT e 20 40 MHz EFB code version e 2b Router code version e 21 Controller code version e 1c ROD 0 initialised Initialise BOC 0 Production BOC - Revision C: status Module Type: 46 Serial Number: 42 Hardware Version: 1 Firmware Version: 5 Manufacturer: cb Status Register: e
Slave code from /work/ppatlas1/config/DspImages?/SlaveDspCode?.051104_dualFixed