Triggers are sent by the TIM at 100kHz (this is the default value, if not set in the configuration). One event out of 15 is transferred to the SlaveDSPs? and histogrammed. This is assumed to be limited by the transfer of data from the Router to the SDSP: most events in this test have no hits, but a complete frame is still transferred over DMA.
NB, this only works with distSlave = 0, therefore all events are histogrammed by Slave DSP 0. It may be possible to implement distSlave = 2, by setting the trap match for different slaves. Depending on whether bottle neck is from router or into Slaves, this may make it possible to histogram more events in total. Either way, settings of module groups will be ignored.
Some interesting finds from the online analysis of SynchTriggerNoise is reported at SynchTriggerNoiseAnalysis
Problems
In an older version (release 3?):
The API does not read out the total number of triggers sent and is set to 0. In the OPE for this test, the y-axis "Null Variable" represent, readout in time while the x-axis is the number of hits recorded in an event. Bin 1-5 records the true occupancy whereas Bin 0 is calculated by subtracting the number of events from the total number of triggers and hence is negative.
The problems in the paragraph above with no trigger counting should be fixed now, but there is a new problem: the scan dosent run.
See also SynchTriggerNoiseDebugging
Output from the complete B6 scan is in this file [1] The output is quite a long way down the file!
[daquser@ppatlas1 daquser]$ grep -an modifyABCDVarROD SctApiCrateServer0.out 146492:modifyABCDVarROD bank 1 148759:modifyABCDVarROD bank 1 152305:modifyABCDVarROD bank 1 153737:modifyABCDVarROD bank 1 155169:modifyABCDVarROD bank 1Seems to have been broken between 4 March 2004
Date: Fri, 4 Mar 2005 19:05:45 +0000 (GMT) From: Alan Barr To: Bruce Cc: Christopher Lester Subject: Trigger counter The trigger counters on the ROD seem to be working correctly for both cal and non-cal scans, as well as for SynchTrigger. :)and daves tag sr1_working_0405
Test program run with dev4 at oxford: http://www-pnp.physics.ox.ac.uk/~daquser/STN/
Possible sources of the problems
Source | excluded by |
SctApi | |
CORBA impl | |
Configuration | |
CalibrationController | |
TIM hardware | Running B6 with R3 (ok) R4-updated (not ok) |
TIM firmware | Running B6 with R3 (ok) R4-updated (not ok) |
TimModule? | |
ROD hardware | Running B6 with R3 (ok) R4-updated (not ok) |
ROD firmware | Running B6 with R3 (ok) R4-updated (not ok) |
RodModule | |
online code | |
dataflow | |
external code | |
timing? | |
some combination | |
act of God | Archbishop of Canterbury |
If its a combination of these, its going to be hard to work out!
Date: Wed, 13 Jul 2005 19:40:19 +0100 (BST) From: Alan Barr <barr@hep.ucl.ac.uk> Subject: Synchronous Triggers - description of the problems After much gnashing of teeth ... There were at least five separate problems of varying seriousness and scope with synchTrigs, which was why the debugging has been difficult. Some of these may have been (probably were) specific to Oxford. I've tried to summarise my knowledge below: Problem 1 --------- The TIM trigger burst register is only 16 bits, so anyone requesting more than 2^16-1 triggers per bin got the wrong number of triggers. Affects: All versions of code. Fix: Either adding an adjacent register to make 32 bits, or getting the API to break down the request into chunks. Neither of which done yet. Currently I've only put a MRS warning message which should tell you you've got problems. Problem 2 --------- One of our TIM modules had its switches set-up upside down, so that MSB and LSB were swapped. These affected the relative timing of triggers and their counters arriving at the ROD, and so led to data errors in some ROD slots, and in particular the one I was originally using for debugging. Affects: One of our TIMs - 3B-324. Perhaps others? Solution: Fix up your TIM switches according to Matt's web page: http://www.hep.ucl.ac.uk/atlas/sct/tim/ being careful about which breed of TIM you have, and MSB vs LSB and 1s vs 0s. Problem 3 --------- In versions 1.5 to 1.7 inclusive of the SynchTrig script CalibrationController/src/scripts/SynchTriggerNoiseTest.h the distSlave option is commented out. This was an error introduced by me when trying to understand Problem 2 :-( Affects: No official releases, but was in Release4-updated in Oxford, which is why Alan and Bilge had to use R3 for Barrel 6 cold. Now fixed in R4-updated too. Was fixed in CVS head on 11th May 05. Problem 4 --------- It seems that the ROD error state "Buffer Full" (which is presumably reached if the histogramming cant keep up with the events, but perhaps also if there are error events?) is not cleared at the end of the bin in "physics mode". I belive that this is why in the SynchTrigNoise runs done on B3 and B6 some rods drop out part way through. I dont yet know why some had no events in the first bin! Affects: I'm not sure of the scope- certainly the November 2004 DSP code seems to suffer from this, which is used with Release 3 and 4. Fix: I dont know - ROD experts? Problem 5 --------- This is another I dont understand too well. It seems that synchronous trigger running causes problems with histogram readout, but only when running with assembler histogramming. The effect is a failure to correctly read out the histogram at the end of the scan, so all histo information is garbage (and the API often suffers badly too). The resultant messages to screen are at the bottom of this mail. Affects: The November 2004 DSP code is ok when using ccode for histogramming, has this problem when using assembler. The most recent DSP code distributed by Trevor on 6 Jul 2005 also has this problem when doing assembler histogramming, but with c-histogramming it is also ok. This problem is probably not noticable in public releases because in all official releases ccode histogramming is the default. However assembler has been the CVS HEAD default for some months now. Fix: Is there some thread synchronisation problem with assembler histogramming and tim triggers? Is there some bug in the assembler histogramming? I suppose there is a lesson in here that we should test synch triggers more frequently during development! Alan (with thanks to Bruce and Matt W). ======================================================================= Error messages from Problem 5 (assembler histogramming). Histogramming complete, 13 seconds, tidying up... Reading histograms... Reading out modules from ROD URID:0.0.0 of 1 Modules in group 0 of 3: count= 1 Attempting to find mid for [20220170100038] serial number is [20220170100038] mid = 2 Module: 2 = 20220170100038 debug line 4 Got some text: crate: UCID:0.0 rod slot: 12 Text TRANSFER : 76 [SDSP 0: rodRun.c, 298]:: caching on! L_TRIG . e histogram functio SEND DATA from dsp 0 says ptr 0xa000be00 length 0x100 SendData returned 2684403200 256 need offset 0 length 2048 Asked for chunk +0 length 2048 in 64words (continuing) Found bad end of histogram bin ( èøøp@) Asked for chunk +16384 length 2048 in 64words (continuing) Asked for chunk +32768 length 2048 in 64words (continuing) Asked for chunk +49152 length 2048 in 64words (continuing) and I think we can assume that the resultant histograms are garbage.