Overview | Releases | Download | Docs | Links | Help | RecentChanges

SynchTriggerNoise

In the SynchTriggerNoise test, a trigger is sent at the same time to modules. An analysis for this data is now being implemented.

Triggers are sent by the TIM at 100kHz (this is the default value, if not set in the configuration). One event out of 15 is transferred to the SlaveDSPs? and histogrammed. This is assumed to be limited by the transfer of data from the Router to the SDSP: most events in this test have no hits, but a complete frame is still transferred over DMA.

NB, this only works with distSlave = 0, therefore all events are histogrammed by Slave DSP 0. It may be possible to implement distSlave = 2, by setting the trap match for different slaves. Depending on whether bottle neck is from router or into Slaves, this may make it possible to histogram more events in total. Either way, settings of module groups will be ignored.

Some interesting finds from the online analysis of SynchTriggerNoise is reported at SynchTriggerNoiseAnalysis

Problems

In an older version (release 3?):

The API does not read out the total number of triggers sent and is set to 0. In the OPE for this test, the y-axis "Null Variable" represent, readout in time while the x-axis is the number of hits recorded in an event. Bin 1-5 records the true occupancy whereas Bin 0 is calculated by subtracting the number of events from the total number of triggers and hence is negative.

The problems in the paragraph above with no trigger counting should be fixed now, but there is a new problem: the scan dosent run.

See also SynchTriggerNoiseDebugging

Output from the complete B6 scan is in this file [1] The output is quite a long way down the file!

[daquser@ppatlas1 daquser]$ grep -an modifyABCDVarROD SctApiCrateServer0.out
146492:modifyABCDVarROD bank 1
148759:modifyABCDVarROD bank 1
152305:modifyABCDVarROD bank 1
153737:modifyABCDVarROD bank 1
155169:modifyABCDVarROD bank 1
Seems to have been broken between 4 March 2004
Date: Fri, 4 Mar 2005 19:05:45 +0000 (GMT)
From: Alan Barr 
To: Bruce
Cc: Christopher Lester 
Subject: Trigger counter
The trigger counters on the ROD seem to be working correctly for both cal
and non-cal scans, as well as for SynchTrigger.
:)
and daves tag sr1_working_0405

Test program run with dev4 at oxford: http://www-pnp.physics.ox.ac.uk/~daquser/STN/

Possible sources of the problems

Sourceexcluded by
SctApi
CORBA impl
Configuration
CalibrationController
TIM hardwareRunning B6 with R3 (ok) R4-updated (not ok)
TIM firmwareRunning B6 with R3 (ok) R4-updated (not ok)
TimModule?
ROD hardwareRunning B6 with R3 (ok) R4-updated (not ok)
ROD firmwareRunning B6 with R3 (ok) R4-updated (not ok)
RodModule
online code
dataflow
external code
timing?
some combination
act of GodArchbishop of Canterbury

If its a combination of these, its going to be hard to work out!

Date: Wed, 13 Jul 2005 19:40:19 +0100 (BST)
From: Alan Barr <barr@hep.ucl.ac.uk>
Subject: Synchronous Triggers - description of the problems


After much gnashing of teeth ...

There were at least five separate problems of varying seriousness and
scope with synchTrigs, which was why the debugging has been difficult.
Some of these may have been (probably were) specific to Oxford. I've tried
to summarise my knowledge below:


  Problem 1
  ---------

The TIM trigger burst register is only 16 bits, so anyone requesting more
than 2^16-1 triggers per bin got the wrong number of triggers.

Affects:
All versions of code.

Fix:
 Either adding an adjacent register to make 32 bits, or getting the API to
break down the request into chunks. Neither of which done yet. Currently
I've only put a MRS warning message which should tell you you've got
problems.


  Problem 2
  ---------

One of our TIM modules had its switches set-up upside down, so that MSB
and LSB were swapped. These affected the relative timing of triggers and
their counters arriving at the ROD, and so led to data errors in some ROD
slots, and in particular the one I was originally using for debugging.

Affects:
One of our TIMs - 3B-324. Perhaps others?

Solution:
  Fix up your TIM switches according to Matt's web page:
 http://www.hep.ucl.ac.uk/atlas/sct/tim/ being careful about which breed
of TIM you have, and MSB vs LSB and 1s vs 0s.


  Problem 3
  ---------

In versions 1.5 to 1.7 inclusive of the SynchTrig script
 CalibrationController/src/scripts/SynchTriggerNoiseTest.h
 the distSlave option is commented out. This was an error introduced by me
when trying to understand Problem 2 :-(

Affects:
 No official releases, but was in Release4-updated in Oxford, which is why
Alan and Bilge had to use R3 for Barrel 6 cold. Now fixed in R4-updated
too. Was fixed in CVS head on 11th May 05.


  Problem 4
  ---------

It seems that the ROD error state "Buffer Full" (which is presumably
reached if the histogramming cant keep up with the events, but perhaps
also if there are error events?) is not cleared at the end of the bin in
"physics mode". I belive that this is why in the SynchTrigNoise runs done
on B3 and B6 some rods drop out part way through. I dont yet know why some
had no events in the first bin!
Affects:
 I'm not sure of the scope- certainly the November 2004 DSP code seems to
suffer from this, which is used with Release 3 and 4.

Fix:
 I dont know - ROD experts?


  Problem 5
  ---------

This is another I dont understand too well. It seems that synchronous
trigger running causes problems with histogram readout, but only when
running with assembler histogramming. The effect is a failure to correctly
read out the histogram at the end of the scan, so all histo information is
garbage (and the API often suffers badly too).

The resultant messages to screen are at the bottom of this mail.

Affects:

 The November 2004 DSP code is ok when using ccode for histogramming, has
this problem when using assembler.

 The most recent DSP code distributed by Trevor on 6 Jul 2005 also has
this problem when doing assembler histogramming, but with c-histogramming
it is also ok.

 This problem is probably not noticable in public releases because in all
official releases ccode histogramming is the default. However assembler
has been the CVS HEAD default for some months now.

Fix:
 Is there some thread synchronisation problem with assembler histogramming
and tim triggers? Is there some bug in the assembler histogramming?


I suppose there is a lesson in here that we should test synch triggers
more frequently during development!

Alan (with thanks to Bruce and Matt W).


=======================================================================

Error messages from Problem 5 (assembler histogramming).

Histogramming complete, 13 seconds, tidying up...
Reading histograms...
Reading out modules from ROD URID:0.0.0 of 1
Modules in group 0 of 3: count= 1
Attempting to find mid for [20220170100038]
serial number is [20220170100038] mid = 2
Module:  2 = 20220170100038
debug line 4
Got some text: crate: UCID:0.0 rod slot: 12
Text TRANSFER : 76
[SDSP 0: rodRun.c,   298]::
caching on!

   L_TRIG

.
e histogram functio
SEND DATA from dsp 0 says ptr 0xa000be00 length 0x100
SendData returned 2684403200 256
 need offset 0 length 2048
Asked for chunk +0 length 2048 in 64words (continuing)
Found bad end of histogram bin (     èøøp@)
Asked for chunk +16384 length 2048 in 64words (continuing)
Asked for chunk +32768 length 2048 in 64words (continuing)
Asked for chunk +49152 length 2048 in 64words (continuing)

and I think we can assume that the resultant histograms are garbage.