SctRodDaq Wiki: SystemTests

This is a list of comments/notes from my testing of SctRodDaq using the SystemTests stuff.

Using the ResponseCurve? tests, once, the FittingService hung after having done 980 scans. Not sure why. It looked like it might have been some sort of memory overwrite problem. I killed the FittingService, restarted and manually told it to fit the missing data. Then everything worked correctly and is did another 1000 scans without problem.
- I have come across this problem again whilst testing the Trim Test. After 1316 scans the FittingService hung. I think there is some kind of subtle bug here - perhaps a thread issue where something is accessed unlocked or some kind of double delete bug or the like. It is likely to be something FittingService specific as I haven't seen any similar problems with any of the other programs.

I had to insert a delay of 0.1 secs between publishing test data in the FullBypass? test. It seems there is a problem if new data is published too quickly. The FittingService was never being called by the is_server and eventually the is_server hung. I suspect it ran out of threads or its queue depth got too long. Inserting the delay fixed the problem. This problem might not have happened if I had run on multiple machines...but there is certainly some resource (probably CPU) conflict issue there.
- Hmmm, actually, this didn't fix the problem. I was running all processes at nice level 10. Setting the nice level to 0 did fix the problem, so perhaps it is a scheduling thing. Certainly it seemed to be the case that the FittingService was not getting a chance to run.

Notes

setup stage

A typical "setup" stage consists of running

 $SCT_DAQ_ROOT/SystemTests/scripts/restartIS.sh

which restarts a very large number of IS servers leaving them in an empty state, before then starting whatever Analysis/Fitting/Archiving/etc services are required.

run stage

A typical "run" stage consistes of simply dropping data into the relevant part of the analysis rollercoaster. This may use a "RetrieveRaws?.bsh" bean shell script that can be started with

 bsh_for_sct RetrieveRaws.bsh

There are two main stages to the above as of 14th Dec 2007. Firstly "control data" must be prepared, as there are no address labels on the roller coaster envelopes. This is done by first pointing the archiving service's retrieval arm at the "ControlData" is server, and then asking it to retrieve some control data, which usually lives in a file called "TestData*":

 import Sct.*;
 import Sct.IS.SctNames;
 import ArchivingServiceI.*;
 import AnalysisServiceI.*;
 import GuiComponents.System.*;

 si=SystemInterface.getInstance();

 a = si.getArchivingService();
 f = si.getFittingService();
 an= si.getAnalysisService();

 System.out.println("<> Retrieve Control Data");

 // get the control data
 a.setRetrieveIsServer("ControlData");
 a.suspendCallbacks(true);
 a.retrieveArchName("TestData.*.*.*.gz");
 Thread.sleep(500);
 while (a.busy()>0){
  Thread.sleep(100);
 }

The second stage is the actual dropping of EventData onto the rollercoaster. This is done again using the ArchivingService. This time its retrieval arm is pointed at the EventData IS server and the RawScanResult files are inserted.

System.out.println("<> Retrieve Event Data");

// get the event data
a.setRetrieveIsServer("EventData");
Thread.sleep(500);
a.retrieveArchName("SctData::RawScanResult.*.*.*.gz");

Thread.sleep(500);
while (a.busy()>0 || f.busy()>0 || an.busy()>0){
  Thread.sleep(100);
  System.out.print(".");
  System.out.print(a.busy()+f.busy()+an.busy());
}

The information then flows through the roller coaster under gravity, and fitting, analysis etc all begin as/where appropriate, as indicated in the control data. The wait loop above waits until ALL services are no longer busy. Progress can be watched in the ServiceStatusGUI.

Where does the roller coaster data actually go?

and

How do I view the results of the fits etc?

When the ArchivingService extracts data from the .gz files and put is into IS, it mostly uses IS proxy objects. (This might change in future.) That means that the data is read off the disk, and converted to a more "in-memory" form which is placed into SCT_SCRATCH_DIR, and then placeholder entries are put in IS that are effectively symbolic links to the path in SCT_SCRATCH_DIR where the data is actually located. Even the analysed and fitted data still in IS at the end of the SystemTest? is likewise stored in IS proxy objects. This means that you can view the results of the test by finding the relevant file in SCT_SCRATCH_DIR and running DataDisplayer on it.

e.g. for the MarkSpaceRatio? example above, you can do:

DataDisplayer $SCT_SCRATCH_DIR/SctData\:\:MarkSpaceRatioTestResult.5067.64.20220170100029

and see a pretty picture ... or

SummaryExtractor $SCT_SCRATCH_DIR/SctData\:\:MarkSpaceRatioTestResult.5067.64.20220170100029

and see the summary.