We need to be careful with terminology, so shift-takers don't get
confused about how the DAQ system is (mis)behaving, or about how to fix
problems.
I've talked with Tavi, and CODA was NOT crashing all night. For me,
CODA crashing means that RunControl is frozen so you can't execute state
transitions (like Prestart, Go, Download, End), or the EB/ER/ET windows
disappear, or the FASTBUS readout controllers reboot.
For Tavi, his first problem was that although the physics trigger was
counting in visual_scal, he saw no events coming through RunControl. I
had him check the overlap of the physics trigger(PT) with the common
strobe(CS) input to the trigger supervisor. Indeed, the common strobe
did NOT come after the start of the physics trigger, so of course there
were no events being read out. The overlap WAS good earlier on 9/10 for
real physics events, because Adrian and I looked at it on the scope; we
had to adjust the relative PT/CS timing in order to get the RunControl
rate consistent with the visual scalers. Somehow the timing of the
flasher trigger is different.
Tavi reports that End always fails. I'll look at this today.
Tavi also was concerned because RunControl was reporting that blastROC
(or maybe blastROCr) had not reported status in n seconds. This status
report comes through a different mechanism than the CODA data flow, so
lack of a status report DOES NOT automatically imply there will be no
data. I haven't yet found out why we get this report so often. You can
check that the ROC is indeed up in a number of ways:
1) Use the opitrig1 screen that setup opens, type ROC, and you are
connected to whatever ROC is attached to opitrig1's serial port.
blrocf2 is blastROC, blrocf3 is blastROCr. Now you can watch the ROC's
response to state transitions like Prestart, Go, etc. Just hitting
carriage return will show you the VxWorks prompt ->. If you see that,
the ROC is probably ok.
2) You can telnet to blrocf2 or blrocf3 and watch the state transitions.
3) You can see the event count increasing on the RunControl screen while
monitoring blastER, blastEB (but those include the scaler and EPICS
events), or blastROC, blastROCr. If you DON'T see physics triggers
here, check the PT and CS inputs to the TS before you decide the ROCs
are dead.
4) You can use dpsh to query the ROC:
cd ~/commis/coda
source coda_user.setup
dpsh
% DP_ask blastROC status (response depends on what state you're in with
RunControl)
% DP_ask blastROC part_stats_all (shows all sorts of information on ROC
data buffers; the interesting one is COMMIS_L_PRI:pool, how many are
free and busy.)
Another problem we have had (but Tavi didn't have last night) is that
at rates above about 60kB/sec, the differential trigger rate will all of
a sudden drop to 0 (or whatever the scaler+EPICS rate is). After a
minute or so, it pops back up again to the physics trigger rate. I have
been calling this "ROC sleep". When the ROC is sleeping, all its data
buffers (see 4 above) are busy; the ROC has nowhere to store the data,
so it won't accept more triggers until the buffers free up. I'm still
trying to understand why the buffers fill up in the first place. If the
ROC does not come back awake in a reasonable time, End Run will work,
and then you can start another run.
Karen Dow
This archive was generated by hypermail 2.1.2 : Mon Feb 24 2014 - 14:07:28 EST