System Failure

SomeTekk · June 2019

Twice now a running RW 5.15 system has entered a system failure state for no detectable reason.

No hardware issues required correction. No programming or parameter changes occurred prior to the failure. Both times a B-Start was able to get the system operational.

A strange item related to both incidents is a gap in the event log, one time for nearly 3 weeks, the most recent occurrence the gap was approximately 10 days. The system operated without issue during the days there seemed to be no event logging happening.

2711101	E	System	20195	System data from last shutdown is lost	6/15/2019 15:02	Normally, all system data is saved on shutdown. During the last shutdown saving data has failed. The system has been started using last good system data saved earlier at WED JUN 05 07:14:50 2019.	{args: "WED JUN 05 07:14:50 2019", "WED JUN 05 07:14:50 2019"}
2711100	E	System	20205	Auto Stop open	6/5/2019 7:14	The Automatic Mode Safeguarded Stop circuit has been broken.	{args: }

There are more than 20 other systems similar to this system and none of the others, which have the same environmental conditions (mains power is stable), have experienced this production-killing problem.

This is an ABB non-robot manipulator system of the FlexLean track and lifter combo type.

Has this scenario happened to anyone else, and if so was a resolution found?

lemster68 · June 2019

I haven't heard of that before. Did you look at all your event logs to see if there is an error causing the sysfail? Also, did you try B-Start? I-Start seems harsh, but if you have to, you have to.

SomeTekk · June 2019

Hello Lemster,

Yikes, I had to edit the original post. A B-start solved (temporarily{?})both instances.

All logs were examined and both times no "smoking gun" was found. In the original post the events listed are sequential, the last one on 6/5 (2711100) - the next one was after the B-Start on 6/15 (2711101).

The first time this happened logs, backups and system diagnostics were sent to the local ABB office and a R## ###k case opened. Unfortunately no luck in finding a root cause was found.

lemster68 · June 2019

Maybe from time to time you can check on the program resources, cpu load memory usage. One thing I have seen before is execution stack overflow due to a recursive program. Maybe you can check to see if there is any chance of recursion in your program.

SomeTekk · June 2019

Sounds good, will do.

I am familiar how to do that with RS; you don't happen to know if there's RAPID that can accomplish that do you?

lemster68 · June 2019

I am not aware of any rapid instructions to read that data. Just system info button on the pendant, and somewhere in there under the properties.

SomeTekk · May 2020

FYI - This was resolved.

The 4 pole Phoenix terminal connector that carries power to the Main CPU was barely maintaining contact. As the control sits on a mezzanine in close proximity to a stamping press it is a wonder the problem was as isolated as it was.

Reseating the connector solved the problem and became a maintenance item for the ABB control population.

MDFB · March 2022

Hi there, thanks for your previous comments.
I have a similar problem: only two IRB 6700 - RW 6.08.01 from more than ten identical robots connected to the same network, entered in a system failure state, aparently after a network loop. After restoring the system we also have a gap in the event log and the only error '20195:System data from las shutdown is lost'.
Someone knows why only two of more than ten robots are having this problem? What differences can be with the others? or what can we do to avoid this? We had the same problem months ago, and not in all robots, after changing network configuration (all device IPs in the network...) .

Thanks in advance!

SomeTekk · March 2022

It's been my experience that 20195 errors occur when an valid image.bin is not available during a reboot.

Perhaps take a look at the parameters for differences in the IO Topic | DeviceTrust Level Type | Action when Disconnected Parameter to see how they are configured.

MDFB · March 2022

Thanks,
these parameters are identical.
The only difference, is that in robots with that problem, we don't have a Local_IO (EtherNet/IP Device) defined, while in robots that didn't have the system failure we have it. We don't know if this parameter could be the reason for the differences.

Also, we don't know why the robot executes a reboot when detecting a network loop/changes in switches (we didn't reboot them).

System Failure

Comments

Categories

In this Discussion