Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations cowski on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Safety System vs. Hardwire / Human Intervention

Status
Not open for further replies.

Ashereng

Petroleum
Nov 25, 2005
2,349
Hello Everyone:

In a previous posting, there was some discussion on the the hierarchy of computer vs. human as the ultimate decision maker in a safety system.

From some of the responses, there seems to be passionate advocates on this issue.

So, I thought I would start a new post on this isssue.

Your comments, experiences and insights is greatly appreciated.



"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
Replies continue below

Recommended for you

Are you locking out a circuit to treat it as de-energized prior to work? Then visibly open disconnects are required. Or are you taking some other control action that has a safety implication? Whether performed by hard wire or logic, verification is required. Toggle switches and microprocessors can both fail. Microprocessors that self test can indicate when they have failed, or will fail to respond to a query. The guy at risk always has the final say. Didn't you see 2001?
 
stevenal, I was thinking more control action that has safety implication, than a lockout board.

I agree that physical switches and microprocessors can fail. Operator can also make mistakes.

Given that, which would you have as the final say in a safety sytsem. Does the operator reserve the right to override a safety system, or does the safety system override the operator input?

"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
Depends on the criticality and the effects of false alarms. In military systems, there are usually thermal shutdown controls that can be overridden by the operator. There's usually a warning of an impending thermal shutdown.

Seems to me that a safety system would be similar, particularly if there is an operator involved. A warning of impending safey shutdown and the option for the operator to override.

Maintenance overrides are pretty common for lots of safety systems.

TTFN



 
I can only draw on my experience in power plants;

Generally an operator cannot overide safety devices. For example, operator sees high bearing temperatures approaching trip level but he doesn't believe it. He has no easy way to stop the impending trip. The best he can do is make a quick call to the I&C guy and he may stop the trip if he can get to it in time.

Trip and protection disabling or bypassing is not typically provided as a mode of operation. It is necessary to go at least one level more into the control system beyond operating.

Therefore in my experience when it comes to the hierarchy of computer vs human, computer comes first.
 
I designed a system some years ago that I think demonstrates the appropriate division between computer control and operator control.
This was an oil heater in a small refinery.
The existing control was prone to failure. The relay control logic was difficult to follow and harder to trouble shoot.
One day we had a small leak of hot hydrocarbons that turned into an auto-ignition fire. Unfortunately the leak was right behind the control panel and when it was finally extinguished the control panel was gutted.
I did a cost comparison on replacing the relays compared to installing a PLC. The PLC won 2:1.
The main safety controls were
1> A flame safety relay.
2> A high stack temperature monitor and shutdown controller.
3> Fuel high-pressure shutdown.
4> Fuel low-pressure shutdown.
5> Low product flow monitor and shutdown controller.
6> Low pilot gas pressure.
7> Manual emergency stops.
These were considered safety issues and were hard wired to close both the fuel valve and the pilot gas valve.

One of the main components was a flame safety relay.
Hardwired; If this relay did not see a flame within 7 seconds of startup it closed the fuel and pilot gas valve.
PLC. This relay reported to the PLC when it saw a flame and the PLC proceeded to the next step.
However,
If the normally fast light off (a second or less) took more than 3 seconds, a warning light indicated "Slow Light off". This usually gave a week or more warning of an impending failure. There was plenty of warning to clean the lens and check the conditions of the igniter electrodes before a failure.
Step 2>: If a flame failure shutdown occurred, The fuel valves were closed. The PLC was notified and a light indicated "Flame Failure".

High stack temperature.
Over time the inside of the process piping would slowly accumulate a coating of sludge or coke. This insulated the process fluid and slowed the transfer of heat to the process fluid. This condition resulted in higher exhaust gas temperatures.
Stack temperature controller.
Hard wired:
If the stack temperature exceeded the set point, the fuel valve was closed. The PLC was notified.
PLC: The stack temperature controller reported to the PLC if the stack temperature reached a warning set point. A light indicated "High Stack Temp Warning" This usually gave several weeks of warning. Cleaning could be planned or production could be curtailed.
If the stack temperature reached the danger point, the controller shut the fuel off directly and reported to the PLC. A light indicated "High Stack Temp Shutdown".

Fuel high-pressure shutdown.
Fuel low-pressure shutdown.
These conditions usually originated elsewhere in the plant and could not be anticipated.
Hardwired: Direct shutdown of the fuel valves. Notify PLC.
PLC: Indicated "High Fuel Pressure shutdown" or "Low Fuel Pressure shutdown" as appropriate.

Low product flow shutdown:
Hardwired; Direct closure of the fuel valve. Notify PLC.
PLC: At warning level, the PLC indicated "Warning, Reduced Flow".
At shut down the PLC indicated "Warning, Low Flow Shutdown".

Low pilot gas pressure.
Pilot gas was supplied from a propane cylinder. It was only used to light off the main fuel supply.
Hardwired; Insufficient pressure during the start cycle would terminate the start cycle and require a manual restart. Notify PLC.
PLC: At the warning level the PLC would indicate "Low Pilot Gas Pressure". At the shutdown level the PLC would indicate "Low Pilot Pressure Shutdown".

Manual emergency stops: The manual emergency stop dropped out a control relay that de-energized everything except the PLC. The PLC indicated "Emergency Stop".
What did the PLC do? In the event of a flame failure it would claim control of the combustion air damper and force it full open and purge the heater with fresh air. It would then relinquish control to the instrumentation system and attempt a restart. It closed the pilot gas valve after a successful start. There was actually quite a bit of logic that the PLC handled, but I can no longer remember the details.
respectfully
 
Thank you everyone for sharing your experiences and thoughts.

IRstuff

You touch on two topics that I feel are very critical: 1) the operator can override the shutdown system, and 2) maintainance bypass / operator override etc.

These two topics, in the subset of safety incidents that I have seen (I say subset because of course, we did not look at all incidents in history) have caused more than 50% of the accidents in my industry (petroleum, oil, gas).

The operator, during an upset, is pretty busy and stressed. Not the best climate for making detailed complex decisions. Often, a sub-optimal decision is made - sometimes leading to accidents.

Putting thins in bypass/overrided/forced values/etc. is also a common practice. Unfortunately, this sometimes lead to a condition becuase the information that is missing is sometimes needed by others, or other users are not informed of the bypass/forced value.

I no longer work in a plant, and unfortunately , do not have access to data to substantiate these thoughts.

Has anyone else come across similar data or experiences?

GTstartup

You are advocating something that for my industry, is quite new - that the safety system (computer) can/should override the human operator (or lock them out). Many of the safety experts that I worked with on my last job are advocating this approach. In time of stress, the opertor often does not make the correct decisions, hence, the safety system will do it for him.

The last power plant project that I worked on (2 gas fired GE generators and a Dresser Rand steam generator) I believe followed this approach. I didn't realise that this was prevalent in the industry.

waross

Bad data is the bane of all control/safety systems. How we deal with it, is more of what I am looking for.

In your example, your control system/PLC gives a timed window to address a failing measurement (in this case, a high stack temperature warning for several weeks).

In some of the accidents, the problem was that operatorations could not discern between a failing instrument giving misleading readings vs. a true high temperature. The result was that the reading was ignored, leading to a failure/accident.

One of the key take aways for me from working with the safety experts is: People must have confidence in the safety system. Otherwise, the safety system will be useless. Operations will bypass the safety system if they do not trust it, resulting in potentially a greater problem than if the safety system was removed altogether.


All your posts are appreciated.

This is a complex and passionate topic for many in the industry.

I certainly welcome and invite more comments, thoughts, questions...


"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
Without a great deal of analysis, the question of who gets to lock out whom is tricky at best.

A computer safety system can only behave, if it's working correctly, as the designer intended and for whatever conditions the designer foresaw. Any deviation from the designer's groundrules could be a potential disaster if the safeties cannot be overridden.

Only if there's been sufficient analysis of ways of screwing up the system should such a decision be made. If the designer overlooked some rather trivial ways of entering an unforeseen condition, then the safety system may wind up injuring or killing someone.

While stress is indeed a factor in these situations, the inability to make decisions in those conditions are usually due to poor or insufficient training. Companies like to claim safety, but seldom pay for the degree of training needed to make thing run smoothly in crisis situations. So they rely on mechanisms and computers that can only react to threshold conditions. But, invariably, someone will die from these threshold conditions applied to an unforeseen situation, also because the companies didn't pay for sufficient analysis to uncover all possible contingencies.

TTFN



 
IRstuff,

I agree. The philosophy of the approach is what we are currently debating. Do we trust the "system" or do we trust the "operator".

Yes, a system is only as good as the designer - who is only as good as the data he/she is give - which is only as good as the data collection/records - which is only as good as the instrument collecting it. Of course, interpretation of the data is another matter.

Yes, you need to train your operators. Unfortunately, training for emergency response is an annual thing, usually. People usually train to control normal operations, not dealing with catastrophies.

Even a very well trained operator, under stress, can make a mistake.

Therein lies my quandry, philosophically speaking.



"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
With a high reliability SIS (Triple, Quad, etc.) I am satisfied, perhaps even advocate software maintenance bypass switches. The use of bypass switches must be logged by the system as well as a manual operator log with supervisor outhorization or acknowledgement. The system also requires a screen that identifies all points in bypass.

Safety systems require a great deal of commissioning and testing for every little change as well as tracking with the management of change system. Most of my projects fall within 29 CFR 1910.119 process safety management and ISA S84 rules.

 
If the rest of the infrastructure is not there or not reliably available, then the most reliable will be the machine, since its responses are both predictable and repeatable.

I would then concentrate on running through as many failure scenarios as possible and design for the contingencies accordingly.

TTFN



 
JL,

Yes, I agree.

On one system I have worked on, each bypass has a "bypass timer". The bypass must be "re-bypassed" every 8 hours. If not, after the 8 hours, the "bypass" is removed, and returned to full service. There is warning that the bypass timer is about to run down, to warn the user. I actually sort of like this idea - it prevents bypass from being accidently "forgotten".



"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
IRSTUFF writes…

“A computer safety system can only behave, if it's working correctly, as the designer intended and for whatever conditions the designer foresaw. Any deviation from the designer's ground rules could be a potential disaster if the safeties cannot be overridden.”

This must represent a very badly designed safety system. Safety systems by definition are designed to keep personal safe and protect equipment. The ultimate safety is the E-Stop button. If the normal operation of a safety system created potential disasters then a reevaluation of the safety system’s control logic is in order.

It may be useful to easily bypass and override safeties during the startup phase of a project. But once the project has been commissioned, allowing operators override safeties is a dangerous and ill advised practice. Overriding safeties invariably creates less safe situations. The most common reason for overriding a safety is economic; (“We’ll lose production time”; “We can get a little more out of it”; “I don’t want to wait for maintenance”, etc). Any situation that would benefit from allowing operators the ability to bypass safeties is, in my mind, a candidate for a control review and redesign. If safety overrides are determined to be necessary and are allowed then the control system designer must ensure that all interested parties are continuously aware of the abnormal condition.

Aside from the philosophical debate of who/what do you trust, there is also the legal and psychological aspects. If for example an operator has the ability to override a gate safety on a packing machine just so he could keep it going to the end of the shift then he absolutely needs to convey this info to the next operator or bad things could happen. The next operator “expects” things to be as safe today as they were yesterday. If he forgets what safeties are bypassed that day it’s not his fault. In the end the company is always responsible and accidents caused by safety overrides could easily be considered gross negligence.
 
You miss the point. E-Stop assumes that an operator is observant and can react quickly enough to prevent harm. In many situations, that is may not be possible. A automatic safety system is designed to prevent injury or failure, but as with anything that is designed with a set of requirements, the requirements may be incomplete or the operator may subvert the proper operation of the automatic safety by doing something outside of the design parameters.

Anti-lock brake systems (ABS) were designed to allow the operator to concentrate on steering the vehicle without also worrying about pumping the brakes and by how much.

However, if the operator is manually pumping the brakes, then the ABS function will not operate correctly, since it's specifically designed for a constant brake command. Since most newer drivers are no longer taught to pump their brakes, that fits the design parameters.


TTFN



 
LAsludge said:
Safety systems by definition are designed to keep personal safe and protect equipment. The ultimate safety is the E-Stop button.

Correct me if I'm wrong, this seems to be contradictory. If the safety system by definition is designed to keep personnel safe and protect equipment, then it would be the ultimate safety device, not the E-Stop button (or emergency stop button).

Or, am I missing something?

"Do not worry about your problems with mathematics, I assure you mine are far greater."
Albert Einstein
Have you read FAQ731-376 to make the best use of Eng-Tips Forums?
 
The Basic Process Control System controls the plant within the normal operating range. Mechanical devices such as pressure safety relief valves protect from transient situations. The independent safety system kicks in where the regulatory control system and operator attention did not keep the measurements within limits. A big red pull to trip button next to the gate could be a good thing. A substation shunt trip could work too.
 
There are many parts of a control system that affect safety.
But when we have a "BIG RED STOP BUTTON" we expect it to stop the operation. I disagree with running the ultimate stop signal through the computer. If I am trying to stop a process because the I/O card on the computer has shorted, I may not have time to call technical support.
I actually saw a system in which the operator would have to hold down the stop button and call for someone else to pull the plug in the event of a computer failure or I/O card failure. If an operator tried to stop any motor in the plant, the stop button would drop out the motor contactor. It would also send a signal to the computer requesting that the motor be stopped. When the button was released the computer would restart the motor and run it until the computer got around to proccessing the request and shutting it down again. The responce was very slow. Ask in the motor forum if it is a good idea to interupt the current to a large motor for a few seconds and then re-apply the current. It can break motor shafts. The problem could have been corrected quite easily, but to do so would have violated the computers group's "Total Control" philosophy and they refused to consider changes.
This system was so flaky that the supplier had to have a technician living on site to keep the software running. It also had an over 50% failure rate on I/O cards, out of the box.
Mistakes are made and stuff happens. However denial prevents corrective action. In this case the reaction of the computer group seemed to be, "Trust us. We know what we're doing and everything is good."
The denial of obvious problems I think was a sign of b-lls bigger than brains and the b-lls aren't very big.
I have often wondered what happened to that design group a few years later when their parent company bought out another company with an excellent reputation for both hardware and software so that they could stay in the DCS and SCADA business.
Re jumpers. Jumpers are going to happen. I find it rare to go into an old plant and not find at least a few jumpers.
Bypassing in software gives a way to record and monitor bypassed circuits. I have no problem with computer control or with software bypass rather than hardwired jumpers. I prefer software bypass if automatic logging is included in the package. However I have a problem with computer techs that are making the decision that every circuit must be controlled by the computer with no exception.
Do what you want with the computer control but give me a hard wired way to stop the thing when the I/O card shorts out.
To sum up my feelings about computer control versus hardwired control. For most circuits, I prefer computer control with the ability to bypass many of the circuit if needed as long as there is good logging and notification.
However when you are considering the implications of high pressures and high temperatures and a host of other potentially dangerous situations that may result from the failure of some equipment or another, consider that the
computer and/or the I/O interface are also subject to failure.
If you think that computers and soft ware never fail, how have you got this far without using MS products?
respectfully
 
I don't think that anyone is arguing that an E-Stop shouldn't be a direct-wired thing.

TTFN



 
Waross addresses real concerns. Many use communication connections such as Devicenet for normal MCC start/stop accessories. The safety system can power the contactor string as a permissive to run. In this manner any operator pushbutton or the safety shutdown system I/O can trip the starter/contactor. If an operator pushbutton trips the motor directly, another contact should input the safety system to identify that an operator tripped the equipment.

In my industry a critical safety system is not a simple computer. The I/O and logic processing are triplicated and voted. Most measurement inputs are analog values from transmitters not on/off alarm or status switches. Depending upon the hazard criticality of the equipment area the measurement instruments may be triplicated. The triple measurement and triple electronics are to improve reliability by reducing nuisance trips. The I/O includes supervisory testing that identifies outputs failed “on” and other such failures where the command and actual status do not match.

Usually the logic to limit starts is within the motor control system or a PLC or such process control system AND outside the safety system.

Normally the management of change for the safety group includes operations and engineering with a much wider scope that a computer or process control group's domain. Arrogance can enter the picture. However, even an arrogant programmer would normally be influenced by a management of change committee with broad experience and the ear of top management.

Some companies will fire operators, technicians or engineers who use jumpers to bypass the software. TUV hardware certification should help to assure that the hardware will trip upon a command to trip. The triple hardware and voting circuits attempt to minimize unintentional trips.

Neither the Basic Process Control System nor the safety shutdown system shuld use MS products for control. MS products are common for the operator interface. The safety system must be completely separate from the regulatory or sequential process control. As stated, most things electrical or mechanical are subject to failure.
 
Hello JLSeagull
I agree with you completely.
My last experience was quite awhile ago. There was no particular redundancy, and motor control was at the whim of the computer. The stop buttons did directly stop a motor, but as soon as they were released, the computer re-asserted control and restarted the motor. A couple of years later, I worked on a similar project where the control systems were excellent. I didn't have a complaint, and I am not easy to satisfy when it comes to safety systems. There were a lot of the same people on the second project. When it came time to install the I/O interface cards, the electricians were anticipating a lot of overtime. It didn't happen. In the whole project I was aware of only one I/O card that didn't function. The enegineer who liassed with the computer group said he'd try to get a replacement card the next morning. He was back the next day and told us that the card in question was a late addition and the software had not yet been set up. "Now it's working." 100% good cards "Out of the box".
The company responsible for the very good system was later purchased by the company responsible for the disasterous system. Go figure?
It sounds as if your system has been born out of the experience of previous mistakes.
I am happy to hear that the systems have been much improved since my last exposure.
I don't think it does any harm to remember the earlier disasters. It goes a long way to combat the arrogance you mentioned.
Respectfully
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor