Redhat 5 / vi usage causes swap space to exhausted

来源:互联网 发布:计算机网络书推荐 知乎 编辑:程序博客网 时间:2024/05/05 16:16

Please be aware Redhat found a problem using VI with large files in the latest RH5 kernel  which caused Thursday's TPS sev 1 outage (EMS 73990).  The majority of CM Linux estate is on RH5 so we are very exposed.

 

Trigger

The bug manifests itself when using vi with a large file which causes swap space to exhausted until the machine becomes unresponsive in a matter of a few minutes.  Redhat will not define "large" file at this time and does not have a fix available yet.   

 

Action Required

1. Avoid using vi completely during business hours. 

 

2. Use “less” or “more” commands to view logs. (This should be standard practice regardless of vi bug)

 

3. SA removes "vim-minimal" package and installing standard  "vim".

 

4. Use vi -n option which does not use any swap. 

 

I will be working with infrastructure teams to determine how best we track exposure and remediation.  This may result in a more formal PEC record.

 

Please contact your Linux SA for more details on the prescribed workarounds. 

 

 

======================

Short summary is root cause is probably pilot error and a recommendation NOT to perform any changes to the env (ie Alias vi –n).  There is also a recommendation to move to latest SOE builds of Red Hat for general maintenance purposes at the earliest possible convenience.

 

·         Open to feedback from the group on the alias / CATE’s recommendations

·         David – In relation to the specific incident, we need to discuss the alternative root cause presented here, along with what alternative tasks are necessary

 

 

 

 

As promised, here is CATE's report on the issue incurred with the RHEL 5 server and VI.

 

Based on all forensic evidence and information provided by the app folks and the SA's, CATE is unable to determine that this issue was caused by any known bug within our SOE build of Red Hat. Any previous information provided by Red Hat pointing to a particular bug was premature and not conclusive. As relayed by Bob Mader, who is our Red Hat SME, the actions taken by the user that led to the issue does NOT match up with the nature of that specific bug report. Therefore, we are fairly certain that this bug was not a root cause.

 

For review, the user opened a 15 Mb XML file and an empty file in vi, search for the string 7323920, use vi yank line command, switch to the buffer of the empty file, and then use vi put command to save the yanked line. Bob explains that the scenario that played out could occur if the string 7323920 was in the paste buffer of the terminal session and if  that number was pasted right before doing the put command, that would cause the put command to repeat that many times. The line was 8174 bytes long, so 8174*7323920 or about 57 Gb. This would obviously be "driver error" and not a bug.

 

With this hypothetical scenario aside, we are still confident that moving to vim-enhanced or using the -n option will NOT fix anything here. Making any changes like these at this point would actually be introducing more unknowns to the environment. Therefore we are asking that no such actions be taken.

 

One recommendation CATE makes is to upgrade the servers to the most recent SOE builds of Red Hat for general maintenance purposes at the earliest possible convenience.

 

Please let me know if there are further questions.

 

 

=====================================

Summary:

Issue is believed to have been caused by IT user error – a sequence of events from a “cut / paste” action. 

 

Recommendations:

·         Review the info below detailing the circumstance the error happened

·         Don’t use vi in business hours if avoidable.  Don’t use vi for viewing files - using “less” and “more” are better

·         If you do use vi, use vi -n

·         Be cautious when pasting in terminal sessions.   

·         Teams need to plan to upgrade to the most recent SOE builds as recommended below

 

It goes without saying that general caution and care should be exercised when in the production environment.  The detail below shows how easy it is for a seemingly innocuous command mistyped can have far reaching impact.

 

Detail:

Based on all forensic evidence and information provided by the app folks and the SA's, CATE is unable to determine that this issue was caused by any known bug within our SOE build of Red Hat. Any previous information provided by Red Hat pointing to a particular bug was premature and not conclusive. As relayed by Bob Mader, who is our Red Hat SME, the actions taken by the user that led to the issue does NOT match up with the nature of that specific bug report. Therefore, we are fairly certain that this bug was not a root cause.

 

Sequence

·         the user opened a 15 Mb XML file and an empty file in vi

·         searched for the string 7323920

·         used vi yank line command

·         switch to the buffer of the empty file

·         Used vi put command to save the yanked line.

 

The scenario that played out could occur if the string 7323920 was in the paste buffer of the terminal session and if  that number was pasted right before doing the put command, that would cause the put command to repeat that many times. The line was 8174 bytes long, so 8174*7323920 or about 57 Gb. This would obviously be "driver error" and not a bug.

 

With this hypothetical scenario aside, we are still confident that moving to vim-enhanced or using the -n option will NOT fix anything here. Making any changes like these at this point would actually be introducing more unknowns to the environment. Therefore we are asking that no such actions be taken.

 

One recommendation CATE makes is to upgrade the servers to the most recent SOE builds of Red Hat for general maintenance purposes at the earliest possible convenience.

 

 

原创粉丝点击