How to Analyze Problems Related to Internal Errors (ORA-600) and Core Dumps (ORA-7445) using My Orac

来源:互联网 发布:电脑用手机网络上网 编辑:程序博客网 时间:2024/05/16 00:25

In this Document
  Purpose
  Last Review Date
  Instructions for the Reader
  Troubleshooting Details
  References


Applies to:

Oracle Server - Enterprise Edition - Version: 8.1.7.4 to 11.2.0.2 - Release: 8.1.7 to 11.2
Oracle Server - Enterprise Edition - Version: 8.1.7.4 to 11.2.0.2   [Release: 8.1.7 to 11.2]
Information in this document applies to any platform.
*** Checked for relevance on 16-Nov-2011 ***

Purpose

1.1 Abstract
============
This document provides guidelines for customers to do an initial analysis of problems related to internal errors (ORA-600) and core dumps (ORA-7445) by using My Oracle Support keyword searches. After finding a set of documents, either bug database entries or notes, these must be correlated to the specific circumstances to further narrow down the search results. Hints to do this are given.

1.2 Introduction
=================
It is often the case that certain problems have been already discovered and are documented in My Oracle Support notes and in published bug information. With the proper techniques it is often possible to narrow down to a particular bug that matches your problem and find documented workarounds or patch information. This document is mainly aimed at rediscovering bugs in Oracle code that cause internal errors or core dumps but it may be equally applicable to all kinds of problems that may be encountered.  The following paragraphs aim at extracting the relevant keywords from the trace file to find the documents that are relevant to the specific error condition. It also tries to help in further narrowing down the list of relevant documents by correlating them to the specific circumstances of the error.

Last Review Date

November 16, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

2.1 ORA-00600: internal error code, arguments: [argument1] [argumentX] ....
==========================================================================
The internal argument ORA-600 is raised within the Oracle kernel when an exceptional condition occurs. Inside the kernel code at various stages of processing, so called assertions are executed. These are certain conditions that must be true to be able to proceed. The assertions are internal health checks and guard over the integrity of memory and data of the instance and the database. When such an assertion fails, an ORA-600 error is raised with either a numeric or alphanumeric first argument and possibly more arguments depending on the particular error. Note that not all ORA-600 errors are necessary fatal errors causing the session to terminate; some are quite benign. Others however can be severe so they must always be carefully investigated.

2.1.1 The First Argument of the Internal Error ORA-600
======================================================
The single most important piece of information is the first argument of the internal error, either numeric or alphanumeric; it uniquely identifies the specific module where it was raised and what assertion was failing. Always include this first argument in your keyword search when trying to rediscover known problems.

2.2 ORA-07445: exception encountered: core dump
==============================================
A core dump is an exceptional condition similar to the internal error ORA-600, however, the big difference is that the kernel did not anticipate the error. Whereas in the case of the internal error the exceptional condition was discovered by an assertion which is a predefined check, the core dump happens because the operating system at some point aborts the process because it is doing a forbidden action such as trying to access an area of memory that does not belong to the process. This is why core dumps are often referred to as access violations. The term 'core dump' stems from a period when memory was stored with the use of magnetic cores, in computer terminology 'core' equates to 'memory'. A core dump means that the memory of the process was dumped in a file 'core' on the file system.

2.2.1 Identify the failing module
=================================
It is important to know in which internal module the core dump occurs, this is often printed together with the error but not always. If not refer to the section 3.2.5 Call Stack Trace. If known include the failing module in your keyword search.

3.1 Trace files
===============
For both internal errors and core dumps a trace file is written in either user_dump_dest for user processes and background_dump_dest for background processes. In 11g all trace files will be written to the location defined by the parameter diagnostic_dest. The trace files for both types are treated in the same manner although the particular information in the trace files, whether a certain section is present, may depend on the particular error. In case of a core dump it is possible for the kernel to still dump relevant information by calling the dump routines such as ksedmp() because the error is trapped by a signal handler. On the other hand, for most internal errors, the process is crashed (aborted) when the dump is ready.

3.2 Relevant sections in Trace files
====================================
While not intended to provide a detailed explanation or in depth understanding, the following sections in the trace file can usually be identified, this will help you to glean the relevant keywords for the search and understand the specific conditions under which the error occured, you should then be able to narrow down your search results.

3.2.1 Header
============
The header section includes such information as the name of the trace file, the specific oracle version, ORACLE_HOME ,system name, node name, OS version, instance name and process information. While not directly relevant for your keyword search this information is vital later to correlate the documents to your problem.

3.2.2 Error Section
===================
The error as it was written in the alert.log is usually repeated, possibly along with some extra information dependent on the error. When an internal error occurs a developer may have decided to write some crucial state information apart from the ORA-600 arguments directly to the trace file. When this is the case, those are usually gems for your search.

3.2.3 Current SQL statement for this session
============================================
You may or may not know what was being conducted at the time of the error. While common keywords like 'insert', 'update' or 'delete' may not be beneficial to your search (they are simply too common) this is of course very useful to correlate documented rediscovery information to your problem. If for example a certain bug has in its rediscovery information that it happens on insert only and you are performing a delete statement, then it is safe to say that this bug is not your problem.

See also 3.2.6 Cursor Dump in case the current SQL statement is not present in the trace file.

3.2.4 PL/SQL Call Stack
=======================
This section is present if the session was performing PL/SQL- it shows what user or internal PL/SQL packages where called. Call Stacks are read bottom up; if there are Oracle packages in there include them in your keyword search.

3.2.5 Call Stack Trace
======================
The modules listed in the call stack trace provide excellent keywords that can be used for rediscovery; they usually uniquely identify specific bugs. However, care must be taken to discard the top and bottom modules. Modules such as the following must be ignored : sigacthandler(), ssexhd(), ksedmp(), ksesicX() (where X is a number that designates the number of extra arguments to an ora-600 besides the first). All kse* and kge* modules in general can be ignored, they stand for Kernel Service Error and Kernel Generic Error respectively; these are modules that are invoked AFTER the error has occurred and perform such tasks as dumping the trace file and as such are common to most internal errors and core dumps so will not help in narrowing down your search. The same goes for the bottom modules that are always present because they are used to initiate process startup; everyting below (and including) opiexe() can safely be ignored. When you include three or four relevant modules from the top of the call stack trace (together with some other relevant keywords), this will usually result in the relevant documents popping up on a My Oracle Support search. If none are returned you may have to reduce the number of modules searched upon, removing the modules further down the stack from the search. If you still have no results returned you may have hit an as yet undiscovered problem.

3.2.6 Cursor Dump
=================
Even when the current SQL statement is not listed in the top section of the trace file together with the error, the cursor dump can still reveal the SQL statement being performed at time of error. Simply search for 'current cursor' identify the cursor number and scroll down until you find the cursor with that number. A a bonus you may also be able to identify the bind variables (if any) used for the statement in this section. See
Note:154170.1 for further information.

3.2.7 Process State - Session State Object
==========================================
A process state is a list of the process state objects, it is beyond the scope of this document to explain these in detail, for now it is enough to understand that state objects are used to organize memory objects that contain the relevant state information of a session in an hierarchical manner. One of the state objects contains relevant session information that helps in narrowing down the specifics of the error condition. Simply search for 'program' in your trace file and you will find a section similar to the following :


SO: 7000000223acfe0, type: 4, owner: 700000022357868, flag: INIT/-/-/0x00
(session) trans: 7000000230d47a0, creator: 700000022357868, flag: (18100041) USR/- BSY/-/-/-/-/-
DID: 0001-0012-00000083, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 2, prv: 0, sql: 70000002831a5a0, psql: 70000002831a5a0, user: 48/ISIS
O/S info: user: someuser, term: SOMETTY01, ospid: 628:1948, machine: BOX\SOMETTY01
program: someprogram.exe
application name: someprogram.exe, hash value=0
last wait for 'db file sequential read' blocking sess=0x0 seq=1054 wait_time=16729
file#=1, block#=2ec3, blocks=1
temporary object counter: 0

When a specific program is being used, you may want to include that in your keyword search, whether it is an oracle client program or not; we sometimes provide information on third party products in relation to Oracle on an 'as is' basis. Try to identify what type of program is being used, these include JDBC (thin or OCI), Pro*C, OCI etc. Some problems are for example specific to JDBC and this will help greatly in identifying the problem.

4.1 Use 'Advanced Search'
=========================
To find relevant bug database entries (more likely to contain call stack trace modules) always perform
the Advanced search and make sure to check the 'Bug Database' check box from the 'sources' section on
the right of the page, in addition to the 'Knowledge Base' which consists of the Notes written by Oracle personnel. Uncheck the 'Technical Forum' checkbox at first; when nothing is found in the knowlege base and bug database, some relevant info maybe found in the forums (this is not to downplay the forums, they just have a different use). The tips on the advanced search page provide further guidelines on how to search efficiently.

4.2 General Comments on Keyword Searches
========================================
The important thing is trial and error. When a search returns an overwhelming mound of documents, try to
narrow it down by including another keyword that is unique to your problem. On the other hand, when no results are returned, omit a few; if you have included too many modules from a call stack trace, delete
some from your search but retain the topmost module(s); the specific module may have been called from a different one to your's when the bug was discovered or the call stack trace was not clearly documented in the bug.

4.3 No Relevant Bug or Document is Found
========================================
You may of course be the first customer to have hit an as yet undiscovered problem. In that case, file a
service request using My Oracle Support and try to describe in detail what the problem is. You may also include
a summary of the analysis that you have already performed based on these guidelines.

4.4 Provide Feedback
====================
If you find documents that are unclear or you think contain errors, or if you think you can add some relevant
information based on your experience using our products, please use the feedback button and state the
document number and your comment (use the 'Technical Library feedback/questions' radio button in step 2). We are very grateful for quality feedback as it improves our knowledge base.

5.1 Correlate Bugs and Documents to Your Problem
================================================
Now that you have gained more understanding of the problem by browsing through the trace file's relevant
sections and have identified sufficiently suitable keywords both in terms of quantity and quality (uniqueness)
you have executed your search and you are presented with some relevant documents. The Notes are usually clear enough; they go through a well defined process of QA and will provide you with detailed circumstances to match your problem.

Bugs can be more cumbersome to read; try scrolling down to the bottom immediately (click 'Go to End') to find the 'Rediscovery' or 'Release Note' section (or search for it) to match the bug description with your specifics. A closed published bug should have such a section unless it is a duplicate of another bug. In that case the base bug must be checked. For bugs resolved in 9.2 onwards there is often a summary note, with the reference
bug number.8, which is easier to read.

You are hitting a certain bug if you can match all circumstances listed in the rediscovery information to your problem AND the designated fix (either a patch or workaround) solves the problem.

6.0 Summary
===========
Good search keywords include error messages, first arguments of internal errors and relevant internal
module names. Program names and type may help narrow down further. Correlate with the documented
rediscovery information. If unsure, file a service request with your findings, this helps the analyst.

References

NOTE:153788.1 - ORA-600/ORA-7445 Error Look-up Tool
NOTE:154170.1 - How to Find the Offending SQL from a Trace File
NOTE:156657.1 - How To Find Known Issues or BUGs Through a MetaLink Search
NOTE:1812.1 - TECH: Getting a Stack Trace from a CORE file
NOTE:211909.1 - Customer Introduction to ORA-7445 Errors

原创粉丝点击