You_Probably_Dont_Need_RAC

来源:互联网 发布:怎样下载excel2007软件 编辑:程序博客网 时间:2024/05/22 04:26
You Probably Don’t Need RAC

V.R~\3O+p0If you’ve been holidaying in Siberia or similar places for about a year, you have
y4H*t#U&n:O3@2X0probably not talked to an Oracle Sales rep yet about RAC. But you will no doubt find
0Ofc$Sm)A4|0that there’s a voice mail waiting for you when you turn your mobile phone on againITPUB个人空间*bz4JdkqD"e@
after returning home from the vacation.

pfL ?u B*S0RAC is being pushed very hard by Oracle. You will get high availability, incredibleITPUB个人空间p$~6o/X+wO8i'~
scalability, a much improved personal life, the ability to partition workloads, buy
W3?M[*B)Bs"i"d3hd0cheap Linux servers and what have you.

P~2o5L v$P+Lj0It sounds pretty good. How can anyone say no to that kind of offer?ITPUB个人空间Zu$M&b2@5BN2U Q q
RAC is not OPS

a XDd\1Qi7J0No, RAC is not OPS, but it looks a lot like it. Oracle Marketing tries really hard toITPUB个人空间va7Pu%j$H
distance RAC from OPS, and I don’t understand why. I mean: If the basic code has
'nT!U g|F0been around for many years it means it’s stable, debugged and tried. If it’s all new,
f1o'a"H0A5w0who dares install it in a critical system? Fortunately, it’s not true that RAC is not
4h `"DFR7r0OPS. The basic parts of the code – GES and GCS – are pretty much the same as
2B(i-A*vT Uq5r*O0they’ve always been.
ITPUB个人空间~o0BTkF
GES stands for Global Enqueue Service and GCS stands for Global Cache Service.ITPUB个人空间J-e.U\&s8X N
More about that later.
ITPUB个人空间(ml m Ty(jdx
A little history: OPS was created for version 6 of Oracle. The only clusters aroundITPUB个人空间f4UW0G ~6T5T&p
then were VAX/VMS clusters, but unfortunately the VAX/VMS Distributed Lock
O'a+mI!GC*bF0Manager (DLM) was created originally to handle the coordination of relatively few
y9z.q.RU x0resources, such as files and devices, not 1000s of buffers in a buffer cache (Oracle orITPUB个人空间k&Mm'_`S*ZV-w
others). It proved way too slow for OPS.
ITPUB个人空间*u.K^ D;d6w\Y
So Oracle had to create their own DLM for VAX/VMS, which they did. It took a
:I4[^6e$|YP)\0while, though, so it wasn’t until 6.0.35 (which was called “6.2” to celebrate the OPS
+zJ wY;LGV0feature) that it finally came out.

]+^1AUE5fH0I remember taking one of the first OPS classes (in Chicago) shortly after joiningITPUB个人空间d |*C ~&l
Oracle and thinking that Oracle Development had gone mad – creating their ownITPUB个人空间u:{x2N I
DLM instead of letting the Digital guys do it (they had, after all, created the clustersITPUB个人空间n"r\3bd y5eeu-?
and the whole concept).

$W$oov'QM%_!E0I was wrong. Oracle’s own DLM worked very well, and Digital adopted Oracle’sITPUB个人空间w Ya,x2|/I%`*[9M
technology and ideas in their own DLM, so when version 7 of Oracle came out, it was
\\rri0w*gV q0again Digitals native DLM that was used.

S:Hc1|fVb/q|?(O0The UNIX vendors then started doing Clusters (well, NCR had done it for a while).
{X[!}M8Y T-_$Lc0And they mostly got the DLM technology from Oracle.

]NZ(N#s\f/BR0c/Y0Microsoft certainly didn’t get their DLM technology from Oracle when they startedITPUB个人空间:qb E/q Z2xwi*H
making Windows clusters. Oh no. They got it from Digital .

A%N$sz*D z*~f#h+c0Fun Fact: The GES/GCS code was already in Oracle version 5. Bjørn Engsig, who has
3E-hf,g0[R0worked with Oracle source code since 1983, found out about this and implemented hisITPUB个人空间/V{ w |)[3b
own, very crude, lock manager on a Danish unix system running version 5. He got itITPUB个人空间:n$^ l,rA `Z#]
to work, but only for demonstration purposes – his home-written lock manager
k"@i9S9A"XDZD0basically used database-level locking which is not really useful .ITPUB个人空间8Jx%K yCL(?
PING

;wy6Qg me0Oracle had to make sure that a buffer wasn’t modified by two different processes atITPUB个人空间!Z"Kv5oZK
the same time – which one should then be written to disk later? So instead of “just”
Xa\;m^)?H-t0serialising the access to one copy of the block in one buffer (which can be achievedITPUB个人空间s6?ul!_.rf
with the combination of hash buckets, chains and latches that we know so well),ITPUB个人空间ycn(ycyt%J
Oracle had to coordinate several copies in several buffer caches across nodes.
#P*Ex @%Q0This was achieved using a new kind of locking (called Parallel Cache Management or
M&Zl6rO yc4G{%zw|Yb8j0PCM locks) which was coordinated across nodes/instances using the DLM andITPUB个人空间ANq%Km#P
various background processes.

~ g/P`!t-Ksn `Z0When there was a “conflict”, ie the same block/buffer was requested by more thanITPUB个人空间R5t/B c.X0a%U9fJB`
one instance, the “exclusive” lock held by the first “holder” had to be down-gradedITPUB个人空间"E1_ DF(Kq,S.oP
to a “shared” lock held by all “holders”. This down-grade/sharing could only be
u^x:[ TU,RPO0done by first making sure that all holders were seeing the same image of theITPUB个人空间?9|\5O']+r v7B,OK
block/buffer.

[wkXBk0So the copy of the block that was in the buffer cache of the first holder was written toITPUB个人空间w`CD0C0XW Y
disk and then that copy of the block was read into the other buffer caches. The termITPUB个人空间U{g#C@6Raj
“ping” was introduced to describe other instances requesting a buffer held
GQ8@pl2L#_0exclusively by one instance.

?L6Z@:jk_9ydIR+f{0Pinging via disk is slow. If you had anindex on a column that kept growing on theITPUB个人空间T0^ s8~k[d-|k
right-hand side the right-most leaf block could get pinged back and forth non-stopITPUB个人空间p{5`x] Q
between instances. Pinging via disk could kill your system’s performance.
r v}K:_ f-h1Bu%Y{0The workarounds included data partitioning, temporary tablespaces (introduced in
0o,yc&h^u:L\ h`07.3) where each instance had their own latch instead of a shared Dictionary lock (STlock
%{)y9?0Nb/h;r*o C0– remember the ora-1575?), reverse indexes (7.3) which meant that it was
n-}'s AZ0random which leaf block you would hit even if you had monotonically increasingITPUB个人空间0h`M.`;Y L'u @+F9?
indexing) and other tricks.
Z{h5}-m3C0
Oracle 8i: Cache Fusion introduced
ITPUB个人空间}+LB,e$r6u
Oracle 8i (that’s 8.1 where the dot is moved on top of the 1) introduced a new way of
L Vz c,wj0pinging via the HSI (High-Speed Interconnect) or similar mechanism, ie a kind ofITPUB个人空间G mH k}@ZsU G/n"oJ
memory-to-memory transport instead of memory-to-disk-to-memory. It’s not easy toITPUB个人空间X @;C.M@p
do, and it was initially only done for CR blocks/buffers.
ITPUB个人空间+fb6u P6g%tx5W
It worked for some and didn’t work for others. On several OPS installations here in
H&yOu0}6BK5HC0Denmark they had to deliberately turn it off in 8.1.6 and 8.1.7.

ZV6M KPL,~;P!x0By the way: Oracle had introduced their own, generic Lock Manager (LM)ITPUB个人空间*Bxq LH g]2u
mechanism in Oracle 8.0, signalling that they would soon be pretty independent of theITPUB个人空间ux![|w,}2z dt/r
DLM code from the various vendors.

C_)h w:A+x0You could say that the LM was the equivalent of the Oracle source code being OSITPUB个人空间{ {A6j []XIM"M-o
independent and then having a small layer in the code known as the OSD (OperatingITPUB个人空间;pj2U]lEa6y
System Dependent). With the introduction of the integrated LM Oracle only had toITPUB个人空间UhRj!Q-P$`5n
manage a small OS-dependent layer for each port – the rest was generic code. RespectITPUB个人空间+R` n] A6D}4^
again to the engineers at Oracle Development.
;D`K-U8~6D2y^0
Oracle9i: Cache Fusion all over – and a new name
ITPUB个人空间j,jE'l3Bu W4}p
With Oracle9i (called 9.0 and 9.2 just to confuse the enemy) all pinging is done viaITPUB个人空间,B)B _c\$Z
the memory channel or high-speed interconnect. That’s it.

~ iMpU{0But just as it was time to call the version 6.2 instead of 6.0.35 back then, it was timeITPUB个人空间p(Ti[g
to call it RAC instead of OPS.

:E D2_6{*m)c'@4W(M.`0Oracle sales people actually started dissing OPS which they had been promoting for aITPUB个人空间Sb x9z\ C"iB3B
decade. At least they did here in Denmark
-n#Hvg3`!O_0A lot of the way RAC works is of course just like OPS worked (and works in many
8N:|4]&s:w:|d2}0installations still).
ITPUB个人空间z&j5Gw~.To P;a
Of course RAC is smarter. Way smarter. Much improved technology, etc.ITPUB个人空间1oY u@(nv
But don’t forget that the engineers at Oracle build on solid, tried and tested codeITPUB个人空间7y$bv5KM
which they then improved. For instance the GES and GCS layers in the code.
;b!H({zo T0
So why is RAC better than OPS?

S#q g5\S)l0For two main reasons:ITPUB个人空间iUD5\/tFD`
First, pinging via disk limits scalability, so pinging via memory channels will improveITPUB个人空间!]3@*^O3p
scalability because it’s faster.

2b0_(j$T T \qsH/H0u0How much faster? That’s a very, very good question. Oracle needs to do a lot ofITPUB个人空间S3x_hn5xw9S
checking, latching, etc. in order to ensure coherency in many ways. For the bestITPUB个人空间\8d$|Y0lp
answer available at the moment, please see Cary Millsap’s article on
w"d/G,LTR4s1lB0www.hotsos.comon why one should focus on logical IO’s instead of physical IO’s.
h_}WEOc;F0It would appear that logical IO’s are in the vicinity of 100 times faster than physical
5Y_nfA5aJ]0ones.
ITPUB个人空间 ]Dw1nbE
This is in stark contrast to pure memory IO done by operating systems – they are
GlzEKZ|A0perhaps up to 10000 times faster than a disk IO. But Oracle has to do lots of thingsITPUB个人空间FBNj/BCQ
(for which we love it), and that has an impact. This necessary overhead is of course
bP2@)H ll|/I#cKA0higher with RAC since more checking still has to be performed.
ITPUB个人空间}r6U X"]#RD
Second, clever tricks have been put into the code in order to make all sorts of
F @L6Z!MH[J9\6T0coordination tasks between instances faster, easier – and sometimes even avoidable.
J;vd!C B0The best tuning is as always not to do it at all. If Oracle doesn’t have to send a copy
%s#`HVmdt0of a buffer across to another instance it will try to not do.
ITPUB个人空间T(Wi bc
Does that mean that RAC will give you a better life? Yes and No. Or as any goodITPUB个人空间:LiJ2d-E.K
consultant will say: “It depends”.
ITPUB个人空间fU7Xz ~(hy0e
Here are the things to consider before you go RAC’ing all over the world: Price,ITPUB个人空间;V#l/yw&_5C f
availability, scalability, manageability, skills required and troubleshooting.
$lE7k5|eQfDN0
Price
ITPUB个人空间#H|d:FYulO4_R
This section talks about Oracle list prices. Discounts may vary .
4j B~%a+@_*y0Oracle Enterprise Edition costs US$40.000,- per cpu or US$800,- per named user plus
U"Y5C1Ri!c _0(NUP), as it’s called now. RAC costs 50% on top of that, which means US$60.000,-ITPUB个人空间eW;U%J$o8n8J
and US$1200,- per cpu or per NUP.

y%@~1q,W$sLq3L0As I write this, I’m aware that RAC has been offered at a 50% discount, ie
kUL'm'jR{0US$10,000, on the American market since around January or February. But it’s notITPUB个人空间O+v'p{ yfZe0u8k
something officially reflected in the global price list.

y U ncn3q0(By the way: The Partitioning option costs 25% on top of the cpu/NUP price. OLAPITPUB个人空间sVh'd'\9@2bH
and Data Mining are 50% each. Spatial, Advanced Security and Label Security areITPUB个人空间4Fg7@t,IHi s
25% each. Diagnostics Pack, Tuning Pack, Change Management Pack andITPUB个人空间eLs~a$bPv
Management Pack for SAP R/3 are US$3.000,- and US$60,- per cpu/NUP.)
p/c \)k1^kDl\0So let’s play around with Larry’s vision of cheap Intel-based Linux clusters. Let’s
7v XB.d7{GL6A(U0buy those two cheap, 4-cpu Intel boxes and put them together in a cluster with
4_:oG mk3z*R0Oracle9i and RAC on top:
:zfy]LHR0Price for the hardware: About US$15.000,- or so.ITPUB个人空间~3s'Zg"bnf
Price for the OS (Linux): About US$0.50,- or thereabout (it depends!)ITPUB个人空间Fw:@2or
Price for Oracle w/ RAC: US$480.000,-ITPUB个人空间h'} u*?,W2YK
So that’s half a million to Oracle. Put another way: It’s 1 dollar to the box movers
E VtF!Ui0for every 32 dollars Oracle gets.

;Y~$z"a5D,\i3y0Psychologically it’s hard for the customers to understand that they have to buyITPUB个人空间 ] G!J-l${2Fl)\D$_i
something that expensive to run on such cheap hardware. The gap is too big, and
p/zr9K]bo,P\@0Oracle will need to address it soon.

5mtY%@7yG Ge,s5TP0There’s nothing like RAC on the market, but that doesn’t mean you have to buyITPUB个人空间7Kqxx#|#w
RAC. I usually joke that it’s like buying a car for US$10.000,- that has all the
3bvAXlw ^3vL1Jyp0facilities you need from a good and stable car. Airbags and ABS brakes are
+_:HTtE4j%C_$[U0US$500.000,- extra, by the way. Well, airbags and ABS are wonderful to have andITPUB个人空间-g3t$I@$p
they increase your security. But it’s a lot of money compared to the basic car price.
g T8IY.}%V.a7F|0There are other indirect costs associated with going RAC: You’ll need more skills inITPUB个人空间!H1D6V;FoI
your organisation, both with respect to RAC and clusters. If your organisation is not
0i {8| O,jW#L)VS0familiar with clusters you’ll need to learn a lot, for instance.

h$~-i8KD%@ j:M&{;v0You’ll also have to consider to have a development environment (and maybe a test
Tq` h;i0|0environment) that consist of both a cluster and RAC. Sometimes Oracle will let you
z9} mv@v7^ LO0run Oracle for free on those systems, sometimes not (it depends).ITPUB个人空间D'k}9k7j"eM)Y
RAC is very cool technology. But it’s expensive.
i!lxqE et \0
Availability, Part 1: 99.x% >98%
ITPUB个人空间;Gv9h_5Y v+pi
I think there are two ways of looking at availability (but one day I will as usual beITPUB个人空间R1Q1uh`k!u!O9a
proved wrong, of course).

Q+@M-hq0One way of looking at availability is this: If you have a standalone Unix box it will
1r5t |H-W3lf0usually give you 99.9% availability over a year (some say 99.5, some say 99.9). It justITPUB个人空间Px$YsF:n)d3r3Q2[!w w
runs. And so does Oracle usually. If you have a two-node Unix cluster the availability
{+i(n7uzE;\-qqm;e0over a year drops to 98%.
ITPUB个人空间!sd#d H2L8N|Fp
Yep, quite surprising, but the two main reasons are that the increased complexity
};V0` yu#^e0(extra layers of code, extra hardware, etc.) introduced with clusters and RAC will
-ugM] N]*}xW0cause some additional downtime – and that it just takes longer to boot a cluster,ITPUB个人空间~^OjWC+H
startup RAC, and perform. some other management tasks.

h^M$P)]0And if you believe what Gartner Group, Oracle and others are saying, namely thatITPUB个人空间w2K:X]b
70% or more of downtime is caused by human errors and lack of knowledge…well,ITPUB个人空间9X;j Fu3^ R
what will happen when you introduce more complex hardware and more complex
Jp4z"Hu"p _l0software?
ITPUB个人空间Y1z XVl2i$D0^Y
Another way of looking at availability is this: Your standalone box is available 99.9%ITPUB个人空间+[ ~N@k;E
of the time. That 0.1% is what the other nodes in a cluster are for.

F@E7a}'k[D[0I have worked with clusters (VMS, Unix and now Windows) since around 1988 orITPUB个人空间7H:c,_V l
1989 or whenever 6.0.35 came out, and clusters are just harder to setup, manage andITPUB个人空间+I!?j&l$]&R*w8?9`
run. If you need to run clusters in your business (for whatever valid business reasons)ITPUB个人空间 Z;l"rl]*G!G4d
you also will need extra personnel, extra consultants and extra skills in your
8c}u L_y g+[*U0organisation.

-@'?/t1k _8b1{k'Q0As a technical director in a small company that sells rather specialized consultants,ITPUB个人空间)|*P'R O D*B(r0m
I’m of course delighted when the complexity of a customer’s setup increases. ItITPUB个人空间*Vz(Z,fB%`'f.y
means they’ll call us (or Oracle) sooner or later. Since we live off catastrophes,
fnCW MXZ1|a0problems and troubles, I’m looking into a bright future, I’m sure .
|V!Mc5B"\V0What are the alternatives? Various standby database options. That can be either theITPUB个人空间.A3C kY*_D
standard standby database introduced in 7.3, or Data Guard, or some 3rd party tool likeITPUB个人空间4`h Oer6II)Z
Shareplex from Quest (I have absolutely no knowledge of this specific product – I just
;XUs2d wX%}n&n0know it exists, so don’t buy it without listening to people more clever than me).ITPUB个人空间4XSj)L/^gS zC
Note, that Oracle’s pricing policy changed recently regarding standby options (asITPUB个人空间;D ^,Ta4Bq5S
pointed out by one alert contributor on the Oracle-L list) so suddenly you now have toITPUB个人空间OR*u.[_,\jx&D JZX
pay full license for the standby nodes if you use them more than 10 days a year. AndITPUB个人空间+H8^;w#R`7h;k
always full price for the Data Guard nodes.
ITPUB个人空间@&^7v&n8xg;lq8g]
You could of course also create something fancy and creative yourself. We used to do
f^{3['M"?I0standby databases back in version 6 by applying archive logs manually on anotherITPUB个人空间"?wn"G)n#q7KsJ&R
database in constant recovery mode. Lots of issues, of course. But it was done. Or youITPUB个人空间*c"_ M)K,HL+G.k&L%e
could use log miner to extract DML from the archive logs and apply them on aITPUB个人空间:I6GU4K]:`4]lb+{
standby database. Or you could have system triggers that caught all DDL and DMLITPUB个人空间&Q'@ f,]X;?O
on a system and put them in load files that were then loaded in real time, near realITPUB个人空间to+g3pV MJ
time or much later on another system.
ITPUB个人空间u#FA| wIq-D
Those kinds of alternatives will need a little work, but they have one thing inITPUB个人空间"^2N r6[[
common: You could even do it with Oracle Standard Edition, which means that theITPUB个人空间zdO1v _1J
price drops from US$40.000,- per cpu to US$15.000,-.
2F(k&i ]h$m0
Availability, Part 2: 25/8/370?

!K)Cq)h$q7C0Recently, a client of mine were down for close to five hours because they needed to
1KRW\ o;SK0upgrade their RAC from 9.2.0.1.0 to 9.2.0.3.0. They were well prepared and
&VQz n"ux~0everything went as planned. It just took close to five hours. Here are the steps theyITPUB个人空间I J8` dTL
went through:ITPUB个人空间"F0H] ?(m2al)\
0. Shutdown of WeblogicITPUB个人空间I:J2KWd O*]l
1. Shutdown of both instancesITPUB个人空间 dA1HHv#@#F(jK`
2. Put SW on
/N)hFT'b,w&h03. Patch one node, which automatically patches the other node
f7]M%b;}}t04. Start one instance in non-clustered mode (parameter cluster_database = false)
C!Ti@/Pr].@Vf05. startup migrate ( ie. underscore parameters are set)ITPUB个人空间,M9i+m!WN U\Da
6. run catpatch.sql
4e@'mE'?07. shutdown
`,|5[t5h~0c08. startupITPUB个人空间0n`(ywD#V6R t
I’m sure it could have been done faster, but all this time the server/disk system was
;he ]&]${&u2Q0busy doing stuff it’s supposed to do. There was no “waste” or “errors” during theITPUB个人空间Lb-^5iK6\+g-h
upgrade. And it was done according to the Oracle documentation (which workedITPUB个人空间-U F*A:J TPI
completely as described). It just took five hours.
ITPUB个人空间S2sd c Q7N nOdj
Many people I talk to are surprised by this. But when you think about it, you don’tITPUB个人空间&xs^ ~jEE?0D
patch nodes or instances. You patch databases. Oracle has one database in a RAC
7m;k@*I(X W0installation.
ITPUB个人空间1kKn+b5L
So there really is no such thing as 24/7/365 (or 25/8/370 which is about as realistic inITPUB个人空间(R'SS`7Eg
the real world). There are HA options to the database plus scheduled downtimes for
&x4Q;es'D0[~E5x:L^0maintenance.
ITPUB个人空间%Aos kO}"n
Oracle doesn’t have rolling upgrades. Oracle doesn’t have “online patching”. So
9H7^!^;J7i0your database needs to be down while you upgrade or patch it.ITPUB个人空间[Aah(L'Xc6d%K
Could you duplicate your database, eg with Oracle DataGuard, so your users could be
UJ2P F"YT7vh'e0running on one database while the other is being patched? I’m sure it’s possible, but
;w~B/q7@0I can’t see how since DataGuard requires you to be on the exact same patch level on
K Af&^(a0both Oracle and the OS. So it would appear that you need to shutdown and patch bothITPUB个人空间^v:`*Y6coen
databases at the same time.
ITPUB个人空间 Zm F(a8Z[-P E3q+rx
If it’s supported to let DataGuard run while upgrading the primary database to a
T%D7I*JF!W2k0higher version or patch level, I’d be interested in the details.

rc,e8w9r(R0Did you notice what was missing from the list of actions above? The client didn’tITPUB个人空间DYiz0g$m5Y)_
take a backup before upgrading. For very good reasons Oracle recommends that you
,Ei[u4v-X0perform. a full and valid backup before applying patches or upgrading your system.
[[n `*\6qQfg0This client didn’t, but you should. Oracle might even recommend (again, for veryITPUB个人空间J5K+jQl}
good and valid reasons) that you take a new backup when you’re done with yourITPUB个人空间*f1e-IH%@a(w,t
upgrade/patch actions.

r(ZAl4Pf8o Z"S2?}0So there’s RAC plus scheduled downtimes. But there are also emergency patches thatITPUB个人空间Bs{-X8i9N
need to be applied fast. This could be due to an error encounted in the environment or
?gNEu?f0it could be a security patch. The time needed to apply emergency patches is hard to
hn+d0vy EN rO0plan .

i$u4n r'^%y'Q#RT0With RAC you get duplicate nodes, duplicate instances and one database. ThatITPUB个人空间1P3lD-ym vmg[-dA L
database can be hit by dictionary corruptions (I’m sure we’ve all seen one) or it can
j A2Me m5iW&X0be hit by the need for patching and upgrading. That’s downtime for your whole RACITPUB个人空间)B{ [ O*bPk%l
system.
qs'^A#R8I.{ wl9}0
Scalability

BY5\g^!CTZ0Scalability is of course much better with RAC than with OPS, and you don’t need as
;e"c\~k ][/E0many fancy tricks in order to make it scale well.
ITPUB个人空间yM6RT pNb&O
But. If you remove a bottleneck in any system (IT or other) a new bottleneck will now
M }vhU!O.F7H0be present. It might be smaller than the old bottleneck (hopefully and usually, but not
s`J;\0@5g!c0always), but it’s still a bottleneck.
ITPUB个人空间T%r7IX6p(w:eG8^&n
With RAC pinging is done using CPU resources. Yes, that’s much faster than diskITPUB个人空间6UbMS_L3d*opY+R
resources. But what if you are strapped for CPU in your system and RAC therefore
&{7aM?_0cannot get enough CPU for the pinging?

/RQt(|W.@\5H5z)R0Mario Broodbakker from Digital/Compaq/HP in Holland has done some interesting
T*}8r-~d~ d0benchmarks on RAC that prove two things:
mRi y-B0It’s important to have enough CPU left for the RAC pinging activity.

gN}Uc)O0And you can still get into situations where traditional OPS workarounds are neededITPUB个人空间8oD C`)rv'^p
(data partitioning, etc.) in order to achieve maximum performance. Even theITPUB个人空间7p{*P;A ?YZ Dl
wonderfully complex and mythic GC_FILES_TO_LOCKS parameter can be useful atITPUB个人空间!}4u r%Z7S#s;L
times. It has deliberately been removed from the documentation because it was seenITPUB个人空间)xV9FA,U$wgHC{Q
by Oracle Marketing to send the wrong message.
ITPUB个人空间*E JlkCIP/~4~
So you say: Of course you should have the necessary CPU available for the RAC
I!{0A {#gB0pinging activities – hey, you should always size your system professionally. Yes. But
lb!^7Fga0what if you suddenly have a bunch of batch jobs or batch-like processes fighting over
9L Y6XOLs{M0the available CPU resources (PX, DBMS_JOB, backup, file copying, whatever)?ITPUB个人空间oh#DM#ttD
Yes, you can plan for many situations, but sooner or later your system will be in a
I$f \'LRA1V2]9B0situation where the system is running at 100% CPU, and that’s when you’ll see someITPUB个人空间gF#t$hkLs!x
really bad performance with RAC.
ITPUB个人空间}-F Q0E9c
If you’re interested in Mario’s whitepaper about his RAC testing let me know, and
_1}Q/?0O3\0I’ll be happy to send it to you. RAC Development are planning to address severalITPUB个人空间[]O:A4}h [
issues he has pointed out or has already addressed them. Yet the lack of required CPU
6S+J;n{}s0resources is not something Oracle or RAC currently can do anything about.
4jt8G6JQ)a3g(?"?0
Manageability
ITPUB个人空间Xa j#GP d0[A
OPS has actually never been that bad to manage. Assign an instance number to each
0Q*K6z4eg0F0instance, startup and shutdown the instances in the same order every time, createITPUB个人空间_5DB%q:Ii!A-u'C
simple scripts for these things, and you’re pretty much rolling.
ITPUB个人空间"T@,H|e ?#|
It’s now possible to do very clever things with groups of instances, and OEM hasITPUB个人空间I`_/Uj$mT$i
been greatly enhanced to handle RAC – but most customers will still run a two-nodeITPUB个人空间}x-U+ia$m7r
cluster with two Oracle instances on it and can do fine with the good, old features
H7e,o6Ip/\7A'`J0from the OPS days.

DNW@vb9r(i0But it still requires more skills and more time to manage RAC than not. Added
2X3jXK8`"iYh0complexity means additional skills and additional time. You can of course define your
9md s}Wd&L"d0way out of it by calling it “planned downtime” or “maintenance” or “service time”
l6XyNm2`.zwL*T0instead of plain downtime.

9a4fo!a3` G7X4V0The end goal, though, is often to have the database (or rather the applications thatITPUB个人空间e WQa7}5\ rXXY
depend on the database) available most of the time.
)Le|/t'`(eJ0
Skills Required

)[0u:u(| xfI$H0I have already touched on this in several places in the paper, so let’s just repeat here
iELh*V,fN(?l0that it’s not only RAC skills that are needed, but also (and probably most) clusterITPUB个人空间c*e2_q-yT#H%_c
skills if your organisation is going RAC.

6ds#^z,}Qx%j0There’s one external Oracle RAC class (three days) and one so-called DSI (DataITPUB个人空间xjW9L u|
Server Internals) class for internal Oracle consumption available out there.
L}!R]*Y&w^0There are also RAC classes available from various external companies.
ITPUB个人空间r^m(LSo+}0O;]/q.ra
And there are lots of other people out there that know about OPS and RAC. Listen
u'ac l[7vSL0carefully to the bitter, twisted old men who’ve worked with OPS. When RAC isITPUB个人空间:@]H#z!v^
pushed to the limit, you could still need to do the same things that were required with
S[ \v4NB0OPS.ITPUB个人空间!RFr1O(p#Jw!TKQl%S
Troubleshooting
ITPUB个人空间RlZ'cVg#|-B
Ah yes, troubleshooting. I’ve seen many clusters that just froze for no apparentITPUB个人空间ElA0vS K6K
reason in my time. It’s always possible to make the OS or Cluster software dump a
jX$k*[0U;|~0trace/log file when it happens.

/{;y2j&W"{-e:Y0The resulting trace/log file from the cluster will normally be the size of Texas, and
+P"L0c*Gr#l` ? J.x0only one or two people in the entire vendor organisation can truly understand them,
/e/hw!W8me)K$G}H4G0you will be told.
ITPUB个人空间'c"cm t e|
Then the files (often with sizes measured in GB) are shipped to the vendor and some
F XTC;R0months later they will report back that it wasn’t possible to pinpoint the exact reason
8dw%z|B8oZ0for the complete cluster freeze or crash, but that this parameter was probably a bit lowITPUB个人空间0{U d]L&T
and this parameter was probably a bit high.
ITPUB个人空间/My^X H0nJW
That’s what always happens. I have never – really: never – seen a vendor who couldITPUB个人空间h C/{!?F5l E4p3j
correctly diagnose and explain a hanging cluster or a cluster that kept crashing.
5Z#o(x@[c0As to Oracle trouble shooting I’m not so worried. Oracle will either have a
7wsBh\VFm6X0performance problem, which is easy to diagnose using the Wait Interface or you’llITPUB个人空间,gX.UW0H@7_
get ora-600 errors that are fairly easy to diagnose, although you’ll need to spend theITPUB个人空间&Z6{bmS
required 42 hours logging and maintaining an iTAR or SR or whatever the name is
$o+_6D/w5V5]h*Q0these days.
ITPUB个人空间9PT+W$@W7X'V
In other words: Finding out what’s wrong (if anything) in Oracle is much easier thanITPUB个人空间`a`q0h7q)@
finding out what’s wrong with a cluster.ITPUB个人空间Q'~#a7XO:^
Conclusion

K3J"{E&C f,x0If you have a system that needs to be up and running a few seconds after a crash, youITPUB个人空间Vl8Zk1\UHn
probably need RAC.

{*?8TUl&Z'A,sb0If you cannot buy a big enough system to deliver the CPU power and or memory you
quK+_{] j0crave, you probably need RAC.

s)DOco{)} @A0If you need to cover your behind politically in your organisation, you can choose to
N1A/l(sj;R9Ns0buy clusters, Oracle, RAC and what have you, and then you can safely say: “We’ve
$gRjZ OWqE o'w$[0bought the most expensive equipment known to man. It cannot possibly be our fault ifITPUB个人空间5R^m%z'{ ? QJ+u
something goes wrong or the system goes down”.
ITPUB个人空间}!{ef;S Qx
Otherwise, you probably don’t need RAC. Alternatives will usually be cheaper ,
6k4rH E;g^0easier to manage and quite sufficient.
-sB!_fX[a N0Now please prove me wrong.
%{nQe j]0Mogens Nørgaard (mno@miracleas.dk) was with Oracle Support in Denmark for 10
;h1iwo;\V0years (three as an RDBMS analyst, four as head of RDBMS Support and three as head
h,Z~p8ugt2^:r0of Premium Services). He is co-founder and technical director of Miracle A/SITPUB个人空间&TE b'ck%]#A#Qm1{p
(www.miracleas.dk), which provides consulting, support and training on Oracle and
3F3O{;v_@9Vg|5N0SQL Server, in Maaloev, Denmark.

-S$Z*j6A8k]3t*U*b SL0First claim to fame: First manager within Oracle to demand that his team (about 40ITPUB个人空间8l,E&f*My6t
people in Premium) used the YAPP performance diagnostics method created by AnjoITPUB个人空间7e"AD`IE IyYs
Kolk. .
ITPUB个人空间{,bu9DGNq
Second claim to fame: The OakTable Network (www.oaktable.net) was named after
v5Fm6u3QkJ)Jv4P0his dining table where some of the better Oracle scientists will gather a couple of
F)?m.z4?B5H0times each year.
ITPUB个人空间~N{+f G$xsGp
Mogens and his co-director Lasse (lch@miracleas.dk) will use the profits from
j/W(R%Ty0} lEr0Miracle A/S to start up a micro brewery that can stop Carlsberg from taking over theITPUB个人空间0D%q8[ LVR6x7eV.i W
world. He believes Carlsberg is the Danish equivalent to the American Budweiser. IfITPUB个人空间'f{`U.X Q9cj
nothing else is available, though, he’ll drink both.
文档出自oracle公司
原创粉丝点击