You_Probably_Dont_Need_RAC
来源:互联网 发布:怎样下载excel2007软件 编辑:程序博客网 时间:2024/05/22 04:26
You Probably Don’t Need RAC
V.R~\3O+p0If you’ve been holidaying in Siberia or similar places for about a year, you have
y4H*t#U&n:O3@2X0probably not talked to an Oracle Sales rep yet about RAC. But you will no doubt find
0Ofc$Sm)A4|0that there’s a voice mail waiting for you when you turn your mobile phone on againITPUB个人空间*bz4JdkqD"e@
after returning home from the vacation.
pfL ?u B*S0RAC is being pushed very hard by Oracle. You will get high availability, incredibleITPUB个人空间p$~6o/X+wO8i'~
scalability, a much improved personal life, the ability to partition workloads, buy
W3?M[*B)Bs"i"d3hd0cheap Linux servers and what have you.
P~2o5L v$P+Lj0It sounds pretty good. How can anyone say no to that kind of offer?ITPUB个人空间Zu$M&b2@5BN2U Q q
RAC is not OPS
a XDd\1Qi7J0No, RAC is not OPS, but it looks a lot like it. Oracle Marketing tries really hard toITPUB个人空间va7Pu%j$H
distance RAC from OPS, and I don’t understand why. I mean: If the basic code has
'nT!U g|F0been around for many years it means it’s stable, debugged and tried. If it’s all new,
f1o'a"H0A5w0who dares install it in a critical system? Fortunately, it’s not true that RAC is not
4h `"DFR7r0OPS. The basic parts of the code – GES and GCS – are pretty much the same as
2B(i-A*vT Uq5r*O0they’ve always been.
ITPUB个人空间~o0BTkF
GES stands for Global Enqueue Service and GCS stands for Global Cache Service.ITPUB个人空间J-e.U\&s8X N
More about that later.
GES stands for Global Enqueue Service and GCS stands for Global Cache Service.ITPUB个人空间J-e.U\&s8X N
More about that later.
ITPUB个人空间(ml m Ty(jdx
A little history: OPS was created for version 6 of Oracle. The only clusters aroundITPUB个人空间f4UW0G ~6T5T&p
then were VAX/VMS clusters, but unfortunately the VAX/VMS Distributed Lock
O'a+mI!GC*bF0Manager (DLM) was created originally to handle the coordination of relatively few
y9z.q.RU x0resources, such as files and devices, not 1000s of buffers in a buffer cache (Oracle orITPUB个人空间k&Mm'_`S*ZV-w
others). It proved way too slow for OPS.
A little history: OPS was created for version 6 of Oracle. The only clusters aroundITPUB个人空间f4UW0G ~6T5T&p
then were VAX/VMS clusters, but unfortunately the VAX/VMS Distributed Lock
O'a+mI!GC*bF0Manager (DLM) was created originally to handle the coordination of relatively few
y9z.q.RU x0resources, such as files and devices, not 1000s of buffers in a buffer cache (Oracle orITPUB个人空间k&Mm'_`S*ZV-w
others). It proved way too slow for OPS.
ITPUB个人空间*u.K^ D;d6w\Y
So Oracle had to create their own DLM for VAX/VMS, which they did. It took a
:I4[^6e$|YP)\0while, though, so it wasn’t until 6.0.35 (which was called “6.2” to celebrate the OPS
+zJ wY;LGV0feature) that it finally came out.
So Oracle had to create their own DLM for VAX/VMS, which they did. It took a
:I4[^6e$|YP)\0while, though, so it wasn’t until 6.0.35 (which was called “6.2” to celebrate the OPS
+zJ wY;LGV0feature) that it finally came out.
]+^1AUE5fH0I remember taking one of the first OPS classes (in Chicago) shortly after joiningITPUB个人空间d |*C ~&l
Oracle and thinking that Oracle Development had gone mad – creating their ownITPUB个人空间u:{x2N I
DLM instead of letting the Digital guys do it (they had, after all, created the clustersITPUB个人空间n"r\3bd y5eeu-?
and the whole concept).
$W$oov'QM%_!E0I was wrong. Oracle’s own DLM worked very well, and Digital adopted Oracle’sITPUB个人空间w Ya,x2|/I%`*[9M
technology and ideas in their own DLM, so when version 7 of Oracle came out, it was
\\rri0w*gV q0again Digitals native DLM that was used.
S:Hc1|fVb/q|?(O0The UNIX vendors then started doing Clusters (well, NCR had done it for a while).
{X[!}M8Y T-_$Lc0And they mostly got the DLM technology from Oracle.
]NZ(N#s\f/BR0c/Y0Microsoft certainly didn’t get their DLM technology from Oracle when they startedITPUB个人空间:qb E/q Z2xwi*H
making Windows clusters. Oh no. They got it from Digital .
A%N$sz*D z*~f#h+c0Fun Fact: The GES/GCS code was already in Oracle version 5. Bjørn Engsig, who has
3E-hf,g0[R0worked with Oracle source code since 1983, found out about this and implemented hisITPUB个人空间/V{ w |)[3b
own, very crude, lock manager on a Danish unix system running version 5. He got itITPUB个人空间:n$^ l,rA `Z#]
to work, but only for demonstration purposes – his home-written lock manager
k"@i9S9A"XDZD0basically used database-level locking which is not really useful .ITPUB个人空间8Jx%K yCL(?
PING
;wy6Qg me0Oracle had to make sure that a buffer wasn’t modified by two different processes atITPUB个人空间!Z"Kv5oZK
the same time – which one should then be written to disk later? So instead of “just”
Xa\;m^)?H-t0serialising the access to one copy of the block in one buffer (which can be achievedITPUB个人空间s6?ul!_.rf
with the combination of hash buckets, chains and latches that we know so well),ITPUB个人空间ycn(ycyt%J
Oracle had to coordinate several copies in several buffer caches across nodes.
#P*Ex @%Q0This was achieved using a new kind of locking (called Parallel Cache Management or
M&Zl6rO yc4G{%zw|Yb8j0PCM locks) which was coordinated across nodes/instances using the DLM andITPUB个人空间ANq%Km#P
various background processes.
~ g/P`!t-Ksn `Z0When there was a “conflict”, ie the same block/buffer was requested by more thanITPUB个人空间R5t/B c.X0a%U9fJB`
one instance, the “exclusive” lock held by the first “holder” had to be down-gradedITPUB个人空间"E1_ DF(Kq,S.oP
to a “shared” lock held by all “holders”. This down-grade/sharing could only be
u^x:[ TU,RPO0done by first making sure that all holders were seeing the same image of theITPUB个人空间?9|\5O']+r v7B,OK
block/buffer.
[wkXBk0So the copy of the block that was in the buffer cache of the first holder was written toITPUB个人空间w`CD0C0XW Y
disk and then that copy of the block was read into the other buffer caches. The termITPUB个人空间U{g#C@6Raj
“ping” was introduced to describe other instances requesting a buffer held
GQ8@pl2L#_0exclusively by one instance.
?L6Z@:jk_9ydIR+f{0Pinging via disk is slow. If you had anindex on a column that kept growing on theITPUB个人空间T0^ s8~k[d-|k
right-hand side the right-most leaf block could get pinged back and forth non-stopITPUB个人空间p{5`x] Q
between instances. Pinging via disk could kill your system’s performance.
r v}K:_ f-h1Bu%Y{0The workarounds included data partitioning, temporary tablespaces (introduced in
0o,yc&h^u:L\ h`07.3) where each instance had their own latch instead of a shared Dictionary lock (STlock
%{)y9?0Nb/h;r*o C0– remember the ora-1575?), reverse indexes (7.3) which meant that it was
n-}'s AZ0random which leaf block you would hit even if you had monotonically increasingITPUB个人空间0h`M.`;Y L'u @+F9?
indexing) and other tricks.
Z{h5}-m3C0
Oracle 8i: Cache Fusion introduced
ITPUB个人空间}+LB,e$r6u
Oracle 8i (that’s 8.1 where the dot is moved on top of the 1) introduced a new way of
L Vz c,wj0pinging via the HSI (High-Speed Interconnect) or similar mechanism, ie a kind ofITPUB个人空间G mH k}@ZsU G/n"oJ
memory-to-memory transport instead of memory-to-disk-to-memory. It’s not easy toITPUB个人空间X @;C.M@p
do, and it was initially only done for CR blocks/buffers.
Oracle 8i (that’s 8.1 where the dot is moved on top of the 1) introduced a new way of
L Vz c,wj0pinging via the HSI (High-Speed Interconnect) or similar mechanism, ie a kind ofITPUB个人空间G mH k}@ZsU G/n"oJ
memory-to-memory transport instead of memory-to-disk-to-memory. It’s not easy toITPUB个人空间X @;C.M@p
do, and it was initially only done for CR blocks/buffers.
ITPUB个人空间+fb6u P6g%tx5W
It worked for some and didn’t work for others. On several OPS installations here in
H&yOu0}6BK5HC0Denmark they had to deliberately turn it off in 8.1.6 and 8.1.7.
It worked for some and didn’t work for others. On several OPS installations here in
H&yOu0}6BK5HC0Denmark they had to deliberately turn it off in 8.1.6 and 8.1.7.
ZV6M KPL,~;P!x0By the way: Oracle had introduced their own, generic Lock Manager (LM)ITPUB个人空间*Bxq LH g]2u
mechanism in Oracle 8.0, signalling that they would soon be pretty independent of theITPUB个人空间ux![|w,}2z dt/r
DLM code from the various vendors.
C_)h w:A+x0You could say that the LM was the equivalent of the Oracle source code being OSITPUB个人空间{ {A6j []XIM"M-o
independent and then having a small layer in the code known as the OSD (OperatingITPUB个人空间;pj2U]lEa6y
System Dependent). With the introduction of the integrated LM Oracle only had toITPUB个人空间UhRj!Q-P$`5n
manage a small OS-dependent layer for each port – the rest was generic code. RespectITPUB个人空间+R` n] A6D}4^
again to the engineers at Oracle Development.
;D`K-U8~6D2y^0
Oracle9i: Cache Fusion all over – and a new name
ITPUB个人空间j,jE'l3Bu W4}p
With Oracle9i (called 9.0 and 9.2 just to confuse the enemy) all pinging is done viaITPUB个人空间,B)B _c\$Z
the memory channel or high-speed interconnect. That’s it.
With Oracle9i (called 9.0 and 9.2 just to confuse the enemy) all pinging is done viaITPUB个人空间,B)B _c\$Z
the memory channel or high-speed interconnect. That’s it.
~ iMpU{0But just as it was time to call the version 6.2 instead of 6.0.35 back then, it was timeITPUB个人空间p(Ti[g
to call it RAC instead of OPS.
:E D2_6{*m)c'@4W(M.`0Oracle sales people actually started dissing OPS which they had been promoting for aITPUB个人空间Sb x9z\ C"iB3B
decade. At least they did here in Denmark
-n#Hvg3`!O_0A lot of the way RAC works is of course just like OPS worked (and works in many
8N:|4]&s:w:|d2}0installations still).
ITPUB个人空间z&j5Gw~.To P;a
Of course RAC is smarter. Way smarter. Much improved technology, etc.ITPUB个人空间1oY u@(nv
But don’t forget that the engineers at Oracle build on solid, tried and tested codeITPUB个人空间7y$bv5KM
which they then improved. For instance the GES and GCS layers in the code.
;b!H({zo T0
Of course RAC is smarter. Way smarter. Much improved technology, etc.ITPUB个人空间1oY u@(nv
But don’t forget that the engineers at Oracle build on solid, tried and tested codeITPUB个人空间7y$bv5KM
which they then improved. For instance the GES and GCS layers in the code.
;b!H({zo T0
So why is RAC better than OPS?
S#q g5\S)l0For two main reasons:ITPUB个人空间iUD5\/tFD`
First, pinging via disk limits scalability, so pinging via memory channels will improveITPUB个人空间!]3@*^O3p
scalability because it’s faster.
2b0_(j$T T \qsH/H0u0How much faster? That’s a very, very good question. Oracle needs to do a lot ofITPUB个人空间S3x_hn5xw9S
checking, latching, etc. in order to ensure coherency in many ways. For the bestITPUB个人空间\8d$|Y0lp
answer available at the moment, please see Cary Millsap’s article on
w"d/G,LTR4s1lB0www.hotsos.comon why one should focus on logical IO’s instead of physical IO’s.
h_}WEOc;F0It would appear that logical IO’s are in the vicinity of 100 times faster than physical
5Y_nfA5aJ]0ones.
ITPUB个人空间 ]Dw1nbE
This is in stark contrast to pure memory IO done by operating systems – they are
GlzEKZ|A0perhaps up to 10000 times faster than a disk IO. But Oracle has to do lots of thingsITPUB个人空间FBNj/BCQ
(for which we love it), and that has an impact. This necessary overhead is of course
bP2@)H ll|/I#cKA0higher with RAC since more checking still has to be performed.
This is in stark contrast to pure memory IO done by operating systems – they are
GlzEKZ|A0perhaps up to 10000 times faster than a disk IO. But Oracle has to do lots of thingsITPUB个人空间FBNj/BCQ
(for which we love it), and that has an impact. This necessary overhead is of course
bP2@)H ll|/I#cKA0higher with RAC since more checking still has to be performed.
ITPUB个人空间}r6U X"]#RD
Second, clever tricks have been put into the code in order to make all sorts of
F @L6Z!MH[J9\6T0coordination tasks between instances faster, easier – and sometimes even avoidable.
J;vd!C B0The best tuning is as always not to do it at all. If Oracle doesn’t have to send a copy
%s#`HVmdt0of a buffer across to another instance it will try to not do.
Second, clever tricks have been put into the code in order to make all sorts of
F @L6Z!MH[J9\6T0coordination tasks between instances faster, easier – and sometimes even avoidable.
J;vd!C B0The best tuning is as always not to do it at all. If Oracle doesn’t have to send a copy
%s#`HVmdt0of a buffer across to another instance it will try to not do.
ITPUB个人空间T(Wi bc
Does that mean that RAC will give you a better life? Yes and No. Or as any goodITPUB个人空间:LiJ2d-E.K
consultant will say: “It depends”.
Does that mean that RAC will give you a better life? Yes and No. Or as any goodITPUB个人空间:LiJ2d-E.K
consultant will say: “It depends”.
ITPUB个人空间fU7Xz ~(hy0e
Here are the things to consider before you go RAC’ing all over the world: Price,ITPUB个人空间;V#l/yw&_5C f
availability, scalability, manageability, skills required and troubleshooting.
$lE7k5|eQfDN0
Here are the things to consider before you go RAC’ing all over the world: Price,ITPUB个人空间;V#l/yw&_5C f
availability, scalability, manageability, skills required and troubleshooting.
$lE7k5|eQfDN0
Price
ITPUB个人空间#H|d:FYulO4_R
This section talks about Oracle list prices. Discounts may vary .
4j B~%a+@_*y0Oracle Enterprise Edition costs US$40.000,- per cpu or US$800,- per named user plus
U"Y5C1Ri!c _0(NUP), as it’s called now. RAC costs 50% on top of that, which means US$60.000,-ITPUB个人空间eW;U%J$o8n8J
and US$1200,- per cpu or per NUP.
This section talks about Oracle list prices. Discounts may vary .
4j B~%a+@_*y0Oracle Enterprise Edition costs US$40.000,- per cpu or US$800,- per named user plus
U"Y5C1Ri!c _0(NUP), as it’s called now. RAC costs 50% on top of that, which means US$60.000,-ITPUB个人空间eW;U%J$o8n8J
and US$1200,- per cpu or per NUP.
y%@~1q,W$sLq3L0As I write this, I’m aware that RAC has been offered at a 50% discount, ie
kUL'm'jR{0US$10,000, on the American market since around January or February. But it’s notITPUB个人空间O+v'p{ yfZe0u8k
something officially reflected in the global price list.
y U ncn3q0(By the way: The Partitioning option costs 25% on top of the cpu/NUP price. OLAPITPUB个人空间sVh'd'\9@2bH
and Data Mining are 50% each. Spatial, Advanced Security and Label Security areITPUB个人空间4Fg7@t,IHi s
25% each. Diagnostics Pack, Tuning Pack, Change Management Pack andITPUB个人空间eLs~a$bPv
Management Pack for SAP R/3 are US$3.000,- and US$60,- per cpu/NUP.)
p/c \)k1^kDl\0So let’s play around with Larry’s vision of cheap Intel-based Linux clusters. Let’s
7v XB.d7{GL6A(U0buy those two cheap, 4-cpu Intel boxes and put them together in a cluster with
4_:oG mk3z*R0Oracle9i and RAC on top:
:zfy]LHR0Price for the hardware: About US$15.000,- or so.ITPUB个人空间~3s'Zg"bnf
Price for the OS (Linux): About US$0.50,- or thereabout (it depends!)ITPUB个人空间Fw:@2or
Price for Oracle w/ RAC: US$480.000,-ITPUB个人空间h'} u*?,W2YK
So that’s half a million to Oracle. Put another way: It’s 1 dollar to the box movers
E VtF!Ui0for every 32 dollars Oracle gets.
;Y~$z"a5D,\i3y0Psychologically it’s hard for the customers to understand that they have to buyITPUB个人空间 ] G!J-l${2Fl)\D$_i
something that expensive to run on such cheap hardware. The gap is too big, and
p/zr9K]bo,P\@0Oracle will need to address it soon.
5mtY%@7yG Ge,s5TP0There’s nothing like RAC on the market, but that doesn’t mean you have to buyITPUB个人空间7Kqxx#|#w
RAC. I usually joke that it’s like buying a car for US$10.000,- that has all the
3bvAXlw ^3vL1Jyp0facilities you need from a good and stable car. Airbags and ABS brakes are
+_:HTtE4j%C_$[U0US$500.000,- extra, by the way. Well, airbags and ABS are wonderful to have andITPUB个人空间-g3t$I@$p
they increase your security. But it’s a lot of money compared to the basic car price.
g T8IY.}%V.a7F|0There are other indirect costs associated with going RAC: You’ll need more skills inITPUB个人空间!H1D6V;FoI
your organisation, both with respect to RAC and clusters. If your organisation is not
0i {8| O,jW#L)VS0familiar with clusters you’ll need to learn a lot, for instance.
h$~-i8KD%@ j:M&{;v0You’ll also have to consider to have a development environment (and maybe a test
Tq` h;i0|0environment) that consist of both a cluster and RAC. Sometimes Oracle will let you
z9} mv@v7^ LO0run Oracle for free on those systems, sometimes not (it depends).ITPUB个人空间D'k}9k7j"eM)Y
RAC is very cool technology. But it’s expensive.
i!lxqE et \0
Availability, Part 1: 99.x% >98%
ITPUB个人空间;Gv9h_5Y v+pi
I think there are two ways of looking at availability (but one day I will as usual beITPUB个人空间R1Q1uh`k!u!O9a
proved wrong, of course).
I think there are two ways of looking at availability (but one day I will as usual beITPUB个人空间R1Q1uh`k!u!O9a
proved wrong, of course).
Q+@M-hq0One way of looking at availability is this: If you have a standalone Unix box it will
1r5t |H-W3lf0usually give you 99.9% availability over a year (some say 99.5, some say 99.9). It justITPUB个人空间Px$YsF:n)d3r3Q2[!w w
runs. And so does Oracle usually. If you have a two-node Unix cluster the availability
{+i(n7uzE;\-qqm;e0over a year drops to 98%.
ITPUB个人空间!sd#d H2L8N|Fp
Yep, quite surprising, but the two main reasons are that the increased complexity
};V0` yu#^e0(extra layers of code, extra hardware, etc.) introduced with clusters and RAC will
-ugM] N]*}xW0cause some additional downtime – and that it just takes longer to boot a cluster,ITPUB个人空间~^OjWC+H
startup RAC, and perform. some other management tasks.
Yep, quite surprising, but the two main reasons are that the increased complexity
};V0` yu#^e0(extra layers of code, extra hardware, etc.) introduced with clusters and RAC will
-ugM] N]*}xW0cause some additional downtime – and that it just takes longer to boot a cluster,ITPUB个人空间~^OjWC+H
startup RAC, and perform. some other management tasks.
h^M$P)]0And if you believe what Gartner Group, Oracle and others are saying, namely thatITPUB个人空间w2K:X]b
70% or more of downtime is caused by human errors and lack of knowledge…well,ITPUB个人空间9X;j Fu3^ R
what will happen when you introduce more complex hardware and more complex
Jp4z"Hu"p _l0software?
ITPUB个人空间Y1z XVl2i$D0^Y
Another way of looking at availability is this: Your standalone box is available 99.9%ITPUB个人空间+[ ~N@k;E
of the time. That 0.1% is what the other nodes in a cluster are for.
Another way of looking at availability is this: Your standalone box is available 99.9%ITPUB个人空间+[ ~N@k;E
of the time. That 0.1% is what the other nodes in a cluster are for.
F@E7a}'k[D[0I have worked with clusters (VMS, Unix and now Windows) since around 1988 orITPUB个人空间7H:c,_V l
1989 or whenever 6.0.35 came out, and clusters are just harder to setup, manage andITPUB个人空间+I!?j&l$]&R*w8?9`
run. If you need to run clusters in your business (for whatever valid business reasons)ITPUB个人空间 Z;l"rl]*G!G4d
you also will need extra personnel, extra consultants and extra skills in your
8c}u L_y g+[*U0organisation.
-@'?/t1k _8b1{k'Q0As a technical director in a small company that sells rather specialized consultants,ITPUB个人空间)|*P'R O D*B(r0m
I’m of course delighted when the complexity of a customer’s setup increases. ItITPUB个人空间*Vz(Z,fB%`'f.y
means they’ll call us (or Oracle) sooner or later. Since we live off catastrophes,
fnCW MXZ1|a0problems and troubles, I’m looking into a bright future, I’m sure .
|V!Mc5B"\V0What are the alternatives? Various standby database options. That can be either theITPUB个人空间.A3C kY*_D
standard standby database introduced in 7.3, or Data Guard, or some 3rd party tool likeITPUB个人空间4`h Oer6II)Z
Shareplex from Quest (I have absolutely no knowledge of this specific product – I just
;XUs2d wX%}n&n0know it exists, so don’t buy it without listening to people more clever than me).ITPUB个人空间4XSj)L/^gS zC
Note, that Oracle’s pricing policy changed recently regarding standby options (asITPUB个人空间;D ^,Ta4Bq5S
pointed out by one alert contributor on the Oracle-L list) so suddenly you now have toITPUB个人空间OR*u.[_,\jx&D JZX
pay full license for the standby nodes if you use them more than 10 days a year. AndITPUB个人空间+H8^;w#R`7h;k
always full price for the Data Guard nodes.
ITPUB个人空间@&^7v&n8xg;lq8g]
You could of course also create something fancy and creative yourself. We used to do
f^{3['M"?I0standby databases back in version 6 by applying archive logs manually on anotherITPUB个人空间"?wn"G)n#q7KsJ&R
database in constant recovery mode. Lots of issues, of course. But it was done. Or youITPUB个人空间*c"_ M)K,HL+G.k&L%e
could use log miner to extract DML from the archive logs and apply them on aITPUB个人空间:I6GU4K]:`4]lb+{
standby database. Or you could have system triggers that caught all DDL and DMLITPUB个人空间&Q'@ f,]X;?O
on a system and put them in load files that were then loaded in real time, near realITPUB个人空间to+g3pV MJ
time or much later on another system.
You could of course also create something fancy and creative yourself. We used to do
f^{3['M"?I0standby databases back in version 6 by applying archive logs manually on anotherITPUB个人空间"?wn"G)n#q7KsJ&R
database in constant recovery mode. Lots of issues, of course. But it was done. Or youITPUB个人空间*c"_ M)K,HL+G.k&L%e
could use log miner to extract DML from the archive logs and apply them on aITPUB个人空间:I6GU4K]:`4]lb+{
standby database. Or you could have system triggers that caught all DDL and DMLITPUB个人空间&Q'@ f,]X;?O
on a system and put them in load files that were then loaded in real time, near realITPUB个人空间to+g3pV MJ
time or much later on another system.
ITPUB个人空间u#FA| wIq-D
Those kinds of alternatives will need a little work, but they have one thing inITPUB个人空间"^2N r6[[
common: You could even do it with Oracle Standard Edition, which means that theITPUB个人空间zdO1v _1J
price drops from US$40.000,- per cpu to US$15.000,-.
2F(k&i ]h$m0
Those kinds of alternatives will need a little work, but they have one thing inITPUB个人空间"^2N r6[[
common: You could even do it with Oracle Standard Edition, which means that theITPUB个人空间zdO1v _1J
price drops from US$40.000,- per cpu to US$15.000,-.
2F(k&i ]h$m0
Availability, Part 2: 25/8/370?
!K)Cq)h$q7C0Recently, a client of mine were down for close to five hours because they needed to
1KRW\ o;SK0upgrade their RAC from 9.2.0.1.0 to 9.2.0.3.0. They were well prepared and
&VQz n"ux~0everything went as planned. It just took close to five hours. Here are the steps theyITPUB个人空间I J8` dTL
went through:ITPUB个人空间"F0H] ?(m2al)\
0. Shutdown of WeblogicITPUB个人空间I:J2KWd O*]l
1. Shutdown of both instancesITPUB个人空间 dA1HHv#@#F(jK`
2. Put SW on
/N)hFT'b,w&h03. Patch one node, which automatically patches the other node
f7]M%b;}}t04. Start one instance in non-clustered mode (parameter cluster_database = false)
C!Ti@/Pr].@Vf05. startup migrate ( ie. underscore parameters are set)ITPUB个人空间,M9i+m!WN U\Da
6. run catpatch.sql
4e@'mE'?07. shutdown
`,|5[t5h~0c08. startupITPUB个人空间0n`(ywD#V6R t
I’m sure it could have been done faster, but all this time the server/disk system was
;he ]&]${&u2Q0busy doing stuff it’s supposed to do. There was no “waste” or “errors” during theITPUB个人空间Lb-^5iK6\+g-h
upgrade. And it was done according to the Oracle documentation (which workedITPUB个人空间-U F*A:J TPI
completely as described). It just took five hours.
ITPUB个人空间S2sd c Q7N nOdj
Many people I talk to are surprised by this. But when you think about it, you don’tITPUB个人空间&xs^ ~jEE?0D
patch nodes or instances. You patch databases. Oracle has one database in a RAC
7m;k@*I(X W0installation.
Many people I talk to are surprised by this. But when you think about it, you don’tITPUB个人空间&xs^ ~jEE?0D
patch nodes or instances. You patch databases. Oracle has one database in a RAC
7m;k@*I(X W0installation.
ITPUB个人空间1kKn+b5L
So there really is no such thing as 24/7/365 (or 25/8/370 which is about as realistic inITPUB个人空间(R'SS`7Eg
the real world). There are HA options to the database plus scheduled downtimes for
&x4Q;es'D0[~E5x:L^0maintenance.
So there really is no such thing as 24/7/365 (or 25/8/370 which is about as realistic inITPUB个人空间(R'SS`7Eg
the real world). There are HA options to the database plus scheduled downtimes for
&x4Q;es'D0[~E5x:L^0maintenance.
ITPUB个人空间%Aos kO}"n
Oracle doesn’t have rolling upgrades. Oracle doesn’t have “online patching”. So
9H7^!^;J7i0your database needs to be down while you upgrade or patch it.ITPUB个人空间[Aah(L'Xc6d%K
Could you duplicate your database, eg with Oracle DataGuard, so your users could be
UJ2P F"YT7vh'e0running on one database while the other is being patched? I’m sure it’s possible, but
;w~B/q7@0I can’t see how since DataGuard requires you to be on the exact same patch level on
K Af&^(a0both Oracle and the OS. So it would appear that you need to shutdown and patch bothITPUB个人空间^v:`*Y6coen
databases at the same time.
Oracle doesn’t have rolling upgrades. Oracle doesn’t have “online patching”. So
9H7^!^;J7i0your database needs to be down while you upgrade or patch it.ITPUB个人空间[Aah(L'Xc6d%K
Could you duplicate your database, eg with Oracle DataGuard, so your users could be
UJ2P F"YT7vh'e0running on one database while the other is being patched? I’m sure it’s possible, but
;w~B/q7@0I can’t see how since DataGuard requires you to be on the exact same patch level on
K Af&^(a0both Oracle and the OS. So it would appear that you need to shutdown and patch bothITPUB个人空间^v:`*Y6coen
databases at the same time.
ITPUB个人空间 Zm F(a8Z[-P E3q+rx
If it’s supported to let DataGuard run while upgrading the primary database to a
T%D7I*JF!W2k0higher version or patch level, I’d be interested in the details.
If it’s supported to let DataGuard run while upgrading the primary database to a
T%D7I*JF!W2k0higher version or patch level, I’d be interested in the details.
rc,e8w9r(R0Did you notice what was missing from the list of actions above? The client didn’tITPUB个人空间DYiz0g$m5Y)_
take a backup before upgrading. For very good reasons Oracle recommends that you
,Ei[u4v-X0perform. a full and valid backup before applying patches or upgrading your system.
[[n `*\6qQfg0This client didn’t, but you should. Oracle might even recommend (again, for veryITPUB个人空间J5K+jQl}
good and valid reasons) that you take a new backup when you’re done with yourITPUB个人空间*f1e-IH%@a(w,t
upgrade/patch actions.
r(ZAl4Pf8o Z"S2?}0So there’s RAC plus scheduled downtimes. But there are also emergency patches thatITPUB个人空间Bs{-X8i9N
need to be applied fast. This could be due to an error encounted in the environment or
?gNEu?f0it could be a security patch. The time needed to apply emergency patches is hard to
hn+d0vy EN rO0plan .
i$u4n r'^%y'Q#RT0With RAC you get duplicate nodes, duplicate instances and one database. ThatITPUB个人空间1P3lD-ym vmg[-dA L
database can be hit by dictionary corruptions (I’m sure we’ve all seen one) or it can
j A2Me m5iW&X0be hit by the need for patching and upgrading. That’s downtime for your whole RACITPUB个人空间)B{ [ O*bPk%l
system.
qs'^A#R8I.{ wl9}0
Scalability
BY5\g^!CTZ0Scalability is of course much better with RAC than with OPS, and you don’t need as
;e"c\~k ][/E0many fancy tricks in order to make it scale well.
ITPUB个人空间yM6RT pNb&O
But. If you remove a bottleneck in any system (IT or other) a new bottleneck will now
M }vhU!O.F7H0be present. It might be smaller than the old bottleneck (hopefully and usually, but not
s`J;\0@5g!c0always), but it’s still a bottleneck.
But. If you remove a bottleneck in any system (IT or other) a new bottleneck will now
M }vhU!O.F7H0be present. It might be smaller than the old bottleneck (hopefully and usually, but not
s`J;\0@5g!c0always), but it’s still a bottleneck.
ITPUB个人空间T%r7IX6p(w:eG8^&n
With RAC pinging is done using CPU resources. Yes, that’s much faster than diskITPUB个人空间6UbMS_L3d*opY+R
resources. But what if you are strapped for CPU in your system and RAC therefore
&{7aM?_0cannot get enough CPU for the pinging?
With RAC pinging is done using CPU resources. Yes, that’s much faster than diskITPUB个人空间6UbMS_L3d*opY+R
resources. But what if you are strapped for CPU in your system and RAC therefore
&{7aM?_0cannot get enough CPU for the pinging?
/RQt(|W.@\5H5z)R0Mario Broodbakker from Digital/Compaq/HP in Holland has done some interesting
T*}8r-~d~ d0benchmarks on RAC that prove two things:
mRi y-B0It’s important to have enough CPU left for the RAC pinging activity.
gN}Uc)O0And you can still get into situations where traditional OPS workarounds are neededITPUB个人空间8oD C`)rv'^p
(data partitioning, etc.) in order to achieve maximum performance. Even theITPUB个人空间7p{*P;A ?YZ Dl
wonderfully complex and mythic GC_FILES_TO_LOCKS parameter can be useful atITPUB个人空间!}4u r%Z7S#s;L
times. It has deliberately been removed from the documentation because it was seenITPUB个人空间)xV9FA,U$wgHC{Q
by Oracle Marketing to send the wrong message.
ITPUB个人空间*E JlkCIP/~4~
So you say: Of course you should have the necessary CPU available for the RAC
I!{0A {#gB0pinging activities – hey, you should always size your system professionally. Yes. But
lb!^7Fga0what if you suddenly have a bunch of batch jobs or batch-like processes fighting over
9L Y6XOLs{M0the available CPU resources (PX, DBMS_JOB, backup, file copying, whatever)?ITPUB个人空间oh#DM#ttD
Yes, you can plan for many situations, but sooner or later your system will be in a
I$f \'LRA1V2]9B0situation where the system is running at 100% CPU, and that’s when you’ll see someITPUB个人空间gF#t$hkLs!x
really bad performance with RAC.
So you say: Of course you should have the necessary CPU available for the RAC
I!{0A {#gB0pinging activities – hey, you should always size your system professionally. Yes. But
lb!^7Fga0what if you suddenly have a bunch of batch jobs or batch-like processes fighting over
9L Y6XOLs{M0the available CPU resources (PX, DBMS_JOB, backup, file copying, whatever)?ITPUB个人空间oh#DM#ttD
Yes, you can plan for many situations, but sooner or later your system will be in a
I$f \'LRA1V2]9B0situation where the system is running at 100% CPU, and that’s when you’ll see someITPUB个人空间gF#t$hkLs!x
really bad performance with RAC.
ITPUB个人空间}-F Q0E9c
If you’re interested in Mario’s whitepaper about his RAC testing let me know, and
_1}Q/?0O3\0I’ll be happy to send it to you. RAC Development are planning to address severalITPUB个人空间[]O:A4}h [
issues he has pointed out or has already addressed them. Yet the lack of required CPU
6S+J;n{}s0resources is not something Oracle or RAC currently can do anything about.
4jt8G6JQ)a3g(?"?0
If you’re interested in Mario’s whitepaper about his RAC testing let me know, and
_1}Q/?0O3\0I’ll be happy to send it to you. RAC Development are planning to address severalITPUB个人空间[]O:A4}h [
issues he has pointed out or has already addressed them. Yet the lack of required CPU
6S+J;n{}s0resources is not something Oracle or RAC currently can do anything about.
4jt8G6JQ)a3g(?"?0
Manageability
ITPUB个人空间Xa j#GP d0[A
OPS has actually never been that bad to manage. Assign an instance number to each
0Q*K6z4eg0F0instance, startup and shutdown the instances in the same order every time, createITPUB个人空间_5DB%q:Ii!A-u'C
simple scripts for these things, and you’re pretty much rolling.
OPS has actually never been that bad to manage. Assign an instance number to each
0Q*K6z4eg0F0instance, startup and shutdown the instances in the same order every time, createITPUB个人空间_5DB%q:Ii!A-u'C
simple scripts for these things, and you’re pretty much rolling.
ITPUB个人空间"T@,H|e ?#|
It’s now possible to do very clever things with groups of instances, and OEM hasITPUB个人空间I`_/Uj$mT$i
been greatly enhanced to handle RAC – but most customers will still run a two-nodeITPUB个人空间}x-U+ia$m7r
cluster with two Oracle instances on it and can do fine with the good, old features
H7e,o6Ip/\7A'`J0from the OPS days.
It’s now possible to do very clever things with groups of instances, and OEM hasITPUB个人空间I`_/Uj$mT$i
been greatly enhanced to handle RAC – but most customers will still run a two-nodeITPUB个人空间}x-U+ia$m7r
cluster with two Oracle instances on it and can do fine with the good, old features
H7e,o6Ip/\7A'`J0from the OPS days.
DNW@vb9r(i0But it still requires more skills and more time to manage RAC than not. Added
2X3jXK8`"iYh0complexity means additional skills and additional time. You can of course define your
9md s}Wd&L"d0way out of it by calling it “planned downtime” or “maintenance” or “service time”
l6XyNm2`.zwL*T0instead of plain downtime.
9a4fo!a3` G7X4V0The end goal, though, is often to have the database (or rather the applications thatITPUB个人空间e WQa7}5\ rXXY
depend on the database) available most of the time.
)Le|/t'`(eJ0
Skills Required
)[0u:u(| xfI$H0I have already touched on this in several places in the paper, so let’s just repeat here
iELh*V,fN(?l0that it’s not only RAC skills that are needed, but also (and probably most) clusterITPUB个人空间c*e2_q-yT#H%_c
skills if your organisation is going RAC.
6ds#^z,}Qx%j0There’s one external Oracle RAC class (three days) and one so-called DSI (DataITPUB个人空间xjW9L u|
Server Internals) class for internal Oracle consumption available out there.
L}!R]*Y&w^0There are also RAC classes available from various external companies.
ITPUB个人空间r^m(LSo+}0O;]/q.ra
And there are lots of other people out there that know about OPS and RAC. Listen
u'ac l[7vSL0carefully to the bitter, twisted old men who’ve worked with OPS. When RAC isITPUB个人空间:@]H#z!v^
pushed to the limit, you could still need to do the same things that were required with
S[ \v4NB0OPS.ITPUB个人空间!RFr1O(p#Jw!TKQl%S
And there are lots of other people out there that know about OPS and RAC. Listen
u'ac l[7vSL0carefully to the bitter, twisted old men who’ve worked with OPS. When RAC isITPUB个人空间:@]H#z!v^
pushed to the limit, you could still need to do the same things that were required with
S[ \v4NB0OPS.ITPUB个人空间!RFr1O(p#Jw!TKQl%S
Troubleshooting
ITPUB个人空间RlZ'cVg#|-B
Ah yes, troubleshooting. I’ve seen many clusters that just froze for no apparentITPUB个人空间ElA0vS K6K
reason in my time. It’s always possible to make the OS or Cluster software dump a
jX$k*[0U;|~0trace/log file when it happens.
Ah yes, troubleshooting. I’ve seen many clusters that just froze for no apparentITPUB个人空间ElA0vS K6K
reason in my time. It’s always possible to make the OS or Cluster software dump a
jX$k*[0U;|~0trace/log file when it happens.
/{;y2j&W"{-e:Y0The resulting trace/log file from the cluster will normally be the size of Texas, and
+P"L0c*Gr#l` ? J.x0only one or two people in the entire vendor organisation can truly understand them,
/e/hw!W8me)K$G}H4G0you will be told.
ITPUB个人空间'c"cm t e|
Then the files (often with sizes measured in GB) are shipped to the vendor and some
F XTC;R0months later they will report back that it wasn’t possible to pinpoint the exact reason
8dw%z|B8oZ0for the complete cluster freeze or crash, but that this parameter was probably a bit lowITPUB个人空间0{U d]L&T
and this parameter was probably a bit high.
Then the files (often with sizes measured in GB) are shipped to the vendor and some
F XTC;R0months later they will report back that it wasn’t possible to pinpoint the exact reason
8dw%z|B8oZ0for the complete cluster freeze or crash, but that this parameter was probably a bit lowITPUB个人空间0{U d]L&T
and this parameter was probably a bit high.
ITPUB个人空间/My^X H0nJW
That’s what always happens. I have never – really: never – seen a vendor who couldITPUB个人空间h C/{!?F5l E4p3j
correctly diagnose and explain a hanging cluster or a cluster that kept crashing.
5Z#o(x@[c0As to Oracle trouble shooting I’m not so worried. Oracle will either have a
7wsBh\VFm6X0performance problem, which is easy to diagnose using the Wait Interface or you’llITPUB个人空间,gX.UW0H@7_
get ora-600 errors that are fairly easy to diagnose, although you’ll need to spend theITPUB个人空间&Z6{bmS
required 42 hours logging and maintaining an iTAR or SR or whatever the name is
$o+_6D/w5V5]h*Q0these days.
That’s what always happens. I have never – really: never – seen a vendor who couldITPUB个人空间h C/{!?F5l E4p3j
correctly diagnose and explain a hanging cluster or a cluster that kept crashing.
5Z#o(x@[c0As to Oracle trouble shooting I’m not so worried. Oracle will either have a
7wsBh\VFm6X0performance problem, which is easy to diagnose using the Wait Interface or you’llITPUB个人空间,gX.UW0H@7_
get ora-600 errors that are fairly easy to diagnose, although you’ll need to spend theITPUB个人空间&Z6{bmS
required 42 hours logging and maintaining an iTAR or SR or whatever the name is
$o+_6D/w5V5]h*Q0these days.
ITPUB个人空间9PT+W$@W7X'V
In other words: Finding out what’s wrong (if anything) in Oracle is much easier thanITPUB个人空间`a`q0h7q)@
finding out what’s wrong with a cluster.ITPUB个人空间Q'~#a7XO:^
In other words: Finding out what’s wrong (if anything) in Oracle is much easier thanITPUB个人空间`a`q0h7q)@
finding out what’s wrong with a cluster.ITPUB个人空间Q'~#a7XO:^
Conclusion
K3J"{E&C f,x0If you have a system that needs to be up and running a few seconds after a crash, youITPUB个人空间Vl8Zk1\UHn
probably need RAC.
{*?8TUl&Z'A,sb0If you cannot buy a big enough system to deliver the CPU power and or memory you
quK+_{] j0crave, you probably need RAC.
s)DOco{)} @A0If you need to cover your behind politically in your organisation, you can choose to
N1A/l(sj;R9Ns0buy clusters, Oracle, RAC and what have you, and then you can safely say: “We’ve
$gRjZ OWqE o'w$[0bought the most expensive equipment known to man. It cannot possibly be our fault ifITPUB个人空间5R^m%z'{ ? QJ+u
something goes wrong or the system goes down”.
ITPUB个人空间}!{ef;S Qx
Otherwise, you probably don’t need RAC. Alternatives will usually be cheaper ,
6k4rH E;g^0easier to manage and quite sufficient.
-sB!_fX[a N0Now please prove me wrong.
%{nQe j]0Mogens Nørgaard (mno@miracleas.dk) was with Oracle Support in Denmark for 10
;h1iwo;\V0years (three as an RDBMS analyst, four as head of RDBMS Support and three as head
h,Z~p8ugt2^:r0of Premium Services). He is co-founder and technical director of Miracle A/SITPUB个人空间&TE b'ck%]#A#Qm1{p
(www.miracleas.dk), which provides consulting, support and training on Oracle and
3F3O{;v_@9Vg|5N0SQL Server, in Maaloev, Denmark.
Otherwise, you probably don’t need RAC. Alternatives will usually be cheaper ,
6k4rH E;g^0easier to manage and quite sufficient.
-sB!_fX[a N0Now please prove me wrong.
%{nQe j]0Mogens Nørgaard (mno@miracleas.dk) was with Oracle Support in Denmark for 10
;h1iwo;\V0years (three as an RDBMS analyst, four as head of RDBMS Support and three as head
h,Z~p8ugt2^:r0of Premium Services). He is co-founder and technical director of Miracle A/SITPUB个人空间&TE b'ck%]#A#Qm1{p
(www.miracleas.dk), which provides consulting, support and training on Oracle and
3F3O{;v_@9Vg|5N0SQL Server, in Maaloev, Denmark.
-S$Z*j6A8k]3t*U*b SL0First claim to fame: First manager within Oracle to demand that his team (about 40ITPUB个人空间8l,E&f*My6t
people in Premium) used the YAPP performance diagnostics method created by AnjoITPUB个人空间7e"AD`IE IyYs
Kolk. .
ITPUB个人空间{,bu9DGNq
Second claim to fame: The OakTable Network (www.oaktable.net) was named after
v5Fm6u3QkJ)Jv4P0his dining table where some of the better Oracle scientists will gather a couple of
F)?m.z4?B5H0times each year.
Second claim to fame: The OakTable Network (www.oaktable.net) was named after
v5Fm6u3QkJ)Jv4P0his dining table where some of the better Oracle scientists will gather a couple of
F)?m.z4?B5H0times each year.
ITPUB个人空间~N{+f G$xsGp
Mogens and his co-director Lasse (lch@miracleas.dk) will use the profits from
j/W(R%Ty0} lEr0Miracle A/S to start up a micro brewery that can stop Carlsberg from taking over theITPUB个人空间0D%q8[ LVR6x7eV.i W
world. He believes Carlsberg is the Danish equivalent to the American Budweiser. IfITPUB个人空间'f{`U.X Q9cj
nothing else is available, though, he’ll drink both.
Mogens and his co-director Lasse (lch@miracleas.dk) will use the profits from
j/W(R%Ty0} lEr0Miracle A/S to start up a micro brewery that can stop Carlsberg from taking over theITPUB个人空间0D%q8[ LVR6x7eV.i W
world. He believes Carlsberg is the Danish equivalent to the American Budweiser. IfITPUB个人空间'f{`U.X Q9cj
nothing else is available, though, he’ll drink both.
文档出自oracle公司
- You_Probably_Dont_Need_RAC
- Android开发之旅:组件生命周期(一)
- 关于query命令的定时器编码的另外思考
- GoldenGate介绍
- 2012华为编程比赛之字符串数字字符排序
- Cannot convert value '0000-00-00 00:00:00' from column 1 to TIMESTAMP
- You_Probably_Dont_Need_RAC
- TPC-W在CentOS6.0下的搭建过程
- RzBorder.pas Number of elements differs from declaration TRzLEDCharacters
- 四位可逆素数
- 二分查找法的实现和应用汇总
- 对于互联网数据combo技巧,你知多少?
- 使用sql生成sql脚本
- iOS 适配iPhone5开发小结
- IP头部协议字段表(转载)