Archive

Archive for June, 2007

What Happens When You Leave an Oracle Database in Backup Mode

June 13th, 2007 Alex Gorbachev No comments

While I was at that fine conference in Scotland, one of our clients did some maintenance on their Windows server where several databases were running. We had just begun supporting this machine, and hadn’t had a chance to test a reboot. And for some reason, backup/recovery wasn’t — until recently — a DBA responsibility at that organization, so that wasn’t under our supervision.

So one fine evening, there was a scheduled maintenance, and one of the databases didn’t shutdown cleanly (thanks to mis-configured Windows services, if I recall correctly). Consequently, the database crashed and later didn’t come back up. That’s a bit odd — crash-recovery should have worked with no problems, but instead it required media recovery.

My team-mate Neil tried recovery, and found that the database was requesting two-weeks-old archivelogs. Weird. We tried to restore from tape, but you know how it goes if it’s someone else who does the backups and knows how tape manager is configured, and all those details. After a while is was clear that rushing didn’t make any sense in the middle of the night, and the storage people were not available until the morning.

When I looked at it in the morning, the error message rang a bell: “Datafile 1 needs media recovery” in combination with the request for very old archivelogs. My immediate guess — the database is started with an old copy of the controlfile (I had seen that happen before, after someone messed around with relocations and screwed up init.ora).

On closer examination, we figured out that the controlfile SCN was actually current while the SCNs of the datafiles were way off. There were no copies of the datafiles on the server so it seemed like someone had restored the datafiles. Weird…

After more detailed investigation, Alex Fatkulin figured out that the database had been put into backup mode two weeks ago and, bingo, the datafile headers were frozen. (By the way, Alex has just joined Pythian and started in my team. A great addition I should say!)

An attempt to restore all archivelogs failed: a few gaps couldn’t be restored from tape. What a surprise! Anyway, to this day, we can’t fully explain why that happened, or what was going on with backups. But, at least the responsibility for backup/recovery is moving to the DBA team. Who would have thought of that? ;-)

The moral of the story: do not leave datafiles in backup mode. If you use hot backups outside of RMAN, such as snapshot technologies, take care to implement monitoring so that the database doesn’t stay in backup mode for much time. We usually set up this check in our monitoring tool when backup mode is used.

Another moral: let everyone do his job. Database backup/recovery is part of the DBA’s responsibilities.

Another interesting story is how someone lost 5 databases, but that might be a good topic for another post.

Categories: Alex @ Pythian Tags:

Changing Hostnames in Oracle RAC

June 11th, 2007 Alex Gorbachev No comments

Sometimes there is a desperate need to change hostnames for one or all nodes of an Oracle RAC cluster. However, this operation is not officially supported. From Metalink Note 220970.1 RAC Frequently Asked Questions:

Can I change the public hostname in my Oracle Database 10g Cluster using Oracle Clusterware?

Hostname changes are not supported in Oracle Clusterware (CRS), unless you want to perform a deletenode followed by a new addnode operation.
The hostname is used to store among other things the flag files and CRS stack will not start if hostname is changed.

One way to do it is to remove a node from a cluster, change its hostname, and then add it back to the cluster as a new node. You will need to make sure that ORACLE_HOME is also added to this node as well as the database instance configuration.

If you are brave enough, there is another way to do this. It’s not described anywhere on the Metalink, but there are no major hacks needed to implement it. The idea is to simply re-run the configuration of CRS (including re-formatting OCR and voting disks) and re-create the CRS resources after that. Obviously, this is not an online operation and the whole cluster is down for the duration of rename.

I assume that we have an Oracle RAC cluster running, with database(s) running already, optionally including ASM instances.

(more…)

Categories: Alex @ Pythian Tags:

Oracle 10.2 RMAN Backup on NFS

June 9th, 2007 Alex Gorbachev 8 comments

You probably wouldn’t expect a technical Oracle post here. ;-) But I can’t call it Oracle blog without relevant content so here it goes. Let’s call it a late birthday present — my first post came on the 2nd of May, 2006 so it’s now one year, one month and one week old.

Today, purely by accident, I came across Metalink Note 413098.1 Extremely Poor RMAN Backup Performance to NFS After Upgrade to 10.2 on Solaris. It seems that it also affects HP-UX at least.

What I love the best from that note is the workaround for RAC environments:

Do not write backups to NFS at 10.2. Backup to tape using Oracle Secure Backup or an alternative Media Manager or backup to a local disk drive.

Oracle Secure Backup?!?! Does it mean mean it has fewer bugs? Nice ad.

I.e. if my backup infrastructure is based on NFS - I’m screwed. The advice basically turns into “do not backup your databases - it’s slow”. Where are the other workarounds like stay on 10.1 or 9i? “Don’t use big databases” would be just as appropriate. :-)

If I have time, I’m going to test it on Linux but if someone has interest - benchmark backups from 10.1. and 10.2 on NFS and let us know. Better yet, run it through strace and check the system calls.

Categories: Oracle Tags: , , , ,

Oracle Coherence vs. Oracle RAC or Coherence + MySQL = MySQL RAC?

June 8th, 2007 Alex Gorbachev No comments

I came across Oracle Coherence today. Seems like this is another approach to clustering than Oracle RAC. Here is the marketing quote from the Oracle website:

Oracle Coherence is a JCache-compliant in memory distributed data grid solution for clustered applications and application servers. Oracle Coherence makes sharing and managing data in a cluster as simple as on a single server. It accomplishes this by coordinating updates to the data using cluster-wide concurrency control, replicating and distributing data modifications across the cluster using the highest performing clustered protocol available, and delivering notifications of data modifications to any servers that request them.

Seems like this is a way to scale middle tiers that require shared data without actually using the central database for that. On the other hand, looks like a clustering framework with rules defined by developers as opposed to Oracle RAC that is designed and built to be a black box delivering database services (which it is not - otherwise why would anyone talk about RAC readiness, workload partitioning and etc.).

Perhaps, Oracle Coherence is the way to bridge shared nothing and shared everything architectures. What do you think?

How about applications that partition the data across small Oracle SE servers and merge it all using Oracle Coherence. Or, get this, what about Coherence with MySQL on the backend?

Has anyone (yeah, I’m asking developers reading the blog) played with it and knows how it feels? It’s available for Java and .NET so those of you who have some spare time on your hands - go ahead and try it. And don’t forget to share your experience with the rest of us! ;-)

Categories: Alex @ Pythian Tags:

MSDBF 07: Final Wrap Up

June 8th, 2007 Alex Gorbachev No comments

A few days ago, I did a short post about the start of the Miracle Scotland Database Forum 2007. I decided to wait until getting back home to complete a full-blown description of the event. I’m starting this blog on board Air Canada’s London-Ottawa flight. It’s late morning in Ottawa and I’m full of energy after a short nap. I should warn you, this will be quite lengthy, so draw a deep breath.

As you probably know, MSDBF 2007 took place in Edinburgh Castle. The castle is a beautiful place with a long history, and you probably need at least couple days to explore it all. The place is so inspiring, and you feel something very special while you are there — it seems like it has preserved all the aromas and sounds of the times when the castle was built. Some days the sun fills the castle with light and the breeze whispers through the bottlenecked passages, and you can hear the music as if a bagpiper is playing gently round the corner. During rainy days it feels like a fortress, impregnable even against hundreds of years of strong winds and heavy rains. Here are some photos from the castle.

So what could be a better place to hold an educational event? The castle disposes your mind to absorbing something new, be it Oracle technology or the unforgettable aroma and taste of Glenfarclas 105, kindly provided by the Glenfarclas Distillery. Needless to say, it was the only conference where excellent single malt whisky was available next to the water for the duration of the whole event.

As you can probably already imagine it was a special event. A few organizational overlaps didn’t spoil the friendly atmosphere and high quality of presentations. Mounting this database forum for the first time in Edinburgh requires a lot of effort. MSDBF 07 was organized by Thomas Presslie, and I would like to thank him for his unlimited dedication, for sacrificing everything he could to get excellent speakers from all over the world, for making everyone comfortable, and for making the event an absolute success.

(more…)

Categories: Alex @ Pythian Tags: