Exadata Bundle Patch 5 Gotcha's

A couple of months ago we became proud parents of a bouncing baby Exadata V2, (1/4 rack). I had about two weeks to configure it and get up to speed on the high points before we were to begin work on our first proof of concept (POC) for a local client. I had just enough time to patch the storage cells up to version 11.2.1.2.6 configure storage and create the databases. Well we’ve been working it pretty hard for the last couple of months with customer POC’s and our own testing but today I finally got a chance to get caught up on our patchsets for Exadata. The most current bundle patch releases are 4 and 5. After hearing about widespread problems with bundle patch 4 I decided BP5 was probably the best fit for us. I did a few searches and didn’t find any serious complications from BP5. However during the installation I did come across a couple of issues I thought I’d share here.

The first issue…
The first issue in the install process involved DBFS. We use DBFS for a high performance clustered file system to stage and load data. When we ran the "opatch apply" command we got the following error message after a few screens of output:

—- —– —– —– —– —– —– —–
The following actions have failed: Copy failed from ‘/home/oracle/stage/9870547/files/bin/dbfs_client’ to ‘/u01/app/oracle/product/11.2.0/dbhome_1/bin/dbfs_client’…
—- —– —– —– —– —– —– —–

We checked permissions on the file and directory but that didn’t appear to be the problem. So we renamed the file and retried the install.

—- —– —– —– —– —– —– —–
mv /u01/app/oracle/product/11.2.0/dbhome_1/bin/dbfs_client \
   /u01/app/oracle/product/11.2.0/dbhome_1/bin/dbfs_client.orig

—- —– —– —– —– —– —– —–

Once the file was out of the way the installatiton of BP5 continued without error.

The second issue…
You might classify the second issue as documentation problem. The instructions in the README file appears to recommend turning on RDS at this juncture. Here’s an excerpt from the README.

—- —– —– —– —– —– —– —–
**************
SPECIAL NOTE :
**************
The patch will activate RDS by default, if the applying system is using UDP,
user need to switch manually by doing:

cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk ipc_g

—- —– —– —– —– —– —– —–

Now it may be just the way I read this note but appear to be two problems with this "special note". The first thing I noticed was that this was not the command to activate RDS. It was in fact the command to configure it back to UDP. See MOS Doc ID 751343.1. So maybe the instructions meant for me to turn off RDS for the remainder of the patch install. This is more likely than not what was meant. But if that was the case why were there no instructions to turn RDS back on at the end of the install? Anyway I knew from experience that I would have to turn off RDS in order to startup the database in upgrade mode. You see a few weeks ago I went through the process of configuring RDS for the database home and the ASM home. I don’t know if its a bug or a feature or if I simply missed something when I configured RDS (don’t think so) after that I found I could no longer startup my databases using SQL*Plus. When I did I got this error:

—- —– —– —– —– —– —– —–
SQL> startup ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:bind-rds failed with status: 99 ORA-27301: OS failure message: Cannot assign requested address ORA-27302: failure occurred at: skgxpvaddr14 ORA-27303: additional information: Could not bind RDS socket to 192.168.8.201. Check interfaces specified for Oracle IPC (RAC/Exadata). [pid: 24249]
—- —– —– —– —– —– —– —–

Normally this wouldn’t be a problem because I can supply the -o option to srvctl to pass in startup options like "mount" or "force". Unfortunately "upgrade" is not one of the valid options for "srvctl start database". Consequently to run the post patch scripts in upgrade mode you must first switch back to UDP. Your database needs to be down during this process. At this point in the patching process your databases are all offlineat least on the node being patched (for rolling upgrades). Per the README the following command will reconfigure your database home for UDP:

—- —– —– —– —– —– —– —–
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk ipc_g

—- —– —– —– —– —– —– —–

After this is done you can start the database in upgrade mode and run the post patch scripts. The README states that you should only have one instance of the database up for this process and that you should have the cluster_database parameter set to ‘FALSE’. Since this is something you can complete from just one node of your cluster you don’t have to reconfigure RDS/UDP on all nodes. You need only run the post patch scripts from one running instance each of your databases. Of course after the patch installation is complete you will want to rebuild the IPC library for RDS using the following command:

—- —– —– —– —– —– —– —–
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk ipc_rds ioracle

—- —– —– —– —– —– —– —–

Aside from these two relatively minor issues BP5 installed cleanly. There are several pages of bug fixes in this patch and from what I’ve seen so far I can’t think of any technical reason not to go ahead and install it. By the way one of the first things you will be instructed to do is unlock the GridInfrastructure home using the "$GI_HOME/crs/install/rootcrs.pl -unlock" command. This process will also shutdown your cluster and all the Grid Infrastructure resources (databases, listeners, SCAN’s, etc.). Once the bundle patch is applied (and before you run the post patch scripts) you will run the "$GI_HOME/crs/install/rootcrs.pl -patch" command which will restart the entire Oracle stack.

About Randy Johnson

Comments

9 Responses to “Exadata Bundle Patch 5 Gotcha's”
  1. Yasin Baskan says:

    Hi Randy,

    The readme states, “if the applying system is using UDP”. So why not just skip this section as the db machine uses RDS by default?

    We had no problems applying the patch if you skip this step. No issues in startup upgrade either.

    • Randy Johnson says:

      Yasin, thanks for your comment. That is the way I read it too. But like I said since we configured RDS on our Oracle Home and Grid Home we have not been able to start our database using sqlplus. When we do we get the “Could not bind RDS socket” error. Have you verified that you really have RDS configured? How did you verify that? The documentation states that you should see something like “CELL interconnect IPC version: Oracle RDS/IP (generic)” in your database alert log when you startup the database. Would you mind checking and let me know? Like I mentioned in the article, I’m not sure at this point whether or not this is normal behavior for an RDS configured database or if it was some kind of problem with our environment.

      On another note are you running DBFS? Did you have any issues with BP5 relating to that?

    • Randy Johnson says:

      Okay so this week I’m at a client site working on a half rack Exadata and I’m afraid I’m going to have to eat some crow. Good thing I skipped breakfast today. This client is running Cell OS 11.2.1.2.6 and Oracle 11.2.1.0. The Oracle kernel is configured for RDS over the interconnect so I tried shutting down and restarting their database using sqlplus. The database started without any sign of the RDS bind error I’m seeing back in my lab environment. So this looks like it was a bug in our configuration. I still believe there is a problem with the BP5 install when DBFS is being used but haven’t heard from anyone else on the subject. We’ll be installing BP5 here at this client in the next few days. I’ll followup with a post with my findings.

  2. >I still believe there is a problem with the BP5 install when DBFS is being used but haven’t heard from anyone else on the subject.

    …can you be a little more specific?

    • Randy Johnson says:

      Hey Kevin. Here’s the error we got when installing BP5.

      The following actions have failed: Copy failed from ‘/home/oracle/stage/9870547/files/bin/dbfs_client’ to ‘/u01/app/oracle/product/11.2.0/dbhome_1/bin/dbfs_client’…

      After checking permissions on the target file and directory I just tried moving the file out of the way and re-trying the step. It executed without any further errors.
      I mentioned this in the post so is there anything else I can tell you about our environment that might help?

  3. Yasin Baskan says:

    Hi Randy,

    Sorry I could not reply earlier. I did not check if the machine I patches was using RDS, but it should be since the default Exadata installation uses RDS. I have applied BP5 on a system having BP4 and using DBFS with no problems. I have another one that will move to BP5 from BP3 and which is also using DBFS. Let me see how it will go.

    • Randy Johnson says:

      The error seemed to be that Opatch couldn’t copy the new version of dbfs_client to the target directory. The fix was pretty straight forward so I’m not sure why it would affect us and not you but maybe it had to do with the patch level of Oracle we were applying the patch to.

  4. Randy Johnson says:

    Okay so this morning I figured out what was causing my database startup from SQL Plus to fail after relinking for RDS. It’s a little embarrassing but in all fairness to Oracle I felt I owed them a followup. I also learned something a little interesting about RDS. As you may recall after relinking the oracle kernel so that it uses RDS for Interconnect traffic I started getting the following error when using SQL Plus to startup my database:

    SQL> startup ORA-27504: IPC error creating OSD context

    After we finished configuring our Exadata machine I proceeded to configure my shell environment. When I did I copied over my profile script from another server. Well as you might have already guessed I missed resetting a variable to the new server environment. The variable is ORA_CRS_HOME. It was still set to /opt/oracle/… while on our Exadata database servers it is the standard /u01/app/11.2.0/grid. What is interesting about this is that it has never been set properly but it never caused an error during startup using SQL Plus while the databases were using UDP for the Interconnect. It only started causing a problem after I relinked the kernel for RDS.

    My thanks to everyone who offered their comments on this post.

Trackbacks

Check out what others are saying about this post...


Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!