Sunday, 11 August, 2013 08:50 Written by Brian B
If you are on 11i and are planning to upgrade to R12 then make sure you review the below links on the Consolidated Upgrade Patch 2 (CUP2). http://ow.ly/2yTvWl
Virtualization and Cloud Made Simple and Easy with Oracle’s Latest Engineered Systems – Webcast http://t.co/HFu9lzsbD8
Linux Container (LXC) Part 2: Working With Containers http://t.co/pDkVzHyYwk
e-book Engineered for Extreme Performance http://t.co/Yht6oLOQUA
Oracle Launches New Oracle Linux 6 Certifications; Oracle Linux 5 Exams To Retire http://t.co/rQNHGGrBG7
Oracle is Unveiling the Latest Engineered System for Enterprise Virtualization http://t.co/I46E2oi3dy
Ready for detailed info on Oracle Multitenant ? Read this technical white paper http://t.co/VZso6WMRdH
The Case for Running Oracle Database 12c on Oracle Solaris http://t.co/0KEMnSocix
10 Things CIOs Should Know About The World’s First Cloud Database http://t.co/sm0KrQbMkj
Oracle VM Templates for Oracle Database http://t.co/nrO4OavkMi
Wednesday, 07 August, 2013 11:02 Written by Brian B
The Solaris Crash Analysis Tool is a fantastic solution that is available in “My Oracle Support” (MOS) that can assist those that don’t have a strong background in Solaris internals in looking at potential issues with a system that is in a panic condition.
The built-in modular debugger (mdb) can also augment or at times work faster than SCAT
Here is a very basic walkthrough that I provide to our Collier IT engineers to assist them in initial diagnostics.
There’s much more, and I’ll add some additional walk-throughs later.
1. Useful information can be found in the stack backtrace to search keywords against MOS. Sometimes you get lucky here.
> $c vpanic(127def0, 2a100ed40c0, 0, 0, 3effffff8000000, 1869c00) cpu_deferred_error+0x568(ecc1ecc100000000, 2, 1000060000003a, 600000000, 0, 30001622360) ktl0+0x48(29fff982000, 2a100ed4d78, 30000, 16, 60, 30) pp_load_tlb+0x1e4(29fff980000, 29fff9822c0, 1d00, 29fff980300, 1822f00, 2) ppcopy_common+0x12c(70001d32500, 700030b2500, 1, 1, 29fff982000, 29fff980000) ppcopy+0xc(70001d32500, 700030b2500, 0, 0, 1822348, 70001d32500) do_page_relocate+0x228(2a100ed5120, 2a100ed5128, 700030b2500, 2a100ed53e0, 0, 2a100ed4fb0) page_relocate+0x14(2a100ed5120, 2a100ed5128, 1, 1, 2a100ed53e0, 0) page_lookup_create+0x244(60017811400, 6007c570000, 70001d32500, 0, 2a100ed53e0, 0) swap_getconpage+0xb4(60017811400, 6007c570000, 2000, 0, 2a100ed53c8, 2000) anon_map_getpages+0x474(60010c02008, 0, 200, 109a420, 2a100ed53e0, 1) segvn_fault_anonpages+0x32c(0, 800000, 0, 1, 6001753c2a8, 3) segvn_fault+0x530(300034bc3c0, 300012abc20, 1, 1, 892000, ffffffffff76e000) as_fault+0x4c8(300012abc20, 6001766b9d0, 890000, 60016881390, 186c0b0, 0) pagefault+0xac(890000, 0, 1, 0, 60016881318, 1) trap+0xd50(2a100ed5b90, 8903bb, 0, 1, fea0ad6c, 0) utl0+0x4c(1e, fe8f8104, 9e58, fe8fee34, 7aebd8, fe8fa524) >
2. Status can also give you things like the hostname and the kernel revision they’re running:
> ::status debugging crash dump vmcore.0 (64-bit) from sunbkpsrv5 operating system: 5.10 Generic_142900-13 (sun4u) panic message: UE CE Error(s) dump content: kernel pages only >
3. cpuinfo also shows some good info on what was running when the system panicked
> ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 0000183a620 1b 7 0 60 no no t-0 3000371fb20 java | | RUNNING <--+ +--> PRI THREAD PROC READY 60 2a1000c7ca0 sched EXISTS 59 30001e121e0 java ENABLE 59 30001d293e0 in.mpathd 59 3000371d480 java 59 3000371ce00 java 59 3000371c440 java 59 3000371f4a0 java ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 0000180c000 1d 6 0 59 yes no t-0 30001dc01c0 syslogd | | RUNNING <--+ +--> PRI THREAD PROC QUIESCED 99 2a100237ca0 sched EXISTS 60 2a100a83ca0 sched ENABLE 53 3000371c100 java 53 3000371c780 java 51 3000371aaa0 java 50 300032a9940 savecore >
4. ::ps gives good info on everything running at the time of the crash
> ::ps S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 0000000001838150 sched R 3 0 0 0 0 0x00020001 0000060012dab848 fsflush R 2 0 0 0 0 0x00020001 0000060012dac468 pageout R 1 0 0 0 0 0x4a004000 0000060012dad088 init R 808 1 807 807 0 0x42000000 0000060016acf890 nbevtmgr R 805 1 7 7 60002 0x4a304102 0000060016746038 java R 764 1 764 764 0 0x42000000 0000060016acec70 dbsrv11 R 712 1 711 711 0 0x42000000 0000060016ad04b0 bpcd R 709 1 708 708 0 0x42000000 00000600167fa040 vnetd R 386 1 385 385 0 0x42000000 0000060016ad10d0 snmpd R 382 1 382 382 25 0x52010000 00000600169a2048 sendmail R 381 1 381 381 0 0x52010000 00000600169a2c68 sendmail R 334 1 334 334 0 0x42000000 0000060016747878 syslogd R 327 1 327 327 0 0x42000000 00000600161c0490 sshd R 324 1 323 323 0 0x42010000 00000600167fb880 smcboot R 326 324 323 323 0 0x42010000 0000060013fba018 smcboot R 325 324 323 323 0 0x42010000 00000600167fac60 smcboot R 275 1 275 275 0 0x42000000 0000060016748498 utmpd R 267 1 266 266 0 0x42000000 00000600159bb860 pbx_exchange R 263 1 263 263 0 0x42000000 00000600159bac40 inetd R 257 1 257 257 0 0x42000000 0000060013e26c30 automountd R 259 257 257 257 0 0x42000000 0000060015d02488 automountd R 251 1 251 251 1 0x42000000 0000060013fbc478 rpcbind R 234 1 234 234 0 0x42010000 00000600161c10b0 cron R 208 1 208 208 0 0x42000000 0000060015d00c48 xntpd R 185 1 7 7 0 0x42000000 0000060013fbd098 iscsid R 155 1 154 154 0 0x42000000 0000060013e28470 in.mpathd R 144 1 144 144 0 0x42000000 00000600159ba020 picld R 139 1 139 139 1 0x42000000 00000600159bd0a0 kcfd R 136 1 136 136 0 0x42000000 0000060012daac28 nscd R 120 1 120 120 0 0x42000000 0000060015d030a8 syseventd R 80 1 79 79 0 0x42020000 0000060013e26010 dhcpagent R 61 1 61 61 0 0x42000000 0000060013fbb858 devfsadm R 9 1 9 9 0 0x42000000 0000060013e29090 svc.configd R 7 1 7 7 0 0x42000000 0000060012daa008 svc.startd R 357 7 7 7 0 0x4a004000 0000060016746c58 rc2 R 702 357 7 7 0 0x4a004000 00000600167490b8 lsvcrun R 703 702 7 7 0 0x4a004000 0000060013e27850 sh R 809 703 7 7 0 0x4a004000 00000600169a3888 pdde R 812 809 7 7 0 0x4a004000 0000060016ace050 pdde R 813 812 7 7 0 0x4a004000 00000600169a44a8 sleep R 342 7 7 7 0 0x4a004000 0000060015d00028 svc-webconsole R 717 342 7 7 0 0x4a004000 00000600169a50c8 sjwcx R 720 717 7 7 0 0x4a004000 00000600167fc4a0 java R 304 7 304 304 0 0x4a004000 0000060013fbac38 ttymon R 290 7 7 7 0 0x4a004000 00000600167fd0c0 svc-dumpadm R 293 290 7 7 0 0x4a004000 00000600161bf870 savecore R 269 7 269 269 0 0x4a014000 00000600161be030 sac R 278 269 269 269 0 0x4a014000 0000060015d01868 ttymon
5. ::panicinfo shows more info on the panic itself
> ::panicinfo cpu 0 thread 3000371fb20 message UE CE Error(s) tstate 80001606 g1 1270ce4 g2 127dc00 g3 3effffff8000000 g4 fbfffffe g5 1 g6 0 g7 3000371fb20 o0 127def0 o1 2a100ed4098 o2 0 o3 0 o4 fc30ffffffffffff o5 3cf000000000000 o6 2a100ed3761 o7 11020dc pc 104982c npc 1049830 y 0 >
6. Find the address of the thread that was executing when the system panicked.
> panic_thread/K panic_thread: panic_thread: 3003acf7020 gt;
7. Run the thread macro against the pointer value from above. Search for the t_procp structure.
> 3003acf7020$<$thread t_link = 0 t_stk = 0x2a108333ae0 t_startpc = 0 t_bound_cpu = 0x30004b42000 t_affinitycnt = 0 t_bind_cpu = 0xffff t_flag = 0x1800 t_proc_flag = 0x104 ... t_procp = 0x3005a6713e0 <== use the value here ... >
8. run the proc2u macro against the pointer from the t_procp structure. Look for the value stored in p_user.u_psargs. This is the full path to the command that was running on the CPU at the time of the system panic.
> 0x3005a6713e0$<proc2u p_user.u_execsw = execsw+0x28 p_user.u_auxv = [ { a_type = 0x7d8 a_un = { a_val = 0xffffffff7fffff90 a_ptr = 0xffffffff7fffff90 a_fcn = 0xffffffff7fffff90 } ... p_user.u_start = { tv_sec = 2007 Jun 11 00:00:00 tv_nsec = 0xcf77e0 } p_user.u_ticks = 0x191b148 p_user.u_comm = [ "bgscollect" ] p_user.u_psargs = [ "bgscollect -I noInstance -B /usr/adm/best1_7.3.00" ] <== use the value here p_user.u_argc = 0x5 p_user.u_argv = 0xffffffff7ffffc98 ... >
Saturday, 03 August, 2013 08:19 Written by Brian B
MultitennantX2 with WebLogic on DB12c. Please join OracleCAF launch to learn more – http://bit.ly/1b5SrgS
SPARC at 25: Past, Present, and Future Register @ http://bit.ly/17NIC41 to hear the story of SPARC from the people who shaped it.
Need backup solution for SPARC compute assets? Check out refreshed Oracle Optimized Solution for Backup & Recovery http://www.oracle.com/us/solutions/oos/oracle-backup-and-recovery/oos-bur-bwp-1847104.pdf …
Oracle Engineered Systems eBook Now Available – http://ow.ly/2yJboP
End-of-Life for SPARC SuperCluster T4-4 – http://ow.ly/2yJnAM
Oracle Solaris Cluster Product Bulletin, July 2013 – http://ow.ly/2yJJU3
Very cool videos about Upgrade to Oracle 12c – http://ow.ly/2yJTlX
How to Get Best Performance From the Oracle ZFS Storage Appliance http://www.oracle.com/technetwork/articles/servers-storage-admin/sto-recommended-zfs-settings-1951715.html …
What makes WebLogic 12c the most optimized App Server for Oracle Database 12c? Read the WP to find out http://bit.ly/18Nf7Ef
Some facts about SPARC T5 CPU architecture and “Software on Silicon”. What do you think about this new technology? http://pub.vitrue.com/va6o
Excellent Reading! Part 2 – DB12c and WLS – Application Continuity – http://bit.ly/133gF8H
Unveiling Oracle’s Latest Engineered System. Live webcast August 13th, 2013, 10am PT/1pm ET – http://bit.ly/18PyL2q
Beta testing begins this week for the new “Upgrade to Oracle Database 12c” certification exam (1Z1-060) http://bit.ly/1cb7APR
IDC White Paper: Oracle Virtual Networking Delivering Fabric Virtualization and Software Defined Networks http://ow.ly/2yLBAJ
Double Maximum Memory Capacity for SPARC T5-1B & T5-2 Servers http://ow.ly/2yMAvN
YouTube Video: Hands-On Labs for Oracle VM – http://pub.vitrue.com/4a3v
Complete integration, continuous innovation: See how Oracle Solaris and systems are evolving: – http://pub.vitrue.com/OUER
Database-as-a-Service and Platform-as-a-Service – http://ow.ly/2yNKEK
The Oracle Linux System Administration course is on Oracle’s top selling course list, check it out yourself https://blogs.oracle.com/linux/entry/best_system_administration_training_for
New Friday tip: removing unwanted networks in Oracle VM – http://bit.ly/1clXH1Q
READ_ME_FIRST: What Do I Do All of Those SPARC Threads? http://ow.ly/2yQrfq
Oracle Technology Day: Plug into the Cloud with Oracle Database 12c – Kolkata – http://ow.ly/2yRhB2
Using Ksplice for diagnostic purposes – http://pub.vitrue.com/POwROracle
Friday, 26 July, 2013 19:35 Written by Brian B
Join the live Oracle Solaris and Oracle’s Systems forum: The Best Platform for Oracle Software on 8/7 @ 9am PT!
http://bit.ly/13jr5NS
New Friday tip: Live Migrating the Oracle Database with Oracle VM Server for SPARC.
http://bit.ly/1bVXEd4
Why Sun ZFS Storage Appliance for Oracle Database can help you reap all the benefits of your database functions:
http://medianetwork.oracle.com/video/player/2549481537001
Needing Oracle WebLogic Server 12c installed on Solaris? Check out our VM templates for zones!
http://www.oracle.com/technetwork/server-storage/solaris11/downloads/zone-templates-1954157.html …
Oracle Solaris Forum: Secrets to the latest enhancements to the world’s #1 UNIX OS unveiled. Learn more here:
http://pub.vitrue.com/d1P7
The key benefits of running Oracle Solaris on Oracle’s x86 systems:
http://pub.vitrue.com/ewzE
Introducing the new Oracle Storage Expert Center for Database Management, your 1-stop shop for optimizing your DB
http://pub.vitrue.com/dYR5
Need to explain Oracle Multitenant to your mother? Or to your boss? Use this simple infographic!
http://pub.vitrue.com/EL9S
Oracle to Unveil the Latest Engineered System for Enterprise Virtualization:
http://pub.vitrue.com/1ndv
The next Oracle Solaris web forum is coming: Wednesday, August 7th. Learn more about Oracle Solaris development
http://bit.ly/1c0STgB
Oracle Database12c Interactive Quick Reference
http://pub.vitrue.com/Ea1c
Thursday, 18 July, 2013 15:41 Written by Brian B
A quick demo of using the ZFS hot spare feature. We talk of ZFS in the Oracle University course at our Minneapolis location.
After the install is complete I added 4 2-GB drives so ZFS had some drives to use.
bash-3.00# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0d0 /pci@0,0/pci-ide@7,1/ide@0/cmdk@0,0 1. c0d1 /pci@0,0/pci-ide@7,1/ide@0/cmdk@1,0 2. c1d1 /pci@0,0/pci-ide@7,1/ide@1/cmdk@1,0 3. c2t0d0 /pci@0,0/pci1000,30@10/sd@0,0 4. c2t1d0 /pci@0,0/pci1000,30@10/sd@1,0
There were no existing ZFS pools
bash-3.00# zpool list
no pools available
So I created a pool named brian, mirrored 2 drives and added one as a spare
bash-3.00# zpool create brian mirror c0d1 c1d1 spare c2t0d0
bash-3.00# zpool status brian pool: brian state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM brian ONLINE 0 0 0 mirror ONLINE 0 0 0 c0d1 ONLINE 0 0 0 c1d1 ONLINE 0 0 0 spares c2t0d0 AVAIL errors: No known data errors
Note that there is a spare identified in the zpool status output. Spares can be used by multiple pools. Mr. Eric Schrock that wrote the code for this tells us that there is now an FMA agent, zfs-retire, which subscribes to vdev failure faults and automatically initiates replacements if there are any hot spares available.
Now I force a failure and use zfs replace so the spare takes over
bash-3.00# zpool offline brian c0d1
Bringing device c0d1 offline
bash-3.00# zpool replace brian c0d1 c2t0d0
bash-3.00# zpool status brian pool: brian state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scrub: resilver completed with 0 errors on Sun Jun 22 11:55:46 2008 config: NAME STATE READ WRITE CKSUM brian DEGRADED 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c0d1 OFFLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c1d1 ONLINE 0 0 0 spares c2t0d0 INUSE currently in use errors: No known data errors
Note the the spare is now marked as INUSE but is still marked as a spare. The replacement is only temporary and once the original device is replaced it will return to the pool.
Now I replace the “failed” drive and the spare returns to the AVAIL state.
bash-3.00# zpool replace brian c0d1 c2t1d0
bash-3.00# zpool status brian pool: brian state: ONLINE scrub: resilver completed with 0 errors on Sun Jun 22 11:58:02 2008 config: NAME STATE READ WRITE CKSUM brian ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c1d1 ONLINE 0 0 0 spares c2t0d0 AVAIL errors: No known data errors
And finally I remove the spare from this pool if it is no longer required
bash-3.00# zpool remove brian c2t0d0
bash-3.00# zpool status brian pool: brian state: ONLINE scrub: resilver completed with 0 errors on Sun Jun 22 11:58:02 2008 config: NAME STATE READ WRITE CKSUM brian ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c1d1 ONLINE 0 0 0 errors: No known data errors