Tuesday, July 8, 2014

Flash Disk Replacement due to poor performance in Exadata X3-2 Environment


We have Exadata X3-2 Environment, where one of our Flashdisk was showing,


To identify a poor performance flash disk, use the following command:


CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS= 'warning - poor performance' DETAIL



name:                   FLASH_1_0
diskType:               FlashDisk
luns:                   1_0
makeModel:              "Sun Flash Accelerator F40 PCIe Card"
physicalFirmware:       TI35
physicalInsertTime:     2012-10-01T13:45:57-07:00
physicalSerial:         5L0039YS
physicalSize:           93.13225793838501G
slotNumber:             "PCI Slot: 1; FDOM: 0"
status:                 warning - poor performance


This flash disk is in poor performance status.

Recommended Action  

The flash disk has entered poor performance status. A white cell locator LED has been lit to help locate the affected cell. Please replace the flash disk.
If the flash disk is used for flash cache, then flash cache will be disabled on this disk thus reducing the effective flash cache size. If the flash disk is used for flash log, then flash log will be disabled on this disk thus reducing the effective flash log size. If the flash disk is used for grid disks, then Oracle ASM rebalance will automatically restore the data redundancy.

Sun Oracle Exadata Storage Server is equipped with four PCIe cards. Each card has four flash disks (FDOMs) for a total of 16 flash disks. The 4 PCIe cards are present on PCI slot numbers 1, 2, 4, and 5. The PCIe cards are not hot-pluggable such that Exadata Cell must be powered down before replacing the flash disks or cards.


Hence DataCenter Team replaced a flash disk in co-ordination with us (DBA) because the flash disk was in poor performance status.



1. Shut down the cell.

The following procedure describes how to power down Exadata Cell.Run the following command to check if there are offline disks on other cells that are mirrored with disks on this cell:


CellCLI > LIST GRIDDISK ATTRIBUTES name WHERE asmdeactivationoutcome != 'Yes'

If any grid disks are returned, then it is not safe to take the storage server offline because proper Oracle ASM disk group redundancy will not be intact. Taking the storage server offline when one or more grid disks are in this state will cause Oracle ASM to dismount the affected disk group, causing the databases to shut down abruptly.

Inactivate all the grid disks when Oracle Exadata Storage Server is safe to take offline using the following command:

CellCLI> ALTER GRIDDISK ALL INACTIVE

The preceding command will complete once all disks are inactive and offline. Depending on the storage server activity, it may take several minutes for this command to complete.

Verify all grid disks areINACTIVEto allow safe storage server shut down by running the following command.

CellCLI> LIST GRIDDISK

If all grid disks areINACTIVE, then the storage server can be shutdown without affecting database availability.

Stop the cell services using the following command:

CellCLI> ALTER CELL SHUTDOWN SERVICES ALL

Shut down the cell.

2. Replace the failed flash disk based on the PCI number and FDOM number.


3. Power up the cell. The cell services will be started automatically.


4.Bring all grid disks are online using the following command:
CellCLI> ALTER GRIDDISK ALL ACTIVE
5. Verify that all grid disks have been successfully put online using the following command:

CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus

        Wait until asmmodestatus from SYNCING to ONLINE for all grid disks. 
        The following is an example of the output:

         CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk

         FLASH_1_0       FL0034E1        normal
         FLASH_1_1       FL0034LC        normal
         FLASH_1_2       FL0034LL        normal
         FLASH_1_3       FL0034KL        normal
         FLASH_2_0       FL00339T        normal
         FLASH_2_1       FL00330H        normal
         FLASH_2_2       FL0032HH        normal
         FLASH_2_3       FL0033DA        normal
         FLASH_4_0       FL0033SS        normal
         FLASH_4_1       FL00347V        normal
         FLASH_4_2       FL0034PB        normal
         FLASH_4_3       FL0034PS        normal
         FLASH_5_0       FL0032JZ        normal
         FLASH_5_1       FL0034KC        normal
         FLASH_5_2       FL0035VV        normal
         FLASH_5_3       FL00365H        normal

CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
         DATA_DR_CD_00_inblrdrceladm03   ONLINE
         DATA_DR_CD_01_inblrdrceladm03   SYNCING
         DATA_DR_CD_02_inblrdrceladm03   ONLINE
         DATA_DR_CD_03_inblrdrceladm03   ONLINE
         DATA_DR_CD_04_inblrdrceladm03   ONLINE
         DATA_DR_CD_05_inblrdrceladm03   ONLINE
         DBFS_DG_CD_02_inblrdrceladm03   ONLINE
         DBFS_DG_CD_03_inblrdrceladm03   ONLINE
         DBFS_DG_CD_04_inblrdrceladm03   ONLINE
         DBFS_DG_CD_05_inblrdrceladm03   ONLINE
         RECO_DR_CD_00_inblrdrceladm03   ONLINE
         RECO_DR_CD_01_inblrdrceladm03   ONLINE
         RECO_DR_CD_02_inblrdrceladm03   ONLINE
         RECO_DR_CD_03_inblrdrceladm03   ONLINE
         RECO_DR_CD_04_inblrdrceladm03   ONLINE
         RECO_DR_CD_05_inblrdrceladm03   ONLINE

Oracle ASM synchronization is only complete when all grid disks show attribute asmmodestatus=ONLINE. Before taking another storage server offline, Oracle ASM synchronization must complete on the restarted Oracle Exadata Storage Server. If synchronization is not complete, then the check performed on another storage server will fail.


The new flash disk will be automatically used by the system. If the flash disk is used for flash cache, then the effective cache size will increase. If the flash disk is used for grid disks, then the grid disks will be recreated on the new flash disk. If those gird disks were part of an Oracle ASM disk group, then they will be added back to the disk group and the data will be rebalanced on them based on the disk group redundancy and asm_power_limit parameter.


Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance, do the following:

    • The rebalance operation may have been successfully run. Check the Oracle ASM alert logs to confirm
    • The rebalance operation may be currently running. Check the GV$ASM_OPERATION view to determine if the rebalance operation is still running.
    • The rebalance operation may have failed. Check the GV$ASM_OPERATION.ERROR view to determine if the rebalance operation failed.
    • Rebalance operations from multiple disk groups can be done on different Oracle ASM instances in the same cluster if the physical disk being replaced contains ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then rebalance operations will be queued.

Doc ID Referred :


HALRT-02011: Flash disk poor performance status (Doc ID 1206015.1)


Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)



No comments:

Post a Comment