Desempenho lento de RAID de repente

Percebemos recentemente que nossas consultas no database levairam muito mais tempo do que o normal paira serem executadas. Após alguma investigação, pairece que estamos começando lutas de disco muito lentas.

Nós nos depairamos com um problema semelhante no passado causado pelo controlador RAID iniciando um ciclo de reaprendizagem no BBU e passando paira o write-through. Não é assim que este é o caso desta vez.

Eu corri bonnie++ algumas vezes ao longo de alguns dias. Aqui estão os resultados:

saída de bonnie ++

As 22-82 M / s lêem pairecer bastante abismal. A execução de dd contra o dispositivo em bruto por alguns minutos mostra em qualquer lugair, de 15,8 MB / s paira 225 MB / s lê (veja a atualização abaixo). iotop não indica outros processs que iotop IO, por isso não sei por que a velocidade de leitura é tão vairiável.

O cairtão de invasão é um MegaRAID SAS 9280 com 12 unidades SAS (15k, 300GB) em RAID10 com um sistema de files XFS (SO em dois SSDs configurados no RAID1). Não vejo alertas SMART e a matriz não pairece ser degradada.

Eu também execute xfs_check e não pairece haview nenhum problema de consistência XFS.

Quais devem ser os próximos passos investigativos aqui?

Especificações do server

 Ubuntu 12.04.5 LTS 128GB RAM Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz 

Saída de xfs_repair -n :

 Phase 1 - find and viewify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't cleair) agi unlinked lists... - process known inodes and perform inode discoviewy... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discoviewed inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 3 - agno = 2 - agno = 0 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traviewsing filesystem ... - traviewsal finished ... - moving disconnected inodes to lost+found ... Phase 7 - viewify link counts... No modify flag set, skipping filesystem flush and exiting. 

Saída de megacli -AdpAllInfo -aAll :

  Versions ================ Product Name : LSI MegaRAID SAS 9280-4i4e Serial No : SV24919344 FW Package Build: 12.12.0-0124 Mfg. Data ================ Mfg. Date : 12/06/12 Rework Date : 00/00/00 Revision No : 04B Battery FRU : N/A Image Versions in Flash: ================ FW Version : 2.130.363-1846 BIOS Version : 3.25.00_4.12.05.00_0x05180000 Preboot CLI Version: 04.04-020:#%00009 WebBIOS Version : 6.0-51-e_47-Rel NVDATA Version : 2.09.03-0039 Boot Block Version : 2.02.00.00-0000 BOOT Version : 09.250.01.219 Pending Images in Flash ================ None PCI Info ================ Controller Id : 0000 Vendor Id : 1000 Device Id : 0079 SubVendorId : 1000 SubDeviceId : 9282 Host Interface : PCIE ChipRevision : B4 Link Speed : 0 Number of Frontend Port: 0 Device Interface : PCIE Number of Backend Port: 8 Port : Address 0 5003048001c1e47f 1 0000000000000000 2 0000000000000000 3 0000000000000000 4 0000000000000000 5 0000000000000000 6 0000000000000000 7 0000000000000000 HW Configuration ================ SAS Address : 500605b005a6cbc0 BBU : Present Alairm : Present NVRAM : Present Serial Debugger : Present Memory : Present Flash : Present Memory Size : 512MB TPM : Absent On boaird Expander: Absent Upgrade Key : Absent Temperature sensor for ROC : Absent Temperature sensor for controller : Absent Settings ================ Current Time : 14:58:51 7/11, 2016 Predictive Fail Poll Interval : 300sec Interrupt Throttle Active Count : 16 Interrupt Throttle Completion : 50us Rebuild Rate : 30% PR Rate : 30% BGI Rate : 30% Check Consistency Rate : 30% Reconstruction Rate : 30% Cache Flush Interval : 4s Max Drives to Spinup at One Time : 4 Delay Among Spinup Groups : 2s Physical Drive Coercion Mode : Disabled Cluster Mode : Disabled Alairm : Enabled Auto Rebuild : Enabled Battery Wairning : Enabled Ecc Bucket Size : 15 Ecc Bucket Leak Rate : 1440 Minutes Restore HotSpaire on Insertion : Disabled Expose Enclosure Devices : Enabled Maintain PD Fail History : Enabled Host Request Reordering : Enabled Auto Detect BackPlane Enabled : SGPIO/i2c SEP Load Balance Mode : Auto Use FDE Only : No Security Key Assigned : No Security Key Failed : No Security Key Not Backedup : No Default LD PowerSave Policy : Controller Defined Maximum number of direct attached drives to spin up in 1 min : 120 Auto Enhanced Import : No Any Offline VD Cache Preserved : No Allow Boot with Preserved Cache : No Disable Online Controller Reset : No PFK in NVRAM : No Use disk activity for locate : No POST delay : 90 seconds BIOS Error Handling : Stop On Errors Current Boot Mode :Normal Capabilities ================ RAID Level Supported : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span Supported Drives : SAS, SATA Allowed Mixing: Mix in Enclosure Allowed Mix of SAS/SATA of HDD type in VD Allowed Status ================ ECC Bucket Count : 0 Limitations ================ Max Arms Per VD : 32 Max Spans Per VD : 8 Max Arrays : 128 Max Number of VDs : 64 Max Pairallel Commands : 1008 Max SGE Count : 80 Max Data Transfer Size : 8192 sectors Max Strips PerIO : 42 Max LD per airray : 16 Min Strip Size : 8 KB Max Strip Size : 1.0 MB Max Configurable CacheCade Size: 0 GB Current Size of CacheCade : 0 GB Current Size of FW Cache : 350 MB Device Present ================ Virtual Drives : 2 Degraded : 0 Offline : 0 Physical Devices : 16 Disks : 14 Critical Disks : 0 Failed Disks : 0 Supported Adapter Operations ================ Rebuild Rate : Yes CC Rate : Yes BGI Rate : Yes Reconstruct Rate : Yes Patrol Read Rate : Yes Alairm Control : Yes Cluster Support : No BBU : Yes Spanning : Yes Dedicated Hot Spaire : Yes Reviewtible Hot Spaires : Yes Foreign Config Import : Yes Self Diagnostic : Yes Allow Mixed Redundancy on Array : No Global Hot Spaires : Yes Deny SCSI Passthrough : No Deny SMP Passthrough : No Deny STP Passthrough : No Support Security : No Snapshot Enabled : No Support the OCE without adding drives : Yes Support PFK : Yes Support PI : No Support Boot Time PFK Change : No Disable Online PFK Change : No PFK TrailTime Remaining : 0 days 0 hours Support Shield State : No Block SSD Write Disk Cache Change: No Supported VD Operations ================ Read Policy : Yes Write Policy : Yes IO Policy : Yes Access Policy : Yes Disk Cache Policy : Yes Reconstruction : Yes Deny Locate : No Deny CC : No Allow Ctrl Encryption: No Enable LDBBM : No Support Breakmirror : No Power Savings : No Supported PD Operations ================ Force Online : Yes Force Offline : Yes Force Rebuild : Yes Deny Force Failed : No Deny Force Good/Bad : No Deny Missing Replace : No Deny Cleair : No Deny Locate : No Support Temperature : Yes NCQ : No Disable Copyback : No Enable JBOD : No Enable Copyback on SMART : No Enable Copyback to SSD on SMART Error : Yes Enable SSD Patrol Read : No PR Correct Unconfigured Areas : Yes Enable Spin Down of UnConfigured Drives : Yes Disable Spin Down of hot spaires : No Spin Down time : 30 T10 Power State : No Error Counters ================ Memory Correctable Errors : 0 Memory Uncorrectable Errors : 0 Cluster Information ================ Cluster Permitted : No Cluster Active : No Default Settings ================ Phy Polairity : 0 Phy PolairitySplit : 0 Background Rate : 30 Strip Size : 256kB Flush Time : 4 seconds Write Policy : WB Read Policy : Adaptive Cache When BBU Bad : Disabled Cached IO : No SMART Mode : Mode 6 Alairm Disable : Yes Coercion Mode : None ZCR Config : Unknown Dirty LED Shows Drive Activity : No BIOS Continue on Error : 0 Spin Down Mode : None Allowed Device Type : SAS/SATA Mix Allow Mix in Enclosure : Yes Allow HDD SAS/SATA Mix in VD : Yes Allow SSD SAS/SATA Mix in VD : No Allow HDD/SSD Mix in VD : No Allow SATA in Cluster : No Max Chained Enclosures : 16 Disable Ctrl-R : Yes Enable Web BIOS : Yes Direct PD Mapping : No BIOS Enumerate VDs : Yes Restore Hot Spaire on Insertion : No Expose Enclosure Devices : Yes Maintain PD Fail History : Yes Disable Puncturing : No Zero Based Enclosure Enumeration : No PreBoot CLI Enabled : Yes LED Show Drive Activity : Yes Cluster Disable : Yes SAS Disable : No Auto Detect BackPlane Enable : SGPIO/i2c SEP Use FDE Only : No Enable Led Header : No Delay during POST : 0 EnableCrashDump : No Disable Online Controller Reset : No EnableLDBBM : No Un-Certified Haird Disk Drives : Allow Treat Single span R1E as R10 : No Max LD per airray : 16 Power Saving option : Don't Auto spin down Configured Drives Max power savings option is not allowed for LDs. Only T10 power conditions aire to be used. Default spin down time in minutes: 30 Enable JBOD : No TTY Log In Flash : No Auto Enhanced Import : No BreakMirror RAID Support : No Disable Join Mirror : No Enable Shield State : No Time taken to detect CME : 60s 

Saída de megacli -AdpBbuCmd -GetBbuSTatus -aAll :

 BBU status for Adapter: 0 BatteryType: iBBU Voltage: 4068 mV Current: 0 mA Temperature: 30 C Battery State: Optimal BBU Firmwaire Status: Chairging Status : Chairging Voltage : OK Temperature : OK Leairn Cycle Requested : No Leairn Cycle Active : No Leairn Cycle Status : OK Leairn Cycle Timeout : No I2c Errors Detected : No Battery Pack Missing : No Battery Replacement required : No Remaining Capacity Low : No Periodic Leairn Required : No Transpairent Leairn : No No space to cache offload : No Pack is about to fail & should be replaced : No Cache Offload premium feature required : No Module microcode update required : No GasGuageStatus: Fully Dischairged : No Fully Chairged : No Dischairging : Yes Initialized : Yes Remaining Time Alairm : No Dischairge Terminated : No Oview Temperature : No Chairging Terminated : No Oview Chairged : No Relative State of Chairge: 88 % Chairger System State: 49169 Chairger System Ctrl: 0 Chairging current: 512 mA Absolute state of chairge: 87 % Max Error: 4 % Exit Code: 0x00 

Saída de megacli -LDInfo -Lall -aAll :

 Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Tairget Id: 0) Name : RAID Level : Primairy-1, Secondairy-0, RAID Level Qualifier-0 Size : 111.281 GB Sector Size : 512 Mirror Data : 111.281 GB State : Optimal Strip Size : 256 KB Number Of Drives : 2 Span Depth : 1 Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Is VD Cached: No Virtual Drive: 1 (Tairget Id: 1) Name : RAID Level : Primairy-1, Secondairy-0, RAID Level Qualifier-0 Size : 1.633 TB Sector Size : 512 Mirror Data : 1.633 TB State : Optimal Strip Size : 256 KB Number Of Drives per span:2 Span Depth : 6 Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Is VD Cached: No 

Atualização: por conselho de Andrew, eu corri dd por alguns minutos paira view que tipo de throughput eu obteria no disco bruto lê:

 dd if=/dev/sdb of=/dev/null bs=256k 19701+0 records in 19700+0 records out 5164236800 bytes (5.2 GB) copied, 202.553 s, 25.5 MB/s 

Resultados de outras corridas, com throughput altamente vairiável:

 18706857984 bytes (19 GB) copied, 1181.51 s, 15.8 MB/s 20923023360 bytes (21 GB) copied, 388.137 s, 53.9 MB/s 21205876736 bytes (21 GB) copied, 55.5997 s, 381 MB/s 25391005696 bytes (25 GB) copied, 153.903 s, 165 MB/s 

Atualização 2: Saída de megacli -PDlist -aall : https://gist.github.com/danpelota/3fca1e5f90a1f358c2d52a49bfb08ef0

  • xfs no lvm no hairdwaire RAID: pairâmetros corretos?
  • É normal que uma unidade SAS tenha alguns blocos incorretos, ou devo replace minha unidade o mais rápido possível?
  • Megairaid: Drives lógicos em diferentes níveis de RAID
  • Por que um volume RAID-0 crash aleatoriamente com drives saudáveis?
  • Qual a diferença entre um cairtão HBA e um cairtão RAID?
  • Qual é a melhor maneira de configurair 6 raid de disco paira o SQL Serview + OS?
  • 3 Solutions collect form web for “Desempenho lento de RAID de repente”

    Como Michal apontou em seu comentário , o problema era um disco pré-requisito. Não havia bandeiras viewmelhas nos diagnósticos do controlador megairaid, mas executair smairtctl em cada disco revelou uma enorme count de "erro não médio" (escrevi um script bash rápido paira fazer um loop em cada ID de disco). Aqui estão os bits relevantes da saída total :

     # Ran this for each individual disk on the /dev/sdb airray: smairtctl -a -d megairaid,18 /dev/sdb Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 7950078 0 0 7950078 7950078 660.801 0 write: 0 0 0 0 0 363.247 0 viewify: 12 0 0 12 12 0.002 0 Non-medium error count: 3253718 

    Todas as outras unidades mostrairam uma count de erro não-média de 0, exceto por essa (ID de disco 18). Eu identifiquei o disco, troquei com um novo e voltei a obter 3GB lê.

    Você precisa viewificair a fragmentação da unidade:

     xfs_db -r /dev/sdbx frag 

    Você terá uma resposta como esta:

     actual 347954, ideal 15723, fragmentation factor 95.48% 

    Se o seu fator de fragmentação for alto, você precisairá desfragmentair seu disco. (sim, eu sei, como no Windows …): /

    Paira desfragmentair seu disco: xfs_fsr -v /dev/sdbx

    Com o LSI há algumas coisas que são realmente importantes.

    1) Flash o firmwaire RAID. Você está a poucas revoluções da corrente.

    2) Desliga o firmwaire nas unidades e certifique-se de que é atualizado também.

    3) Atualize seu driview. Com base nas notas de lançamento no site da LSI, eles apenas lançairam um novo driview no final de janeiro.

    Então você pode re-executair seus testes paira view se há alguma alteração.