Veeam Backup Repository Best Practices Session Notes

After a couple days off I’m back to some promised VeeamON content. A nice problem that VeeamON had this year is the session choices were much more diverse and there were a lot more of them. Unfortunately this led to some overlap of some really great sessions. A friend of mine, Jaison Bailey of vBrisket fame and fortune, got tied up in another session and was unable to attend what I considered one of the best breakout sessions all week, Anton Gostev‘s Backup Repository Best Practices so he asked me to post my notes.

For those not too familiar with Veeam repos they can essentially be any manner of addressable disk space, whether local, DAS, NAS, SAN or even cloud, but when you start taking performance into account you have to get much more specific. Gostev, who is the Product Manager for Backup & Replication, lines out the way to do it right.

Anyway, here’s the notes including links to information when possible. Any notations I have are in bold and italicized.

Don’t underestimate the importance of Performance

  • Performance issues may impact RTOs

Five Factors of choosing Storage

  • Reliability
  • Fast backups
  • Fast restores
  • DR from complete storage loss
  • Lowest Cost

Ultimate backup Architecture

  • Fast, reliable primary storage for fastest backups, then backup copy to Secondary storage both onsite AND offsite
  • Limit number of RP on primary, leverage cheap secondary
  • Selectively create offsite copies to tape, dr site, or cloud

Best Repo: Low End

  • Any Windows or Linux Server
    • Can also serve as backup /backup proxy server
  • Physical server storage options
    • Local Storage
    • DAS (JBOD)
    • SAN LUN
  • Virtual
    • iSCSI LUN connected to in guest Volume

Best Backup Repo: High End

Backup Repos to Avoid

  • Low-end NAS  & appliances
    • If stuck with it, use iSCSI instead of other protocols * Ran into this myself with a Qnap array as my secondary storage, not really even feasible to run anything I/O heavy on it
  • SMB (CIFS) network shares
    • Lots of issues with existing SMB clients
    • If share is backed up by server, add actual server instead
  • VMDK on VMFS *Nothing wrong with running a repo from a virtual machine, but don’t store backups within, instead connect an iSCSI LUN directly to the VM and format NTFS
    • Extra logic on the data path- more chances for data corruption
    • Dependent on vSphere being functional
  • Windows Server 2012 Deduplication (scalability) *I get his rationale, but honestly I live and die by 2012 R2 deduplication, it just takes more care and feeding than other options. See my session’s slides for notes on how I implement it.

Immediate Future: Technologies to keep in mind

  • Server 2016 Deduplication
    • Same deduplication, far greater performance and scale (64 TB files) *This really will be a big deal in this space, there is a lot of upside to a simple dedupe ability rolled into a Windows server
  • ReFS 2.0
    • Great fit for backup repos because it has built in data corruption protection
    • Veeam is currently working on some things with it

Raw Disk

  • Raid10 whenever you can (2x write penalty, but capacity suffers)
  • Raid5 4x write penalty, greater risks)
  • Raid6 severe performance overhead (6x write penalty
  • Lookup Maximum performance per spindle
  • A single job can only keep about 6-8 spindles busy- use multiple jobs if you have them to saturate
  • RAID volume
    • Stripe Size
      • Typical I/O for Veeam is 25k-512KB
      • Windows Server 2012 defaults to 64KB
      • At least 128 KB stripe size is highly recommended
        • Huge change for things like Synthetics, etc
    • RAID array
      • Fill as many drives as possible from the start to avoid expansion
      • Low-end sorage systems have significant performance problems
    • File System
      • NTFS (Best Option)
        • Larger block size does not affect performance, but it helps avoid excessive fragmentation so 64KB block size recommend
        • Format with /L to enable larger file records
        • 16 TB max file size limit before 2012 (now 256)
        • * Full string of best practices for format NTFS partition from CLI: Format <drive:> /L /Q /FS:NTFS /A:8192
      • ReFS not ready for prime time yet
      • Other
    • Backup Job Settings
      • Always a performance vs disk space choice
      • Reverse incremental backup mode is 3x I/O per block
      • Consider forever incremental instead
      • Evaluate transform performance
      • Repository load
        • Limit concurrent jobs to a reasonable amount
        • Use ingest rate throttling for cross-SAN backups

Dedupe Storage: Pains and Gains

  • Gains
    • True global dedupe
    • Lowest cost/ TB
  • Do not use deduplicating storage as your primary backup repository!
  • But if you must leverage vendor-specific integrations, use backup modes without full backup transformation, us active fulls instead of synthetics
  • If backup performance is still bad, consider VTL
  • 16TB+ backup storage optimization for 4MB blocks (new)
  • Parallel processing may impact  dedupe ratios

Secondary Storage Best Practices

  • Vendor-specific integrations can make performance better
  • Test Backup Copy retention processing performance. If too slow consider Active Full option of backup copy jobs (new in v9)
  • If already invested and stuck
    • Use as primary storage and leverage native replication to copy backups to DR

Backup Job Settings BP

Built-In deduplication

  • Keep ON for best performance (except lowest end devices) even if it isn’t going to help you with Per VM backup files
  • Compression
    • Instead of disabling keep Optimal enabled in job and use “decompress before storing- even locally
    • Dedupe-friendly isn’t very friendly any more (new)
      • Will hinder faster recovery in v9
  • Vendor recommendations are sometimes self-serving  to achieve higher dedupe ratios but negatively effect performance

Disk-based Storage Gotchas

  • Gostev loves tape
    • Cheaper
    • Reliable
    • Read-only
    • Customer Success is the biggie
    • Tape is dead
      • Amazon, Google & 50% of Veeam customers disagree
  • Storage-level corruption
    • RAID Controllers are your worst enemies
    • Firmware and software bugs are common, too
    • VT402 Data Corruption tomorrow at 1:30 for more
  • Ransomware  possible

The 2 Part of the 3-2-1 Rule

  • 3 copies, 2 different medias, 1 offsite
  • Completely different storage type!

Storage based replication

  • Betting exclusively on storage-based replication will cost you your job
  • Pros:
    • Fantastic performance
    • Efficient bandwidth utilization
  • Cons:
    • Replicates bad data too
    • Backups remain in a single fault domain

Backup Copy vs. Storage-Based Copy

  • Pros:
    • Breaks the data loop (isolated source and target storage)
    • Implicitly validates all source data during its operation
    • Includes backup files health check
  • Cons:
    • Higher load on backup storage

Make Tape out of drives

  • Low End:
    • Use rotated drives
    • Supported for both primary & backup copy jobs
  • Mid-range:
    • Keep an off-site copy off-prem (cloud)
  • High End:
    • Use hardware-based WORM solutions

Virtualize your Repository (SOBR)

  • simplify backup storage and backup job management
  • Reduce storage hardware spending by allowing disks to be fully utilized
  • Improve backup storage performance and scalability

 

One Comment

  1. Pingback: Veeam options/Cloud connect