I’ve just completed watching the Veeam v12 Data Platform Launch Event keynote and one theme I feel is getting a lot of chatter is the new “Direct to object storage” capabilities of the release. What this means is that as I create backups of my production data the first copy that is made is written to object storage. I’ve written quite a bit about v12 and object storage before, both in how to setup and consume it in the performance tier of Scale Out Backup Repositories as well as comparing copy mode object storage and Veeam Cloud Connect Backup, but this is not that. Direct to object storage is the idea that you write your first backup copy of production data to object storage, rather than as a secondary copy.
Anton Gostev himself highlighted this as a favorite feature and outlined that there are limited uses cases where it should be used. As a long time Veeam administrator and of late an architect of their largest Service Provider partner I have some thoughts into what those use cases should and should not be that I’d like to share.
The Do’s
✅ On Premises Backup Repository
The first and in my opinion most exciting use case for direct to object storage capabilities is as a replacement for your old block based on premises backup repositories. Object storage has numerous benefits over traditional storage servers using NTFS, ReFS or even XFS. These include true object-lock backed immutability, real scalability in that you simply continue to add nodes to the cluster as needed and API first, automation driven provisioning. All these things can combine to a scalable, secure, and durable on premises backup storage location.
✅ Distributed Workforce Agent Backups
As we find ourselves in the world of remote work in 2023 systems administrators have more concerns than in the past about protecting those laptops that our remote workers are using as critical company data further escapes the datacenter. By combining Veeam Backup and Replication v12, Veeam Service Provider Console v7 and public object storage services such as 11:11 Systems’ offering we can create a robust solution for managing these backups securely and at scale with self-service capabilities included.
✅ NAS Backups
As much as cloud-based storage services such as OneDrive, Google Drive and Dropbox have become prevalent in modern IT there are still a lot of traditional Network Attached Storage file servers around. The problem is these are usually measured in 100s of TBs and backups don’t scale well in any case. In this case I believe it’s appropriate to back these up via the NAS backup feature of VBR directly to immutable cloud-based object storage. Recoverability is in terms of files as opposed to needing to write Virtual Machines to a hypervisor so far more appropriate.
The Don’ts
Now that we’ve covered what are good use cases let’s talk about some things that should not be considered from a best practices point of view but are technically possible.
❌ Cloud-Based Object Storage for First Backup
There are going to be people who begin wanting to get rid of the on-premises backups all together and replace them with low-cost cloud object storage. While the immutability and ease of scale are tempting, don’t. Just don’t. Recovery from this for any backup platform would be painful at best, unsuccessful at worst. Further you lose so much of the power of Veeam Backup & Replication by doing this with capabilities such as Instant Recovery and SureBackup instantly going away.
❌ Remote Office/Branch Office (ROBO) Server Backups
One of the use cases specifically called out by Gostev for direct to object storage was the ROBO server. While I respect him very much, I do have to disagree with him here. As someone who’s had to restore this scenario in anger the pressure will be on in most if not all these situations to get that server back up and running as fast as possible, probably in your production datacenter with some creative networking setup, to get that remote location back up and in business as soon as possible. By and large direct to cloud-based backups for virtual machines or server grade agents should not in my opinion be considered as you quickly become limited in recoverability choices. If you cannot either back these up first to a small NAS unit at the ROBO or to storage available in your production location, then VCC-B may be interesting as a first write backup. While definitely not preferred at least things like Instant Recovery to Cloud Director can be an option to get these workloads available sooner.
Conclusion
While writing backups from Veeam Backup & Replication v12 directly to object storage is an exciting innovation to the platform it is a capability that should be given a great deal of consideration before putting into production grade use. Features such as immutability and scalability will drive adoption what you are using it for is far more important than the hype.
With the upcoming version 12 release of Veeam Backup & Replication the platform will start to support object storage as a primary location for backups. This is not a new idea, it feels like we’ve been talking about this forever, but it is a radical shift for the company and frankly one that for reasons many in the Veeam Certified Service Provider community have been dreading. While I believe object storage has its place, even within a VCSP, it should not in general practice be a replacement for Veeam Cloud Connect as VCC-B and VCC-R offer capabilities and enhancements that object storage services all by themselves will never be able to replicate.
What is Veeam Cloud Connect?
It’s appropriate to level set on the technologies at hand first. Veeam Cloud Connect was first announced at VeeamON 2014 and became available through partners shortly thereafter. There are two sides to today’s Cloud Connect Service: Veeam Cloud Connect Backup (VCC-B) and Veeam Cloud Connect Replication (VCC-R).
Veeam Cloud Connect Backup allows for a service provider to provision a slice of cloud storage and securely provision that a customer as a tenant. That storage can have further enhancements such as Immutability if on Linux repositories or with v12 object storage and Insider Protection, an out-of-band temporary storage location for deleted cloud backups as part of a ransomware mitigation strategy. Once the tenant is provisioned on the SP side a customer only needs to enter 3 pieces of information to add that storage as a repository or repositories to their backup infrastructure. Afterwards they can simply target that repo for backup copy or in some situations even direct backup jobs.
Veeam Cloud Connect Replication is the same general concept as VCC-B, but just with VMs being stored directly in IaaS. Your Service Provider would provision quotas in a VMware Cloud Director or vCenter environment in which powered off copies of the source VMs are copied. These replicas maintain a shorter maximum number of restore points which are stored as snapshots on the replica. In conjunction with either a VPN connection or the Veeam Network Extension Appliance you can extend your on-prem environment into your replicated IaaS one and run those systems remotely for several reasons.
Object Storage Review
Veeam has been slowly building support for object storage over a few releases now. In short order the progression has been:
9.5u4 (2019): Support for the ability to archive or dehydrate older restore points to object storage, known as move mode. This functionality is powered by the Scale Out Backup Repository (SOBR) construct in VBR, defining performance tier (on-prem block storage) and cloud or archive tier (object storage). With this support cloud tier can only support a single bucket per SOBR which can cause scaling issues.
10.0 (2020): Copy mode is added which allows that same SOBR to make an immediate copy of any restore point that hits the performance tier to be immediately copied to the cloud tier. This is commonly seen as the capability that best competes with the VCC-B functionality. Support for Immutability and mounting backups regarding cloud tier happens in this release as well.
11.0 (2021): Addition of support for Google Cloud Platform (GCP) Storage.
12.0 (2023): Host of improvements to object storage support.
Object in the performance tier with multiple buckets supported as extents.
Support for multiple buckets of the same type in the cloud tier.
Reconfiguration of the bucket folder structure to allow for optimized performance especially around API calls.
In all there has been a steady line of improvement and innovation towards object storage making it the first-class citizen it will be with v12’s release.
Verdict
All that begs the question, should I as a Veeam Backup & Replication administrator be migrating my offsite backups from Cloud Connect Backup to object storage? In my opinion, probably not. The compelling use case today regarding offsite backups is the same that’s been around since v10; if you are only sending a copy of backups offsite for the purpose of compliance, checking the box if you will, then object storage may be a good fit for you. It is exceptional in how it supports Object Lock (Immutability). Further it is more efficient than anything else to date in how it writes data which makes it well suited for on-premises storage if available, but for cloud copy usage the capabilities quickly fall off.
In the end my biggest reason why I would still choose VCC-B today is the restore scenarios. With Object Storage unless you are utilizing a VCSP like 11:11 Systems you aren’t going to be able to have the object storage close enough to a VMware native compute to facilitate timely restore. Your other options are the opposite ends of the spectrum; low cost providers who have no compute capabilities and may have extra charges that may apply to your organization depending on deletion.Data egress or API calls along with hyperscalers that do have compute adjacent but are a completely different architecture to vSphere are other concerns. These scenarios require conversion of backups to instances and rely on your IT staff having the skillset to securely create an environment for these workloads to run from.
When you consider the tiered approach to Disaster Recovery, pairing a broad spectrum copy of backups to VCC-B with Tier 0 or 1 targeted VCC-R you gain 2 things by sticking to VCC-B:
Support for seeding replicas from VCC Backups
A defined DRaaS environment pre-created for your organization that in the event of on-prem not being available that you can boot replicas quickly and less quickly begin restoring backups into.
While not necessarily as important as restorability one other major difference to be aware of is the ability to control how much of your on-prem backups you keep off site. Keep in mind that copy mode works on the principle that you want to keep everything you have on-site in an offsite location, so if you want to have 30 days on prem and 7,14,60, or 90 days offsite there really isn’t a path besides Grandfather-Father-Son (GFS) to get there with just copy mode. Further it’s going to be everything that targets the given repository, if you have VMs you don’t necessarily want to keep off prem (think of like a door control system that does you no good anywhere but in the physical location) your only option is to create separate repositories and jobs. With VCC-B and backup copy all these things can be done in a very granular manner.
Finally, you have the issue of expertise. While many of us might be conversant in object storage and how it works you may not necessarily know the ins and outs of bucket policy, ACLs, versioning, etc. It’s a technology you do need to know but when it comes to backups and especially the need to restore them in anger using Veeam Cloud Connect not only gets you an easy path to restore. VCSP employs experts in Veeam software and access to this expertise with best practices for Veeam backups and replicas can be critical, ensuring a more seamless restoration of services when the pressure is already on.
Conclusion
While the promise and the capabilities of object storage are exciting as of the v12 release Veeam Cloud Connect Backup offers many more options when it comes to the restore and management of your backup data and for that reason should still be the go-to for offsite backup copies. If you are wanting to begin leveraging the technology an on-premises object storage platform for your local backups is a far better solution and can give you an out of the box upgrade to what you are using today.
For those that haven’t been playing in the RHEL/CentOS/Fedora space for a while Red Hat finally got around to making CentOS a more “productized” Linux Distribution and replaced it with CentOS Stream. That then spawned a fewforked distributions to fill the gap, most notably Rocky Linux which recently released their version 9 to track the current cycle of RHEL.
Aside from the name pretty much everything in Rocky looks and feels the same as what you are used to with CentOS at this point, but because it is such a newly named Operating System VMware hasn’t really known how to deal with it until the 7.0u3c release. As you may be aware the 7.0u3 releases have had a bit of troubling history so many organizations have chosen to stay on the 7.0u2 for now so upgrading to get named release isn’t really going to work. This is quite the problem for those of us who are used to using the combination of VMware templates and Custom Specifications to rapidly deploy virtual machine instances.
Luckily there are a couple of options that VMware has finally documented in KB59557. The first of which is kind of janky, edit the /etc/redhat-release file and change the word “Rocky” to “CentOS.” Sure this works but anytime you run a system upgrade it’s going to change it back. The better option is to install and configure Canonical’s cloud-init. If you haven’t looked at cloud-init yet it’s definitely time as it allows you to be a bit more distribution agnostic in how you automatically provision virtual machines and cloud instances. As it is from the same maker Ubuntu has been supporting it and installs it by default for some time now so having support here is just nice.
So first thing’s first when you go to build your template source VM set the distribution to CentOS 8. This is as close as you are going to get until you upgrade to 7.0u3c or later. You then can build out your VM template as you normally would. Once the system is up you simply need to install cloud-init.
sudo yum makecache --refresh
yum -y cloud-init
Once installed it’s worth checking out /etc/cloud/cloud.cfg and ensure that the disable_vmware_customization parameter is set to false. This will allow VMware to utilize cloud-init if it is available.
Now that you have installed cloud-init you can shutdown the VM (after making sure the system is up to date of course) and convert it to a template. It is then just a matter of using your VMware Customization Specification under Policies and Profiles to supply your rules for how you will automate the deployment.
In my last post, Configuring Veeam Backup & Replication SOBR for Non-immutable Object Storage, I covered the basics of how to consume object storage in Veeam Backup & Replication (VBR). In general this is done through the concept of Scale-out Backup Repositories (SOBR). In this post we are going to build upon that land layer in object storage’s object-lock features which is commonly referred to in Veeam speak as Immutability.
First, let’s define immutability. What the backup/disaster recovery world thinks of as immutability is much like the old Write Once, Read Many (WORM) technology of the early 00’s, you can write to it but until it ages out it cannot be deleted or modified in any way. Veeam and other backup vendors definitely treat it this way but object-lock under the covers actually leverages the idea of versioning to make this happen. I can still delete an object through another client but the net effect is that a new version of that object is created with a delete marker attached to it. This means that if that were to occur you could simply restore to the previous version and it’s like it never happened.
With VBR once Immutability is enabled objects are written with Compliance mode retention for the duration of the protected period. It recognizes that the object bucket has object-lock enabled and then as it writes it each block is written with the policy directly rather than assuming a bucket level policy. If you attempt to delete a restore point that has not yet had its retention policy expire it won’t let you delete but instead gives you an error.
Setting up immutability with object storage in Veeam is the same as with non-immutability but with a few differences. This starts with how we create the bucket. In the last post we simply used the s3 mb command to create a bucket, but when you need to work with object-lock you need to use the s3api create-bucket command.
Once your bucket is created you will go about adding your backup repository as we’ve done previously but with one difference, when you get to the Bucket portion of the New Object Store Repository wizard you are going to check the box for “Make recent backups immutable” and set the number of days desired.
You now have an immutable object bucket that can be linked to a traditional repository in a SOBR. Once data is written (still the same modes) any item that is written is un-deleteable via the VBR server until the retention period expires. Finally if I examine any of the objects in the created bucket with the s3api get-object-retention command I can see that the object’s retention is set.
Veeam Backup & Replication (VBR) currently makes use of object storage through the concept of Scale-Out Backup Repositories, SOBR. A SOBR in VBR version 11 can contain any number of extents as the performance tier (made up of traditional repositories) and a single bucket for the capacity tier (object storage). The purpose of a SOBR from Veeam’s point of view is to allow for multiple on-premises repositories to be combined into a single logical repository to allow for large jobs to be supported and then be extended with cloud based object storage for further scalability and retention.
There are two general modes for object storage to be configured in a SOBR:
Copy Mode- any and all data that is written by Veeam to the performance tier extents will be copied to the object storage bucket
Move Mode- Only restore points that are aged out of a defined window will be evacuated to object storage or as a failure safeguard only when the performance tier extents reach a used capacity threshold. With Archive mode within the Veeam UI the restore points all still appear as being local but the local files only contain metadata that points to where the data chunks reside in the bucket. The process of this occurring in Veeam is referred to as dehydration.
In this post let’s demonstrate how to create the necessary buckets, how to create a SOBRs for both Copy and Move modes without object-lock (Immutability) enabled. If you haven’t read my previous post about how to configure aws cli to be used for object storage you may want to check that out first.
1. Create buckets that will back our Copy and Move mode SOBRs. In this example I am using AWS CLI with the s3 endpoint to make the buckets.
2. Now access your VBR server and start with adding the access key pair provided for the customer. You do this in Menu > Manage Cloud Credentials.
3. Click on Backup Infrastructure then right click on Backup Repositories, selecting Add Backup Repository.
4. Select Object Storage as type.
5. Select S3 Compatible as object storage type
6. Provide a name for your object repository and hit next.
7. For the Account Settings screen enter the endpoint, region and select your created credentials.
8. In the Bucket settings window click Browse and select your created bucket then click the Browse button beside the Folder blank and create a subfolder within your bucket with the “New Folder…” button. I’ll note here do NOT check the box for “Make recent backups immutable for…” here as the bucket we have created above does not support object-lock. Doing so will cause an error.
9. Click Apply.
10. Create or select from existing traditional, Direct Storage repository or repositories to be used in your SOBR. Note: You cannot choose the repository that your Configuration Backups are targeting.
11. Right click on Scale-out Backup Repositories and select “Add Scale-out Backup Repository…”
12. Name your new SOBR.
13. Click Add in your Performance Tier screen and select your repository or repositories desired. Hit Ok and then Next.
14. Leave Data Locality selected as the Placement Policy for most scenarios.
15. In the Capacity Tier section check to Extend capacity with object storage and then select the desired bucket.
16. (Optional but highly recommended): Check the Encrypt data uploaded to object storage and create an encryption password. Hit Apply.
17. This will have the effect of creating an exact copy of any backup jobs that target the SOBR both on premises and into the object store. To leverage Move mode rather than Copy you simply check the other box instead/in addition to and set the number of days you would like to keep on premises.
Now you simply need to target a job at your new SOBR to have it start working.
In conclusion let’s cover a bit about how we are going to see our data get written to object storage. For the Copy mode example it should start writing to data to the object store immediately upon completion of each run. In the case of leveraging Move mode you will see objects written after the run after the day specified to move archive. For example if you set it to move anything older than 7 days on-prem dehydration will occur after the run on day 8. These operations can be seen in the Storage Management section of the History tab in the VBR console.
Further if I recursively list my bucket via command line I can see lots of data now, data is good. 😉
% aws --endpoint=https://us-central-1a.object.ilandcloud.com --profile=premlab s3 ls s3://premlab-sobr-unlocked-copy/ --recursive
In my last post I worked through quite a few things I’ve learned recently about interacting with S3 Compatible storage via the CLI. Now that we know how to do all that fun stuff it’s time to put it into action with a significant Service Provider/Disaster Recovery slant. Starting with this post I’m going to highlight how to get started with some common use cases of object storage in Backup/DR scenarios. In this we’re going to look at a fairly mature use case, with it backing Veeam Backup for Office (now Microsoft) 365.
Veeam Backup for Microsoft 365 v6, which was recently showcased at Cloud Field Day 12, has been leveraging object as a way to make it’s storage consumption more manageable since version 4. Object also provides a couple more advantages in relation to VBM, namely an increase in data compression as well as a method to enable encryption of the data. With the upcoming v6 release they will also support the offload of backups to AWS Glacier for a secondary copy of this data.
VBM exposes its use of object storage under the Object Storage Repositories section of Backup Infrastructure but it consumes it as a step of the Backup Repository configuration itself, which is nested within a given Backup Proxy. I personally like to at a minimum start with scaling out repositories by workload (Exchange, OneDrive, Sharepoint, and Teams) as each data type has a different footprint. When you really need to scale out VBM, say anything north of 5000 users in a single organization, you will want to use that a starting point for how you break down and customize the proxy servers.
Let’s start by going to the backup proxy server, in this case the VBM server itself, and create folder structure for our desired Backup Repositories.
Now that we have folders let’s go create some corresponding buckets to back them. We’ll do this via the AWS S3 CLI as I showed in my last post. At this point VBM does not support advanced object features such as Immutability so no need to be fancy and use the s3api, but I just prefer the command structure.
Ok so now we have folder and buckets, time to hop in to Veeam. First we need to add our object credentials to the server. This is a simple setup and most likely you will only need one set of credentials for all your buckets. Because in this example I will be consuming iland Secure Cloud Object Storage I need to choose the “S3 Compatible access key” under the “Add…” button in Cloud Credential Manager (menu> Cloud Credentials). These should be the access key and secret provided to you by your service provider.
Now we need to go to Backup Infrastructure > Object Storage Repositories to add our various buckets. Start by right clicking and choose “Add Object Storage.”
1. Name your Object Repository2. Select S3 Compatible option3. Enter your endpoint URL, region, and select credentials4. Select your bucket from the dropdown menu5. Create a folder inside of your bucket for this repository’s data and hit finish
Now simply repeat the process above for any and all buckets you need for this task.
Now that we have all our object buckets added we need to pair these up with our on premises repository folders. It’s worth noting that the on-prem repo is a bit misleading, no backup data as long as you use the defaults will ever live locally in that repository. Rather it will hold a metadata file in the form of a single jetDB file that service as pointers to the objects that is the actual data. For this reason the storage consumption here is really really low and shouldn’t be part of your design constraints.
Under Backup Infrastructure > Backup Repositories we’re going to click “Add Repository..” and let the wizard guide us.
1. Name our repository2. Specify the hosting proxy server and the path to the folder you wish to use.3. If you don’t already have one created you can add an encryption secret to encrypt the data when specifying your object repository4. Specify the object storage repository and the encryption key to use. 5. Specify the retention period, retention level and hit finish.
One note on that final step above. Often organization will take the “Keep Forever” option that is allowed here and I will say I highly advise against this. You should specify a retention policy that is agreed upon with your business/organization stakeholders as keeping any backup data longer than needed may have unintended consequences should a legal situation arise; data the organization believes to be long since gone is now discoverable through these backups.
Also worth noting item-level retention is great if you are using a service provider that does not charge you on egress fees because it gives you more granular control in terms of retention. If you use a hyperscaler such as Amazon S3 you may find this option will drive your AWS bill up because of a much higher load on egress each time the job runs.
Once you’ve got one added again, rinse and repeat for any other repositories you need to add.
Finally the only step left to do is create jobs targeting our newly created repositories. This is going to have way more variables based on your organization size, retention needs, and other factors than I can truly do justice in the space of this blog post but I will show how to create a simple, entire organization, single workload job.
You can start the process under Organizations > Your Organization > Add to backup job…
1. Name your backup job2. Select Organization as your source3. Check the box for your desired organizaton4. Select your organization and click the edit button, allowing you to deselect all the workloads not in this job.5. Once edited you’ll see just the workload you want for your organization before hitting next6. I don’t have any exclusions for this job but you may have this need7. Select your desired proxy server and backup repository8. Finally do any customization needed to the run schedule.
Once again you’d want to repeat the above steps for all your different workload types but that’s it! If we do a s3 ls on our s3://premlab-ilandproduct-vbm365-exch/Veeam/Backup365/ilandproduct-vbm365-exch/ full path we’ll see a full folder structure where it’s working with the backup data, proving that we’re doing what we tried to do!
In conclusion I went way into depth of what is needed here but in practice it isn’t that difficult considering the benefits you gain by using object storage for Veeam Backup for Microsoft365. These benefits include large scale storage, encryption and better data compression. Hope you find this helpful and check back soon for more!
Recently a good portion of my day job has been focused on learning and providing support for s3 compatible object storage? What is s3 compatible you say? So while Amazon’s AWS may have created the s3 platform at its root today is an open framework of API calls and commands known as s3. While AWS s3 and its many iterations are the 5000 pound gorilla in the room many other organizations have created either competing cloud services or storage systems that can let you leverage the technology in your own environments.
So why am I focusing on this you may ask? Today we are seeing more and more enterprise/cloud technologies have reliance on object storage. Any time you even think of mentioning Kubernetes you are going to be consuming object. In the Disaster Recovery landscape we’ve had the capability for a few years now to provide our archive or secondary copies of data to object “buckets” as it is both traditionally cheaper than other cloud based file systems and provided a much larger feature set. With their upcoming v12 release Veeam is going to be providing the first iteration of their Backup & Replication product that can write directly to object storage with no need to have the first repository be a Windows or Linux file system.
To specifically focus on the VBR v12 use case many customers are going to choose to start dipping their toes into the idea of on-prem s3 compatible object storage. This can be as full featured as a Cloudian physical appliance or as open and flexible as a minIO or ceph based architecture. The point being that as Veeam and other enterprise tech’s needs for object storage matures your systems will be growing out of the decisions you make today so it’s a good time to start learning about the technology and how to do the basics of management from an agnostic point of view.
So please excuse the long windedness of post as I dive into the whys and the hows of s3 compatible object storage.
Why Object Then?
Before we go further it’s worth taking a minute to talk about the reasons why these technologies are looking to object storage over the traditional block (NTFS, ReFS, XFS, etc) options. Probably first and foremost it is designed to be a scale-out architecture. With block storage while you can do things like creating RAID arrays to allow you to join multiple disks you aren’t really going to make a RAID across multiple servers. So for the use case of backup rather than be limited by the idea of a single server or have to use external constructs such as Veeam’s SOBR to allow those to be stitched together, you can target an object storage gateway that then write to a much more scalable, much more tunable, infrastructure of storage servers underneath.
Beyond the scale-out you have a vast feature set. Things that we use every day such as file versioning, security ACLs, least privilege design and the concept of immutability are extremely important in designing a secure storage system in today’s world and most object storage systems are going to be able to provide these capabilities. Beyond this we can look at capabilities such as multi-region synchronization as a way to ensure that our data is secure and highly available.
Connecting to S3 or S3 Compatible Storage
So regardless of whatever client you are using you are going to need 4 basic pieces of information to connect to the service at all.
Endpoint: This will be a internet style https or http URL that defines the address to the gateway that your client will connect to.
Region: This defines the datacenter location within the service provider’s system that you will be storing data in. For example the default for AWS s3 is us-east-1 but can be any number of other locations based on your geography needs and the provider.
Access Key: this is one half of the credential set you will need to consume your service and is mostly akin to a username or if you are used to consuming Office 365 like the AppID
Secret Key: this is the other half and is essentially the generated password for the access key.
Regardless of the service you will be consuming all of those parts. With things that have native AWS integration you may not be prompted necessarily for the endpoint but be assured it’s being used.
To get started at connecting to a service and creating basic, no frills buckets you can look at some basic GUI clients such as CyberDuck for Windows or MacOS or WinSCP for Windows. Decent primers for using these can be found here and here.
Installing and Configuring AWS CLI Client
If you’ve ever used AWS S3 to create a bucket before you are probably used to go to the console website and pointy, clicky create bucket, set attributes, upload files, etc. As we talk more and more about s3 compatible storage that UI may or may not be there and if it is there it may be wildly different than what you use at AWS because it’s a different interpretation of the protocol’s uses. What is consistent, and in some cases or situation may be your only option, is consuming s3 via CLI or the API.
Probably the easiest and most common client for consuming s3 via CLI is the AWS cli. This can easily be installed via the package manager of your choice, but for quick and easy access:
Windows via Chocolatey
choco install -y awscli
MacOS via Brew
brew install awscli
Once you have it installed you are going to need to interact with 2 local files in your .aws directory on your user profile, config and credentials. You can get these created by using the aws configure command. Further aws cli supports the concept of profiles so you can create multiple connections and accounts. To get started with this you would simply use aws configure --profile obj-testwhere obj-test is whatever name you want to use. This will then walk through prompting you for 3 of those 4 pieces of information, access key, secret key and default region. Just as an FYI this command impacts 2 files within your user profile, regardless of OS, ~/.aws/config and ~/.aws/credentials. These are worth reviewing after you configure to become familiar with the format and security implications.
Getting Started with CLI
Now that we’ve got our CLI installed and authentication configured let’s take a look a few basic commands that will help you get started. As a reference here are the living command references you will be using
Awesome! We’ve got our first bucket in our repository. That’s cool but I want my bucket to be able to leverage this object lock capability Jim keeps going on about. To do that you use the same command but add the –object-lock-enabled-for-bucket parameter.
So yeah, good to go there. Next let’s dive into that s3api list-buckets command seen in the previous screenshot. Listing buckets is a good example for understanding that when you access s3 or s3 compatible storage you are really talking about 2 things; s3 protocol and the s3api. For listing buckets you can use either:
aws --profile jimtest --endpoint-url=https://us-central-1a.object.ilandcloud.com s3 ls
While these are similar it’s worth noting the return will not be the same. The ls command will return data much like what it would in a standard Linux shell while s3api list-buckets will return JSON formatted data by default.
Enough About Buckets, Give Me Data
So buckets are great but the are nothing without data inside them. Let’s get to work writing objects.
Again writing data, especially if you are familar with the *nix methods to s3 can be very similar. I can use s3 cp or mv to copy or mv data to my s3://test-bucket-2-locked/ bucket or any other I’ve created or between them.
Now that we’ve written a couple files let’s look at what we have. As you can see above once again you can do the same actions via both methods, it’s just the s3api way will consistently give you more information and more capability. Here’s what the api output would look like.
Take note of a few things here. While the s3 ls command gives more traditional file system output s3api refers to the objects with their entire “path” as the key. Essentially in object storage for our benefit it still has the concept of file and folder structure but it views each unique object as a single flat thing on the file system without a true tree. Key is also important because as we start to consider more advanced object storage capabilities such as object lock, encryption, etc. the key is often what you need to supply to complete the commands.
A Few Notes About Object Lock/Immutability
To round out this post let’s take a look at where we started as the why, immutability. Sure we’ve enabled object lock on a bucket before but what that really just does is enable versioning, it’s not enforcing anything. Before we get crazy with creating immutable objects its important to understand there are 2 modes of object lock:
Governance Mode – In Governance Mode users can write data and not be able to truly delete it as expected but there are roles and permissions that can be set (and are inherited by root) that will allow that to be overridden and allow data to be removed.
Compliance Mode – This is the more firm option where even the root account cannot remove data/versions and the retention period is hard set. Further once a retention date is set on a given object you cannot shorten it in any way, only extend it further.
Object Lock is actually done in one of two ways (or a mix of both); creating a policy and applying it to a bucket so that anything written to that bucket will assume that retention or by actually applying a retention period to an object itself either while writing the object or after the fact.
Let’s start with applying a basic policy to a bucket. In this situations for my test-bucket-2-locked bucket I’m going to enable compliance mode and then set retention to 21 days. A full breakdown of the formatting of the object-lock-configuration parameter and the options it provides can be found in the AWS documentation.
Cool, now to check that compliance we can simply use s3api get-object-lock-configuration instead against the bucket to check what we’ve done. I’ll note that for either the “put” above or the “get” below that there is no s3 endpoint equivalent, these are some of the more advanced features I’ve been going on about.
Ok, so we’ve applied a baseline policy of compliance and retention of 21 days to our bucket and confirmed that it’s set. Now let’s look at the objects within. You can view a particular object’s retention with the s3api get-object-retention command. As we are dealing with advanced features at the object level you will need to capture the key for an object to test. If you’ll remember we found those using the s3api list-objects command.
So as you can see we have both mode and retention date set on the individual object. What if we wanted this particular object to have a different retention period than the bucket itself? Let’s now use the s3api put-object-retention option to work and try to set that down to 14 days instead. While we use a general purpose number of days when creating the policy when we set object level retention it’s done by modifying the actual date stamp so we’ll simply pick a day 14 days from today.
Doh! Remember what we said about compliance mode? That you could make the retention shorter than what was previously set? We are running into that here and can see that it in fact works! Instead let’s try this again and set it to 22 days.
As you can see now not only did we not get an error but when you check your retention it is now showing for defined timestamp so it definitely worked.
This feels like a good time to note that object locking is not the same as deletion protection. If I create an object lock enabled bucket and upload some objects to it, setting the object retention flag with the right info along the way, I am still going to be able to use a basic delete command to delete that file. In fact if I use CyberDuck or WinSCP to connect to my test bucket I can right click on any object there and successfully choose delete. What is happening under the covers is that a new version of that object is spawned, one with the delete marker applied to it. For standard clients that will appear that the data is gone but in reality it’s still there, it just needs to be restored to the previous version. In practice most of the UIs you are going to use to consume s3 compatible storage such as Veeam or developed consoles will recognize what is going on under the covers and essentially “block” you from executing the delete, but feel secure that as long as you have object lock enabled and the data is written with a retention date the data has not actually gone away and can be recovered.
All of this is a somewhat long winded answer to the question “How S3 Object Lock works” which Amazon has thoughtfully well answered in this post. I recommend you give it a read.
Conclusion
In the end you are most likely NOT going to need to know via the command line how to do all the above steps. More likely you will be using some form of a UI, be it Veeam Backup and Replication, AWS console or that of your service provider, but it is very good to know how to do these things especially if you are considering on-premises Object Storage as we move into this next evolution of IT and BCDR. Learning and testing the above is a relatively low cost consideration as most object services are literally pennies per GB, possibly with the egress data charges depending on your provider (hey AWS…), but it’s money well spent to get a better understanding.
If there is any part of being an IT professional that I actively dislike it is the need to know the intricacies of how various vendors license their software products. As bad as it was when I was a “Geek of Many Hats” back when I used to work in the SMB space, believe me or not its even worse in the Service Provider space. This is because when I typically have to get involved with licensing questions or tasks it’s because something has gone sideways for a customer organization or they are wanting to do “creative” things with their licensing dollars. I get it, I really do, I used to be there and you have to get as much out of every budget dollar you possibly can because with “Enterprise IT” at any scale those licensing spends usually equate to efficencies or compliance check boxes, not products sold on a linear line like it is in other scenarios.
That said today I have the joy of trying to clean up after a human error situation with Veeam Backup for O365. Now while many of you may know and/or love this product in a one off scenario where the only organization you are protecting is your own VBO is also considered a Service Provider product where the software at scale such as at iland. While it is good for this, it’s a whole different world when you need to architect solutions where many organizations will share the same backup infrastructure and then be billed for just their use. In this case we have a situation where a customer has licensed a very small subset of their much larger overall Office365 organization for backup, but through any number of ways a job was created that captured the entire organization. This resulted in the given server pulling almost twice as many licensed users as the license file allows, so yeah, good times.
Now when talking about Veeam licensing in regards to how it is determined that a given user should be allotted a licensed seat it is based off of a) backup data existing on disk (or in objects) and b) actively being a part of backup jobs. Both of these situations need to be cleared up before you can begin to actually remove licenses from users. Unfortunately aside from actually deleting the jobs (which does not have an option to delete the data under it) very little of this process can be done in the UI, it has to be done via Powershell. That said, here’s the basic process for purging all data related to a given organization. If you need to be more fine tuned that that let me know and I’ll write that up as well.
This tutorial assumes that you are not sharing named repositories between tenant organizations. If you are doing that PLEASE CONSIDER THIS A GOOD TIME TO RETHINK THAT DECISION. Let me just say from experience it’s bad man, it’s bad. If you need to do this with a shared repository I would recommend you give the fine folks at Veeam support a call and have them assist you.
Remove the jobs associated to a given organization. This should be simple enough as selecting the organization in VBO, then selecting all jobs within, right click delete.
Next we need to ensure that all data associated with the given organization is being purged. Luckily Niels Englen has a handy script up on VeeamHub called VBO-ClearRepo.ps1 that will take a given repo name and purge all data from it. You should be able to take this and just feed it all the repositories that are relevant over and over again and it will purge the data.
Finally we need to go through and verify that all the licenses have been removed for a given organization. If the organization is of any size this is most likely not been cleaned out and will necessitate you manually doing so. Luckily it can be done in a relatively easy manner with this:
Specify the organization to set scope
$org = Get-VBOOrganization -Name "myawesomeorg.onmicrosoft.com"
#Get a count of how many mailboxes are involved before
Get-VBOLicensedUser -organization $org | measure
#Purge all users for our given organization
Get-VBOLicensedUser -organization $org | Remove-VBOLicensedUser
In my experiences this last cmdlet can take quite a while to run; for larger organizations I’m seeing a run rate of 10 licenses per minute but your mileage may vary. There is most likely a faster way to do it but would probably involve hacking on the Config DB something that should never be tried alone but that’s up to your scenario.
In this post we are going to continue on our journey of building your own, full featured, Veeam Backup and Replication v10 environment. As a reminder of how this series is going here’s the list:
Episode 2: On-Prem Windows Components (VBR, Windows Proxy, Windows Repo)
Episode 3: On-Prem Linux Components (Proxy, Repo), Create Local Jobs
Episode 4: Build Service Provider pod
Episode 5: Create cloud jobs to both Service Provider and Copy Mode to S3
Episode 6: Veeam Availability Console
Episode 7: Veeam Backup for Office 365
In this installment we are going to focus on building out the Microsoft Windows side of your customer environment, the one that is deployed on-premises to the customer data. We will begin by deploying a Veeam Backup and Replication server followed by setting up a Window Proxy server and a Repository based on ReFS.
I was recently honored to be a guest on Vince Wood‘s IT Reality Podcast for episodes 7 and 8. In the episodes a couple of the things that came up is that while for small environments Veeam will work right out of the box for most end customers of the software there is a requirement to scale the components out correctly. One of the advantages to Veeam Backup and Replication is that it is hardware and operating system agnostic for many of its components but with choice comes complexity.
One of the other things that came up was the fact that for most Veeam Cloud Connect customers, myself included up to 5 months ago when I began working for a Service Provider, the way that side is setup is a mystery. While I won’t exactly be giving up the special sauce here I was struck with the thought that a series of posts that show you how correctly set up a Veeam Backup and Replication v10 environment both on premises and for the Cloud Connect environment would be pretty educational. Each of these will have video walk throughs of the actual installations so you can follow along as well if you like.
I total we’ll be covering the following topics:
Episode 1: Intro and Common components
Episode 2: On-Prem Windows Components (VBR, Windows Proxy, Windows Repo)
Episode 3: On-Prem Linux Components (Proxy, Repo), Create Local Jobs
Episode 4: Build Service Provider pod
Episode 5: Create cloud jobs to both Service Provider and Copy Mode to S3
Episode 6: Veeam Availability Console
Episode 7: Veeam Backup for Office 365
In this episode we are mostly talking setting up your external SQL server for Veeam use. While the Veeam Backup and Replication installer includes MS SQL Server Express 2016 as part of the simple mode installation, but according to Veeam best practices you wouldn’t want to use this if you are backing up more than 500 instances, extensively using tape or want to leverage a common SQL products for other parts of the Veeam environment including Enterprise Manager or VeeamONE.
Intro to Lab Environment Design
Blog Post Steps to Come
Mount SQL Server 2016 ISO
Run Setup.exe
Allow for Updates check before installing
Rules Check and open up Windows Firewall for communication, either by disabling entirely or at least creating a port based rule for TCP 1433
In Feature Selections choose Database Engine Services only unless you or your SQL administrator have other needs
Either name your SQL Instance or leave it with the default MSSQLSERVER
Based on need or security practices enable or leave disabled the SQL Server Agent and SQL Server Browser services
In Database Engine Configuration
Set your authentication mode. I prefer to leave it Windows only but there are valid use cases to choose Mixed mode and create SQL only credentials
Add SQL server administrator accounts. Domain Admins for small environments but also want to add the service account you plan to run Veeam services under
Set your Data Directories. For the lab I will leave default but in production you should create 3 additional volumes for your SQL Server, 1 each for databases, logs and backups
Hit Install
Once completed you will be ready to proceed to installing your Veeam server itself. One thing that is ommitted from the video but probably a good idea is to go ahead and install MS SQL Server Management Studio. It’s a separate download, but often when you need to call Veeam support it is database related and SSMS is the way to manage the databases. You can directly download from Microsoft but if you happen to have Chocolatey available on the server installation is a choco install sql-server-management-studio -y away.
With that we’ll consider this step done and move on to installing the on-premises Windows components in the next post.
Hi there and welcome to koolaid.info! My name is Jim Jones, a Geek of Many Hats living in West Virginia.
This site was created for the purpose of being a locker full of all the handy things I’ve learned over the years, know I’m going to need again and know I’ll forget. It’s morphed a bit over the years as all things do but still that’s the main purpose. If you’d like to know more about me check out any of the social links at the top left of the site, I’m pretty much an open book.
If you’ve found this page I hope you find it’s contents helpful. Finally, anything written here are solely my views and do not reflect those of my employer.
You must be logged in to post a comment.