Tuesday, May 5, 2015

The 7th Guest: Placing eight queens on a chess board

On the return leg of my Narita to Sydney flight, I was passing time playing the classic DOS game, The 7th Guest. This game was one of the reasons I spent hundreds of dollars buying a CD-ROM drive! Today, it's available in the iTunes Store and the Google Play Store for a fraction of the cost. Today, the only way you'd spend a hundred dollars on this game would be from excess charges on a bad cellular plan.

The 7th Guest broke new ground in 1993: it was one of the first games to ship on a CD-ROM, and it used every megabyte available! You were treated to full motion video as you walked between the rooms of the beautiful haunted house, which was revolutionary at the time. Bill Gates once described it as "the new standard in interactive entertainment". Gamers around the world (including myself) scared themselves half to death as they wandered around the haunted house, solving puzzles and trying to unlock the mystery of the house.

Rated 15 and above. FOR A VERY GOOD REASON.
Enter the Queen's Puzzle: place 8 Queens on a standard 8x8 chess board such that they can't attack each other.

If you've read this blog, you'll notice I like chess. One of my old hobbies was writing a chess simulator in C.

Chess in C (Part 1)
Chess in C (Part 2) - Insert Pawn Pun Here
Chess in C (Part 3) - Rook, Rooks, Rookies, Wookies, same thing
Chess in C (Part 4) - I'm asking for input
Chess in C (Part 5) - Potential moves of a bishop: up-left, cardinal, pope

When I saw the Queen's Puzzle, my immediate thinking was to write an app that brute forced the solution. The solution space was fairly limited:

          1. Create a 8x8 board
          2. Place a Queen in position (x,y)
          3. Mark each square reachable by the Queen as attackable
          4. Iterate through the remainder of the board until you reach a square that cannot be attacked
          5. Place a Queen in this square
          6. Go to step 4 and repeat until there are 8 Queens on the board.

For step 2, position (x,y) would start as (1,1).
For step 4, the next square that could not be attacked would be position (x+2, y+1). So, if the first Queen is in (1,1), the next Queen would be placed in (3,2).

Unfortunately, I was on a plane and didn't have access to an IDE so I simulated with pen and paper.

Solving problems the old fashion way: pen, paper and swearing.
Queens placed at (1,1), (2,3), (3,5), (4,7), (5,2), (6,4), (7,6) and DARN IT!
Close, but no cigar! Only seven Queens fit. The algorithm fails at step 4: there are no squares that cannot be attacked. I refined the algorithm with two more steps:

          7. Clear the board
          8. Go to step 2, and place a Queen in the next available square.

This meant that instead of placing the Queen in position (1,1), placing it in position (1,2).

Great success!
I solved it and the returned to the next 7th Guest puzzle: swapping the position of 8 bishops on a 4x5 board. That puzzle was AWFUL.

You want to know what's worse than flying 10 hours on a budget carrier that hates you?
More on that in a later blog post.

But, being stuck on 10 hour NRT-SYD flight I thought...what would happen if the chess board was 3D and had a Z-dimension? If you can place 8 Queens on a chessboard of size 8x8, how many Queens can you place on a chessboard of size x-y-z? There is such thing as 3D chess: one of the more common configurations is the Raumschach board which is a 5x5x5 board. The inventor believed that chess should be like warfare: you can be attacked from the plane you are on, but also from above (aerial) and below (underwater).

Board size reduced from 8x8, otherwise you'd spend
months figuring out whether your move was legal.
I started by drawing a 8x8x3 board to get a ballpark idea of the complexity of the problem. Then I placed the 8 Queens on the top layer, and drew the possible attack spaces throughout the other layers.

After diagramming, it becomes clear that there are lots of places for a Queen to hide on an 8x8x3 board. While the Queen can move diagonally over a Z dimension, it has a weakness: the further you are away on the Z dimension, the more clear spots appear. And it's at that point I fell asleep and enjoyed the rest of my flight. The moral of the story: if you need to burn time on a flight, The 7th Guest as a great time waster. But if you want to have hair when you depart the plane, download the strategy guide as well.

Monday, January 26, 2015

Automating Certificate Signing Requests (CSR) generation for Dell iDRAC

I've been trying to get Puppet to automate the issuing of certificates to the iDRAC (Dell Remote Access Controller) for PowerEdge servers. One of the problems with Dell iDRACs is that on a certain batch of servers, the default key length was too short (1024 bit), rather than the minimum key length required by most Issuing Certificate Authorities (2048 bit).

Bumping the Certificate Signing Request (CSR) key length to 2048 bits requires the use of the racadm.exe utility: there is no way to change the CSR key length from the iDRAC UI, at least not in version 7.

Here are the steps you'll need to automate the generation of CSRs for all new servers that identify themselves as Dell.

Changing the CSR cryptographic key length size

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrKeySize 2048

Changing the CSR Common Name

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrCommonName "dellServer.myCloud.local"

Changing the CSR Organisation Name

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrOrganizationName "BURGER BURGER BURGER Pty Ltd"

Changing the CSR Organization Unit

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrOrganizationUnit "Security Operations"

Changing the CSR Locality

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrLocalityName "Sydney"

Changing the CSR State

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrStateName "NSW"

Changing the CSR Country

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrCountryCode "AU"

Changing the CSR e-mail address

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" config -g cfgRacSecurity -o cfgRacSecCsrEmailAddr "pki@burgerburgerburger.com"

Resetting the iDRAC unit

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" racreset soft

Generating the CSR

racadm.exe -r dellServer.myCloud.local -u "DRAC_USERNAME" -p "DRAC_PASSWORD" sslcsrgen -g -f "C:\temp\dellServer.mycloud.local.csr"

Once you've approved the CSR, you'll get a nice minted certificate you can use to eliminate those pesky iDRAC errors. It should look something like this (if you're using Chrome on OSX)

I've blacked out the Issuing CA details, but all the details in the certificate Subject Name
match with the script above.

Some other areas that you may want to automate in your environment include
  • Configuration of SNMP (for hardware alerting)
  • Uploading the certificate
  • Renaming the default iDRAC user account and setting a strong password
  • Disabling features that are not required
  • Changing the default IPMI key
Remember, once you've automated it for one server, the next 1000 servers are easy!

One caveat: I think iDRAC is unstable or has a memory leak: generating a Certificate Signing Request (CSR) only works reliably if you reset the iDRAC beforehand. Once I added this step in, the CSR generation process became more reliable.

Friday, January 23, 2015

Certificate Templates not appearing in Windows Server 2012 R2-based Microsoft Certificate Authority (CertUtil error 0x80070057)

You may have created some certificate templates in your Microsoft Certificate Authority (CA), such as a template for your VMware hosts. Derek Seaman has a good blog post on the exact settings and extensions required.

After creating a certificate template, I had a problem enabling it in the CA. While the certificate template appeared in the Certificate Templates console, it couldn't be enabled. The certificate template just wasn't appearing in the Certification Authority MMC snapin.

It appears in Certificate Templates..

...but you can't enable it. Because it doesn't appear.
IT JUST DOESN'T APPEAR. WHY??!?!?!?!?!?!?!

I tried using the certutil.exe command to enable the certificate template manually

certutil.exe -SetCATemplates VMware-SSL

Unfortunately, same problem: certificate template wasn't enabled, but this time I got a deceptive and nonsensical error message complaining that the "parameter" was "incorrect".

CertUtil: -SetCATemplates command FAILED: 0x80070057 (WIN32: 87 ERROR_INVALID_PARAMETER).
CertUtil: The parameter is incorrect.

When you create a certificate template, it needs time to replicate to all domain controllers. A certificate template is just another object in Active Directory, just like a user or computer account. So if the certificate template doesn't appear immediately, just wait the same amount of time you'd wait for a user to replicate across your DCs.

Back to our problem: why isn't the certificate template appearing? Well, it turns out that every online certificate enrolment service has to have contacted Active Directory and downloaded the certificate templates before it can be enabled. If you've previously configured an issuing CA and then destroyed it without cleaning up its entries, you'll never be able to enable the certificate template.

Performing a cleanup of issuing CAs in Active Directory Certificate Services

It's ADSI Edit Time!

Open ADSI Edit and connect to the Configuration context.

Select a well known Naming Context like Configuration, or Paul, or Jimmy.
If you see the names of OUs, you connected to the wrong context.

Navigate to CN=Certification Authorities,CN=Public Key Services,CN=Services,CN=Configuration in your domain.

CN=Certification Authorities contains your root CAs and CN=Enrollment Services contains your issuing CAs. If there are any extra CAs listed that no longer exist, you'll need to delete them.

In my case, I had an additional issuing CA in CN=Enrollment Services that no longer existed. When I deleted the CA, I could enable the template.


But I want to do everything from the command line because I want to use Server Core in the future.

Now you understand why the original error message "The Parameter is incorrect" is deceptive.
This is the same command that was run last time.

Unfortunately there are no event log error messages for this error. Microsoft just expect you to figure it out.

Monday, January 12, 2015

Errors installing VMware ESXi dump collector: it's probably your complex password!

The VMware ESXi dump collector installer has some vague error messages.

Error 1: Login failed due to a bad user name or password.

"Login probably failed due to a bad user name or password" would
be a more accurate error message.

  • The username is incorrect.I'm assuming you've verified the username and password are correct. If you haven't done this, try logging into Windows with the credentials and see if they are valid.
  • The user account does not permission in vCenter.Ensure the user account has permissions in vCenter. For the duration of troubleshooting, you may wish to give the user account administrative access. If you look at the netdump-reg-debug.txt file, you can see the following error.

    ERROR:ndreg-app:error: cannot connect to VC -- (vim.fault.NoPermission) {
       dynamicType = <unset>,
       dynamicProperty = (vmodl.DynamicProperty) [],
       msg = 'Permission to perform this operation was denied.',
       faultCause = <unset>,
       faultMessage = (vmodl.LocalizableMessage) [],
       object = 'vim.Folder:group-d1',
       privilegeId = 'System.View'
  • Your password contains the special character "VMware haven't escaped parameters correctly. Remove the " from your password and try again.

Error 2: Error 29457. A specified parameter was not correct.

Of the 30,000 error messages, I received error 29457.


  • Your password contains the character ;Fool me once, shame on you. If you look at the vminst.txt log file, you'll see something like

    esxiInstUtil: 01/12/15 13:06:12 ExecuteCmd::Cmd:  --register --address "vc.cloudlab.local" --user "svcvmwaredump@cloudlab.local" --password "*****" -s "vUq<~[" --thumbprint "C:\ProgramData\VMware\VMware ESXi Dump Collector\vmconfig-netdump.xml"

    The passwords I use are generated by a password management tool, which makes long non-sensical passwords with lots of special characters like !@#$%^&*();. Unfortunately, VMware haven't properly escaped the password field so installation will fail if the password contains the character ;

    In this case, my password was JI@$QH$7*@eie$Hhg8;vUq<~[. The installer thinks my password is JI@$QH$7*@eie$Hhg8, ignored the ;  and has left the vUq<~[ dangling.

Monday, December 1, 2014

Upgrading Dell PowerEdge R710 firmware without an OS installed (how hard could it be?!)

I've been automating the firmware update process for the Dell PowerEdge R710-series of servers. The intent of this automation is to ensure that all servers in the data centre have the exact same firmware levels, and to ensure that the automated installation of VMware ESXi on the servers will successfully complete without human intervention.

Before automating this process, I first had to understand how the manual Dell firmware update process was performed. I was disappointed to find that the firmware update process for Dell servers was poorly documented, not reliably reproducible (the anathema of scripting and process automation) and simply downright buggy.

The process was not as straightforward as I thought it would be: how hard could it be to update the firmware of a commodity Dell server? Well, it turns out that many Dell R710s ship with an expired Lifecycle Manager certificate, which prevents the application of Dell updates signed after a certain date! The process involved:

1) Updating the iDRAC firmware
2) Updating the expired Lifecycle Manager certificate using a Lifecycle Manager Repair Package
3) Updating other firmware within the server

There are bugs in the installation of Dell Update Packages (DUPs). If at first they don't apply, just try again! I've pointed out where this occurs to help you script around it. It's fairly disappointing from Dell: after eleven generations of servers, Dell still haven't figured out how to streamline the firmware update process. Oh well.

To proceed, you'll need to have made an update repository using the Dell Repository Manager.

Step 1. Download the latest iDRAC6 firmware

If you go to the iDRAC6 page on the Dell TechCenter, you'll have a choice between downloading a monolithic or blade version of iDRAC. Because you are upgrading firmware on an R710 (rackmount), you'll want the monolothic version. Monolithic is Dell's term for standalone server as opposed to blade server.

The latest version of the Dell iDRAC 6 is v1.98 and the filename is firmimg.d6. You can download it here.

Step 2. Download the Lifecycle Manager Repair Package (only for Dell R710)

If you have a Dell PowerEdge R710, the certificates used by the Dell Lifecycle Manager have expired. Lifecycle Manager is a component on Dell servers that manages the application of firmware updates to the BIOS, motherboard, network adapters, et cetera. If you try to apply any updates without applying the Lifecycle Manager Repair Package, you'll get the error message "The updates you are trying to apply are not Dell-authorized updates."

The latest version of the Dell Repair Package is V 1.5.5, A0 and the filename is BDF_1.5.5_BIN-12.usc. You can download it here.

Step 3. Update the iDRAC firmware

The iDRAC firmware needs to be updated to at least 1.97 so the Lifecycle Manager Repair Package can be applied. Updating iDRAC firmware can be done remotely or via the console (if you feel like freezing to death in your data centre/server closet/broom closet).

Step 3.1. Log into the iDRAC

If you don't know the password for your Dell iDRAC, try the default password combination.
Username: root
Password: calvin
I'm not sure who Calvin at Dell is. I might check on LinkedIn later when I'm waiting 40 minutes for a firmware update to complete.

Step 3.2. In the iDRAC, click iDRAC Settings (in the left menu bar)

On this page, verify the iDRAC firmware version.

Step 3.3. Click on the Update tab

For the record, Google Chrome on Mac works for uploading files.

Step 3.4. Select the iDRAC update package.

Click Choose File, and select the iDRAC update package downloaded in step 1.
The latest iDRAC 6 update package should be called firmimg.d6

Step 3.5. Confirm the old and new version, then click Next

Verify that the New Version is newer than the Current Version, then click Next.

Step 3.6. Wait for the iDRAC Firmware Image to be updated

This typically takes less than 5 minutes. After the iDRAC firmware is updated, the iDRAC will restart and may become unresponsive for a minute. You will need to login again.

Step 3.7. Verify the new iDRAC version has been installed

Once the firmware update is complete, log into the iDRAC again and verify that the existing iDRAC version matches the new version.

Step 4. Repair the Lifecycle Manager (for R710 only)

Updating the Lifecycle Manager will allow you to apply firmware updates to the rest of the system. You must have an iDRAC firmware version of at least 1.97 to continue.

Step 4.1. Upload the Lifecycle Repair Package

In the iDRAC interface, go to the Firmware Update screen and upload the Lifecycle Repair package. The filename should be BDF_1.5.5_BIN-12.usc.

Step 4.2. Confirm the package name

The package name should be System Services Recovery Image. Click Next to continue.

Step 4.3. Confirm upload

Click OK to proceed with the update.

Step 4.4. Wait for the Lifecycle Manager to update

It is common for the update to be stuck at 10% for approximately 3-4 minutes.

Step 4.5. If the upload fails, restart the iDRAC.

It is common for the update to fail. If this is the case, try applying the update multiple times. It is not uncommon for the update to take 3-4 attempts. If applying the update still fails, restart the iDRAC and try again. The link to restart the iDRAC is in the Quick Links section on the System Summary page.

Step 4.6. Complete update

When the update is complete, leave the iDRAC open. You may need to use it.

Step 5. Update the remainder of the server firmware

Lifecycle Controller allows you to update the other firmware in the server. This includes
  • Diagnostic utilities
  • Dell Lifecycle Controller
  • BIOS
  • PERC 6/i Integrated (Embedded)
  • Broadcom NetXtreme II Gigabit Ethernet (Embedded)
Here's an image of the typical firmware components that can be upgraded on a Dell server.

Step 5.1. Boot the server to the Unified Server Configurator

When the server is booting, press F10 to boot to the Unified Server Configurator. Dell also labels this as System Services.
If you have pressed F10 in time, you will see the message Entering System Services. To cancel, enter the IDRAC6 Configuration Utility
You can skip the memory test by pressing Esc.

Step 5.2. Wait for Unified Server Configurator to start

This can take several minutes.

Step 5.3. Start the Platform Update

You see the message reading "Warning: A system update is recommended since some components are potentially out of date. Please go to Platform Update to view and run availabile updates."? It's useless. It always appears due to a bug in the way Dell compares version numbers for the PERC 6/i.
At the Unified Server Configurator screen, click Platform Update.

Step 5.4. Launch the Platform Update

On the Platform Update screen, click Launch Platform Update.

Step 5.5. Select the update repository source

If you have a small number of servers (less than 5), it is easier to update via USB. Updating via FTP server or network share is possible, but introduces complexity: there needs to be appropriate network connectivity and credentials configured.

Step 5.6. Select the source

You need to have a repository file or folder that contains all the Dell updates relevant to your server. Repositories are created using Dell Repository Manager.

Step 5.7. Confirm use of the existing catalog file

This error is normal and will appear for any ISO created by the Dell Repository Manager. Click Yes to continue.

Step 5.8. Wait for the image to be verified

This can take up to 2 minutes. They're not lying.

Step 5.9. Review the list of firmware updates to be applied

When you have reviewed the list of firmware updates being applied, click Apply to begin.

Step 5.10. Wait for all Dell Update Packages (DUP) to be copied and verified

Step 5.11. Wait for the updates to be applied

This can take up to 45 minutes. The elapsed time may freeze: this is normal. During this process, there will be multiple reboots. Do not interrupt the reboots. You may click Esc to cancel the memory test during the reboots to speed the process.

Step 5.12. Wait while the server reboots multiple times

During the reboots, the screen may be blank for several minutes. This is normal.

Step 5.13. Wait to be returned to the Unified Server Configurator screen

Wait to be returned to the Unified Server Configurator screen.

Step 5.14. Verify that all updates have been applied

When all updates have been applied, the server will return to the Unified Server Configurator screen. You can verify that updates have been applied by comparing the Current version with the Available version. These should be the same, with the exception of the PERC 6/i Integrated (Embedded). Due to a bug in the way Dell compares the versions, it will appear as requiring an update (the PERC 6/i reports it version as, while the update package has the version 6.3.3-0002 which it thinks is older). A messaging saying everything is up to date would have been nice, but hey, that'd require a focus on the user experience!

If all the updates have been applied successfully, click the Cancel button.

Step 5.15. Exit the USC

At the Unified Server Configurator screen, click Exit and Reboot to boot the server normally.

Step 5.16 Confirm the exit

Click Yes to exit the USC.

And there you have it: an updated Dell PowerEdge R710 server! Next step: automate it.

Tuesday, March 11, 2014

An irreverent look at VMware's Software-Defined Data Centre (SDDC)

The intent of this blog post is to explain the SDDC in plain language. I get a lot of questions about SDDC so I'll address them here in an irreverent manner and hopefully you'll find it entertaining or educational. Preferably both, but I'll settle for the former.

What the heck is the Software-Defined Data Centre?
The Software-Defined Data Centre is VMware's strategy for delivering data centre services as a set of capabilities implemented in software. In VMware's SDDC vision, compute is delivered with vSphere, networking is delivered with NSX, management is delivered with vCenter and vCloud, and storage with vSAN. The SDDC is distinctly different from competing data centre architectures where network capabilities (such as VLANs, security, load balancing, etc) and storage capabilities (VMDK storage, storage replication, storage availability, etc.) are implemented in hardware.

The goal of the SDDC is to deliver a "fully automated, zero-downtime infrastructure for any application, and any hardware, now and in the future". While it is possible to deliver these goals in hardware (using orchestration and integration), VMware believe that software is a more appropriate mechanism and delivers higher levels of flexibility. And I tend to agree.

The SDDC consists of green and blue boxes.
Can I buy the SDDC?
The SDDC is a state your data centre can achieve, rather than a product. Don't worry, you'll be buying VMware licenses as your data centre matures from an SDDC 1.0 "basic virtualization" state to an SDDC 3.0 "Fully Cloud Ready" state. As you progress through your SDDC journey, you'll be buying licenses to unlock the capabilities your data centre requires (whether it be multi-tenancy, chargeback, self-service). If in doubt, just buy the vCloud Enterprise Suite.

Ignore the word "SAP" on the slide. I did and my life improved.
rom VMware Consulting blog article SDDC + SAP = CapEx/OpEx Savings)
Is the SDDC cheaper?
In most cases, SDDC will reduce and shift spending. Virtualization of servers and network devices can result in incredible reductions in capital and operational spending. For organisations transitioning to an SDDC model, network and storage infrastructure refresh spending will shift to vendors which support SSDC. An example is Nutanix customers who have consolidated their storage and compute spending into "converged infrastructure" spending. Another example is Amazon Web Services (AWS) using SDN to slash a $1b Cisco spend to $11m.

Sorry Cisco.
The other benefit of the SDDC is the increased agility of the IT organisation: people can actually get the infrastructure they need, when they need it. A case could be made that the capability and flexibility of AWS is not feasible to implement in hardware.

Where does cloud fit in with SDN?
The SDDC is one method of achieving cloud. The NIST definition of cloud computing includes on-demand self-service, broad network access, resource pooling, rapid elasticity and measured service. As long as the service you provide has those qualities, you have a cloud (regardless of the underlying technology). In fact, it's entirely possible to implement "as a service" offerings without any virtualization at all (I'd hate to do it though!). VMware believe the easiest way for enterprises to provide a cloud-like service is to pursue an SDDC architecture.
There's more than one way to implement an SDDC architecture.
Let's not talk about the other ways.

Isn't the SDDC just server virtualization?
Server virtualization is one component of the SDDC and it's neat for delivering more virtual servers with less spending and management overhead. But the delivery chain is only as strong as the weakest link: delivering a server in 10 minutes is of no use if it takes two weeks for firewall changes to be applied to make the server active. Provisioning a VM is just one part of delivering usable infrastructure.

So to deliver the network quicker, we virtualize the network?
Yes. This is known as Software-Defined Networking (SDN). Generally speaking, data centre capabilities which exist purely in software are more flexible, simpler, easier to test and can be integrated more seamlessly than hardware-defined solutions. This is also true with networks: many existing network architectures are device-centric and don't easily provide the provisioning flexibility and ease of integration required to implement on-demand cloud services such as rapid spin-up and teardown of networks.

Because software-defined networking solutions aren't constrained by physical network topology and are more programmable, more cloud-style flexible and programmatic approaches to networks are possible. This enables data centres to become less device-centric and more service-centric. VMware's SDN product is called VMware NSX.

But network devices can be orchestrated to provide what I need!
An alternative to SDN is to use an orchestration system to orchestrate VM and network changes (an example could be updating the perimeter firewall when a VM is provisioning/deprovisioning, or the ability to spin-up a new test network). If the orchestration system is implemented well, you'll get the same result as the SDDC: infrastructure services delivered quickly. If it isn't, you'll have a Rube Goldberg frankencloud. I'm not discounting the completeness or capability of physical network devices over SDN, I'm saying that SDN enables organisations to provide network capabilities (such as firewalls, site-to-site VPN, load balancing) in the hypervisor (which is more flexible and cost-effective) rather than the physical network.

A market-leading orchestration platform.
Why should I virtualize storage?
While vSAN has amazing infrastructure benefits (which I'll outline in another blog post), the strategic importance of vSAN is for storage to be managed with same flexibility and integration as compute. Storage today is a pain: storage administrators are either struggling to keep up with providing the amount of storage the data centre needs, and they're struggling to manage it. The presentation of "as a Service" IT models which enable the business to consume IT more easily make this problem worse. Instead of trying to optimise your storage procurement, provisioning and management processes, vSAN allows you to manage them the same way you would your compute capacity. When you run out of storage, simply buy another server.

But storage can be orchestrated today using robust interfaces provided by storage vendors!
Yes, it can. The majority of storage vendors have SDKs you can use to enable integration with orchestration tools or monitoring tools. If you already have this level of integration in your environment, you are already experiencing the benefits of the SDDC. If you are struggling with integration, or find that your home-grown integration doesn't deliver the feature completeness present with out-of-the-box solutions such as vSAN, it may be worth pursuing another strategy. Implementing technologies (like VAAI and VASA) which bring storage closer to compute aren't as easy as they should be. With the amazing capabilities of SANs, it feels strange that configuring array integration requires reading 30 pages guides, deploying vApps, create service accounts, configuring certificates, etc. You don't need to worry about any of this with vSAN, or any hyper-converged infrastructure. It just works seamlessly.

I followed a 32 page guide, submitted two firewall change requests, one storage change,
and one VMware change so that VASA provider would provide
a single concatenated string of disk capabilities. I guess it's a start.

Physical SANs are more fully featured than vSAN.
Horses for courses. Tradeoffs are involved with all data centre architectural decisions. In the majority of cases, choosing vSAN over a traditional physical SAN will be involve a tradeoff between features and seamless integration. Some customers may consider the lack of a deduplication capability in vSAN to be a glaring omission. Other customers are willing to choose vSAN over a physical SAN for the ease of management. I expect that over time, VMware will make vSAN feature-competitive with physical offerings (as they already have with VMware NSX and physical networks).

How will I know when I achieve the SDDC?
The CEO of VMware will personally hand you a key which will unlock over 600 airport lounges worldwide. SDDC is the journey and delivery of IT as a Service is the destination. Just because a data centre uses an SDDC architecture doesn't mean it's any good; it could be atrocious!

There's all the other usual KPIs for measuring success: amount of administrators per VM, current versus historical infrastructure spend, turnaround time on VM/firewall change request, etc. A good barometer of your ability to deliver IT as a Service is the stress level of project managers whose projects require IT infrastructure. In every organisation I've worked in, project managers are acutely aware of the lead times for delivery of IT infrastructure. Buy them a coffee and ask what they think about delivery of IT infrastructure. Another barometer is whether your developers use Amazon Web Services. Buy them a coffee as well, but understand that they'll likely not admit to using AWS!