James O'Neill's Blog

May 23, 2016

Good and bad validation in PowerShell

Filed under: Powershell — jamesone111 @ 10:35 am
Tags:

I divide input validation into good and bad. image

Bad validation on the web makes you hate being a customer of a whichever organization. It’s the kind which says “Names can only contain alphabetic characters” so O’Neill isn’t a valid name.
Credit card companies think it’s easier to write blocks of 4 digits but how many web sites demand an unbroken string of 16 digits?

Good validation tolerates spaces and punctuation and also spots credit card numbers which are too short or don’t checksum properly and knows the apostrophe needs special handling. Although it requires the same care on the way out as on the way in as this message from Microsoft shows.
And bad validation can be good validation paired with an unhelpful message  – for example telling your new password you chose isn’t acceptable without saying what is.

In PowerShell, parameter declarations can include validation, but keep in mind validation is not automatically good.
Here’s good validation at work: I can write parameters like this. 
     [ValidateSet("None", "Info", "Warning", "Error")]
     [string]$Icon = "error"

PowerShell’s intellisense can complete values for the -Icon parameter, but if I’m determined to put an invalid value in here’s the error I get.
Cannot validate argument on parameter 'Icon'.
The argument "wibble" does not belong to the set "None,Info,Warning,Error" specified by the ValidateSet attribute.
Supply an argument that is in the set and then try the command again.

It might be a bit a verbose, but it’s clear what is wrong and what I have to do to put it right. But PowerShell builds its messages from templates and sometimes dropping in the text from the validation attribute gives something incomprehensible, like this 
Cannot validate argument on parameter 'Path'.
The argument "c:" does not match the "^\\\\\S*\\\S*$" pattern.
Supply an argument that matches "^\\\\\S*\\\S*$" and try the command again.

This is trying to use a regular expression to check for a UNC path to a share ( \\Server\Share), but when I used it in a conference talk none of 50 or 60 PowerShell experts could work that out quickly. And people without a grounding in regular expressions have no chance.
Moral: What is being checked is valid but to get a good message, do the test in the body of the function.

Recently I saw this – or something like it via a link from twitter.

function Get-info {
  [CmdletBinding()]
  Param (
          [string]$ComputerName
  )
  Get-WmiObject –ComputerName $ComputerName –Class 'Win32_OperatingSystem'
}

Immediately I can see too things wrong with the parameter.
First is “All parameters must have a type” syndrome. ComputerName is a string, right? Wrong! GetWmiObject allows an array of strings, most of the time you or I or the person who wrote the example will call it with a single string, but when a comma separated list is used the “Make sure this is a string” validation concatenates the items into a single string.
Moral. If a parameter is passed straight to something else, either copy the type from there or don’t specify a type at all.

And Second, because the parameter isn’t mandatory and doesn’t’ have a default, so if we run the function with no parameter, it calls Get-WmiObject with a null computer name, which causes an error. I encourage people to get in the habit of setting defaults for parameters.

The author of that article goes on to show that you can use a regular expression to validate the input. As I’ve shown already regular expression give unhelpful error messages, and writing comprehensive ones can be and art in itself in the example, the author used
  [ValidatePattern('^\w+$')]
But if I try
Get-info MyMachine.mydomain.com
Back comes a message to
Supply an argument that matches "^\w+$" and try the command again
The author specified only “word” characters (letters and digits), no dots, no hyphens and so on. The regular expression can be fixed, but as it becomes more complicated, the error message grows harder to understand.

He moves on to a better form of validation, PowerShell supports a validation script for parameters, like this
[ValidateScript({ Test-Connection -ComputerName $_ -Quiet -Count 1 })]
This is a better test, because it checks whether the target machine is pingable or not. But it is still let down by a bad error message.
The " Test-Connection -ComputerName $_ -Quiet -Count 1 " validation script for the argument with value "wibble" did not return a result of True.
Determine why the validation script failed, and then try the command again.

In various PowerShell talks I’ve said that a user should not have to understand the code inside a function in order to use the function. In this case the validation code is simple enough that someone working knowledge of PowerShell can figure out the problem but, again, to get a good message, do the test in the body seems good advice, in simple form the test would look like this
if (Test-Connection -ComputerName $ComputerName -Quiet -Count 1) {
        Get-WmiObject –ComputerName $ComputerName –Class 'Win32_OperatingSystem'
}
else {Write-Warning "Can't connect to $computername" }

But this doesn’t cope with multiple values in computer name – if any are valid the code runs so it would be better to run.
foreach ($c in $ComputerName) {
    if (Test-Connection -ComputerName $c -Quiet -Count 1 ) {
        Get-WmiObject –ComputerName $c –Class 'Win32_OperatingSystem'
    }
    else {Write-Warning "Can't connect to $c"}
}

This doesn’t support using “.” to mean “LocalHost” in Get-WmiObject – hopefully by now you can see the problem: validation strategies can either end up stopping things working which should work or the validation becomes a significant task. If a bad parameter can result in damage, then a lot validation might be appropriate. But this function changes nothing so there is no direct harm if it fails; and although the validation prevents some failures, it doesn’t guarantee the command will succeed. Firewall rules might allow ping but block an RPC call, or we might fail to logon and so on. In a function which uses the result of Get-WmiObject we need to check that result is valid before using it in something else. In other words, validating the output might be better than validating the input.

Note that I say “Might”: validating the output isn’t always better. Depending on the kind of things you write validating input might be best, most of the time. Think about validation rather than cranking it out while running on autopilot. And remember you have three duties to your users

  • Write help (very simple, comment-based help is fine) for the parameter saying what is acceptable and what is not. Often the act of writing “The computer name must only contain letters” will show you that you have lapsed into Bad validation
  • Make error messages understandable. One which requires the user to read code or decipher a regular expression isn’t, so be wary of using some of the built in validation options.
  • Don’t break things. Work the way the user expects to work. If commands which do similar things take a list of targets, don’t force a single one.
    If “.” works, support it.
    If your code uses SQL syntax where “%” is a wildcard, think about converting “*” to “%”, and doubling up single apostrophes (testing with my own surname is a huge help to me!)
    And if users want to enter redundant spaces and punctuation, it’s your job to remove them.

Blog at WordPress.com.