James O'Neill's Blog

August 9, 2012

Getting to the data in Adobe Lightroom–with or without PowerShell

Filed under: Databases / SQL,Photography,Powershell — jamesone111 @ 7:01 am

Some Adobe software infuriates me (Flash), I don’t like their PDF reader and use Foxit instead, apps which use Adobe-Air always seem to leak memory. But I love Lightroom .  It does things right – like installations – which other Adobe products get wrong. It maintains a “library” of pictures and creates virtual folders of images ( “collections” ) but it maintains metadata in the images files so data stays with pictures when they are copied somewhere else – something some other programs still get badly wrong. My workflow with Lightroom goes something like this.

  1. If I expect to manipulate the image at all I set the cameras to save in RAW, DNG format not JPG (with my scuba diving camera I use CHDK to get the ability to save in DNG)
  2. Shoot pictures – delete any where the camera was pointing at the floor, lens cap was on, studio flash didn’t fire etc. But otherwise don’t edit in the camera.
  3. Copy everything to the computer – usually I create a folder for a set of pictures and put DNG files into a “RAW” subfolder. I keep full memory cards in filing sleeves meant for 35mm slides..
  4. Using PowerShell I replace the IMG prefix with something which tells me what the pictures are but keeps the camera assigned image number. 
  5. Import Pictures into Lightroom – manipulate them and export to the parent folder of the “RAW” one. Make any prints from inside Lightroom. Delete “dud” images from the Lightroom catalog.
  6. Move dud images out of the RAW folder to their own folder. Backup everything. Twice. [I’ve only recently learnt to export the Lightroom catalog information to keep the manipulations with the files]
  7. Remove RAW images from my hard disk

There is one major pain. How do I know which files I have deleted in Lightroom ? I don’t want to delete them from the hard-disk I want to move them later. It turns out Lightroom uses a SQL Lite database and there is a free Windows ODBC driver for SQL Lite available for download.  With this in place one can create a ODBC data source – point it at a Lightroom catalog and poke about with data. Want a complete listing of your Lightroom data in Excel? ODBC is the answer. But let me issue these warnings:

  • Lightroom locks the database files exclusively – you can’t use the ODBC driver and Lightroom at the same time. If something else is holding the files open, Lightroom won’t start.
  • The ODBC driver can run UPDATE queries to change the data: do I need to say that is dangerous ? Good.
  • There’s no support for this. If it goes wrong, expect Adobe support to say “You did WHAT ?” and start asking about your backups. Don’t come to me either. You can work from a copy of the data if you don’t want to risk having to fall back to one of the backups Lightroom makes automatically

   I was interested in 4 sets of data shown in the following diagrams. Below is image information with the Associated metadata, and file information. Lightroom stores images (Adobe_Images table) IPTC and EXIF metadata link to images – their “image” field joins to the “id_local” primary key in images. Images have a “root file” (in the AgLibraryFile table) which links to a library folder (AgLibraryFolder) which is expressed as a path from a root folder (AgLibraryRootFolder table). The link always goes to the “id_local” field I could get information about the folders imported into the catalog just by querying these last two tables (Outlined in red)


The SQL to fetch this data looks like this for just the folders
SELECT RootFolder.absolutePath || Folder.pathFromRoot as FullName
FROM   AgLibraryFolder     Folder
JOIN   AgLibraryRootFolder RootFolder O
N  RootFolder.id_local = Folder.rootFolder
ORDER BY FullName 

SQLlite is one of the dialects of SQL which doesn’t accept AS in the FROM part of a SELECT statement . Since I run this in PowerShell I also put a where clause in which inserts a parameter. To get all the metadata the query looks like this
SELECT    rootFolder.absolutePath || folder.pathFromRoot || rootfile.baseName || '.' || rootfile.extension AS fullName, 
          LensRef.value AS Lens,     image.id_global,       colorLabels,                Camera.Value       AS cameraModel,
          fileFormat,                fileHeight,            fileWidth,                  orientation ,
captureTime,               dateDay,               dateMonth,                  dateYear,
          hasGPS ,                   gpsLatitude,           gpsLongitude,               flashFired,
focalLength,               isoSpeedRating ,       caption,                    copyright
FROM      AgLibraryIPTC              IPTC
JOIN      Adobe_images               image      ON      image.id_local = IPTC.image
JOIN      AgLibraryFile              rootFile   ON   rootfile.id_local = image.rootFile
JOIN      AgLibraryFolder            folder     ON     folder.id_local = rootfile.folder
JOIN      AgLibraryRootFolder        rootFolder ON rootFolder.id_local = folder.rootFolder
JOIN      AgharvestedExifMetadata    metadata   ON      image.id_local = metadata.image
LEFT JOIN AgInternedExifLens         LensRef    ON    LensRef.id_Local = metadata.lensRef
LEFT JOIN AgInternedExifCameraModel  Camera     ON     Camera.id_local = metadata.cameraModelRef

Note that since some images don’t have a camera or lens logged the joins to those tables needs to be a LEFT join not an inner join. Again the version I use in PowerShell has a Where clause which inserts a parameter.

OK so much for file data – the other data I wanted was about collections. The list of collections is in just one table (AgLibraryCollection) so very easy to query, and but I also wanted to know the images in each collection.


Since one image can be in many collections,and each collection holds many images AgLibraryCollectionImage is a table to provide a many to relationship. Different tables might be attached to AdobeImages depending on what information one wants from about the images in a collection, I’m interested only in mapping files on disk to collections in Lightroom, so I have linked to the file information and I have a query like this.

SELECT   Collection.name AS CollectionName ,
         RootFolder.absolutePath || Folder.pathFromRoot || RootFile.baseName || '.' || RootFile.extension AS FullName
FROM     AgLibraryCollection Collection
JOIN     AgLibraryCollectionimage cimage     ON collection.id_local = cimage.Collection
OIN     Adobe_images             Image      ON      Image.id_local = cimage.image
JOIN     AgLibraryFile            RootFile   ON   Rootfile.id_local = image.rootFile
JOIN     AgLibraryFolder          Folder     ON     folder.id_local = RootFile.folder
JOIN     AgLibraryRootFolder      RootFolder ON RootFolder.id_local = Folder.rootFolder
ORDER BY CollectionName, FullName

Once I have an ODBC driver (or an OLE DB driver) I have a ready-made PowerShell template for getting data from the data source. So I wrote functions to let me do :
Get-LightRoomItem -ListFolders -include $pwd
To List folders, below the current one, which are in the LightRoom Library
Get-LightRoomItem  -include "dive"
To list files in LightRoom Library where the path contains  "dive" in the folder or filename
Get-LightRoomItem | Group-Object -no -Property "Lens" | sort count | ft -a count,name
To produce a summary of lightroom items by lens used. And
$paths = (Get-LightRoomItem -include "$pwd%dng" | select -ExpandProperty path)  ;   dir *.dng |
           where {$paths -notcontains $_.FullName} | move -Destination scrap -whatif

  Stores paths of lightroom items in the current folder ending in .DNG in $paths;  then gets files in the current folder and moves those which are not in $paths (i.e. in Lightroom.) specifying  -Whatif allows the files to be confirmed before being moved.

Get-LightRoomCollection to list all collections
Get-LightRoomCollectionItem -include musicians | copy -Destination e:\raw\musicians    to copies the original files in the “musicians” collection to another disk

I’ve shared the PowerShell code on Skydrive

August 7, 2012

The cloud, passwords, and problems of trust and reliance

Filed under: Privacy,Security and Malware — jamesone111 @ 9:02 pm

In recent days a story has been emerging of a guy called Mat Honan. Mat got hacked, the hackers wanted his twitter account simply because he had a three letter twitter name. Along the way they wiped his Google mail account and (via Apple’s iCloud) his iPhone, iPad and his Macbook. Since he relied on stuff being backed up in the cloud he lost irreplaceable family photos, and lord only knows what else. There are two possible reactions Schadenfreude – “Ha, ha I don’t rely on Google or Apple look what happens to people who do” , “What an idiot, not having a backup”, or “There but for the grace of God goes any of us”.

Only people who’ve never lost data can feel unsympathetic to Mat and I’ve lost data. I’ve known tapes which couldn’t be read on a new unit after the old one was destroyed in a fire. I’ve learnt by way of a disk crash that a server wasn’t running it’s backups correctly. I’ve gone back to optical media which couldn’t be read. My backup drive failed a while back – though fortunately everything on it existed somewhere else, making a new backup showed me in just how many places. I’ve had memory cards fail in the camera before I had copied the data off them and I had some photos which existed only on a laptop and a memory card which were in the same bag that got stolen (the laptop had been backed up the day before the photos were taken). The spare memory card I carry on my key-ring failed recently, and I carry that because I’ve turned up to shoot photos with no memory card in the camera – never close the door on the camera with the battery or memory card out. I treat memory cards like film and just buy more and keep the old cards as a backstop copy. So my data practices look like a mixture of paranoia and superstition and I know, deep down, that nothing is infallible.

For many of us everything we have in the cloud comes down to one password. I don’t mean that we logon everywhere with “Secret1066!”  (no, not my password). But most of us have one or perhaps two email address which we use when we register.  I have one password which I use on many, many sites which require me to create an identity but that identity doesn’t secure anything meaningful to me. It doesn’t meet the rules of some sites (and I get increasingly cross with sites which define their own standards for passwords) and on those sites I will set a one off password. Like “2dayisTuesday!” when I come to use the site again I’ll just ask them to reset my password. Anything I have in the cloud is only as secure as my email password. 
There are Some hints here, first: any site which can mail you your current password doesn’t encrypt it properly the proper way to store passwords is as something computed from the password so it is only possible to tell if the right password was entered not what the password is. And second, these computations are case sensitive and set no maximum password length, so any site which is case insensitive or limits password length probably doesn’t have your details properly secured.  Such sites are out there – Tesco for example – and if we want to use them we have to put up with their security. However if they get hacked (and you do have to ask , if they can’t keep passwords securely, what other weaknesses are there ?) your user name , email and password are in the hands of the hackers, so you had better use different credentials anywhere security matters – which of course means on your mailbox.

So your email password is the one password to rule them all and obviously needs to be secure. But there is a weak link, and that seems to be where the people who hacked Mat found a scary loophole. The easiest way into someone’s mailbox might be to get an administrator to reset the password over the phone – not to guess or brute force it. The only time I had my password reset at Microsoft the new one was left on my voicemail – so I had to be able to login to that. If the provider texts the password to a mobile phone or resets it (say) to the town where you born (without saying what it is) that offers a level of protection; but – be honest – do you know what it takes to get someone at your provider to reset your password, or what the protocol is ?  In Mat’s case the provider was Apple – for whom the hacker knew an exploitable weakness – but it would be naive to think that Apple was uniquely vulnerable.

Mat’s pain may show the risk in having only a mailbox providers password reset policy to keep a hacker out of your computer and/or your (only) backup. One can build up a fear of other things that stop you having access to either computer or backup without knowing how realistic they are.  I like knowing that my last few phones could be wiped easily but would I want remote wipe of a laptop ? When my laptop was stolen there wasn’t any need to wipe it remotely as it had full volume encryption with Microsoft’s bitlocker (saving me a difficult conversation with corporate security) and after this story I’ll stick to that. Cloud storage does give me off-site backup and that’s valuable – it won’t be affected if I have a fire or flood at home – but I will continue to put my faith in traditional off-line backup and I’ve just ordered more disk capacity for that.

July 31, 2012

Rotating pictures from my portfolio on the Windows 7 Logon screen

Filed under: Photography,Powershell — jamesone111 @ 12:15 pm

In the last 3 posts I outlined my Get-IndexedItem function for accessing windows Search. The more stuff I have on my computers the harder it is to find a way of classifying it so it fits into hierarchical folders : the internet would be unusable without search, and above a certain number of items local stuff is too.  Once I got search I start leaving most mail in my Inbox and outlook uses search to find what I want; I have one “book” in Onenote with a handful of sections and if I can’t remember where I put something, search comes to the rescue. I take the time to tag photos so that I don’t have to worry too much about finding a folder structure to put them in. So I’ll tag geographically  (I only have a few pictures from India – one three week trip, so India gets one tag but UK pictures get divided by County , and in counties with many pictures I put something like Berkshire/Reading. Various tools will make a hierarchy with Berkshire then Newbury, Reading etc) People get tagged by name – Friends and Family being a prefix to group those and so on. I use Windows’ star ratings to tag pictures I like – whether I took them or not – and Windows “use top rated pictures” for the Desktop background picks those up. I also have a tag of “Portfolio”

Ages ago I wrote about Customizing the Windows 7 logon screen. So I had the idea “Why not find pictures with the Portfolio tag, and make them logon backgrounds.”  Another old post covers PowerShell tools for manipulating images so I could write a script to do it, and use Windows scheduled tasks to run that script each time I unlocked the computer so that the next time I went to the logon screen I would have a different picture. That was the genesis of Get-IndexedItem. And I’ve added it, together with the New-LogonBackground to the image module download on the Technet Script Center

If you read that old post you’ll see one of the things we depend on is setting a registry key so the function checks that registry key is set and writes a warning if it isn’t:

if ( (Get-ItemProperty HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background
).oembackground -ne 1) {
        Write-Warning "Registry Key OEMBackground under
          HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background needs to be set to 1"

        Write-Warning "Run AS ADMINISTRATOR with -SetRegistry to set the key and try again."

So if the registry value isn’t set to 1, the function prints a warning which tells the user to run with –SetRegistry. After testing this multiple times – I found changing windows theme resets the value – and forgetting to run PowerShell with elevated permissions, I put in a try / catch to pick this up and say “Run Elevated”. Just as a side note here I always find when I write try/catch it doesn’t work and it takes me a moment to remember catch works on terminating errors and the command you want to catch must usually needs –ErrorAction stop

if ($SetRegistry ) {
  try{ Set-ItemProperty -Name oembackground -Value 1 -ErrorAction Stop `
               -PATH "HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background" 

  catch [System.Security.SecurityException]{
     Write-Warning "Permission Denied - you need to run as administrator"

The function also tests that it can write to the directory where the images are stored, since this doesn’t normally have user access: if it can’t write a file, it tells the user to set the permissions. Instead of using try/catch here I use $? to see if the previous command was successful
Set-content -ErrorAction "Silentlycontinue" -Path "$env:windir\System32\oobe\Info\Backgrounds\testFile.txt" `
              -Value "This file was created to test for write access. It is safe to remove"
if (-not $?) {write-warning "Can't create files in $env:windir\System32\oobe\Info\Backgrounds please set permissions and try again"
else         {Remove-Item -Path "$env:windir\System32\oobe\Info\Backgrounds\testFile.txt"}

The next step is to find the size of the monitor. Fortunately, there is a WMI object for that, but not all monitor sizes are supported as bitmap sizes so the function takes –Width and –Height parameters. If these aren’t specified it gets the value from WMI and allows for a couple of special cases – my testing has not been exhaustive, so other resolutions may need special handling. The Width and height determine the filename for the bitmap, and later the function check the aspect ratio – so it doesn’t try to crop a portrait image to fit landscape monitor.

if (-not($width -and $height)) {
    $mymonitor = Get-WmiObject Win32_DesktopMonitor -Filter "availability = '3'" | select -First 1
    $width, $height = $mymonitor.ScreenWidth, $mymonitor.ScreenHeight
    if ($width -eq 1366) {$width = 1360}
    if (($width -eq 1920) -and ($height -eq 1080)) {$width,$height = 1360,768}
if (@("768x1280" ,"900x1440" ,"960x1280" ,"1024x1280" ,"1280x1024" ,"1024x768" , "1280x960" ,"1600x1200",
      "1440x900" ,"1920x1200" ,"1280x768" ,"1360x768") -notcontains "$($width)x$($height)" )
    write-warning "Screen resolution is not one of the defaults. You may need to specify width and height"
$MonitorAspect = $Width / $height
$SaveName = "$env:windir\System32\oobe\Info\Backgrounds\Background$($width)x$height.jpg"

The next step is to get the image – Get-Image is part of the PowerShell tools for manipulating images .

$myimage = Get-IndexedItem -path $path -recurse -Filter "Kind=Picture","keywords='$keyword'",
                            "store=File","width >= $width ","height >= $height " |
                      where-object {($_.width -gt $_.height) -eq ($width -gt $height)} |
 get-random | get-image

Get-Indexed item looks for files in folder specified by –Path parameter – which defaults to [system.environment]::GetFolderPath( [system.environment+specialFolder]::MyPicture – the approved way to find the "my pictures" folder -recurse tells it to look in sub-folders and it looks for a file with keywords which match the –Keyword Parameter (which defaults to “Portfolio”). It filters out pictures which are smaller than the screen and then where-object filters the list down to those with have the right aspect ratio. Finally one image is selected at random and piped into Get-Image.

If this is successful , the function logs what it is doing to the event log. I set up a new log source “PSLogonBackground” in the application log by running PowerShell as administrator and using the command
New-EventLog -Source PSLogonBackground -LogName application
Then my script can use that as a source – since I don’t want to bother the user if the log isn’t configured I use -ErrorAction silentlycontinue here
write-eventlog -logname Application -source PSLogonBackground -eventID 31365 -ErrorAction silentlycontinue `
                -message "Loaded $($myImage.FullName) [ $($myImage.Width) x $($myImage.Height) ]"


The next thing the function does is to apply cropping and scaling image filters from the original image module as required to get the image to the right size.  When it has done that it tries to save the file, by applying a conversion filter and saving the result. The initial JPEG quality is passed as a parameter if the file is too big, the function loops round reducing the jpeg quality until the file fits into the 250KB limit and logs the result to the event log.

Set-ImageFilter -filter (Add-ConversionFilter -typeName "JPG" -quality $JPEGQualitypassthru) -image $myimage -Save $saveName
$item = get-item $saveName
while ($item.length -ge 250kb -and ($JPEGQuality -ge 15) ) {
      $JPEGQuality= 5
      Write-warning "File too big - Setting Quality to $Jpegquality and trying again"
      Set-ImageFilter -filter (Add-ConversionFilter -typeName "JPG" -quality $JPEGQuality -passThru) -image $myimage -Save $saveName
      $item = get-item $saveName
if ($item.length -le 250KB) {
write-eventlog -logname Application -source PSLogonBackground -ErrorAction silentlycontinue `
           -eventID 31366 -message "Saved $($Item.FullName) : size $($Item.length)"


That’s it. If you download the module  remove the “Internet block” on the zip file and expand the files into \users\YourUserName\windowsPowerShell\modules, and try running New-logonbackground  (with –Verbose to get extra information if you wish).
If the permissions on the folder have been set, the registry key is set,  pressing [Ctrl]+[Alt]+[Del] should reveal a new image.  YOU might want to use a different keyword or a different path or start by trying to use a higher JPEG quality in which case you can run it with parameters as needed.

Then it is a matter of setting up the scheduled task: here are the settings from my scheduler




The program here is the full path to Powershell.exe and the parameters box contains
-noprofile -windowstyle hidden -command "Import-Module Image; New-logonBackground"

Lock, unlock and my background changes. Perfect. It’s a nice talking point and a puzzle – sometimes people like the pictures (although someone said one of a graveyard was morbid) – and sometimes they wonder how the background they can see is not only not the standard one but not the one they saw previously.

June 30, 2012

Using the Windows index to search from PowerShell: Part three. Better function output

Filed under: Powershell — jamesone111 @ 10:53 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 27 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place

In part one, I introduced a function which queries the Windows Index using filter parameters like

  • "Contains(*,’Stingray’)"
  • "System.Keywords = ‘Portfolio’ "
  • "System.Photo.CameraManufacturer LIKE ‘CAN%’ "
  • "System.image.horizontalSize > 1024"

In part two, I showed how these parameters could be simplified to

  • Stingray (A word on its own becomes a contains term)
  • Keyword=Portfolio (Keyword, without the S is an alias for System.Keywords and quotes will be added automatically))
  • CameraManufacturer=CAN* (* will become %, and = will become LIKE, quotes will be added and CameraManufacturer will be prefixed with System.Photo)
  • Width > 1024 (Width is an alias or System.image.horizontalsize, and quotes are not added around numbers).

There is one remaining issue. PowerShell is designed so that one command’s output becomes another’s input. This function isn’t going to do much with Piped input: I can’t see another command spitting out search terms for this one, nor can I multiple paths being piped in. But the majority of items found by a search will be files: and so it should be possible to treat them like files, piping them into copy-item or whatever.
The following was my first attempt at transforming the data rows into something more helpful

$Provider= "Provider=Search.CollatorDSO; Extended Properties=’Application=Windows’;"
$adapter = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$ds      = new-object system.data.dataset
if ($adapter.Fill($ds))
{ foreach ($row in $ds.Tables[0])
            {if ($row."System.ItemUrl" -match "^file:")
                  {$obj = New-Object psobject -Property @{
                                Path = (($row."System.ItemUrl" -replace "^file:","") -replace "\/","\")}}
             Else {$obj = New-Object psobject -Property @{Path = $row."System.ItemUrl"}
             Add-Member -force -Input $obj -Name "ToString" -MemberType "scriptmethod" -Value {$this.path}
             foreach ($prop in (Get-Member -InputObject $row -MemberType property |
                                    where-object {$row."$($_.name)" -isnot [system.dbnull] }))
                  { Add-Member -ErrorAction "SilentlyContinue" -InputObject $obj -MemberType NoteProperty `
                               -Name (($prop.name -split "\." )[-1]) -Value $row."$($prop.name)"
             foreach ($prop in ($PropertyAliases.Keys |
                                    Where-Object {$row."$($propertyAliases.$_)" -isnot [system.dbnull] }))
                  { Add-Member -ErrorAction "SilentlyContinue" -InputObject $obj ` -MemberType AliasProperty ` 
                               -Name $prop ` -Value ($propertyAliases.$prop -split "\." )[-1]
This is where the function spends most of its time, looping through the data creating a custom object for each row; non-file items are given a path property which holds the System.ItemURL property; for files the ItemUrl is processed into normal format (rather than file:c/users/james format) – in many cases the item can be piped into another command successfully if it just has a Path property.

Then, for each property (database column) in the row a member is added to the custom object with a shortened version of the property name and the value (assuming the column isn’t empty).
Next, alias properties are added using the definitions in $PropertyAliases.
Finally some standard members get added. In this version I’ve pared it down to a single method, because several things expect to be able to get the path for a file by calling its tostring() method.

When I had all of this working I tried to get clever. I added aliases for all the properties which normally appear on a System.IO.FileInfo object and even tried fooling PowerShell’s formatting system into treating my file items as a file object, something that only needs one extra line of code
$Obj.psobject.typenames.insert(0, "SYSTEM.IO.FILEINFO")
Pretending a custom object is actually another type seems dangerous, but everything I tried seemed happy provided the right properties were present. The formatting worked except for the "Mode" column. I found the method which that calculates .Mode for FILEINFO objects, but it needs a real FILEINFO object. It was easy enough to get one – I had the path and it only needs a call to Get‑Item but I realized that if I was getting a FILEINFO object anywhere in the process, then it made more sense to add extra properties to that object and dispense with the custom object. I added an extra switch -NoFiles to supress this behaviour
So the code then became
$Provider ="Provider=Search.CollatorDSO; Extended Properties=’Application=Windows’;"
$adapter  = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$ds       = new-object system.data.dataset
if ($adapter.Fill($ds))
     { foreach ($row in $ds.Tables[0])
 { if (($row."System.ItemUrl" -match "^file:") -and (-not $NoFiles)) 
                 {$obj = Get-item -Path (($row."System.ItemUrl" -replace "^file:","") -replace "\/","\")}
            Else {$obj = New-Object psobject -Property @{Path = $row."System.ItemUrl"}
                  Add-Member -force -Input $obj -Name "ToString" -MemberType "scriptmethod" -Value {$this.path} 
The initial code was 36 lines, making the user input more friendly took it to 60 lines, and the output added about another 35 lines, bring the total to 95.
There were 4 other kinds of output I wanted to produce:

  • Help. I added comment-based-help with plenty of examples. It runs to 75 lines making it the biggest constituent in the finished product.
    In addition I have 50 lines that are comments or blank for readability as insurance against trying to understand what those regular expressions do in a few months’ time – but there are only 100 lines of actual code.
  • A –list switch which lists the long and short names for the fields (including aliases)
  • Support for the –Debug switch – because so many things might go wrong, I have write‑debug $SQL immediately before I carry out the query, and to enable it that I have
    [CmdletBinding()] before I declare the parameters.
  • A –Value switch which uses the GROUP ON… OVER… search syntax so I can see what the possible values are in a column.
    GROUP ON queries are unusual because they fill the dataset with TWO tables.
    GROUP ON System.kind OVER ( SELECT STATEMENT) will produce a something like this as the first table.

-----------   -------
communication 0
document      1
email         2
folder        3
link          4
music         5
picture       6
program       7
recordedtv    8

The second table is the normal data suitably sorted. In this case it has all the requested fields grouped by kind plus one named "Chapter", which ties into the first table. I’m not really interested in the second table but the first helps me know if I should enter "Kind=image", "Kind=Photo" or "Kind=Picture"

I have a Select-List function which I use in my configurator and Hyper-V library, and with this I can choose which recorded TV program to watch, first selecting by title, and then if there is more than one episode, by episode.
$t=(Get-IndexedItem -Value "title" -filter "kind=recordedtv" -recurse |
            Select-List -Property title).title
start (Get-IndexedItem -filter "kind=recordedtv","title='$t'" -path |

In a couple of follow up posts I’ll show some of the places I use Get-IndexedItem. But for now feel free to download the code and experiment with it.

Using the Windows index to search from PowerShell:Part Two – Helping with user input

Filed under: Uncategorized — jamesone111 @ 10:46 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 26 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place

In part one I developed a working PowerShell function to query the Windows index. It outputs data rows which isn’t the ideal behaviour and I’ll address that in part three; in this part I’ll address another drawback: search terms passed as parameters to the function must be "SQL-Ready". I think that makes for a bad user experience so I’m going to look at the half dozen bits of logic I added to allow my function to process input which is a little more human. Regular expressions are the way to recognize text which must be changed, and I’ll pay particular attention to those as I know I lot of people find them daunting.

Replace * with %

SQL statements use % for wildcard, but selecting files at the command prompt traditionally uses *. It’s a simple matter to replace – but for the need to "escape" the* character, replacing * with % would be as simple as a –replace statement gets:
$Filter = $Filter -replace "\*","%"
For some reason I’m never sure if the camera maker is Canon or Cannon so I’d rather search for Can*… or rather Can%, and that replace operation will turn "CameraManufacturer=Can*" into "CameraManufacturer=Can%". It’s worth noting that –replace is just as happy to process an array of strings in $filter as it is to process one.

Searching for a term across all fields uses "CONTAINS (*,’Stingray’)", and if the -replace operation changes* to % inside a CONTAINS() the result is no longer a valid SQL statement. So the regular expression needs to be a little more sophisticated, using a "negative look behind"
$Filter = $Filter -replace " "(?<!\(\s*)\*","%"

In order to filter out cases like CONTAINS(*… , the new regular expression qualifies "Match on *",with a look behind – "(?<!\(\s*)" – which says "if it isn’t immediately preceded by an opening bracket and any spaces". In regular expression syntax (?= x) says "look ahead for x" and (?<= x) says "Look behind for x" (?!= x) is “look ahead for anything EXCEPT x” and (?<!x) is “look behind for anything EXCEPT x” these will see a lot of use in this function. Here (?<! ) is being used, open bracket needs to be escaped so is written as \( and \s* means 0 or more spaces.

Convert "orphan" search terms into ‘contains’ conditions.

A term that needs to be wrapped as a "CONTAINS" search can be identified by the absence of quote marks, = , < or > signs or the LIKE, CONTAINS or FREETEXT search predicates. When these are present the search term is left alone, otherwise it goes into CONTAINS, like this.
$filter = ($filter | ForEach-Object {
    if  ($_ -match "'|=|<|>|like|contains|freetext") 
    else   {"Contains(*,'$_')"}

Put quotes in if the user omits them.

The next thing I check for is omitted quote marks. I said I wanted to be able to use Can*, and we’ve seen it changed to Can% but the search term needs to be transformed into "CameraManufacturer=’Can%’ ". Here is a –replace operation to do that.
$Filter = $Filter -replace "\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
This is a more complex regular expression which takes a few moments to understand

Regular expression




Any spaces (or none)



= or < or > or "Like"



Anything which is NOT a ‘ character
or a digit



Any number of non-quote,
non-space characters (or none)



End of line


Capture the enclosed sections
as matches

$Matches[0]= "=Can%"
$Matches[1]= "="
$Matches[2]= "Can%"

‘ $1 ”$2” ‘0

Replace Matches[0] ("=Can%")
with an expression which uses the
two submatches "=" and "can%".

= ‘Can%’

Note that the expression which is being inserted uses $1 and $2 to mean matches[1] and[2] – if this is wrapped in double quote marks PowerShell will try to evaluate these terms before they get to the regex handler, so the replacement string must be wrapped in single quotes. But the desired replacement text contains single quote marks, so they need to be doubled up.

Replace ‘=’ with ‘like’ for Wildcards

So far, =Can* has become =’Can%’, which is good, but SQL needs "LIKE" instead of "=" to evaluate a wildcard. So the next operation converts "CameraManufacturer = ‘Can%’ "into "CameraManufacturer LIKE ‘Can%’ ":
$Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "

Regular expression




= sign surrounded by any spaces

CameraManufacturer = ‘Can%’


A quote character

CameraManufacturer = Can%’


Any characters (at least one)

CameraManufacturer = ‘Can%’


% character followed by ‘

CameraManufacturer = ‘Can%’


Any spaces (or none)
followed by end of line


Look ahead for the enclosed expression but don’t include it in the match

$Matches[0] = "="
(but only if ‘Can%’ is present)

Provide Aliases

The steps above reconstruct "WHERE" terms to build syntactically correct SQL, but what if I get confused and enter “CameraMaker” instead of “CameraManufacturer” or “Keyword” instead of “Keywords” ? I need Aliases – and they should work anywhere in the SQL statement – not just in the "WHERE" clause but in "ORDER BY" as well.
I defined a hash table (a.k.a. a "dictionary", or an "associative array") near the top of the script to act as a single place to store the aliases with their associated full canonical names, like this:
$PropertyAliases = @{
    Width       = "System.Image.HorizontalSize";
    Height      = "System.Image.VerticalSize";
    Name        = "System.FileName";
    Extension   = "System.FileExtension";
    Keyword     = "System.Keywords";
    CameraMaker = "System.Photo.CameraManufacturer"
Later in the script, once the SQL statement is built, a loop runs through the aliases replacing each with its canonical name:
$PropertyAliases.Keys | ForEach-Object {
    $SQL= $SQL -replace "(?<=\s)$($_)(?=\s*(=|>|<|,|Like))",$PropertyAliases[$_]
A hash table has .Keys and .Values properties which return what is on the left and right of the equals signs respectively. $hashTable.keyName or $hashtable[keyName] will return the value, so $_ will start by taking the value "width", and its replacement will be $PropertyAliases["width"] which is "System.Image.HorizontalSize", on the next pass through the loop, "height" is replaced and so on. To ensure it matches on a field name and not text being searched for, the regular expression stipulates the name must be preceded by a space and followed by "="or "like" and so on.

Regular expression




The literal text "Width"

Width > 1024


A Space


Look behind for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if a leading space is present)


any spaces (or none)


The literal text "Like", or any of the characters comma, equals, greater than or less than

Width > 1024


Look ahead for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if " >" is present)

If the prefix is omitted put the correct one in.

This builds on the ideas we’ve seen already. I want the list of fields and prefixes to be easy to maintain, so just after I define my aliases I define a list of field types
$FieldTypes = "System","Photo","Image","Music","Media","RecordedTv","Search"
For each type I define two variables, a prefix and a fieldslist : the names must be FieldtypePREFIX and FieldTypeFIELDS – the reason for this will become clear shortly but here is what they look like
$SystemPrefix = "System."
$SystemFields = "ItemName|ItemUrl"
$PhotoPrefix  = "System.Photo."
$PhotoFields  = "cameramodel|cameramanufacturer|orientation"
In practice the field lists are much longer – system contains 25 fieldnames not just the two shown here. The lists are written with "|" between the names so they become a regular expression meaning "ItemName or ItemUrl Or …". The following code runs after aliases have been processed
foreach ($type in $FieldTypes) {
   $fields = (get-variable "$($type)Fields").value
   $prefix = (get-variable "$($type)Prefix").value 
   $sql    = $sql -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
I can save repeating code by using Get-Variable in a loop to get $systemFields, $photoFields and so on, and if I want to add one more field, or a whole type I only need to change the variable declarations at the start of the script. The regular expression in the replace works like this:

Regular expression




Look behind for a space
but don’t include it in the match



The literal text "orientation" or "cameramanufacturer"

CameraManufacturer LIKE ‘Can%’


any spaces (or none)



The literal text "Like", or any of the characters comma, equals, greater than or less than

CameraManufacturer LIKE ‘Can%’


Look ahead for the enclosed expression
but don’t include it in the match

$match[0] is the point between the leading space and "CameraManufacturer LIKE" but doesn’t include either.

We get the effect of an "insert" operator by using ‑replace with a regular expression that finds a place in the text but doesn’t select any of it.
This part of the function allows "CameraManufacturer LIKE ‘Can%’" to become "System.Photo CameraManufacturer LIKE ‘Can%’ " in a WHERE clause.
I also wanted "CameraManufacturer" in an ORDER BY clause to become "System.Photo CameraManufacturer". Very sharp-eyed readers may have noticed that I look for a Comma after the fieldname as well as <,>,=, and LIKE. I modified the code which appeared in part one so that when an ORDER BY clause is inserted it is followed by a trailing comma like this:
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , " ) + ","}

the new version will work with this regular expression but the extra comma will cause a SQL error and so it must be removed later.
When I introduced the SQL I said the SELECT statement looks like this:

SELECT System.ItemName, System.ItemUrl,      System.FileExtension, System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText , System.KindText,     System.Kind,     System.MIMEType,       System.Size

Building this clause from the field lists simplifies code maintenance, and as a bonus anything declared in the field lists will be retrieved by the query as well as accepted as input by its short name. The SELECT clause is prepared like this:
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", " ) + ", "}

This replaces the "|" with a comma and puts a comma after each set of fields. This means there is a comma between the last field and the FROM – which allows the regular expression to recognise field names, but it will break the SQL , so it is removed after the prefixes have been inserted (just like the one for ORDER BY).
This might seem inefficient, but when I checked the time it took to run the function and get the results but not output them it was typically about 0.05 seconds (50ms) on my laptop – it takes more time to output the results.
Combining all the bits in this part with the bits in part one turns my 36 line function into about a 60 line one as follows

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
$PropertyAliases = @{Width ="System.Image.HorizontalSize"; 
                    Height = "System.Image.VerticalSize"}
$FieldTypes      = "System","Photo"
$PhotoPrefix     = "System.Photo."
$PhotoFields     = "cameramodel|cameramanufacturer|orientation"
$SystemPrefix    = "System."
$SystemFields    = "ItemName|ItemUrl|FileExtension|FileName"
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", ")+", " }
if ($Path -match "\\\\([^\\]+)\\.")
     {$SQL += " FROM $($matches[1]).SYSTEMINDEX WHERE "}
if ($Filter)
     {$Filter = $Filter -replace "\*","%"
      $Filter = $Filter -replace"\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
      $Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "
      $Filter = ($Filter | ForEach-Object {
          if ($_ -match "'|=|<|>|like|contains|freetext")
          else {"Contains(*,'$_')"}
      $SQL += $Filter -join " AND "
if ($Path)
    {if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
     $Path = $Path -replace "\\","/"
     if ($SQL -notmatch "WHERE\s$") {$SQL += " AND " }
     if ($Recurse) 
          {$SQL += " SCOPE = '$Path' "}
     else {$SQL += " DIRECTORY = '$Path' "}
if ($SQL -match "WHERE\s*$")
     { Write-warning "You need to specify either a path , or a filter." ; return }
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) + ","}
$PropertyAliases.Keys | ForEach-Object 
     { $SQL= $SQL -replace"(?<=\s)$($_)(?=\s*(=|>|<|,|Like))", $PropertyAliases.$_ }
foreach ($type in $FieldTypes)
{$fields = (get-variable "$($type)Fields").value
     $prefix = (get-variable "$($type)Prefix").value
     $SQL    = $SQL -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
$SQL = $SQL -replace "\s*,\s*FROM\s+" , " FROM "
$SQL = $SQL -replace "\s*,\s*$" , ""
$Provider="Provider=Search.CollatorDSO;"+ "Extended Properties=’Application=Windows’;"
$Adapter = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS     = new-object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }

In part 3 I’ll finish the function by turning my attention to output

Using the Windows index to search from PowerShell: Part one: Building a query from user input.

Filed under: Uncategorized — jamesone111 @ 10:43 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 25 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place

I’ve spent some time developing and honing a PowerShell function that gets information from the Windows Index– the technology behind the search that is integrated into explorer in Windows 7 and Vista. The Index can be queried using SQL and my function builds the SQL query from user input, executes it and receives rows of data for all the matching items. In Part three, I’ll look at why rows of data aren’t the best thing for the function to return and what the alternatives might be. Part two will look at making user input easier – I don’t want to make an understanding SQL a prerequisite for using the function. In this part I’m going to explore the query process.

We’ll look at how at how the query is built in a moment, for now please accept that a ready-to-run query stored in the variable $SQL. Then it only takes a few lines of PowerShell to prepare and run the query

$Provider="Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$adapter = new-object system.data.oledb.oleDBDataadapter -argument $sql, $Provider
$ds      = new-object system.data.dataset
if ($adapter.Fill($ds)) { $ds.Tables[0] }

The data is fetched using oleDBDataAdapter and DataSet objects; the adapter is created specifying a "provider" which says where the data will come from and a SQL statement which says what is being requested. The query is run when the adapter is told to fill the dataset. The .fill() method returns a number, indicating how many data rows were returned by the query – if this is non-zero, my function returns the first table in the dataset. PowerShell sees each data row in the table as a separate object; and these objects have a property for each of the table’s columns, so a search might return something like this:

SYSTEM.ITEMURL : file:C:/Users/James/pictures/DIVE_1771+.JPG
SYSTEM.KIND : {picture}
SYSTEM.MIMETYPE : image/jpeg
SYSTEM.SIZE : 971413

There are lots of fields to choose from, so the list might be longer. The SQL query to produce it looks something like this.

SELECT System.ItemName, System.ItemUrl,        System.FileExtension,
       System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText ,  System.KindText, 
       System.Kind,     System.MIMEType,       System.Size
WHERE  System.Keywords = 'portfolio' AND Contains(*,'stingray')

In the finished version of the function, the SELECT clause has 60 or so fields; the FROM and WHERE clauses might be more complicated than in the example and an ORDER BY clause might be used to sort the data.
The clauses are built using parameters which are declared in my function like this:

Param ( [Alias("Where","Include")][String[]]$Filter ,

In my functions I try to use names already used in PowerShell, so here I use -Filter and -First but I also define aliases for SQL terms like WHERE and TOP. These parameters build into the complete SQL statement, starting with the SELECT clause which uses -First

if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields

If the user specifies –First 1 then $SQL will be "SELECT TOP 1 fields"; otherwise it’s just "SELECT fields". After the fields are added to $SQL, the function adds a FROM clause. Windows Search can interrogate remote computers, so if the -path parameter is a UNC name in the form \\computerName\shareName the SQL FROM clause becomes FROM computerName.SYSTEMINDEX otherwise it is FROM SYSTEMINDEX to search the local machine.
A regular expression can recognise a UNC name and pick out the computer name, like this:

if ($Path -match "\\\\([^\\]+)\\.") {
$sql += " FROM $($matches[1]).SYSTEMINDEX WHERE "
else {$sql += " FROM SYSTEMINDEX WHERE "}

The regular expression in the first line of the example breaks down like this

Regular expression




2 \ characters: "\" is the escape character, so each one needs to be written as \\



Any non-\ character, repeated at least once



A \,followed by any character



Capture the section which is enclosed by the brackets as a match

$matches[0] =\\computerName\s
$matches[1] =computerName

I allow the function to take different parts of the WHERE clause as a comma separated list, so that
-filter "System.Keywords = 'portfolio'","Contains(*,'stingray')"
is equivalent to
-filter "System.Keywords = 'portfolio' AND Contains(*,'stingray')"

Adding the filter just needs this:

if ($Filter) { $SQL += $Filter -join " AND "}

The folders searched can be restricted. A "SCOPE" term limits the query to a folder and all of its subfolders, and a "DIRECTORY" term limits it to the folder without subfolders. If the request is going to a remote server the index is smart enough to recognise a UNC path and return just the files which are accessible via that path. If a -Path parameter is specified, the function extends the WHERE clause, and the –Recurse switch determines whether to use SCOPE or DIRECTORY, like this:

if ($Path){
     if ($Path -notmatch "\w{4}:") {
           $Path = "file:" + (resolve-path -path $Path).providerPath
     if ($sql -notmatch "WHERE\s*$") {$sql += " AND " }
     if ($Recurse)                   {$sql += " SCOPE = '$Path' " }
      else                           {$sql += " DIRECTORY = '$Path' "}

In these SQL statements, paths are specified in the form file:c:/users/james which isn’t how we normally write them (and the way I recognise UNC names won’t work if they are written as file://ComputerName/shareName). This is rectified by the first line inside the If ($Path) {} block, which checks for 4 "word" characters followed by a colon. Doing this will prevent ‘File:’ being inserted if any protocol has been specified –the same search syntax works against HTTP:// (though not usually when searching on your workstation), MAPI:// (for Outlook items) and OneIndex14:// (for OneNote items). If a file path has been given I ensure it is an absolute one – the need to support UNC paths forces the use of .ProviderPath here. It turns out there is no need to convert \ characters in the path to /, provided the file: is included.
After taking care of that, the operation -notmatch "WHERE\s*$" sees to it that an "AND" is added if there is anything other than spaces between WHERE and the end of the line (in other words if any conditions specified by –filter have been inserted). If neither -Path nor -filter was specified there will be a dangling WHERE at the end of the SQL statement .Initially I removed this with a ‑Replace but then I decided that I didn’t want the function to respond to a lack of input by returning the whole index so I changed it to write a warning and exit. With the WHERE clause completed, final clause in the SQL statement is ORDER BY, which – like WHERE – joins up a multi-part condition.

if ($sql -match "WHERE\s*$") {
     Write-warning "You need to specify either a path, or a filter."
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , ") }

When the whole function is put together it takes 3 dozen lines of PowerShell to handle the parameters, build and run the query and return the result. Put together they look like this:

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields
if ($Path -match "\\\\([^\\]+)\\.") {
              $SQL += "FROM $($matches[1]).SYSTEMINDEX WHERE "
else         {$SQL += " FROM SYSTEMINDEX WHERE "}
if ($Filter) {$SQL += $Filter -join " AND "}
if ($Path) {
    if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
    $Path = $Path -replace "\\","/"
    if ($SQL -notmatch "WHERE\s*$") {$SQL += " AND " }
    if ($Recurse)                   {$SQL += " SCOPE = '$Path' " }
    else                            {$SQL += " DIRECTORY = '$Path' "}
if ($SQL -match "WHERE\s*$") {
    Write-Warning "You need to specify either a path or a filter."
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) }
$Provider = "Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$Adapter  = New-Object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS       = New-Object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }

The -Path parameter is more user-friendly as a result of the way I handle it, but I’ve made it a general rule that you shouldn’t expect the user to know too much of the underlying syntax ,and at the moment the function requires too much knowledge of SQL: I don’t want to type

Get-Indexed-Item –Filter "Contains(*,'Stingray')", "System.Photo.CameraManufacturer Like 'Can%'"

and it seems unreasonable to expect anyone else to do so. I came up with this list of things the function should do for me.

  • Don’t require the user to know whether a search term is prefixed with SYSTEM. SYSTEM.DOCUMENT, SYSTEM.IMAGE or SYSTEM.PHOTO. If the prefix is omitted put the correct one in.
  • Even without the prefixes some fieldnames are awkward for example "HorizontalSize" and "VerticalSize" instead of width and height. Provide aliases
  • Literal text in searches needs to be enclosed in single quotes, insert quotes if the user omits them.
  • A free text search over all fields is written as Contains(*,’searchTerm’) , convert "orphan" search terms into contains conditions.
  • SQL uses % (not *) for a wild card – replace* with % in filters to cope with users putting the familiar *
  • SQL requires the LIKE predicate(not =) for wildcards : replace = with like for Wildcards

In Part two, I’ll look at how I do those things.

June 25, 2012

Lonesome George

Filed under: Uncategorized — jamesone111 @ 3:15 pm

At Easter I was in the Galapagos Islands; work had taken me to Ecuador diving in the Galapagos was too good an opportunity to miss. Mainland Ecuador was a country I knew little about and two weeks working in the capital (Quito, just South of the Equator, and 10,000 feet up in the Andes) doesn’t qualify me as an expert. The client there was a good one to work with, and what I saw of the city (a bunch of Taxi rides and a bus tour on the one day we weren’t working) mean I’d go back if asked. Travel wasn’t good and the return flights so bad that I’ve vowed never to fly with Iberia again. Flying to the islands the plane had a problem which meant if it landed it couldn’t take off again so having got within sight of the islands we had to go all the way back to Quito and get another plane. Down at sea level the heat was ferocious, the transportation scary and the insect bites the worst I’ve had. But the diving… different kinds of Sharks (including a close encounter with a group of Hammerheads), Seal Lions, Turtles, Rays (including a Manta encounter on the very first dive which set the tone) – I’d put up with a lot for that. And if some search engine has steered you here, I dived with Scuba Iguana, and if I manage to go back I’ll dive with them again.

The Scuba place was pretty much next door to the Darwin station: home of giant tortoises and a tortoise breading programme. Galapagos comes from the Spanish word for saddle because the shape of the giant tortoise’s shell looked like a traditional saddle. I also learnt that some languages – including Spanish – don’t have distinct words for Tortoise and (Marine) Turtles.  The sex of tortoises is determined by the temperature at which the eggs incubate and the breeding programme gathers eggs, incubates them to get extra females, and looks after the baby tortoises keeping them safe human introduced species (like rats) which feed on eggs and baby tortoises. Each island’s tortoises are different so the eggs and hatchlings are marked up so they go back to right island. But there is no breeding programme for Pinta island (also named Abingdon island by the Royal Navy. According to a story told by Stephen Fry on QI, sailors ate giant tortoises and found them very good.)  A giant Tortoise was found on Pinta; but a search over several years failed to find a second. So he – Lonesome George, was brought to the Darwin station in the 1970s. No one knows for sure how he was then. All efforts to find a mate for him failed: so George lived out his final decades as the only known example of  Geochelone nigra abingdoni  the Pinta Galapagos tortoise.
On Easter Sunday I walked up see George and the giants from other islands who live at the station. George was keeping out of the sun; he shared an enclosure and I wondered what he made of the other species – if somewhere in that ancient reptile brain lurked a memory of others just like him, a template into which the other tortoises didn’t quite fi.

Later in trip I was asked to help with some work on survey being prepared about Hammerhead sharks. I was told they estimated as having a 20% chance of becoming extinct in the next 100 years. This statistic is quite hard to digest: my chances of being extinct in 100 years are close to 100%: so my contribution to the survey was suggest that telling people if things continue as they are the chances of seeing Hammerheads on a dive in 5 years will be X amount less than today. It’s not fair, but we care more about some species than others and I hope there will still be Hammerheads for my children to see in a few years. Sadly they won’t get the chance to see the Pinta Tortoise.

June 22, 2012

Windows Phone 8. Some Known Knowns, and Known Unknowns

Filed under: Uncategorized — jamesone111 @ 10:23 am

Earlier this week Microsoft held its Windows Phone Summit where it made a set of announcements about the next generation of Windows Phone – Windows phone 8 in summary these were

  • Hardware Support for Multi-core processors , Additional Screen resolutions, Removable memory cards, NFC
  • Software The Windows 8 core is now the base OS, Support for native code written in C or C++ (meaning better games), IE 10, new mapping (from Nokia), a speech API which apps can use. Better business-oriented features, VOIP support, a new “Wallet” experience and in-app payments, and a new start screen.

This summit came hot on the heels of the Surface tablet launch, which seemed to be a decision by Microsoft that making hardware for Windows 8 was too important to be left to the hardware makers. The first thing I noted about phone announcement was the lack of a Microsoft branded phone. I’ve said that Microsoft should make phones itself since before the first Windows Mobile devices appeared – when Microsoft was talking about “Stinger” internally; I never really bought any of the reasons given for not doing so. But I’d be astounded if Nokia didn’t demand that Microsoft promise not to make a phone (whether there’s a binding agreement or just an understanding between Messrs Elop and Ballmer we don’t know). Besides Nokia Microsoft has 3 other device makers on-board: HTC have devices for every Microsoft mobile OS since 2000, but also have a slew of devices for Android, Samsung were a launch partner for Phone 7 but since then have done more with Android ; LG were in the line up for the Windows Phone 7 launch and are replaced by Huawei.  What these 3  feel about Nokia mapping technology is a matter of guesswork but depends on branding and licensing terms.

There are some things we think we know, but actually they are things we know that we don’t know.

  • Existing 7.x phones will not run Windows Phone 8 but will get an upgrade to Windows 7.8. I have an HTC Trophy which I bought in November 2010 and it has gone from 7.0 to 7.5 and I’ll be quite happy to get 7.8. on a 2 year old phone. Someone who just bought a Nokia Lumia might not feel quite so pleased.  What will be in 7.8 ? The new start screen has been shown. But will it have IE10 ? Will it have the new mapping and Speech capabilities. The Wallet, In-app-payments ?  This matters because….
  • Programs specifically targeting Windows Phone 8 won’t run on 7. Well doh! Programs targeting Windows 7 don’t run XP. But what programs will need to target the new OS ? Phone 7 programs are .NET programs and since .NET compiles to an intermediate language not to CPU instructions, a program which runs on Windows 8-RT (previous called Windows on ARM) should go straight onto a Windows 8-intel machine (but not vice versa), and Phone 7 programs will run on Phone 8. An intriguing comment at the launch says the Phone 8 emulator runs on Hyper-V; unless Hyper-V has started translating between different CPU instruction sets this means the Emulated phone has an Intel CPU but it doesn’t matter because it is running .NET intermediate language not binary machine code. So how many new programs will be 8-specific ? – Say Skype uses the VOIP support and in-app payments for calling credit. Will users with old phones be stuck with the current version of Skype? Do they get a new version where those features don’t light up. Or do they get the same experience as someone with a new phone. If the only things which are Phone 8 specific are apps which need Multiple cores (or other newly supported hardware) there would never have been any benefit upgrading a single core phone from 7 to 8.  
  • Business support. This covers many possibilities, but what is in and what is out ? Will the encryption send keys back to base as bit-locker does ? Will there be support for Direct-Access ? Corporate wireless security ? Will adding Exchange server details auto-configure the phone to corporate settings (like the corporate app-store) . Will it be possible to block updates ? This list goes on and on.

It’s going to interesting to see what this does for Microsoft and for Nokia’s turn-round.

June 19, 2012

Microsoft’s new surface. All it needs is love.

Filed under: General musings — jamesone111 @ 4:44 pm

People who buy Apple stuff worry me. Their attachment to one brand of equipment is somewhere between addiction and religious fervour. It is Love.

For over 20 years I’ve used Microsoft stuff, because it is simply a better way to get a job done. Some of the people in my office use Macs and can’t do their job without creating a virtual PC running Windows. It doesn’t reduce their love for their Macs. As the saying goes: The opposite of love is not hate, it’s indifference. And its hard to feel anything but indifferent to the products of any of the major PC makers. The Dell in front of me it is well made, well specified and does everything I ask and more. But Love it ? Most PC users will tell you loving a computer is crazy (which is why Apple folk are so disturbing). 

I think Microsoft’s new surface tablets are trying to create a Windows machine which people – if they can’t actually Love – feel more than indifferent about.  Surface is two Machines: one uses an ARM processor and runs the new Windows RT, so won’t run all your existing software. The other uses an Intel chip – and is thicker and heavier to give it 1/3 more battery life (roughly 130cc and 225g more to get 40 Watt Hours instead of 30.), but that means it should work with existing software and USB devices. I can plug in a mouse and keyboard, and attach two monitors via display port and have a system just like the one I have today. Unplug it and I can use it iPad style or take the “touch Cover” keyboard * and write documents or use the Pen to annotate documents if that’s what I choose. The ARM version has office built in, but no pen and a different video connector (so probably only 1 screen). Even with its smaller battery it will probably run for longer (though like the shipment date and price, battery life is yet to be confirmed).

Mary-Jo Foley wrote of the launch of Surface “It’s the end of an era. Or maybe the start of a new one.” Indeed. Microsoft began by providing OEMs (including Apple) with BASIC, then with Operating Systems. OEMs where very much the Geese that laid golden eggs for Microsoft – during the 10 years I worked at there (not in the OEM part of the business) there were times when I felt that the company forgot the problem with Geese is they produce a lot of … guano. The OEMs have been poor on design and innovation for a long time: Bloomberg business week no-less talks about how the PC industry should be shamed by Surface, and talks about recent years of PC Development as “the great stagnation”. The Bloomberg piece puts some blame on Microsoft and Intel for taking too much from the OEMs, I doubt that if the chips and OS had cost less the difference would have gone into innovations that add value. That lack of added value means margins on PCs aren’t great and that’s led manufacturers to take money to install all manner of junk on the machines they ship. The whole DoJ saga – which grew out of Microsoft trying to prevent OEMs installing software it didn’t like – left a situation where the company was required to sell Windows to anyone who wanted it and could not do anything to prevent an explosion of crapware. Lots of people are asking WHY has Microsoft chosen to get into making computers? There answer is either (a) It can make much more profit by selling computers and operating systems together. or (b) it has an idea of what a PC should be in the second decade of the 21st century and it doesn’t trust PC makers to deliver that by themselves.

If I were fielding calls from angry OEMs upset Microsoft arriving in their market I’d make the case that no OEM would have made a product like this: their lack of a similar product both lost them right to complain and forced Microsoft to do something: if they do have something , Microsoft is saying they won’t undercut on price, something we won’t know for sure until the units go on sale. Some people wonder if Microsoft will aim to make the same from selling the hardware as anyone else and make the price of Windows on top of that; or if they will think $x of margin per unit sold is the same whether they sell a computer/OS combination or they sell a licence to an OEM. The latter would make it very hard for OEMs to compete; but Microsoft trying to make desirable hardware profitably ? That’s a lot less of a threat to OEMs. Apple doesn’t sell as many units as Samsung, but the profit to Apple per unit is more than the retail price of the Samsung. When the iPhone was launched I questioned whether there was sufficient market for a phone at that price point: it has actually sold more units than Apple envisaged at the start: which proves one thing – people will pay handsomely for something they love. 


image* There are two keyboards. The thin pressure sensitive  “Touch cover” and a thicker moving key “Type cover” You can see the difference in this picture  (click for a bigger version)

May 11, 2012

Lies, damn lies and public sector pensions

Filed under: Uncategorized — jamesone111 @ 9:08 am

Every time I hear that “public sector workers”  are protesting about government plans for their pensions –as happened yesterday – I think of two points I made to fellow school governors when the teachers took a day off (with the knock on cost to parents and the economy) these were

  • Does  classroom teachers understand their union wants them to subsidize the head’s pension from theirs ?  (Have the Unions explained to the classroom teachers that is what they are doing
  • Any teacher who is part of this protest has demonstrated they have insufficient grasp of maths to teach the subject (except, perhaps to the reception class.)

This second point is the easier of the two to explain: the pension paid for over your working life is a function of:

  • The salary you earned (which in turn depends on the rate at which your salary grew)
  • What fraction of your salary was paid in to your pension fund. It might be you gave up X% or your employer put in Y% that you never saw or a combination.   
  • How many years you paid in for (when you started paying in, and when you retire)
  • How well your pension fund grew before it paid out
  • How long the pension is expected to pay out for (how long you live after retirement)
  • Annuity rates – the interest earned on the pension fund as it pays out.

In addition, some people receive some pension which wasn’t paid for in this way; some employers (public or private sector) make guarantees to top up either the pension fund so it will buy a given level of annuity or to top-up  pension payments bought with the fund. The total you receive is the combination of what you have paid for directly and the top-up.
Change any factor – for example how long you expect to live – and either what you have paid for changes or the other factors have to change to compensate. Since earnings and rates of return aren’t something we control, living longer means either we have pay for a smaller pension, or we must pay in more, or retire later or some combination of all three. Demanding that the same amount will come out of a pension for longer, without paying more in is a demand for a guaranteed top-up in future – in the case of public sector employees that future top-up comes from future taxes, for private sector it comes from future profits.

It’s easy for those in Government to make pension promises because those promises don’t need to be met for 30 years or more. Teachers whose retirement is imminent would have come into the profession when Wilson, Heath or Callaghan was in Downing Street, and all 3 are dead and buried; so with the exception of Denis Healey are all the chancellors who set budgets while they were in office. It’s temping for governments to save on spending by under-contributing to pensions today, and leave future governments with the shortfall: taken to the extreme governments can turn pensions into a Ponzi scheme – this year’s “Teachers’/Police/whatever pay” bill covers what all current teachers/police officers / whoever are paid for doing the job and all pensions paid to retired ones for doing the job cheaply in the past. Since I am, more-or-less, accusing governments of all colours of committing fraud, I might as well add false accounting to the charge sheet. Let’s say the Government wants to buy a squadron of new aircraft but doesn’t want to raise taxes to pay for them all this year; it borrows money and the future liability that creates is accounted for. If the deal it makes with public sector workers is for a given amount to spend today, and a promise of a given standard of living in retirement ,does it record that promise – that future liability – as part of pay today? Take a wild guess.
This wouldn’t matter – outside accounting circles – if everything was constant. But the length of time between retirement and death has increased and keeps on increasing. For the sake of a simple example: lets assume someone born in 1920, joined the Police after world war II , served for 30 years and retired in 1975 at age 55 expecting to die at 70. Their retirement was half their length of service. Now consider someone born in 1955, who also joined the police at age 25, served for 30 years and retired 2010. Is any one making plans for their Pension to stop in 2025 ? We might reasonably expect this person to live well into their eighties – so we’ve moved from 1 retired officer for every 2 serving, to a 1:1 ratio. I’m not saying that in 1975 the ratio was 1:2 and in 2012 it is 1:1 but that’s the direction of travel. 

I’ve yet to hear a single teacher say their protests about pensions amount to a demand that they should under-fund their retirement as a matter of public policy and their pupils – who will then be tax payers – should make up difference. As one of those whose work generates the funds to pay for the public sector I must choose a combination of lower pension, later retirement, and higher contributions than I was led to expect when I started work 25 years or so ago. And there are people demanding my taxes insulate them from having to do the same; or (if you prefer) demanding a pay rise to cover the gap between what past governments have promised them and what they are actually paying for, or (and this becomes a bit uncomfortable) that government starts telling us what it really costs to have the teachers, nurses, police officers and so on we want. 

But what of my claim that Unions get low paid staff to subsidize the pensions of higher paid colleagues. Lets take two teachers; I’ll call them Alice and Bob, and since this is a simplified example they’ll fit two stereotypes: Alice sticks to working in the class room; and gets a 2% cost of living rise every year. Bob competes for every possible promotion, and gets promoted in alternate years, so he gets a 2% cost of living rise alternating with a 10% rise. Although they both started on the same salary after 9 end-of-year rises, Alice’s pay has gone up by 19.5% and Bob – who must be a head by now – has seen his rise by 74%  
Throughout the 10 years they pay 10% of their salary into their pension fund – to make the sums easy we’ll assume they pay the whole 10% on the last day of the year, and each year their pension fund manager earns them 10% of what was in their pension pot at the end of the previous year. After 10 years Alice has £17,184 in her pension pot, and Bob has £20,390 in his.

Alice (and her fellow class room teachers) are told by the Union Rep that any attempt to change from final salary as the calculation mechanism is an attack on your pension, for her, this is factually wrong. If you are ever told this you need to ask if you are a high flier like Bob or if your career is more like Alice’s. To see why it is wrong (and lets put it down to the Union rep being innumerate , rather than dishonest), lets pretend the pension scheme only has Alice and Bob in it. So the total pot is £37,574 – Alice put in 46% of that money, but of it is shared in the ratio of the final salaries 11,950 : 17,432 ,Alice gets 41% of the pay out. 
You can argue it doesn’t work like that because Alice’s pot (at 1.44 times her final salary) might just cover the percentage of her final salary she has been promised: Bob’s pension pot is only 1.17 times his final salary which will give him a smaller percentage so the government steps in and boosts his pot to be 1.44 time his  final salary just like Alice’s. So Bob gets a golden handshake of nearly £4700 and Alice gets nothing.
Suppose 1.44 years is nowhere near enough and Alice and Bob need 3 years salary to buy a large enough annuity; the government needs to find £18,668 for Alice (108% of her pot), and 31,907 for Bob (156% of his pot). Whichever way you cut and slice if your salary grows quicker than your colleagues you will do better out of final salary than they do. If it grows more slowly you will fare worse.  

Alice Bob
Salary Increase Pension Payment Pension Pot Salary Increase Pension Payment Pension Pot
Year 1   10,000.00 2%               1,000.00     1,000.00   10,000.00 10%               1,000.00     1,000.00
Year 2   10,200.00 2%               1,020.00     2,120.00   11,000.00 2%               1,100.00     2,200.00
Year 3   10,404.00 2%               1,040.40     3,372.40   11,220.00 10%               1,122.00     3,542.00
Year 4   10,612.08 2%               1,061.21     4,770.85   12,342.00 2%               1,234.20     5,130.40
Year 5   10,824.32 2%               1,082.43     6,330.36   12,588.84 10%               1,258.88     6,902.32
Year 6   11,040.81 2%               1,104.08     8,067.48   13,847.72 2%               1,384.77     8,977.33
Year 7   11,261.62 2%               1,126.16   10,000.39   14,124.68 10%               1,412.47   11,287.53
Year 8   11,486.86 2%               1,148.69   12,149.12   15,537.15 2%               1,553.71   13,970.00
Year 9   11,716.59 2%               1,171.66   14,535.69   15,847.89 10%               1,584.79   16,951.79
Year 10   11,950.93               1,195.09  17,184.35   17,432.68               1,743.27   20,390.23
Average Salary   10,949.72   13,394.10
Combined Final Salary   29,383.60 Total Pot  37,574.58
Alice’s share   15,282.37
Bob’s Share   22,292.21
Combined Average Salary   24,343.82 Total Pot   37,574.58
Alice’s share   16,900.85
Bob’s Share   20,673.73

What if the mechanism for calculating were Average salary , not final salary? It doesn’t quite remove gap but gets very close. Instead of £2,000 of Alice’s money going to Bob it’s less than £300. 
A better way to look at this is to say if the amount of money in the combined Pension pot pays £5000 a year in Pensions, do we split it as roughly £2000 to Alice and £3000 to Bob (the rough ratio of their final salaries – each gets about 1/6th of their final salary) or £2250 to Alice and £2750 to Bob (the ratio of their average salaries and each gets about 1/5th of their average).
Whenever average salary is suggested as a basis, union leaders will say that pensions are calculated from a smaller number as if it reduces the amount paid. If the government wanted to take money that way it would be simpler to say “a full pension will in future be a smaller percentage of final salary”. Changing to average-based implies an increase in the percentage paid. 

That perhaps is the final irony. Rank and file Police officers – whose career pay is like Alice’s in the example – marched through London yesterday demanding that their pensions be left alone; you do not need to spend long reading “Inspector Gadget” to realise when you remove the problems created for the Police by politicians most of the problems that are left are created by senior officers whose career pay follows the “Bob” path. Yet the “many” marching were demanding that they continue to subsidize these “few”. As Gadget himself likes to say : you couldn’t make it up.

April 22, 2012

Don’t Swallow the cat. Doing the right thing with software development and other engineering projects.

Filed under: Uncategorized — jamesone111 @ 8:30 pm

In my time at Microsoft I became aware of the saying “communication occurs only between equals.” usually couched in the terms “People would rather lie than deliver bad news to Steve Ballmer”. Replacing unwelcome truths with agreeable dishonesty wasn’t confined to the CEOs direct reports, and certainly isn’t a disease confined to Microsoft. I came across ‘The Hierarchy of Power Semantics’ more than 30 years ago when I didn’t understand what was meant by the title; it was written in the 1960s and if you don’t recognise “In the "beginning was the plan and the specification, and the plan was without form and the specification was void and there was darkness on the face of the implementation team”  see here – language warning for the easily offended.
Wikipedia says the original form of “communication occurs only between equals”  is Accurate communication is possible only in a non-punishing situation. There are those who (consciously or not) use the impossibly of saying “No” to extract more from staff and suppliers; it can produce extraordinary results, but sooner or later it goes horribly wrong. For example the Challenger disaster was caused by the failure of an ‘O’ ring in solid rocket booster made by Morton Thiokol. The engineers responsible for the booster were quite clear that in cold weather the ‘O’ rings were likely to fail with catastrophic results.  NASA asked if a launch was OK after a freezing night and fearing the consequences of saying “No” managers at Morton Thiokol over-ruled the engineers and allowed the disastrous launch to go ahead.  Most people can think of some case where someone made an impossible promise to a customer, because they were afraid to say no.

Several times recently I have heard people say something to the effect that ‘We’re so committed to doing this the wrong way that we can’t change to the right way.”  Once the person saying it was me, which was the genesis of this post. Sometimes, in a software project because saying to someone – even to ourselves – “We’re doing this wrong” is difficult, so we create work rounds. The the odd title of this post comes from a song which was played on the radio a lot when I was a kid.

There was an old lady, who swallowed a fly, I don’t know why she swallowed a fly. I guess she’ll die.
There was an old lady, who swallowed a spider that wriggled and jiggled and ticked inside her. She Swallowed the spider to catch the fly  … I guess she’ll die
There was an old lady, who swallowed a bird. How absurd to swallow a bird. She swallowed the bird to catch the spider … I guess she’ll die
There was an old lady, who swallowed a cat. Fancy that to swallow a cat. She swallowed the cat to catch the bird …  I guess she’ll die
There was an old lady, who swallowed a dog. What a hog to swallow a dog. She swallowed the dog to catch the cat … I guess she’ll die
There was an old lady, who swallowed a horse. She’s dead, of course

In other words each cure needs a further, more extreme cure.  In my case the “fly” was a simple problem I’d inherited. It would take a couple of pages to explain the context, so for simplicity it concerns database tables and the “spider” was to store data de-normalized. If you don’t spend your days working with databases, imagine you have a list of suppliers, and a list of invoices from those suppliers. Normally you would store an ID for the supplier in the invoice table, and look up the name from the supplier table using the ID. For what I was doing it was better to put the supplier name in the invoices table, and ignore the ID. All the invoices for the supplier can be looked up by querying for the name. The same technique applied to products supplied by that supplier: store the supplier name in the product table, look up products by supplier name. This is not because I didn’t know any better, I had database normal forms drummed into me two decades ago. To stick with the metaphor: I know that, under normal circumstances, swallowing spiders is bad, but faced with this specific fly it was demonstrably the best course of action.
At this point someone who could have saved me from my folly pointed out that supplier names had to be editable. I protested that the names don’t change, but Amalgamated Widgets did, in fact, become Universal Widgets. This is an issue because Amalgamated not Universal raised the invoices in the filing cabinet so matching them to invoices in the system requires preserving the name as it was when the invoice was generated. “See, I was right name should be stored” – actually this exception doesn’t show I was right at all, but on I went. On the other hand all of  Amalgamated’s products belong to Universal now. Changing names means doing a cascaded update (update any product with the old company name to the new name when a name changes) the real case has more than just products. If you’re filling in the metaphor you’ve guessed I’d reached the point of figuring out how to swallow a bird. Worse, I could see another problem looming (anyone for Cat ?): changes to products had to be exported to another system, and the list of changes had their own table requiring cascaded updates from the cascaded updates.

One of the great quotes in Macbeth says “I am in blood stepped in so far that should I wade no more, Returning were as tedious as go o’er.” he knows what he’s doing is wrong, but it is as hard to go back (and do right) as it is to go on.  Except it isn’t: the solution is not to swallow another couple more spiders and a fly, the solution is to swallow a bird, then a cat and so on.  The dilemma is that the effort for an additional work-round is smaller than the effort to go back fix the initial problem and unpick all the work-rounds to date – either needs to be done now, and the easy solution is to choose the one which needs the least effort now. The sum of effort required for future work-rounds is greater but we can discount that effort because it isn’t needed now. Only in a non-punishing situation can we tell people that progress must be halted for a time to fix a problem which has been mitigated up to now. Persuading people that such a problem needs to fixed at all isn’t trivial, I heard this quote in a Radio programme a while back

“Each uneventful day that passes reinforces a steadily growing false sense of confidence that everything is alright:
that I, we, my group must be OK because the way we did things today resulted in no adverse consequences.”

In my case the problem is being fixed at the moment, but in how many organisations is it what career limiting move to tell people that something which has had now adverse consequences to date must be fixed? 

February 4, 2012

Customizing PowerShell, Proxy functions and a better Select-String

Filed under: Uncategorized — jamesone111 @ 9:24 pm

I suspect that even regular PowerShell users don’t customize their environment much. By co-incidence, in the last few weeks I’ve made multiple customizations to my systems (my scripts are sync’d over 3 machines, customize one, customize all). Which has given me multiple things to talk about. My last post was about adding persistent history this time I want to look at Proxy Functions …

Select-String is, for my money, one of the best things in PowerShell. It looks through piped text or through files for anything which matches a regular expression (or simple text) and reports back what matched and where with all the detail you could ever need. BUT It has a couple of things wrong with it: it won’t do a recursive search for files, and sometimes the information which comes back is too detailed. I solved both problems with a function I named “WhatHas” which has been part of my profile for ages. I have been using this to search scripts, XML files and saved SQL whenever I need a snippet of code that I can’t remember or because something needs to be changed and I can’t be sure I’ve remembered which files contain it. I use WhatHas dozens (possibly hundreds) of times a week. Because it was a quick hack I didn’t support every option that Select-string has, so if a code snippet spans lines I have go back to the proper Select-String cmdlet and use its -context option to get the lines either side of the match: more often than not I find myself typing dir -recurse {something} | select-String {options}

A while back I saw a couple of presentations on Proxy functions (there’s a good post about them here by Jeffrey Snover): I thought when I saw them that I would need to implement one for real before I could claim to understand them, and after growing tired of jumping back and forth between select-string and WhatHas, I decided it was time to do the job properly creating a proxy function for Select-String and keep whathas as an alias. 

There are 3 bits of background knowledge you need for proxy functions.

  1. Precedence. Aliases beat Functions, Functions beat Cmdlets. Cmdlets beat external scripts and programs. A function named Select-String will be called instead of a cmdlet named Select-String – meaning a function can replace a cmdlet simply by giving it the same name. That is the starting point for a Proxy function.
  2. A command can be invoked as moduleName\CommandName. If I load a function named “Get-Stuff” from my profile.ps1 file for example, it won’t have an associated module name but if I load it as part of a module, or if “Get-Stuff” is a cmdlet it will have a module name.
    Get-Command get-stuff | format-list name, modulename
    will show this information You can try
    > Microsoft.PowerShell.Management\get-childitem
    For yourself. It looks like an invalid file-system path, but remember PowerShell looks for a matching Alias, then a matching Function and a then a matching cmdlet before looking for a file.
  3. Functions can have a process block (which runs for each item passed via the pipeline) a begin block (which runs before the first pass through process, and an end block (which runs after the last item has passed through process.) Cmdlets follow the same structure, although it’s harder to see.

Putting these together A function named Select-String can call the Select-String cmdlet, but it must call it as Microsoft.PowerShell.Utility\Select-String or it will just go round in a loop. In some cases, calling it isn’t quite enough and PowerShell V2 delivered the steppable pipeline which can take a PowerShell command (or set of commands piped together) and allow us to run its begin block , process block , and end block, under the control of an function. So a Proxy function looks like this :
Function Select-String {
  Param  ( Same Parameters as the real Select-String
           Less any I want to prevent people using
           Plus any I want to add
   Begin { My preliminaries
Process { My Per-item code against current item ($_ )

     end { $steppablePipeline.End
           My Clean up code

What would really help would be something produce a function like this template, and fortunately it is built into PowerShell: it does the whole thing in 3 steps: Get the command to be proxied, get the detailed metadata for command and build a Proxy function with the meta data, like this:
  $cmd=Get-command select-string -CommandType cmdlet
  $MetaData = New-Object System.Management.Automation.CommandMetaData ($cmd)

The last command will output the Proxy function body to the console, I piped the result into Clip.exe and pasted the result into a new definition
Function Select-String { }
And I had a proxy function.

At this point it didn’t do anything that the original cmdlet doesn’t do but that was a starting point for customizing.
The auto-generated parameters are be formatted like this
  [Parameter(ParameterSetName='Object', Mandatory=$true, ValueFromPipeline=$true)]

And I removed some of the line breaks to reduce the screen space they use from 53 lines to about half that.
The ProxyCommand creator wraps parameter names in braces just in case something has a space or other breaking character in the name, and I took those out.
Then I added two new switch parameters -Recurse and -BareMatches.

Each of the Begin, Process and End blocks in the function contains a try...catch statement, and in the try part of the begin block the creator puts code to check if the -OutBuffer common parameter is set and if it is, over-rides it (why I’m not sure) – followed by code to create a steppable pipeline, like this:
  $wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('Select-String',
  $scriptCmd = {& $wrappedCmd @PSBoundParameters }
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

I decided it would be easiest to build up a string and make that into the steppable pipeline . In simplified form
   $wrappedCmd        = "Microsoft.PowerShell.Utility\Select-String " 
  $scriptText        = "$wrappedCmd @PSBoundParameters"
  if ($Recurse)      { $scriptText = "Get-ChildItem @GCI_Params | " + $scriptText }
  if ($BareMatches)  { $scriptText += " | Select-Object –ExpandProperty 'matches' " +
                                      " | Select-Object -ExpandProperty 'value'   " }  
  $scriptCmd         = [scriptblock]::Create($scriptText)  
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

& commandObject works in a scriptblock: the “&” sign says  “run this” and if this is a command object that’s just fine: so the generated code has scriptCmd = {& $wrappedCmd @PSBoundParameters } where $wrappedCmd  is a command object.
but when I first changed the code from using a script block to using a string I put the original object $wrappedCmd inside a string. When the object is inserted into a string, the conversion renders it as the unqualified name of the command – the information about the module is lost, so I produced a script block which would call the function, which would create a script block which would call the function which… is an effective way to cause a crash.

The script above won’t quite work on its own because
(a) I haven’t built up the parameters for Get-Childitem. So if -recurse or –barematches are specified I build up a hash table to hold them, using taking the necessary parameters from what ever was passed, and making sure they aren’t passed on to the Select-String Cmdlet when it is called. I also make sure that a file specification is passed for a recursive search it is moved from the path parameter to the include parameter.
(b) If -recurse or -barematches get passed to the” real” Select-String cmdlet it will throw a “parameter cannot be found” error, so they need to be removed from $psboundParameters.

This means the first part of the block above turns into
  if ($recurse -or $include -or $exclude) {
     $GCI_Params = @{}
     foreach ($key in @("Include","Exclude","Recurse","Path","LiteralPath")) {
          if ($psboundparameters[$key]) {
$GCI_Params[$key] = $psboundparameters[$key]
     # if Path doesn't seem to be a folder it is probably a file spec
     # So if recurse is set, set Include to hold file spec and path to hold current directory
     if ($Recurse -and -not $include -and ((Test-Path -PathType Container $path) -notcontains $true) ) {
        $GCI_Params["Include"] = $path
        $GCI_Params["Path"] = $pwd
   $scriptText = "Get-ChildItem @GCI_Params | "
else { $scriptText = ""}

And the last part is
if ($BareMatches) {
  $scriptText += " | Select-object -expandproperty 'matches' | Select-Object -ExpandProperty 'value' "
$scriptCmd = [scriptblock]::Create($scriptText)
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

There’s no need for me to add anything to the process or end blocks, so that’s it – everything Select-String originally did, plus recursion and returning bare matches.

I’ve put the whole file on skydrive here

January 28, 2012

Adding Persistent history to PowerShell

Filed under: Uncategorized — jamesone111 @ 7:19 pm

The Doctor: “You’re not are you? Tell me you’re not Archaeologists”
River Song: “Got a problem with Archaeologists ?”
The Doctor: “I’m a time traveller. I point and laugh at Archaeologists”

I’ve said to several people that my opinion of Linux has changed since I left Microsoft: after all I took a job with Microsoft because I rated their products and stayed there 10 years, I didn’t have much knowledge of Linux at the start of that stay and had no call to develop any during it, so my views I had was based on ignorance and supposition. After months of dealing with it on a daily basis I know how wrong I was. In reality Linux is uglier, more dis-functional and, frankly, retarded than I ever imagined it could be. Neither of our Linux advocates suggest it should be used anywhere but servers (just as well they both have iPhones and Xboxes which are about as far from the Open source ideal as it’s possible to go). But compare piping or tab expansion in PowerShell and Bash and you’re left in no doubt which one was designed 20 years ago. “You can only pipe text ? Really ? How … quaint”.

One of guys was trying to do something with Exchange from the command-line and threw down a gauntlet.
If PowerShell’s so bloody good why hasn’t it got persistent history”
OK. This is something which Bash has got going for it. How much work would it take to fix this ? Being PowerShell the answer is “a few minutes”. Actually the answer is “a lot less time than it takes to write a blog post about it”

First a little side track various people I know have a PowerShell prompt which looks like
[123]  PS C:\users\Fred>
Where 123 was the history ID. Type H (or history, or Get-History) and PowerShell shows you the previous commands, with their history ID, the command Invoke-History <id> (or ihy for short) runs the command.
I’d used PowerShell for ages before I discovered typing #<id>[Tab] inserts the history item into the command line. I kept saying “I’ll do that one day”, and like so many things I didn’t get round to it.
I already use the history ID I have this function in my profile
Function HowLong {
   <# .Synopsis Returns the time taken to run a command
      .Description By default returns the time taken to run the last command
.Parameter ID The history ID of an earlier item.

   param  ( [Parameter(ValueFromPipeLine=$true)]
$id = ($MyInvocation.HistoryId -1)
  process {  foreach ($i in $id) {
                 (get-history $i).endexecutiontime.subtract(
(get-history ($i)).startexecutiontime).totalseconds
Once you know $MyInvocation.HistoryID gives the ID of the current item, it is easy to change the Prompt function to return something which contains it.

At the moment I find I’m jumping back and forth between PowerShell V2, and the CTP of V3 on my laptop
(and I can run PowerShell –Version 2 to launch a V2 version if I see something which I want to check between versions).
So I finally decided I would change the prompt function. This happened about the time I got the “why doesn’t the history get saved” question. Hmmm. Working with history in the Prompt function. Tick, tick, tick.  [Side track 2 In PowerShell the prompt isn’t a constant, it is the result of a function.  To see the function use the command type function:prompt]
So here is the prompt function I now have in my profile.
Function prompt {
  $hid = $myinvocation.historyID
  if ($hid -gt 1) {get-history ($myinvocation.historyID -1 ) |
                      convertto-csv | Select -last 1 >> $logfile
  $(if (test-path variable:/PSDebugContext) { '[DBG]: ' } else { '' }) + 
    "#$([math]::abs($hid)) PS$($PSVersionTable.psversion.major) " + $(Get-Location) + 
    $(if ($nestedpromptlevel -ge 1) { '>>' }) + '> '

The first part is new lines are new: get the history ID and if is greater than 1, get the previous history item, convert from an object to CSV format, discard the CSV header and append it to the file named in $logFile (I know I haven’t set it yet)

The second part is lifted from the prompt function found in the default profile, that reads
"PS $($executionContext.SessionState.Path.CurrentLocation)$('>' * ($nestedPromptLevel + 1)) "
It’s actually one line but I’ve split it at the + signs for ease of reading.
I put the a # sign and the history ID before “PS” – when PowerShell starts the ID is –1 so I make sure it is the absolute value.
After “PS” I put the major version of PowerShell.
I’m particularly pleased with the #ID part in the non-ISE version of PowerShell double clicking on #ID selects it. My mouse is usually close enough to my keyboard that the keypad [Enter] key is within reach of my thumb so if I scroll up to look at something I did earlier, one flickity gesture (double-click, thumb enter, right click [tab]) has the command in the current command line.

So now I’m keeping a log, and all I need to do is to load that log my from Profile. PowerShell has an Add-History command and the on-line help talks about reading in the history from a CSV file so that was easy – I decided I would truncate the log when PowerShell started and also ensure that the file had the CSV header so here’s the reading friendly version of what’s in my profile.

$MaximumHistoryCount = 2048
$Global:logfile = "$env:USERPROFILE\Documents\windowsPowershell\log.csv"
$truncateLogLines = 100
$History = @()
$History += '#TYPE Microsoft.PowerShell.Commands.HistoryInfo'
$History += '"Id","CommandLine","ExecutionStatus","StartExecutionTime","EndExecutionTime"'
if (Test-Path $logfile) {$history += (get-content $LogFile)[-$truncateLogLines..-1] | where {$_ -match '^"\d+"'} }
$history > $logfile
$History | select -Unique  |
 Convertfrom-csv -errorAction SilentlyContinue |
 Add-History -errorAction SilentlyContinue

UPDATE Copying this code into the blog page and splitting the last $history line to fit, something went wrong and the
select -unique went astray. Oops.
It’s there because hitting enter doesn’t advance the History count, or run anything but does cause the prompt function to re-run. Now I’ve had to look it again it occurs to me it would be better to have select –unique in the (get-content $logfile) rather in the Add-history section as this would remove duplicates before truncating.

So … increase the history count, from the default of 64 (writing this I found that in V3 ctp 2 the default is 4096). Set a Global variable to be the path to the log file, and make it obvious what the length is I will truncate the log to.
Then build an array of strings named history. Put the CSV header information into $history, and if the log file exists put up to the truncate limit of lines into $history as well. Write $history back to the log file and pipe it into add history, hide any lines which won’t parse correctly. Incidentally those who like really long lines of PowerShell could recode all lines with $history in them into one line. So a couple of lines in the prompt function and between 3 and 9 lines in the profile depending on how you write them all in it’s less than a dozen lines. This blog post has taken a good couple of hours, and I don’t the code in 10 to 15 minutes.


Oh , and one thing I really like – when I launch PowerShell –Version 2 inside Version 3, it imports the history giving easy access to the commands I just used without needing to cut and paste.

If you’re a Bash user and didn’t storm off in a huff after my initial rudeness I’d like to set a good natured challenge. A non-compiled enhancement to bash I can load automatically which gives it tab expansion on par with PowerShell’s (OK, PowerShell has an unfair advantage completing parameters, so just command names and file names). And in case you wondered about the quote at the top of the post from one of Stephen Moffat’s Dr Who episodes. You see, “I Know PowerShell. I point and laugh at Bash users.”

December 20, 2011

Free NetCmdlets

Filed under: Uncategorized — jamesone111 @ 9:48 pm
I’ve mentioned the NetCmdlets before. Although not perfect if you spend a lot of your life using PowerShell and various network tools they are a big help. They’ve made a bunch of things which would have been longwinded and painful relatively easy. So here is a mail I have just had from PowerShell inside (aka /n software) . I don’t normally hold with pasting mails straight into a blog post, but you’ll see why if you read on: if you click the link it asks you to fill in some details and “A member of our sales team will contact you with your FREE NetCmdlets Workstation License. (Limit one per customer.)”

A Gift For The Holidays: FREE NetCmdlets Workstation License – This Week Only, Tell a Friend!

Help us spread some PowerShell cheer! Tweet, blog, post, email, or just tell a friend and you can both receive a completely free workstation license of NetCmdlets!

NetCmdlets includes powerful cmdlets offering easy access to every major Internet technology, including: SNMP, LDAP, DNS, Syslog, HTTP, WebDav, FTP, SMTP, POP, IMAP, Rexec/RShell, Telnet, SSH, Remoting, and more. Hurry, this offer ends on Christmas day – Happy Holidays!

NetCmdlets Workstation License:

$99.00 FREE

NetCmdlets Server License:
* Special Limited Time Offer *

$349.00 $199.00 [+] Order Now

Hurry, this offer ends on Christmas day – Get your free license now!

Happy Holidays!

Or as we say this side of the Pond Merry Christmas !

December 10, 2011

PowerShell HashTables: Splatting, nesting, driving selections and generally simplifying life

Filed under: Powershell — jamesone111 @ 11:43 pm

Having spent so much of my life working in Microsoft environments where you typed a password to get into your computer and never had to type it again until the next time you had to login / unlock it (however many servers you worked with in between)  I find working in an office with a bunch of Linux servers but no shared set of accounts to be alternately comical and stupid. 
Some of our services use Active Directory to provide LDAP authentication so at least I retype the same user name and password over and over: but back as far as the LAN Manager client for DOS, Microsoft network clients tried to transparently negotiate connections using credentials they already had. Internet explorer does the same when connecting to intranet servers. Even in a workgroup if you manually duplicate accounts between machines you’ll get connected automatically. I stopped retyping passwords about the same time as I stopped saving work to floppy disk; I’m baffled that people in the Unix/Linux world tolerate it.
Some of our services use a separate LDAP service instead of AD (just to keep me on my toes I’m JamesOne on one, and Joneill on the other) and others again use their own internal accounts database. I might log on to a given box as root, and sign into MySQL as root but the passwords are different. If I go to another box the root password is different again. And so on.
Recently we hit a roadblock because one of the team had set up a server with a root password he couldn’t share (because of the other places he’d used it) and he gave me a key file as a work round. I’ve talked before about using the NetCmdlets to connect run commands on the Linux boxes. And this sent me back to look at how I was using them. I ended up driving them from Hash Table of Hash Tables  – something Ed Wilson has recently written about on the “Hey Scripting Guy” blog.

I use the Hash tables for splatting  – the name, if not the concept, is peculiar to PowerShell. If it’s not familiar, lets say I want to invoke the command

connect-ssh –Server "" -Force –user "root" –AuthMode "publickey" –CertStoreType "pemkeyfile" `
            -CertSubject "*" -certStore="$env:USERPROFILE\documents\secret.priv"

I can create a hash table ” with members Server, Force, Authmode and so on.  Normally when we refer to PowerShell variables we want to work with their Values so we prefix the name with the $ sign. Prefixing the name with the @ sign instead turns the members of an object into parameters for a command : like this

$master =@{Server=""; Force=$True;
           user="root"; AuthMode="publickey"; CertStoreType="pemkeyfile"; 
           CertSubject="*"; certStore="$env:USERPROFILE\documents\secret.priv"}
connect-ssh @master

For another server I might logon with conventional user name and password so I might have:

$Spartacus =@{Server=""; Force=$True; user="root";}

and use that hash table as the basis of logging on.  If I want to use one of the copy commands I might add two hash tables together – one containing logon information and the other containing the parameters relating to what should be copied – or I might specify these in the normal way in addition to my splatted variable.
In my original piece about using the NetCmdlets I showed Invoke-RemoteCommand, Get-RemoteItem, Copy-Remote-Item, Copy-LocalItem and so on: I have modified all of these to take a hash table as a parameter and use splatting. I also showed a set-up function named connect-remote: the hash tables are set up when I load the module, but they don’t contain credentials: connect-remote now looks at the hash-table for the connection I want to make and says “Is this fully specified specified to use a certificate, if not does it have credentials ?” If the answer is no to both parts it prompts for credentials and adds them to the hash table – in the snippet below $Server contains the hash table, and $user can be a parameter passed to connect-remote or in the Hash table, and if it can’t be found in either place it is set to the current user name

if (-not ($server.AuthMode -or $server.credential)) {
    $server["credential"] = $Host.ui.PromptForCredential("Login","Enter Your Details for $ServerName","$user","")

A global variable set in connect-remote keeps track of which of the hash tables with SSH settings in should be used as the default for Invoke-RemoteCommand, Get-RemoteItem, Copy-Remote-Item, Copy-LocalItem and so on. But it makes sense to have all the hash tables listed somewhere where they can be accessed by name so I have

$hosts = @{ master =@{Server=""; Force=$True; user="root";
                    AuthMode="publickey"; CertStoreType="pemkeyfile";
                    CertSubject="*"; certStore="$env:USERPROFILE\documents\secret.priv"}
         Spartacus =@{Server=""; Force=$True; user="root";}
           Maximus =@{Server=""; Force=$True; user="joneill";}

In connect-remote, the –server parameter is declared like this 

  $server = $hosts[(select-item $hosts -message "Which server do you want to connect to ? " -returnkey)]

Select-item is a function I wrote ages ago which takes a hash table and offers the user a choice based of the keys in the hash table and returns either the number of the choice (not very helpful for hash tables) or its name.
Function Select-Item
param ([parameter(ParameterSetName="p1",Position=0)][String[]]$TextChoices,
       [String]$Caption="Please make a selection",
       [String]$Message="Choices are presented below",
$choicedesc = New-Object System.Collections.ObjectModel.Collection[System.Management.Automation.Host.ChoiceDescription]
switch ($PsCmdlet.ParameterSetName) { 
       "p1" {$TextChoices | ForEach-Object { $choicedesc.Add((
New-Object "System.Management.Automation.Host.ChoiceDescription" -ArgumentList $_ )) } }
       "p2" {foreach ($key in $HashChoices.Keys) { $choicedesc.Add((
New-Object "System.Management.Automation.Host.ChoiceDescription" -ArgumentList $key,$HashChoices[$key] )) } }
If ($returnkey) { $choicedesc[$Host.ui.PromptForChoice($caption, $message, $choicedesc, $default)].label }
else            {             $Host.ui.PromptForChoice($caption, $message, $choicedesc, $default) }

December 3, 2011

PowerShell–full of stringy goodness.

Filed under: Powershell — jamesone111 @ 2:28 pm

I think almost everyone who works with PowerShell learns two things in their first few minutes.
(a) Assigning some text to a Variable looks like this $Name =  "James"
(b) When you wrap a string in double quotes it expands variables inside it, for example "Hello $Name" will evaluate to Hello James.

A little later we tend to learn that if we want to put a property of a variable in string things get marginally more complex.
"The name $Name is $name.length characters long" gives the text the name James is James.length characters long.

To get the length of what is stored in the variable named “name” we need to use "The name $Name is $($name.length) characters long". This gives the name James is 5 characters long.

And most of us also learn that PowerShell can have Multi-line strings, sometimes called Here-Strings. which look like this

Here is a string
The string $name made
It hasn’t finished yet
There’s more
OK we’re done

Here-strings are really useful because they don’t close until they reach “@ at the start of a line, they can contain quotes, newlines and if they use double quotes they too will expand variables.

I should say here that I don’t have any detailed knowledge of how PowerShell’s parser actually deals with strings, but it seems pretty clear that when it hits a $ sign it says “OK I need to work out a value here”. I think using $ to mean “value of” was something PowerShell picked up from another language but I can’t be sure which one it was. My inner Pedant wants to correct people when they say “PowerShell variable names begin with $” $foo means the value that is in foo: the name is foo. The way I understand it, when the parser sees $ inside a string it looks to see what comes next: if it is a sequence of letters it assumes they are the name of variable, which it should insert into the string. If is a “(“ character then it looks for the matching “)” and evaluates the expression inside, like $name.length. Easy. Only this week I found myself saying … these are nothing special … well, we can get much better things than that.

I had to create over a dozen XML files: all identical except each one had a different database table name and a subset of the fields in the table – some had just 2 fields others had more than a dozen, the XML went like this.
   <kind1 name=TableName>
      <field name=field1>
      <field name=field2>
      <field name=field3>
   <xmlgumf />
   <kind2 name=TableName>
      <field name=field1>
      <field name=field2>
      <field name=field3>
      <field name=field3>
   <xmlgumf />

Except the real XML was much more complicated. Using the tool which builds the XML file from scratch takes about an hour per file. The XML files then go into another tool which is where the real magic happens. Building a dozen files would be couple of days work, or with the project I’m working on , a very late night.
I had a document with the table names and the field lists in them, so could I automate the creation of these XML files ?
Outside of a string I’d write something like this
$fields | foreach -begin {$a=""} -process {$a += [environment]::newline + "<Field name=$_ >"} -end {$a}

Would that second line work in a here-string based on an existing XML file? How well would this work ?

$Table = “TableName”
   <kind1 name=$Table>$($fields | foreach -begin {$a=””} -process {$a += [environment]::newline + ”      <Field name=$_ >”} -end {$a})
   <xmlgumf />
   <kind2 name=$Table>$($fields | foreach -begin {$a=””} -process {$a += [environment]::newline + ” <Field name=$_ >”} -end {$a})
   <xmlgumf />
“@ | out-file “$table.xml” –encoding ascii

The simple answer is it works like a charm.
Building the first XML file took an hour using the normal tool to do it from scratch, it then took 5 minutes to convert that file into a template – which to my immense delight worked first time. Producing each additional XML file took 2 minutes. Half an hour in all. That’s 10.5 hours I’ve got to do something else with. And showing it to colleagues their reaction was I had performed serious magic. The kind of magic which lets me disappear early.

October 24, 2011

Maximize the reuse of your PowerShell

Filed under: Powershell — jamesone111 @ 3:18 pm
Last week I was at The Experts Conference in Frankfurt presenting at the PowerShell Deep Dive. My presentation was entitled “Maximize the reuse of your PowerShell”.
My PowerShell library for managing Hyper-V has now gone through a total of 100,000 downloads over all its different versions but whether it’s got wide use because of the way I wrote it or in spite of it, I can’t say.  Some people whose views I value seem to agree with the ideas I put forward in the talk, so I’m setting them out in this (rather long) post.

I have never lost my love of Lego. Like Lego, PowerShell succeeds by being a kit of small, general purpose blocks to link together.  Not everything we do can extend that kit, but we should aim to do that where possible. I rediscovered the Monad Manifesto recently via a piece Jeffrey Snover about how PowerShell has remained true to the original vision, named Monad. It talks of a model where “Every executable should do a narrow set of functions and complex functions should be composed by pipelining or sequencing executables together”.  Your work is easier to reuse if it becomes a building block that can be incorporated into more complex solutions; and this certainly isn’t unique to PowerShell.   

Functions for re-use, scripts for a single task. If you a use .PS1 script to carry out a task it is effectively a batch file:  it is automating stuff so that’s good, but it isn’t a building block for other work. Once loaded, functions behave like like compiled cmdlets. If you separate parts of a script into functions it should be done with the aim of making those parts easy to reuse and recycle, not simply be to reformat a script into subroutines.
If we’re going to write functions how should they look?

Choose names wisely, One task : One Function, there is no “DoStuffWith” verb for a reason.
PowerShell cmdlets use names in the form Verb-Noun and it is best to stick to this model; the Get-Verb command will give you a list of the Approved verbs. PowerShell’s enforcement of these standards is limited to raising a warning if you use non-standard names in a module. If your action really isn’t covered by a standard verb, then log the need for another one on connect; but if you use “Duplicate” when the standard says “Copy”, or “Delete” when the standard says “Remove” it simply makes things more difficult for people who know PowerShell but don’t know your stuff. The same applies to parameter names: try to use the ones people already know. Getting the function name right sets the scope for your work. For my talk I used an example of generating MD5 hashes for files. My assertion was that the MD5 hash belonged to the file, and so command should return a file with an added MD5 hash – meaning my command should be ADD-MD5.

Output: Use the right Out- and Write- cmdlets Other speakers at the Deep Dive made the point that output created with Write-Host isn’t truly returned (in the sense that it can be piped or redirected) it is writing on the console screen; so only use Write-Host if you want to prevent output going anywhere else. There are other “write on the screen without returning” cmdlets which might be better: for example when debugging we often want messages that say “starting stage 1” , “starting stage 2” and so on. One way to do this would be to add Write-Host statements and after the code is debugged remove them. A better way is to use Write-Verbose which outputs if you specify a –verbose switch and doesn’t if you don’t. As a side effect you have a quasi-comment in your code which tells you what is happening. The same is true when you use Write-Progress to indicate something is happening in a long running function, (remember to run it with the -completed switch when you have finished otherwise your progress box can remain on screen while something else runs). Write-Debug, Write-Error and Write-Warning are valuable, I try to prevent errors appearing where the code can recover and write a warning when something didn’t go to plan, in a non-fatal way.

Formatting results can be a thorny issue: you can redirect the results of Format-Table or Format-List to a file, but the output is useless if you want to pipe results into another command – which needs the original object.
Some objects can have ugly output: it’s not a crime to have an –asTable switch so output goes through Format-Table at the end of the function or even to produce pretty output by default provided you ensure that it is possible to get the raw object into the pipe with a –noFormat or –Raw switch. But it’s best to create formatting XML and a lot easier with tools like James Brundage’s E-Z-out.

Output with the pipeline in mind. It’s not enough to just avoid write-host and return some sort of object. I argued that a function should return the richest object that is practical – my example used Add-Member to take a file object and give it an additional MD5Hash property. I can do anything with the result that I can do with a file and default formatting rules for a file are used (which is usually good)  but being the right type need not matter if the object has the right property/properties. For example, this line of PowerShell :
  Select-String -Path *.ps1 -Pattern "Select-String"
looks at all the .PS1 files in the current folder and where it finds the text  “Select-String”  it outputs a MatchInfo object with the properties: .Context, .Filename, .IgnoreCase, .Line, .LineNumber, .Matches, .Path and .Pattern. A Cmdlet like Copy-Item has no idea about MatchInfo objects, but if an object piped in has a .path property, it will knows what it is being asked to work on. If Matchinfo objects named this property “NameOfMatchingFile” it just would not work. 

Think about pipelined input. All it takes to tell PowerShell that a parameter should come from the pipeline is prefixing its declaration with
  [parameter(ValueFromPipeLine= $true)]
If you find that you are using your new function like this
  Get-thing | ForEach {Verb-Thing –thing $_}
It’s telling you that -thing should take pipeline values.

The pipeline can supply multiple items , so a function may need to be split into Begin{}process{} and End {} blocks. (If you don’t specify these the whole function body is treated as an end block and only the last item passed is processed). Eventually the realization dawns that if the example above works, it should be possible have two lines:
  $t = Get-thing
Verb-Thing –thing $t
So parameters need to be able to handle arrays – something I’ll come back to further down.

You can do the same thing that Copy-Item was shown doing above: I have another function which is a wrapper for Select-String, its parameters include:
  [parameter(ValueFromPipelineByPropertyName = $true)][Alias('Fullname','Path')]

If the function gets passed a string via the pipeline it is the value for the -pattern parameter. If it gets an object containing a property named “Include”, “Fullname” or “path” that property becomes the value for the –include parameter.

Sometimes a function needs to output to a destination based on input: so you can check to see if a parameter is a script block and evaluate it if it is. 

Don’t require users to know syntax for things inside your function. If you are going to write code to do one job and never reuse it then you don’t need to be flexible. If the code is to be reused, you need to do a little extra work so users don’t have to. For example: if something you use needs a fully qualified path to a file, then the function should use Resolve-Path to avoid an error when the user supplies the name of a file in the current directory. Resolve-Path is quite content to resolve multiple items so replacing   
  Do_stuff_with $path
  Resolve-Path $path | for-each {Do stuff with $_ }
Delivers, at a stroke, support for Wildcards, multiple items passed in $path and relative names .
Another example is with WMI, where syntax is based on SQL so the wildcard is “%”, not “*”. Users will assume the wildcard is *. In this case do you say:
(a) Users should learn to use “%”   
(b) My  function should include   $parameter = $parameter -Replace "*","%" 
For my demo I showed a function I have named “Get-IndexedItem” which finds things using the Windows index. I wanted to find my best underwater photos- they are tagged “portfolio” and are the only photos I shoot with a Canon camera. My function lets me type
  Get-IndexedItem cameramaker=can*,tag=portfolio
Inside the function the search system needs a where condition of
  “System.Photo.Cameramanufacturer LIKE 'can%'  AND System.Keywords = 'portfolio'
Some tools would require me to type the filtering condition in this form, but I don’t want to remember what prefixes are needed and whether the names are “camera Maker” or “camera Manufacturer” and “keyword”, “keywords” or “tag”. Half the time I’ll forget to wrap things in single quotes, or use “*” as I wild card because I forgot this was SQL. And if I have multiple search terms why shouldn’t they be a list not “A and b and C” (there is a write-up coming for how I did this processing. ).

Set sensible defaults.  The talk before mine highlighted some examples of “bad” coding, and showed a function which accepted a computer name. Every time the user runs the command they must specify the computer. Is it valid to assume that most of the time the command will run against the current computer? If so the parameter should default to that. If the tool deals with files is it valid to assume “all files in the current directory” – if the command is delete, probably not, if it displays some aspect of the files, it probably can.
Constants could be Parameter defaults With computer name example you might write a function which only ran against the current computer. Obviously it is more useful if the computer name is a parameter (with a default) not a constant. In a surprising number of a cases, something you create to do a specific task can carry out a generic task if you change a fixed value in your code into a parameter, which defaults to the former fixed value.

Be flexible about parameter types and other validation This is a variation on not making the user understand the internals of your function and I have talked about it before, in particular the dangers of people who are trained systems programmers applying the techniques in PowerShell they would use in something like C#: in those languages a function declaration might look like:
  single Circumference(single radius) {}
which says Circumference takes a parameter which must have been declared to be a single precision real number and returns a single precision real number. Writing
   c = Circumference("Hello");
or  c = Circumference("42");
will cause the compiler to give a “type mismatch” error – “Hello” and “42” are strings, not single precision real numbers . Similarly
System.io.fileinfo f = Circumference(42);
Is a type mismatch: f is a fileInfo object and we can’t assign a number to it. The compiler picks these things up before the program is ever run, so  users don’t see run-time errors.
PowerShell isn’t compiled and its Function declarations don’t include a return a type: Get-Item, for example, deals with the file system, certificate stores, the registry etc. so it can return more than a dozen different types: there is no way to know in advance what type of item Get-Item $y will return. If the result is stored in a variable (with  $x = Get-item $y )  the type of the variable isn’t specified, but defined at runtime.
Trying to translate that declaration into PowerShell gives something like this.
  Function Get-Circumference{
  $radius * 2 * [system.math]::PI

  Get-Circumference "Hello"
Produces an error but a closer inspection show it is not a type mismatch: it says
Cannot process argument transformation on parameter 'radius'. Cannot convert value "hello" to type "System.Single".
the [Single] says “always try to cast $f as a single”.  Which means
  Get-Circumference "42"
Will be cast from a string to a single and the function returns 263.89

So you might expect 
  [System.IO.FileInfo]$f = Get-Circumference 42
To throw an error, but it doesn’t, PowerShell casts 263.893782901543 to an object representing a file with that name in \windows\system32. The file doesn’t exist and is read-only!  So it can be better to resolve types in code.

I’d go further than that. Some validation is redundant because the parameter will be passed to a cmdlet which is more flexible than the function writer thought, in other cases parameter validation is there to cover up a programmer’s laziness . When I see a web site which demands that I don’t enter spaces in my credit card number I think “Lazy. It’s easy to strip spaces and dashes”. Having “O’Neill” for a surname means that the work of slovenly developers who don’t check their SQL inputs or demand only letters gets drawn to my attention too. If the user is forced to use the equivalent of
  Stop-Process (get-process Calc)
they will think “Lazy. you  couldn’t be bothered even to provide a –name parameter”.  Stop-process does just that to cope with process objects, names and IDs (and notice it allows you to pass more than one ID)  for example:
  stop-process [-id ] 4472,5200,5224
  $p = get-process calc ; Stop-Process -InputObject $p 
  stop-process -name calc

Other cmdlets are able to resolve types without using different parameters for example
  ren '.\100 Meter Event.txt' "100 Metre Event.txt"
  $f = get-item '.\100 Metre Event.txt' ; ren $f '100 Meter Event.txt'
In one case the first item is a string and the other is a file object: from a user’s point of view this is better than stop-process which in turn is much better than having to get an object representing the process you want to stop. In my talk I used Set-VMMemory from my Hyper-V module on Codeplex, which has:
        [parameter(ValueFromPipeLine = $true)]
        [long]$Memory = 0 ,
        $Server = "."

There isn’t a way for me to work out what a user means if they specify a memory size which isn’t a number (and can’t be cast to one). If the user specifies many Virtual Machines, catching something during parameter validation will produce one error instead of one per VM (so this would be the place to trap negative numbers too).
I have a –server parameter and it makes sense to assume the local machine – but I can’t make an assumption about which VM(s) the user might want to modify. The VM can come from the pipeline, and it might be an object which represents a VM, or a string with a VM name (possibly a wildcard) or an array of VM objects and/or Names. If the user says
  Set-VMMemory –memory 1GB –vm london* –server h1,h2
the function, not the user, translates that into the necessary objects. If this doesn’t match any VMs I don’t want things to break. (Though I might produce a warning). It takes 4 lines to support all of this
   if ($VM -is [String]) { $VM = Get-VM -Name $VM -Server $Server}
  if ($VM.count -gt 1 )  {[Void]$PSBoundParameters.Remove("VM")
                           $VM | ForEach-object {Set-VMMemory -VM $_ @PSBoundParameters}}
  if ($vm.__CLASS -eq 'Msvm_ComputerSystem') {
      do the main part of the code 
Incidentally I use the __CLASS property because it works with remoting when other methods for checking a WMI type failed.

Allow parts to be switched off In the talk I explained I have a script which applies new builds of software to a remote host. It does a backup and creates a roll back script. Sometimes I have to roll back, produce another build and then apply that. Since I have an up to date backup I don’t need to run the backup a second time so that part of the code runs unless I specify a –nobackup switch.

Support –whatif  or restore data. You choose. A function can use $pscmdlet.shouldProcess – which returns true if the function should proceed into a danger zone and false if it shouldn’t. It only takes one line  line before declaring the parameters at the start of a function to enable this
  [CmdletBinding(SupportsShouldProcess=$true, ConfirmImpact='High' )]
There are 4 confirmation levels “None”, “Low”, “Medium,”, “High”, and for the Confirm impact value and the $Confirmation preference variable. If either is set to “None” no confirmation appears. If the impact level is higher or equal to the preference setting this line
  if ($pscmdlet.shouldProcess($file.name,”Delete file”) {Remove-item $file}
Will delete the file or not depending on the users response to a prompt – in this case “Performing operation “Delete File” on Target “filename”. Confirmation can be forced on by specifying –confirm. Specifying –whatif will echo the message but return false. If no confirmation is needed,  
will echo the message but return true.

Prepare your code for other people, including you.  I’m not the person I was a year ago. You’re not either, and we’ll be different people in 3 or 6 months. So even if you are not publishing your work are writing for someone else. You, in 3 months time. You under-pressure at 3 in the morning. And that you will curse the you of today if leave your code in a mess.

There are many ways to format your code: find a style of indenting that works for you, if it does, then the chances that anyone else will like it are about the same as the chance they will hate it.  Some people like to play “PowerShell Golf” where fewest [key]strokes wins – this is fine for the command prompt. Expanding things to make them easier to read is generally good. That’s not an absolute ban on aliases – using Sort and Where instead of Sort-Object and Where-Object may help readability – the key is to break up very long lines, but without creating so many short lines that you are constantly scrolling up and down.

Everyone says “Put in brief comments”, but I’d refine that: explain WHY not WHAT, for example  I only decided to use Set-VMMemory as an example in my talk at the last moment.  It has a line
  if (-not ($memory % 2mb))  {$memory /= 1mb}

I can’t have looked code in 8 months. Why would I do that ? Fortunately I’ve put in WHY comment – the WHAT is self explanatory
# The API takes the amount of memory in MB, in multiples of 2MB.
# Assume that anything less than 2097152 is in already MB (we aren't assigning 2TB to the VM). If user enters 1024MB or 0.5GB divide by 1MB 

Instead of putting comments next to each parameter saying what it does, move the comments into comment based help, take a couple of places where you have used / tested the command and put them into the comment based help as examples. Your future self will thank you for it.

October 17, 2011

How to be Creative with QR Codes

Filed under: Uncategorized — jamesone111 @ 1:15 pm

I’ve been playing with QR codes recently, and have started to use them . HELLO if you’ve been at The Experts Conference in Frankfurt and scanned one of from one of my sessions the files form these are on My Skydrive 

I looked at a number of different code libraries which would build a QR code for me but the easiest way turned out to be to use a web service, provided by Google.  And I wrapped this up in a little PowerShell

[System.Reflection.Assembly]::LoadWithPartialName(”System.Web") | Out-Null
Function get-qrcode {
param ([parameter(ValueFromPipeLine= $true, mandatory=$true)]
       $path = (join-path $pwd "QRCode.PNG")
  if ($WebClient -eq $null) {$Global:WebClient=new-object System.Net.WebClient }
  $WebClient.DownloadFile(("http://chart.apis.google.com/chart?cht=qr&chs=547x547&chld=H|0&chl=" + 
::UrlEncode($Text)) , $path)
   Start-Process $path

So it takes two parameters, a block of text and a path where the file should be saved. The text must be specified, but there is a default for

To save having to the create lots of web client objects, I keep a web client object , then one line of powershell gets a PNG file from the Google Service and saves it to the path.  Finally I launch the file in the default viewer.

The next step is to use some image tools to pretty up the QR code. I still use an ancient version of PaintShop pro and that does the job here nicely.


On the left is the original QR code, in the middle I have applied some gaussian blur to the image and on the right I have reduced this to two colours, 100% black and 100% white. This smoothes off the edges. Then I take the image back to 16 million colours and add some colour.



One of the really nice things about QR codes is that they have error correction built in (and in my call to the web service I specify the maximum amount) this means we put something which isn’t part of the code into the picture. I’ve used the Frankfurt skyline from the slide template here, but this is scope for creativity.

Of course there is nothing which says the data in the code must be a URL. Here’s a message for anyone who has got a reader. The file name and the Ferrari might be a clue to what it says.


September 11, 2011

Adding $Clipboard automatic variable support to PowerShell

Filed under: Powershell — jamesone111 @ 5:08 pm

Some of the bits and pieces I have been working on recently have needed PowerShell to take input from the windows clipboard, and as with so many things in PowerShell there is more than one way this can be done.

  • If it is something short paste into a command line –wrapping in quotes or @” “@ as required
  • Paste into a new edit window in the PowerShell ISE and wrap it with $variableName = “” (or @” “@ as required) , then run the contents of the Window
  • Use  [windows.clipboard]::GetText() in a command line
  • Create a Get-Clipboard Function which wraps  [windows.clipboard]::GetText()
  • Create a $Clipboard “automatic variable”
  • Use the Paste.Exe which Joel has on his site.

Note (a) that if you want to use  [windows.clipboard]::GetText() in the “Shell” version of PowerShell you need to start PowerShell.exe with the –STA switch and run the command   Add-Type -AssemblyName (“Presentationcore”) – I’m actually only interested in doing this in the ISE.
and (b) If you want to send PowerShell’s output to the clipboard the command line Clip.exe has been Windows for several versions. It’s getting text from the clipboard which needs more work.

There’s very little to choose between having Get-Clipboard and $clipboard but having just read Robert’s piece on Tied Variables I thought I would go down that path.

Robert explained that you can set a breakpoint on a PowerShell variable to run a scriptblock when the variable is read, so he got to thinking “I can run something which changes the value of variable, when it is read.

In my profile I have

$Global:MyBreakPoint = $Global:Clipboard = Set-PSBreakpoint -Variable Clipboard -Mode Read -Action {
Set-Variable clipboard ([Windows.clipboard]::GetText()) -Option ReadOnly, AllScope -Scope Global -Force}

So $clipboard is set to the breakpoint which is created by monitoring read action on itself. At each subsequent read the script-block is called and uses set-variable to make $clipboard a ReadOnly variable holding the value of the clipboard at that moment.

In the I modified my Get-SQL command to take advantage of this. I added a –Paste switch (I chose –paste over –clipboard simply because I already had –connection so –c would be ambiguous –paste allows me to run sql –p , taking advantage of SQL being an automatic alias for Get-SQL) then

if ($Paste -and (test-path 'Variable:\Clipboard'))
    {$sql = $Clipboard -replace ".*(?=select|update|delete)","" -replace "[\n|\r]+","" }

This trims out anything before “Select”, “update” or “Delete” –as I am usually copying a statement from a log with a time stamp at the start, and it also replaces any carriage return/line feed characters.

WHAT ? A bug in the PowerShell ISE

imageThis was all working brilliantly until I went to use a function which uses the PowerShellISE object model to manipulate text in an edit window it fails – the error implies that PowerShell’s object model thinks a debugger is running if a break point exists, even though trying to edit the script files interactively is not a problem .

To get around this I set TWO variables to point to the break point, one was $clipboard, which will be overwritten as soon as it is used, and the other is $myBreakPoint. In my edit function I now start with

if ($MyBreakPoint) {Remove-PSBreakpoint $MyBreakPoint }

Removing the breakpoint doesn’t remove the variable $myBreakpoint so on the way out of the function I use it to decide if I need to reinstate it.

September 2, 2011

Round-tripping files for editing

Filed under: Powershell — jamesone111 @ 9:20 pm

Watching a quiz the other day the host said of a book “I plan to start [it] the moment my Doctor tells me I have 2000 years to live”. It’s was a great spin on “life’s too short to…” I my case life is too short to re-learn how to use vi.  I learned how to use it on Wyse terminals at university in the mid 1980s, and I didn’t rate it then. If wikipedia has its facts right vi was dates back to 1976; Steve Jobs and Steve Wozniak had yet to launch the Apple II had IBM hadn’t even thought of making a PC, and the supercomputers in the best equipment research facilities had less power than today’s SmartPhones. In computer terms dinosaurs were still roaming the earth, and must make vi the coelacanth of software, ugly as hell and somehow immune to the forces which should have rendered it extinct.

I’m working with a Linux server which I can only access over an SSH session. Since configuring the server means editing text files some thought has gone into how to avoid vi.  We’ve standardized on SSH explorer in the office – it can round-trip files to the PC for editing. I’ve recently been using the netcmdlets to move files around and I’ve also been looking at trapping file-changed events I found myself asking “How difficult would it be to transfer a file to a temporary folder, load it into the editor of my choice and push it back to where it came from when it was saved?”. The answer turns out “about this difficult …”

Function Edit-RemoteItem {
param ([Parameter(ValueFromPipeLine=$true, ValueFromPipelineByPropertyName=$true, mandatory=$true)]
       $Connection = $Global:ssh
       $localpath     =  ( join-path $env:temp ($path -split "/")[-1])
       $MessageDataHT = @{   Server=$connection.Server;        RemoteFile=$path; 
                         Credential=$connection.Credential;      LocalFile=$localpath; 
                              force=$true;                       Overwrite=$true}
       Get-SFTP @MessageDataHT | out-null
       if ( $Host.Name -match "\sISE\s" )
            { $null = $psISE.CurrentPowerShellTab.Files.Add($localpath) }
       else { notepad.exe $localpath }
       $Global:watcher = Register-FileSystemWatcher -MessageData $MessageDataHT -Path $localpath `
                       -On "Changed" -Process {
$event.sender.enableRaisingEvents = $false 
                          $oldpp =$ProgressPreference
                          if ((Send-SFTPserver 
$event.MessageData.server `
                                      -LocalFile  $event.MessageData.LocalFile `
                                      -RemoteFile $event.MessageData.RemoteFile `
                                      -Credential $event.MessageData.Credential` 
                                      -Overwrite -Force ).size) {
Write-host "Updated $($event.MessageData.RemoteFile on $($event.MessageData.server)"
                          $ProgressPreference = $oldpp 
                          $event.sender.enableRaisingEvents = $True
To run through the function quickly; it takes a remote path and a netcmdlets SSH connection object.  It builds a local path and a hash table with parameters in and calls Get-SFTP by splatting – turning each key/value pair in the hash table into a parameter/value pair.

If it is running in the PowerShell ISE it loads the transferred file into PowerShell’s editor, otherwise it loads it into notepad.

Then it sets up a FileSystem watcher using the same code I talked about here  The watcher is passed the same hash table of parameters – unfortunately you can’t use splatting in a process block and there is no option to hide the progress bar, so I temporarily change the $Progress preference and change it back when I’ve finished. I also make sure that while the upload is happening no new events are raised. In between I call send-Sftp inserting parameters from the hash table: simples.

« Previous PageNext Page »

Create a free website or blog at WordPress.com.