James O'Neill's Blog

August 9, 2012

Getting to the data in Adobe Lightroom–with or without PowerShell

Filed under: Databases / SQL,Photography,Powershell — jamesone111 @ 7:01 am

Some Adobe software infuriates me (Flash), I don’t like their PDF reader and use Foxit instead, apps which use Adobe-Air always seem to leak memory. But I love Lightroom .  It does things right – like installations – which other Adobe products get wrong. It maintains a “library” of pictures and creates virtual folders of images ( “collections” ) but it maintains metadata in the images files so data stays with pictures when they are copied somewhere else – something some other programs still get badly wrong. My workflow with Lightroom goes something like this.

  1. If I expect to manipulate the image at all I set the cameras to save in RAW, DNG format not JPG (with my scuba diving camera I use CHDK to get the ability to save in DNG)
  2. Shoot pictures – delete any where the camera was pointing at the floor, lens cap was on, studio flash didn’t fire etc. But otherwise don’t edit in the camera.
  3. Copy everything to the computer – usually I create a folder for a set of pictures and put DNG files into a “RAW” subfolder. I keep full memory cards in filing sleeves meant for 35mm slides..
  4. Using PowerShell I replace the IMG prefix with something which tells me what the pictures are but keeps the camera assigned image number. 
  5. Import Pictures into Lightroom – manipulate them and export to the parent folder of the “RAW” one. Make any prints from inside Lightroom. Delete “dud” images from the Lightroom catalog.
  6. Move dud images out of the RAW folder to their own folder. Backup everything. Twice. [I’ve only recently learnt to export the Lightroom catalog information to keep the manipulations with the files]
  7. Remove RAW images from my hard disk

There is one major pain. How do I know which files I have deleted in Lightroom ? I don’t want to delete them from the hard-disk I want to move them later. It turns out Lightroom uses a SQL Lite database and there is a free Windows ODBC driver for SQL Lite available for download.  With this in place one can create a ODBC data source – point it at a Lightroom catalog and poke about with data. Want a complete listing of your Lightroom data in Excel? ODBC is the answer. But let me issue these warnings:

  • Lightroom locks the database files exclusively – you can’t use the ODBC driver and Lightroom at the same time. If something else is holding the files open, Lightroom won’t start.
  • The ODBC driver can run UPDATE queries to change the data: do I need to say that is dangerous ? Good.
  • There’s no support for this. If it goes wrong, expect Adobe support to say “You did WHAT ?” and start asking about your backups. Don’t come to me either. You can work from a copy of the data if you don’t want to risk having to fall back to one of the backups Lightroom makes automatically

   I was interested in 4 sets of data shown in the following diagrams. Below is image information with the Associated metadata, and file information. Lightroom stores images (Adobe_Images table) IPTC and EXIF metadata link to images – their “image” field joins to the “id_local” primary key in images. Images have a “root file” (in the AgLibraryFile table) which links to a library folder (AgLibraryFolder) which is expressed as a path from a root folder (AgLibraryRootFolder table). The link always goes to the “id_local” field I could get information about the folders imported into the catalog just by querying these last two tables (Outlined in red)

image

The SQL to fetch this data looks like this for just the folders
SELECT RootFolder.absolutePath || Folder.pathFromRoot as FullName
FROM   AgLibraryFolder     Folder
JOIN   AgLibraryRootFolder RootFolder O
N  RootFolder.id_local = Folder.rootFolder
ORDER BY FullName 

SQLlite is one of the dialects of SQL which doesn’t accept AS in the FROM part of a SELECT statement . Since I run this in PowerShell I also put a where clause in which inserts a parameter. To get all the metadata the query looks like this
SELECT    rootFolder.absolutePath || folder.pathFromRoot || rootfile.baseName || '.' || rootfile.extension AS fullName, 
          LensRef.value AS Lens,     image.id_global,       colorLabels,                Camera.Value       AS cameraModel,
          fileFormat,                fileHeight,            fileWidth,                  orientation ,
         
captureTime,               dateDay,               dateMonth,                  dateYear,
          hasGPS ,                   gpsLatitude,           gpsLongitude,               flashFired,
         
focalLength,               isoSpeedRating ,       caption,                    copyright
FROM      AgLibraryIPTC              IPTC
JOIN      Adobe_images               image      ON      image.id_local = IPTC.image
JOIN      AgLibraryFile              rootFile   ON   rootfile.id_local = image.rootFile
JOIN      AgLibraryFolder            folder     ON     folder.id_local = rootfile.folder
JOIN      AgLibraryRootFolder        rootFolder ON rootFolder.id_local = folder.rootFolder
JOIN      AgharvestedExifMetadata    metadata   ON      image.id_local = metadata.image
LEFT JOIN AgInternedExifLens         LensRef    ON    LensRef.id_Local = metadata.lensRef
LEFT JOIN AgInternedExifCameraModel  Camera     ON     Camera.id_local = metadata.cameraModelRef
ORDER BY FullName

Note that since some images don’t have a camera or lens logged the joins to those tables needs to be a LEFT join not an inner join. Again the version I use in PowerShell has a Where clause which inserts a parameter.

OK so much for file data – the other data I wanted was about collections. The list of collections is in just one table (AgLibraryCollection) so very easy to query, and but I also wanted to know the images in each collection.

 image

Since one image can be in many collections,and each collection holds many images AgLibraryCollectionImage is a table to provide a many to relationship. Different tables might be attached to AdobeImages depending on what information one wants from about the images in a collection, I’m interested only in mapping files on disk to collections in Lightroom, so I have linked to the file information and I have a query like this.

SELECT   Collection.name AS CollectionName ,
         RootFolder.absolutePath || Folder.pathFromRoot || RootFile.baseName || '.' || RootFile.extension AS FullName
FROM     AgLibraryCollection Collection
JOIN     AgLibraryCollectionimage cimage     ON collection.id_local = cimage.Collection
J
OIN     Adobe_images             Image      ON      Image.id_local = cimage.image
JOIN     AgLibraryFile            RootFile   ON   Rootfile.id_local = image.rootFile
JOIN     AgLibraryFolder          Folder     ON     folder.id_local = RootFile.folder
JOIN     AgLibraryRootFolder      RootFolder ON RootFolder.id_local = Folder.rootFolder
ORDER BY CollectionName, FullName

Once I have an ODBC driver (or an OLE DB driver) I have a ready-made PowerShell template for getting data from the data source. So I wrote functions to let me do :
Get-LightRoomItem -ListFolders -include $pwd
To List folders, below the current one, which are in the LightRoom Library
Get-LightRoomItem  -include "dive"
To list files in LightRoom Library where the path contains  "dive" in the folder or filename
Get-LightRoomItem | Group-Object -no -Property "Lens" | sort count | ft -a count,name
To produce a summary of lightroom items by lens used. And
$paths = (Get-LightRoomItem -include "$pwd%dng" | select -ExpandProperty path)  ;   dir *.dng |
           where {$paths -notcontains $_.FullName} | move -Destination scrap -whatif

  Stores paths of lightroom items in the current folder ending in .DNG in $paths;  then gets files in the current folder and moves those which are not in $paths (i.e. in Lightroom.) specifying  -Whatif allows the files to be confirmed before being moved.

Get-LightRoomCollection to list all collections
Get-LightRoomCollectionItem -include musicians | copy -Destination e:\raw\musicians    to copies the original files in the “musicians” collection to another disk

I’ve shared the PowerShell code on Skydrive

July 31, 2012

Rotating pictures from my portfolio on the Windows 7 Logon screen

Filed under: Photography,Powershell — jamesone111 @ 12:15 pm

In the last 3 posts I outlined my Get-IndexedItem function for accessing windows Search. The more stuff I have on my computers the harder it is to find a way of classifying it so it fits into hierarchical folders : the internet would be unusable without search, and above a certain number of items local stuff is too.  Once I got search I start leaving most mail in my Inbox and outlook uses search to find what I want; I have one “book” in Onenote with a handful of sections and if I can’t remember where I put something, search comes to the rescue. I take the time to tag photos so that I don’t have to worry too much about finding a folder structure to put them in. So I’ll tag geographically  (I only have a few pictures from India – one three week trip, so India gets one tag but UK pictures get divided by County , and in counties with many pictures I put something like Berkshire/Reading. Various tools will make a hierarchy with Berkshire then Newbury, Reading etc) People get tagged by name – Friends and Family being a prefix to group those and so on. I use Windows’ star ratings to tag pictures I like – whether I took them or not – and Windows “use top rated pictures” for the Desktop background picks those up. I also have a tag of “Portfolio”

Ages ago I wrote about Customizing the Windows 7 logon screen. So I had the idea “Why not find pictures with the Portfolio tag, and make them logon backgrounds.”  Another old post covers PowerShell tools for manipulating images so I could write a script to do it, and use Windows scheduled tasks to run that script each time I unlocked the computer so that the next time I went to the logon screen I would have a different picture. That was the genesis of Get-IndexedItem. And I’ve added it, together with the New-LogonBackground to the image module download on the Technet Script Center

If you read that old post you’ll see one of the things we depend on is setting a registry key so the function checks that registry key is set and writes a warning if it isn’t:

if ( (Get-ItemProperty HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background
     
).oembackground -ne 1) {
        Write-Warning "Registry Key OEMBackground under
          HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background needs to be set to 1"

        Write-Warning "Run AS ADMINISTRATOR with -SetRegistry to set the key and try again."
}

So if the registry value isn’t set to 1, the function prints a warning which tells the user to run with –SetRegistry. After testing this multiple times – I found changing windows theme resets the value – and forgetting to run PowerShell with elevated permissions, I put in a try / catch to pick this up and say “Run Elevated”. Just as a side note here I always find when I write try/catch it doesn’t work and it takes me a moment to remember catch works on terminating errors and the command you want to catch must usually needs –ErrorAction stop

if ($SetRegistry ) {
  try{ Set-ItemProperty -Name oembackground -Value 1 -ErrorAction Stop `
               -PATH "HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI\Background" 

   }
  catch [System.Security.SecurityException]{
     Write-Warning "Permission Denied - you need to run as administrator"
  }
   return
}

The function also tests that it can write to the directory where the images are stored, since this doesn’t normally have user access: if it can’t write a file, it tells the user to set the permissions. Instead of using try/catch here I use $? to see if the previous command was successful
Set-content -ErrorAction "Silentlycontinue" -Path "$env:windir\System32\oobe\Info\Backgrounds\testFile.txt" `
              -Value "This file was created to test for write access. It is safe to remove"
if (-not $?) {write-warning "Can't create files in $env:windir\System32\oobe\Info\Backgrounds please set permissions and try again"
             
return
}
else         {Remove-Item -Path "$env:windir\System32\oobe\Info\Backgrounds\testFile.txt"}

The next step is to find the size of the monitor. Fortunately, there is a WMI object for that, but not all monitor sizes are supported as bitmap sizes so the function takes –Width and –Height parameters. If these aren’t specified it gets the value from WMI and allows for a couple of special cases – my testing has not been exhaustive, so other resolutions may need special handling. The Width and height determine the filename for the bitmap, and later the function check the aspect ratio – so it doesn’t try to crop a portrait image to fit landscape monitor.

if (-not($width -and $height)) {
    $mymonitor = Get-WmiObject Win32_DesktopMonitor -Filter "availability = '3'" | select -First 1
    $width, $height = $mymonitor.ScreenWidth, $mymonitor.ScreenHeight
    if ($width -eq 1366) {$width = 1360}
    if (($width -eq 1920) -and ($height -eq 1080)) {$width,$height = 1360,768}
}
if (@("768x1280" ,"900x1440" ,"960x1280" ,"1024x1280" ,"1280x1024" ,"1024x768" , "1280x960" ,"1600x1200",
      "1440x900" ,"1920x1200" ,"1280x768" ,"1360x768") -notcontains "$($width)x$($height)" )
{
    write-warning "Screen resolution is not one of the defaults. You may need to specify width and height"
}
$MonitorAspect = $Width / $height
$SaveName = "$env:windir\System32\oobe\Info\Backgrounds\Background$($width)x$height.jpg"

The next step is to get the image – Get-Image is part of the PowerShell tools for manipulating images .

$myimage = Get-IndexedItem -path $path -recurse -Filter "Kind=Picture","keywords='$keyword'",
                            "store=File","width >= $width ","height >= $height " |
                      where-object {($_.width -gt $_.height) -eq ($width -gt $height)} |
                           
 get-random | get-image

Get-Indexed item looks for files in folder specified by –Path parameter – which defaults to [system.environment]::GetFolderPath( [system.environment+specialFolder]::MyPicture - the approved way to find the "my pictures" folder -recurse tells it to look in sub-folders and it looks for a file with keywords which match the –Keyword Parameter (which defaults to “Portfolio”). It filters out pictures which are smaller than the screen and then where-object filters the list down to those with have the right aspect ratio. Finally one image is selected at random and piped into Get-Image.

If this is successful , the function logs what it is doing to the event log. I set up a new log source “PSLogonBackground” in the application log by running PowerShell as administrator and using the command
New-EventLog -Source PSLogonBackground -LogName application
Then my script can use that as a source – since I don’t want to bother the user if the log isn’t configured I use -ErrorAction silentlycontinue here
write-eventlog -logname Application -source PSLogonBackground -eventID 31365 -ErrorAction silentlycontinue `
                -message "Loaded $($myImage.FullName) [ $($myImage.Width) x $($myImage.Height) ]"

image

The next thing the function does is to apply cropping and scaling image filters from the original image module as required to get the image to the right size.  When it has done that it tries to save the file, by applying a conversion filter and saving the result. The initial JPEG quality is passed as a parameter if the file is too big, the function loops round reducing the jpeg quality until the file fits into the 250KB limit and logs the result to the event log.

Set-ImageFilter -filter (Add-ConversionFilter -typeName "JPG" -quality $JPEGQualitypassthru) -image $myimage -Save $saveName
$item = get-item $saveName
while ($item.length -ge 250kb -and ($JPEGQuality -ge 15) ) {
      $JPEGQuality= 5
      Write-warning "File too big - Setting Quality to $Jpegquality and trying again"
      Set-ImageFilter -filter (Add-ConversionFilter -typeName "JPG" -quality $JPEGQuality -passThru) -image $myimage -Save $saveName
      $item = get-item $saveName
}
if ($item.length -le 250KB) {
         
write-eventlog -logname Application -source PSLogonBackground -ErrorAction silentlycontinue `
           -eventID 31366 -message "Saved $($Item.FullName) : size $($Item.length)"
 }

image

That’s it. If you download the module  remove the “Internet block” on the zip file and expand the files into \users\YourUserName\windowsPowerShell\modules, and try running New-logonbackground  (with –Verbose to get extra information if you wish).
If the permissions on the folder have been set, the registry key is set,  pressing [Ctrl]+[Alt]+[Del] should reveal a new image.  YOU might want to use a different keyword or a different path or start by trying to use a higher JPEG quality in which case you can run it with parameters as needed.

Then it is a matter of setting up the scheduled task: here are the settings from my scheduler

image

image

image

The program here is the full path to Powershell.exe and the parameters box contains
-noprofile -windowstyle hidden -command "Import-Module Image; New-logonBackground"

Lock, unlock and my background changes. Perfect. It’s a nice talking point and a puzzle – sometimes people like the pictures (although someone said one of a graveyard was morbid) – and sometimes they wonder how the background they can see is not only not the standard one but not the one they saw previously.

June 30, 2012

Using the Windows index to search from PowerShell: Part three. Better function output

Filed under: Powershell — jamesone111 @ 10:53 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 27 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place


In part one, I introduced a function which queries the Windows Index using filter parameters like
  • "Contains(*,’Stingray’)"
  • "System.Keywords = ‘Portfolio’ "
  • "System.Photo.CameraManufacturer LIKE ‘CAN%’ "
  • "System.image.horizontalSize > 1024"

In part two, I showed how these parameters could be simplified to

  • Stingray (A word on its own becomes a contains term)
  • Keyword=Portfolio (Keyword, without the S is an alias for System.Keywords and quotes will be added automatically))
  • CameraManufacturer=CAN* (* will become %, and = will become LIKE, quotes will be added and CameraManufacturer will be prefixed with System.Photo)
  • Width > 1024 (Width is an alias or System.image.horizontalsize, and quotes are not added around numbers).

There is one remaining issue. PowerShell is designed so that one command’s output becomes another’s input. This function isn’t going to do much with Piped input: I can’t see another command spitting out search terms for this one, nor can I multiple paths being piped in. But the majority of items found by a search will be files: and so it should be possible to treat them like files, piping them into copy-item or whatever.
The following was my first attempt at transforming the data rows into something more helpful

$Provider= "Provider=Search.CollatorDSO; Extended Properties=’Application=Windows’;"
$adapter = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$ds      = new-object system.data.dataset
if ($adapter.Fill($ds))
    
{ foreach ($row in $ds.Tables[0])
            {if ($row."System.ItemUrl" -match "^file:")
                  {$obj = New-Object psobject -Property @{
                                Path = (($row."System.ItemUrl" -replace "^file:","") -replace "\/","\")}}
             Else {$obj = New-Object psobject -Property @{Path = $row."System.ItemUrl"}
}
             Add-Member -force -Input $obj -Name "ToString" -MemberType "scriptmethod" -Value {$this.path}
             foreach ($prop in (Get-Member -InputObject $row -MemberType property |
                                    where-object {$row."$($_.name)" -isnot [system.dbnull] }))
                  { Add-Member -ErrorAction "SilentlyContinue" -InputObject $obj -MemberType NoteProperty `
                               -Name (($prop.name -split "\." )[-1]) -Value $row."$($prop.name)"
                 
}
             foreach ($prop in ($PropertyAliases.Keys |
                                    Where-Object {$row."$($propertyAliases.$_)" -isnot [system.dbnull] }))
                  { Add-Member -ErrorAction "SilentlyContinue" -InputObject $obj ` -MemberType AliasProperty ` 
                               -Name $prop ` -Value ($propertyAliases.$prop -split "\." )[-1]
                  }
             $obj
            }
      }
This is where the function spends most of its time, looping through the data creating a custom object for each row; non-file items are given a path property which holds the System.ItemURL property; for files the ItemUrl is processed into normal format (rather than file:c/users/james format) – in many cases the item can be piped into another command successfully if it just has a Path property.

Then, for each property (database column) in the row a member is added to the custom object with a shortened version of the property name and the value (assuming the column isn’t empty).
Next, alias properties are added using the definitions in $PropertyAliases.
Finally some standard members get added. In this version I’ve pared it down to a single method, because several things expect to be able to get the path for a file by calling its tostring() method.

When I had all of this working I tried to get clever. I added aliases for all the properties which normally appear on a System.IO.FileInfo object and even tried fooling PowerShell’s formatting system into treating my file items as a file object, something that only needs one extra line of code
$Obj.psobject.typenames.insert(0, "SYSTEM.IO.FILEINFO")
Pretending a custom object is actually another type seems dangerous, but everything I tried seemed happy provided the right properties were present. The formatting worked except for the "Mode" column. I found the method which that calculates .Mode for FILEINFO objects, but it needs a real FILEINFO object. It was easy enough to get one – I had the path and it only needs a call to Get‑Item but I realized that if I was getting a FILEINFO object anywhere in the process, then it made more sense to add extra properties to that object and dispense with the custom object. I added an extra switch -NoFiles to supress this behaviour
So the code then became
$Provider ="Provider=Search.CollatorDSO; Extended Properties=’Application=Windows’;"
$adapter  = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$ds       = new-object system.data.dataset
if ($adapter.Fill($ds))
     { foreach ($row in $ds.Tables[0])
        
 { if (($row."System.ItemUrl" -match "^file:") -and (-not $NoFiles)) 
                 {$obj = Get-item -Path (($row."System.ItemUrl" -replace "^file:","") -replace "\/","\")}
            Else {$obj = New-Object psobject -Property @{Path = $row."System.ItemUrl"}
                  Add-Member -force -Input $obj -Name "ToString" -MemberType "scriptmethod" -Value {$this.path} 
          }
         
ForEach...
 
The initial code was 36 lines, making the user input more friendly took it to 60 lines, and the output added about another 35 lines, bring the total to 95.
There were 4 other kinds of output I wanted to produce:

  • Help. I added comment-based-help with plenty of examples. It runs to 75 lines making it the biggest constituent in the finished product.
    In addition I have 50 lines that are comments or blank for readability as insurance against trying to understand what those regular expressions do in a few months’ time – but there are only 100 lines of actual code.
  • A –list switch which lists the long and short names for the fields (including aliases)
  • Support for the –Debug switch – because so many things might go wrong, I have write‑debug $SQL immediately before I carry out the query, and to enable it that I have
    [CmdletBinding()] before I declare the parameters.
  • A –Value switch which uses the GROUP ON… OVER… search syntax so I can see what the possible values are in a column.
    GROUP ON queries are unusual because they fill the dataset with TWO tables.
    GROUP ON System.kind OVER ( SELECT STATEMENT) will produce a something like this as the first table.

SYSTEM.KIND   Chapter
-----------   -------
communication 0
document      1
email         2
folder        3
link          4
music         5
picture       6
program       7
recordedtv    8

The second table is the normal data suitably sorted. In this case it has all the requested fields grouped by kind plus one named "Chapter", which ties into the first table. I’m not really interested in the second table but the first helps me know if I should enter "Kind=image", "Kind=Photo" or "Kind=Picture"

I have a Select-List function which I use in my configurator and Hyper-V library, and with this I can choose which recorded TV program to watch, first selecting by title, and then if there is more than one episode, by episode.
$t=(Get-IndexedItem -Value "title" -filter "kind=recordedtv" -recurse |
            Select-List -Property title).title
start (Get-IndexedItem -filter "kind=recordedtv","title='$t'" -path |
           
 Select-List -Property ORIGINALBROADCASTDATE,PROGRAMDESCRIPTION)


In a couple of follow up posts I’ll show some of the places I use Get-IndexedItem. But for now feel free to download the code and experiment with it.

December 10, 2011

PowerShell HashTables: Splatting, nesting, driving selections and generally simplifying life

Filed under: Powershell — jamesone111 @ 11:43 pm
Tags:

Having spent so much of my life working in Microsoft environments where you typed a password to get into your computer and never had to type it again until the next time you had to login / unlock it (however many servers you worked with in between)  I find working in an office with a bunch of Linux servers but no shared set of accounts to be alternately comical and stupid. 
Some of our services use Active Directory to provide LDAP authentication so at least I retype the same user name and password over and over: but back as far as the LAN Manager client for DOS, Microsoft network clients tried to transparently negotiate connections using credentials they already had. Internet explorer does the same when connecting to intranet servers. Even in a workgroup if you manually duplicate accounts between machines you’ll get connected automatically. I stopped retyping passwords about the same time as I stopped saving work to floppy disk; I’m baffled that people in the Unix/Linux world tolerate it.
Some of our services use a separate LDAP service instead of AD (just to keep me on my toes I’m JamesOne on one, and Joneill on the other) and others again use their own internal accounts database. I might log on to a given box as root, and sign into MySQL as root but the passwords are different. If I go to another box the root password is different again. And so on.
Recently we hit a roadblock because one of the team had set up a server with a root password he couldn’t share (because of the other places he’d used it) and he gave me a key file as a work round. I’ve talked before about using the NetCmdlets to connect run commands on the Linux boxes. And this sent me back to look at how I was using them. I ended up driving them from Hash Table of Hash Tables  – something Ed Wilson has recently written about on the “Hey Scripting Guy” blog.

I use the Hash tables for splatting  - the name, if not the concept, is peculiar to PowerShell. If it’s not familiar, lets say I want to invoke the command

connect-ssh –Server "192.168.42.42" -Force –user "root" –AuthMode "publickey" –CertStoreType "pemkeyfile" `
            -CertSubject "*" -certStore="$env:USERPROFILE\documents\secret.priv"

I can create a hash table ” with members Server, Force, Authmode and so on.  Normally when we refer to PowerShell variables we want to work with their Values so we prefix the name with the $ sign. Prefixing the name with the @ sign instead turns the members of an object into parameters for a command : like this

$master =@{Server="192.168.42.42"; Force=$True;
           user="root"; AuthMode="publickey"; CertStoreType="pemkeyfile"; 
           CertSubject="*"; certStore="$env:USERPROFILE\documents\secret.priv"}
connect-ssh @master

For another server I might logon with conventional user name and password so I might have:

$Spartacus =@{Server="192.168.109.71"; Force=$True; user="root";}

and use that hash table as the basis of logging on.  If I want to use one of the copy commands I might add two hash tables together – one containing logon information and the other containing the parameters relating to what should be copied – or I might specify these in the normal way in addition to my splatted variable.
In my original piece about using the NetCmdlets I showed Invoke-RemoteCommand, Get-RemoteItem, Copy-Remote-Item, Copy-LocalItem and so on: I have modified all of these to take a hash table as a parameter and use splatting. I also showed a set-up function named connect-remote: the hash tables are set up when I load the module, but they don’t contain credentials: connect-remote now looks at the hash-table for the connection I want to make and says “Is this fully specified specified to use a certificate, if not does it have credentials ?” If the answer is no to both parts it prompts for credentials and adds them to the hash table – in the snippet below $Server contains the hash table, and $user can be a parameter passed to connect-remote or in the Hash table, and if it can’t be found in either place it is set to the current user name

if (-not ($server.AuthMode -or $server.credential)) {
    $server["credential"] = $Host.ui.PromptForCredential("Login","Enter Your Details for $ServerName","$user","")
}

A global variable set in connect-remote keeps track of which of the hash tables with SSH settings in should be used as the default for Invoke-RemoteCommand, Get-RemoteItem, Copy-Remote-Item, Copy-LocalItem and so on. But it makes sense to have all the hash tables listed somewhere where they can be accessed by name so I have

$hosts = @{ master =@{Server="192.168.42.42"; Force=$True; user="root";
                    AuthMode="publickey"; CertStoreType="pemkeyfile";
                    CertSubject="*"; certStore="$env:USERPROFILE\documents\secret.priv"}
         Spartacus =@{Server="192.168.109.71"; Force=$True; user="root";}
           Maximus =@{Server="192.168.10.10"; Force=$True; user="joneill";}
}

In connect-remote, the -server parameter is declared like this 

  $server = $hosts[(select-item $hosts -message "Which server do you want to connect to ? " -returnkey)]

Select-item is a function I wrote ages ago which takes a hash table and offers the user a choice based of the keys in the hash table and returns either the number of the choice (not very helpful for hash tables) or its name.
Function Select-Item
{[CmdletBinding()]
param ([parameter(ParameterSetName="p1",Position=0)][String[]]$TextChoices,
       [Parameter(ParameterSetName="p2",Position=0)][hashTable]$HashChoices,
       [String]$Caption="Please make a selection",
       [String]$Message="Choices are presented below",
       [int]$default=0,
       [Switch]$returnKey
      )
$choicedesc = New-Object System.Collections.ObjectModel.Collection[System.Management.Automation.Host.ChoiceDescription]
switch ($PsCmdlet.ParameterSetName) { 
       "p1" {$TextChoices | ForEach-Object { $choicedesc.Add((
              
New-Object "System.Management.Automation.Host.ChoiceDescription" -ArgumentList $_ )) } }
       "p2" {foreach ($key in $HashChoices.Keys) { $choicedesc.Add((
              
New-Object "System.Management.Automation.Host.ChoiceDescription" -ArgumentList $key,$HashChoices[$key] )) } }
}
If ($returnkey) { $choicedesc[$Host.ui.PromptForChoice($caption, $message, $choicedesc, $default)].label }
else            {             $Host.ui.PromptForChoice($caption, $message, $choicedesc, $default) }
}

December 3, 2011

PowerShell–full of stringy goodness.

Filed under: Powershell — jamesone111 @ 2:28 pm

I think almost everyone who works with PowerShell learns two things in their first few minutes.
(a) Assigning some text to a Variable looks like this $Name =  "James"
(b) When you wrap a string in double quotes it expands variables inside it, for example "Hello $Name" will evaluate to Hello James.

A little later we tend to learn that if we want to put a property of a variable in string things get marginally more complex.
"The name $Name is $name.length characters long" gives the text the name James is James.length characters long.

To get the length of what is stored in the variable named “name” we need to use "The name $Name is $($name.length) characters long". This gives the name James is 5 characters long.

And most of us also learn that PowerShell can have Multi-line strings, sometimes called Here-Strings. which look like this

$HereString=@"
Here is a string
The string $name made
It hasn’t finished yet
There’s more
OK we’re done
"@

Here-strings are really useful because they don’t close until they reach “@ at the start of a line, they can contain quotes, newlines and if they use double quotes they too will expand variables.

I should say here that I don’t have any detailed knowledge of how PowerShell’s parser actually deals with strings, but it seems pretty clear that when it hits a $ sign it says “OK I need to work out a value here”. I think using $ to mean “value of” was something PowerShell picked up from another language but I can’t be sure which one it was. My inner Pedant wants to correct people when they say “PowerShell variable names begin with $” $foo means the value that is in foo: the name is foo. The way I understand it, when the parser sees $ inside a string it looks to see what comes next: if it is a sequence of letters it assumes they are the name of variable, which it should insert into the string. If is a “(“ character then it looks for the matching “)” and evaluates the expression inside, like $name.length. Easy. Only this week I found myself saying … these are nothing special … well, we can get much better things than that.

I had to create over a dozen XML files: all identical except each one had a different database table name and a subset of the fields in the table – some had just 2 fields others had more than a dozen, the XML went like this.
<stuff>
   <kind1 name=TableName>
      <field name=field1>
      <field name=field2>
      <field name=field3>
   </kind1>
   <xmlgumf />
   <kind2 name=TableName>
      <field name=field1>
      <field name=field2>
      <field name=field3>
      <field name=field3>
   </kind2>
   <xmlgumf />
</stuff>

Except the real XML was much more complicated. Using the tool which builds the XML file from scratch takes about an hour per file. The XML files then go into another tool which is where the real magic happens. Building a dozen files would be couple of days work, or with the project I’m working on , a very late night.
I had a document with the table names and the field lists in them, so could I automate the creation of these XML files ?
Outside of a string I’d write something like this
$fields=”Field1”,”Field2”,”Field3”
$fields | foreach -begin {$a=""} -process {$a += [environment]::newline + "<Field name=$_ >"} -end {$a}

Would that second line work in a here-string based on an existing XML file? How well would this work ?

$Table = "TableName"
$fields=”Field1”,”Field2”,”Field3”
@"
<stuff>
   <kind1 name=$Table>$($fields | foreach -begin {$a=""} -process {$a += [environment]::newline + "      <Field name=$_ >"} -end {$a})
   </kind1>
   <xmlgumf />
   <kind2 name=$Table>$($fields | foreach -begin {$a=""} -process {$a += [environment]::newline + " <Field name=$_ >"} -end {$a})
   </kind2>
   <xmlgumf />
</stuff>
"@ | out-file "$table.xml” –encoding ascii

The simple answer is it works like a charm.
Building the first XML file took an hour using the normal tool to do it from scratch, it then took 5 minutes to convert that file into a template - which to my immense delight worked first time. Producing each additional XML file took 2 minutes. Half an hour in all. That’s 10.5 hours I’ve got to do something else with. And showing it to colleagues their reaction was I had performed serious magic. The kind of magic which lets me disappear early.

October 24, 2011

Maximize the reuse of your PowerShell

Filed under: Powershell — jamesone111 @ 3:18 pm
Last week I was at The Experts Conference in Frankfurt presenting at the PowerShell Deep Dive. My presentation was entitled “Maximize the reuse of your PowerShell”.
My PowerShell library for managing Hyper-V has now gone through a total of 100,000 downloads over all its different versions but whether it’s got wide use because of the way I wrote it or in spite of it, I can’t say.  Some people whose views I value seem to agree with the ideas I put forward in the talk, so I’m setting them out in this (rather long) post.
image

I have never lost my love of Lego. Like Lego, PowerShell succeeds by being a kit of small, general purpose blocks to link together.  Not everything we do can extend that kit, but we should aim to do that where possible. I rediscovered the Monad Manifesto recently via a piece Jeffrey Snover about how PowerShell has remained true to the original vision, named Monad. It talks of a model where “Every executable should do a narrow set of functions and complex functions should be composed by pipelining or sequencing executables together”.  Your work is easier to reuse if it becomes a building block that can be incorporated into more complex solutions; and this certainly isn’t unique to PowerShell.   

Functions for re-use, scripts for a single task. If you a use .PS1 script to carry out a task it is effectively a batch file:  it is automating stuff so that’s good, but it isn’t a building block for other work. Once loaded, functions behave like like compiled cmdlets. If you separate parts of a script into functions it should be done with the aim of making those parts easy to reuse and recycle, not simply be to reformat a script into subroutines.
If we’re going to write functions how should they look?

Choose names wisely, One task : One Function, there is no “DoStuffWith” verb for a reason.
PowerShell cmdlets use names in the form Verb-Noun and it is best to stick to this model; the Get-Verb command will give you a list of the Approved verbs. PowerShell’s enforcement of these standards is limited to raising a warning if you use non-standard names in a module. If your action really isn’t covered by a standard verb, then log the need for another one on connect; but if you use “Duplicate” when the standard says “Copy”, or “Delete” when the standard says “Remove” it simply makes things more difficult for people who know PowerShell but don’t know your stuff. The same applies to parameter names: try to use the ones people already know. Getting the function name right sets the scope for your work. For my talk I used an example of generating MD5 hashes for files. My assertion was that the MD5 hash belonged to the file, and so command should return a file with an added MD5 hash – meaning my command should be ADD-MD5.

Output: Use the right Out- and Write- cmdlets Other speakers at the Deep Dive made the point that output created with Write-Host isn’t truly returned (in the sense that it can be piped or redirected) it is writing on the console screen; so only use Write-Host if you want to prevent output going anywhere else. There are other “write on the screen without returning” cmdlets which might be better: for example when debugging we often want messages that say “starting stage 1” , “starting stage 2” and so on. One way to do this would be to add Write-Host statements and after the code is debugged remove them. A better way is to use Write-Verbose which outputs if you specify a –verbose switch and doesn’t if you don’t. As a side effect you have a quasi-comment in your code which tells you what is happening. The same is true when you use Write-Progress to indicate something is happening in a long running function, (remember to run it with the -completed switch when you have finished otherwise your progress box can remain on screen while something else runs). Write-Debug, Write-Error and Write-Warning are valuable, I try to prevent errors appearing where the code can recover and write a warning when something didn’t go to plan, in a non-fatal way.

Formatting results can be a thorny issue: you can redirect the results of Format-Table or Format-List to a file, but the output is useless if you want to pipe results into another command – which needs the original object.
Some objects can have ugly output: it’s not a crime to have an –asTable switch so output goes through Format-Table at the end of the function or even to produce pretty output by default provided you ensure that it is possible to get the raw object into the pipe with a –noFormat or –Raw switch. But it’s best to create formatting XML and a lot easier with tools like James Brundage’s E-Z-out.

Output with the pipeline in mind. It’s not enough to just avoid write-host and return some sort of object. I argued that a function should return the richest object that is practical – my example used Add-Member to take a file object and give it an additional MD5Hash property. I can do anything with the result that I can do with a file and default formatting rules for a file are used (which is usually good)  but being the right type need not matter if the object has the right property/properties. For example, this line of PowerShell :
  Select-String -Path *.ps1 -Pattern "Select-String"
looks at all the .PS1 files in the current folder and where it finds the text  “Select-String”  it outputs a MatchInfo object with the properties: .Context, .Filename, .IgnoreCase, .Line, .LineNumber, .Matches, .Path and .Pattern. A Cmdlet like Copy-Item has no idea about MatchInfo objects, but if an object piped in has a .path property, it will knows what it is being asked to work on. If Matchinfo objects named this property “NameOfMatchingFile” it just would not work. 

Think about pipelined input. All it takes to tell PowerShell that a parameter should come from the pipeline is prefixing its declaration with
  [parameter(ValueFromPipeLine= $true)]
If you find that you are using your new function like this
  Get-thing | ForEach {Verb-Thing –thing $_}
It’s telling you that -thing should take pipeline values.

The pipeline can supply multiple items , so a function may need to be split into Begin{}process{} and End {} blocks. (If you don’t specify these the whole function body is treated as an end block and only the last item passed is processed). Eventually the realization dawns that if the example above works, it should be possible have two lines:
  $t = Get-thing
Verb-Thing –thing $t
 
So parameters need to be able to handle arrays – something I’ll come back to further down.

You can do the same thing that Copy-Item was shown doing above: I have another function which is a wrapper for Select-String, its parameters include:
  [Parameter(ValueFromPipeLine=$true,Mandatory=$true)]
  $Pattern,
  [parameter(ValueFromPipelineByPropertyName = $true)][Alias('Fullname','Path')]
  $Include=@("*.ps1","*.js","*.sql")

If the function gets passed a string via the pipeline it is the value for the -pattern parameter. If it gets an object containing a property named “Include”, “Fullname” or “path” that property becomes the value for the –include parameter.

Sometimes a function needs to output to a destination based on input: so you can check to see if a parameter is a script block and evaluate it if it is. 

Don’t require users to know syntax for things inside your function. If you are going to write code to do one job and never reuse it then you don’t need to be flexible. If the code is to be reused, you need to do a little extra work so users don’t have to. For example: if something you use needs a fully qualified path to a file, then the function should use Resolve-Path to avoid an error when the user supplies the name of a file in the current directory. Resolve-Path is quite content to resolve multiple items so replacing   
  Do_stuff_with $path
with
  Resolve-Path $path | for-each {Do stuff with $_ }
Delivers, at a stroke, support for Wildcards, multiple items passed in $path and relative names .
Another example is with WMI, where syntax is based on SQL so the wildcard is “%”, not “*”. Users will assume the wildcard is *. In this case do you say:
(a) Users should learn to use “%”   
(b) My  function should include   $parameter = $parameter -Replace "*","%" 
For my demo I showed a function I have named “Get-IndexedItem” which finds things using the Windows index. I wanted to find my best underwater photos- they are tagged “portfolio” and are the only photos I shoot with a Canon camera. My function lets me type
  Get-IndexedItem cameramaker=can*,tag=portfolio
Inside the function the search system needs a where condition of
  “System.Photo.Cameramanufacturer LIKE 'can%'  AND System.Keywords = 'portfolio'
Some tools would require me to type the filtering condition in this form, but I don’t want to remember what prefixes are needed and whether the names are “camera Maker” or “camera Manufacturer” and “keyword”, “keywords” or “tag”. Half the time I’ll forget to wrap things in single quotes, or use “*” as I wild card because I forgot this was SQL. And if I have multiple search terms why shouldn’t they be a list not “A and b and C” (there is a write-up coming for how I did this processing. ).

Set sensible defaults.  The talk before mine highlighted some examples of “bad” coding, and showed a function which accepted a computer name. Every time the user runs the command they must specify the computer. Is it valid to assume that most of the time the command will run against the current computer? If so the parameter should default to that. If the tool deals with files is it valid to assume “all files in the current directory” – if the command is delete, probably not, if it displays some aspect of the files, it probably can.
Constants could be Parameter defaults With computer name example you might write a function which only ran against the current computer. Obviously it is more useful if the computer name is a parameter (with a default) not a constant. In a surprising number of a cases, something you create to do a specific task can carry out a generic task if you change a fixed value in your code into a parameter, which defaults to the former fixed value.

Be flexible about parameter types and other validation This is a variation on not making the user understand the internals of your function and I have talked about it before, in particular the dangers of people who are trained systems programmers applying the techniques in PowerShell they would use in something like C#: in those languages a function declaration might look like:
  single Circumference(single radius) {}
which says Circumference takes a parameter which must have been declared to be a single precision real number and returns a single precision real number. Writing
   c = Circumference("Hello");
or  c = Circumference("42");
will cause the compiler to give a “type mismatch” error – “Hello” and “42″ are strings, not single precision real numbers . Similarly
System.io.fileinfo f = Circumference(42);
Is a type mismatch: f is a fileInfo object and we can’t assign a number to it. The compiler picks these things up before the program is ever run, so  users don’t see run-time errors.
PowerShell isn’t compiled and its Function declarations don’t include a return a type: Get-Item, for example, deals with the file system, certificate stores, the registry etc. so it can return more than a dozen different types: there is no way to know in advance what type of item Get-Item $y will return. If the result is stored in a variable (with  $x = Get-item $y )  the type of the variable isn’t specified, but defined at runtime.
Trying to translate that declaration into PowerShell gives something like this.
  Function Get-Circumference{
  Param([single]$Radius)
  $radius * 2 * [system.math]::PI
  }

Calling
  Get-Circumference "Hello"
Produces an error but a closer inspection show it is not a type mismatch: it says
Cannot process argument transformation on parameter 'radius'. Cannot convert value "hello" to type "System.Single".
the [Single] says “always try to cast $f as a single”.  Which means
  Get-Circumference "42"
Will be cast from a string to a single and the function returns 263.89

So you might expect 
  [System.IO.FileInfo]$f = Get-Circumference 42
To throw an error, but it doesn’t, PowerShell casts 263.893782901543 to an object representing a file with that name in \windows\system32. The file doesn’t exist and is read-only!  So it can be better to resolve types in code.

I’d go further than that. Some validation is redundant because the parameter will be passed to a cmdlet which is more flexible than the function writer thought, in other cases parameter validation is there to cover up a programmer’s laziness . When I see a web site which demands that I don’t enter spaces in my credit card number I think “Lazy. It’s easy to strip spaces and dashes”. Having “O’Neill” for a surname means that the work of slovenly developers who don’t check their SQL inputs or demand only letters gets drawn to my attention too. If the user is forced to use the equivalent of
  Stop-Process (get-process Calc)
they will think “Lazy. you  couldn’t be bothered even to provide a –name parameter”.  Stop-process does just that to cope with process objects, names and IDs (and notice it allows you to pass more than one ID)  for example:
  stop-process [-id ] 4472,5200,5224
  $p = get-process calc ; Stop-Process -InputObject $p 
  stop-process -name calc

Other cmdlets are able to resolve types without using different parameters for example
  ren '.\100 Meter Event.txt' "100 Metre Event.txt"
  $f = get-item '.\100 Metre Event.txt' ; ren $f '100 Meter Event.txt'
In one case the first item is a string and the other is a file object: from a user’s point of view this is better than stop-process which in turn is much better than having to get an object representing the process you want to stop. In my talk I used Set-VMMemory from my Hyper-V module on Codeplex, which has:
  param(
        [parameter(ValueFromPipeLine = $true)]
        $VM,        
        [Alias("MemoryInBytes")]
        [long]$Memory = 0 ,
        $Server = "."
  )

There isn’t a way for me to work out what a user means if they specify a memory size which isn’t a number (and can’t be cast to one). If the user specifies many Virtual Machines, catching something during parameter validation will produce one error instead of one per VM (so this would be the place to trap negative numbers too).
I have a -server parameter and it makes sense to assume the local machine – but I can’t make an assumption about which VM(s) the user might want to modify. The VM can come from the pipeline, and it might be an object which represents a VM, or a string with a VM name (possibly a wildcard) or an array of VM objects and/or Names. If the user says
  Set-VMMemory –memory 1GB –vm london* –server h1,h2
the function, not the user, translates that into the necessary objects. If this doesn’t match any VMs I don’t want things to break. (Though I might produce a warning). It takes 4 lines to support all of this
   if ($VM -is [String]) { $VM = Get-VM -Name $VM -Server $Server}
  if ($VM.count -gt 1 )  {[Void]$PSBoundParameters.Remove("VM")
                           $VM | ForEach-object {Set-VMMemory -VM $_ @PSBoundParameters}}
  if ($vm.__CLASS -eq 'Msvm_ComputerSystem') {
      do the main part of the code 
  }
 
Incidentally I use the __CLASS property because it works with remoting when other methods for checking a WMI type failed.

Allow parts to be switched off In the talk I explained I have a script which applies new builds of software to a remote host. It does a backup and creates a roll back script. Sometimes I have to roll back, produce another build and then apply that. Since I have an up to date backup I don’t need to run the backup a second time so that part of the code runs unless I specify a –nobackup switch.

Support –whatif  or restore data. You choose. A function can use $pscmdlet.shouldProcess – which returns true if the function should proceed into a danger zone and false if it shouldn’t. It only takes one line  line before declaring the parameters at the start of a function to enable this
  [CmdletBinding(SupportsShouldProcess=$true, ConfirmImpact='High' )]
There are 4 confirmation levels “None”, “Low”, “Medium,”, “High”, and for the Confirm impact value and the $Confirmation preference variable. If either is set to “None” no confirmation appears. If the impact level is higher or equal to the preference setting this line
  if ($pscmdlet.shouldProcess($file.name,”Delete file”) {Remove-item $file}
Will delete the file or not depending on the users response to a prompt – in this case “Performing operation “Delete File” on Target “filename”. Confirmation can be forced on by specifying –confirm. Specifying –whatif will echo the message but return false. If no confirmation is needed,  
–verbose
will echo the message but return true.

Prepare your code for other people, including you.  I’m not the person I was a year ago. You’re not either, and we’ll be different people in 3 or 6 months. So even if you are not publishing your work are writing for someone else. You, in 3 months time. You under-pressure at 3 in the morning. And that you will curse the you of today if leave your code in a mess.

There are many ways to format your code: find a style of indenting that works for you, if it does, then the chances that anyone else will like it are about the same as the chance they will hate it.  Some people like to play “PowerShell Golf” where fewest [key]strokes wins – this is fine for the command prompt. Expanding things to make them easier to read is generally good. That’s not an absolute ban on aliases – using Sort and Where instead of Sort-Object and Where-Object may help readability – the key is to break up very long lines, but without creating so many short lines that you are constantly scrolling up and down.

Everyone says “Put in brief comments”, but I’d refine that: explain WHY not WHAT, for example  I only decided to use Set-VMMemory as an example in my talk at the last moment.  It has a line
  if (-not ($memory % 2mb))  {$memory /= 1mb}

I can’t have looked code in 8 months. Why would I do that ? Fortunately I’ve put in WHY comment – the WHAT is self explanatory
# The API takes the amount of memory in MB, in multiples of 2MB.
# Assume that anything less than 2097152 is in already MB (we aren't assigning 2TB to the VM). If user enters 1024MB or 0.5GB divide by 1MB 
>

Instead of putting comments next to each parameter saying what it does, move the comments into comment based help, take a couple of places where you have used / tested the command and put them into the comment based help as examples. Your future self will thank you for it.

September 11, 2011

Adding $Clipboard automatic variable support to PowerShell

Filed under: Powershell — jamesone111 @ 5:08 pm

Some of the bits and pieces I have been working on recently have needed PowerShell to take input from the windows clipboard, and as with so many things in PowerShell there is more than one way this can be done.

  • If it is something short paste into a command line –wrapping in quotes or @” “@ as required
  • Paste into a new edit window in the PowerShell ISE and wrap it with $variableName = “” (or @” “@ as required) , then run the contents of the Window
  • Use  [windows.clipboard]::GetText() in a command line
  • Create a Get-Clipboard Function which wraps  [windows.clipboard]::GetText()
  • Create a $Clipboard “automatic variable”
  • Use the Paste.Exe which Joel has on his site.

Note (a) that if you want to use  [windows.clipboard]::GetText() in the “Shell” version of PowerShell you need to start PowerShell.exe with the –STA switch and run the command   Add-Type -AssemblyName (“Presentationcore”) – I’m actually only interested in doing this in the ISE.
and (b) If you want to send PowerShell’s output to the clipboard the command line Clip.exe has been Windows for several versions. It’s getting text from the clipboard which needs more work.

There’s very little to choose between having Get-Clipboard and $clipboard but having just read Robert’s piece on Tied Variables I thought I would go down that path.

Robert explained that you can set a breakpoint on a PowerShell variable to run a scriptblock when the variable is read, so he got to thinking “I can run something which changes the value of variable, when it is read.

In my profile I have

$Global:MyBreakPoint = $Global:Clipboard = Set-PSBreakpoint -Variable Clipboard -Mode Read -Action {
Set-Variable clipboard ([Windows.clipboard]::GetText()) -Option ReadOnly, AllScope -Scope Global -Force}

So $clipboard is set to the breakpoint which is created by monitoring read action on itself. At each subsequent read the script-block is called and uses set-variable to make $clipboard a ReadOnly variable holding the value of the clipboard at that moment.

In the I modified my Get-SQL command to take advantage of this. I added a –Paste switch (I chose –paste over –clipboard simply because I already had –connection so –c would be ambiguous –paste allows me to run sql –p , taking advantage of SQL being an automatic alias for Get-SQL) then

if ($Paste -and (test-path 'Variable:\Clipboard'))
    {$sql = $Clipboard -replace ".*(?=select|update|delete)","" -replace "[\n|\r]+","" }

This trims out anything before “Select”, “update” or “Delete” –as I am usually copying a statement from a log with a time stamp at the start, and it also replaces any carriage return/line feed characters.

WHAT ? A bug in the PowerShell ISE

imageThis was all working brilliantly until I went to use a function which uses the PowerShellISE object model to manipulate text in an edit window it fails – the error implies that PowerShell’s object model thinks a debugger is running if a break point exists, even though trying to edit the script files interactively is not a problem .

To get around this I set TWO variables to point to the break point, one was $clipboard, which will be overwritten as soon as it is used, and the other is $myBreakPoint. In my edit function I now start with

if ($MyBreakPoint) {Remove-PSBreakpoint $MyBreakPoint }

Removing the breakpoint doesn’t remove the variable $myBreakpoint so on the way out of the function I use it to decide if I need to reinstate it.

September 2, 2011

Round-tripping files for editing

Filed under: Powershell — jamesone111 @ 9:20 pm

Watching a quiz the other day the host said of a book “I plan to start [it] the moment my Doctor tells me I have 2000 years to live”. It’s was a great spin on “life’s too short to…” I my case life is too short to re-learn how to use vi.  I learned how to use it on Wyse terminals at university in the mid 1980s, and I didn’t rate it then. If wikipedia has its facts right vi was dates back to 1976; Steve Jobs and Steve Wozniak had yet to launch the Apple II had IBM hadn’t even thought of making a PC, and the supercomputers in the best equipment research facilities had less power than today’s SmartPhones. In computer terms dinosaurs were still roaming the earth, and must make vi the coelacanth of software, ugly as hell and somehow immune to the forces which should have rendered it extinct.

I’m working with a Linux server which I can only access over an SSH session. Since configuring the server means editing text files some thought has gone into how to avoid vi.  We’ve standardized on SSH explorer in the office – it can round-trip files to the PC for editing. I’ve recently been using the netcmdlets to move files around and I’ve also been looking at trapping file-changed events I found myself asking “How difficult would it be to transfer a file to a temporary folder, load it into the editor of my choice and push it back to where it came from when it was saved?”. The answer turns out “about this difficult …”

Function Edit-RemoteItem {
param ([Parameter(ValueFromPipeLine=$true, ValueFromPipelineByPropertyName=$true, mandatory=$true)]
      
[String]$path,
       $Connection = $Global:ssh
       )
       $localpath     =  ( join-path $env:temp ($path -split "/")[-1])
       $MessageDataHT = @{   Server=$connection.Server;        RemoteFile=$path; 
                         Credential=$connection.Credential;      LocalFile=$localpath; 
                              force=$true;                       Overwrite=$true}
       Get-SFTP @MessageDataHT | out-null
       if ( $Host.Name -match "\sISE\s" )
            { $null = $psISE.CurrentPowerShellTab.Files.Add($localpath) }
       else { notepad.exe $localpath }
       $Global:watcher = Register-FileSystemWatcher -MessageData $MessageDataHT -Path $localpath `
                       -On "Changed" -Process {
                         
$event.sender.enableRaisingEvents = $false 
                          $oldpp =$ProgressPreference
                          $ProgressPreference="silentlycontinue"
                          if ((Send-SFTPserver 
$event.MessageData.server `
                                      -LocalFile  $event.MessageData.LocalFile `
                                      -RemoteFile $event.MessageData.RemoteFile `
                                      -Credential $event.MessageData.Credential` 
                                      -Overwrite -Force ).size) {
                            
Write-host "Updated $($event.MessageData.RemoteFile on $($event.MessageData.server)"
                         
}
                          $ProgressPreference = $oldpp 
                          $event.sender.enableRaisingEvents = $True
                      
}
}
To run through the function quickly; it takes a remote path and a netcmdlets SSH connection object.  It builds a local path and a hash table with parameters in and calls Get-SFTP by splatting – turning each key/value pair in the hash table into a parameter/value pair.

If it is running in the PowerShell ISE it loads the transferred file into PowerShell’s editor, otherwise it loads it into notepad.

Then it sets up a FileSystem watcher using the same code I talked about here  The watcher is passed the same hash table of parameters – unfortunately you can’t use splatting in a process block and there is no option to hide the progress bar, so I temporarily change the $Progress preference and change it back when I’ve finished. I also make sure that while the upload is happening no new events are raised. In between I call send-Sftp inserting parameters from the hash table: simples.

August 24, 2011

Improving my score at PowerShell Golf

Filed under: Powershell — jamesone111 @ 9:58 am

There is mindset found in among scripting and programming people known as “Golf”. The objective is to get to the end using as few [key]strokes as possible. It’s not something I like to see in scripts and functions, but most of us don’t like to type any more than we have to at the command line. Tab expansion in PowerShell is a great feature for cutting down keystrokes (not beyond improvement, but truly great).

I’ve mentioned a couple of times recently that PowerShell’s parser gives GET- commands a sort of “automatic alias”: if you type “Service” where a command is expected, tab expansion won’t work but if you carry on and run the command, then  before it gives up and reports “The term 'service' is not recognized as the name of a cmdlet, function, script file, or operable program.” PowerShell looks for “Get-Service” finds it and calls that instead.  This works for user defined functions as well , so the Get-SQL command I wrote to send SQL commands via ODBC can be shortened to “SQL”, which is very fine and good and only has two things wrong with it.

  1. Commands need to wrapped in quotes. It doesn’t matter if they are single or double quotes, but if you are pasting in something which was written for MySQL you need to have your wits about you because like PowerShell it allows you to use single or double quotes in the command .
  2. It’s still more keystrokes than using the MySQL Monitor in an SSH session.

What I wanted to do was use a symbol to mean “Send this to SQL”.”PowerShell already uses most of the symbols on the keyboard, although its parser is flexible enough to allow some to be used as aliases – for example when it sees “40 % 13” the % sign is interpreted as the “modulo” operator but when it sees “dir | % {…% is an alias for “ForEach-Object” . Possible though it is, making the $ or > signs into aliases feels like the first dancing step down the path to madness, so for SQL I used ¬ (I’ve used the same technique for the Invoke-RemoteCommand wrapper I wrote for the netCmdlets Invoke-SSH command for which I use ~ )

I want to be able to type
¬ select * from mytable where name="james"
and have my command go into the default ODBC connection (or the default SSH connection or whatever) .
But it won’t work if I simply use Set-Alias -Name "¬" -Value "Get-SQL" , because “Select”,”*”,”from”, “mytable” and so on each get treated as distinct parameters and the quote marks will be lost around james. PowerShell has a “ValueFromRemainingArguments” option for a parameter but that won’t deal with the quotes issue (and there will be an problems deciding what belongs to other parameters like the ODBC connection string ).

My solution was to alias ¬ to an intermediate function named Hide-GetSql which contains 1 line
Function Hide-GetSQL {
  Get-Sql -sql ($MyInvocation.line.substring($MyInvocation.OffsetInLine))
}
Set-Alias -Name ¬ -Value Hide-GetSQL

Normally I hate seeing references to $Args or $MyInvocation.line but here I have to make an exception, this gets the line which was used as a single string and truncates it after the function / alias name.  PowerShell’s parser will try to split this line into sections if it contains semi colons – the function still works, but any attempt to process a second command will almost certainly cause a spurious error. For my use it hasn’t been a problem – your mileage may vary.

August 23, 2011

Enhancing the NetCmdlets

Filed under: Powershell — jamesone111 @ 1:11 pm

At PowerShell Deep dive conference in April /n software gave away USB keys with a full set of “PowerShell inside” software including their NetCmdlets. It turns out that this is 30 day evaluation software, and initially this put me off using them; Joel wrote about an alternative saying “It’s also possible to do this using the NetCmdlets from /n and I’m constantly surprised at how unwilling people are to pay for them.”  Part of the problem is you get nearly 100 cmdlets, and if you want only half a dozen if feels like most of the cost is wasted. After checking a free library and finding it flakey I dug out the USB key, figuring if they worked, or even mostly worked they’d be worth the cost.

I only need to do a couple of things – use SSH to send commands to the Linux server which hosts my application using (for which there are 3 commands: Connect-SSH, Disconnect-SSH and Invoke-SSH) and move files both ways with sftp (which has Connect-SFTP, Disconnect-SFTP, Get-SFTP, Remove-SFTP, Rename-SFTP and Send-SFTP) .  I wanted to be able to invoke the commands against my server without re-making connections each time , and that meant some simple “wrapper” commands to get things how I wanted them.

Connect-Remote Sets up a Remote Session (saves the session object as a global variable)
All the following functions can be passed a remote session but use the global variable as a default
Get-RemoteItem Get properties one or more remote items (Like Get-Item / Get-ChildItem) – alias rDir
Copy-RemoteItem Copies a remote item to the local computer (like Copy-Item ) –alias rCopy
Copy-LocalItem Copies a local item to the remote computer
Remove-RemoteItem Deletes an item on the remote computer  (like Remove-Item)
Invoke-RemoteCommand Runs a command on the remote computer. (Like Invoke-Command)

The two connect- netcmdlets return different types of object which contain the server, credential and various other connection parameters. I can set them up like this

$credential = $Host.ui.PromptForCredential("Login",
   "Enter Your Details for $server","$env:userName","")
$Global:ssh = Connect-SSH -Server $server -Credential $credentialForce

I found downloading causes the sftpConnection object to “stick” to the path of downloaded file. Get-, Remove-, Rename- and Send- all allow the server and credential to be passed, so I can pass the SSHConnection object into my SFTP wrappers and use its server and credential properties – even though it is the “wrong” connection type – being for SSH command-lines rather than SFTP file access. I wrote a Connect-Remote function which also stores the connections so I can switch between them more easily. It looks like this:

$Global:AllSSHConnections = @{}
Function Connect-Remote { 
Param ([String]$server = $global:server,
      
[String]$UserName =$env:userName , 
       [Switch]$force
     
)
      if   ($Global:AllSSHConnections[$server] -and -not $force) {
             
$Global:ssh = $Global:AllSSHConnections[$server]} 
      else { $credential = $Host.ui.PromptForCredential("Login","Enter Your Details for $server",$username,"")
      # Connect-ssh takes errorAction parameter but it doesn't work. So use try-catch 
      try {$Global:ssh = Connect-SSH -Server $server -Credential $credential -Force -ErrorVariable SSHErr }
      catch {continue}
      If ($?) {$remotehost = (invoke-RemoteCommand -Connection $ssh -CmdLine "hostname")[0] -replace "\W*$",""
               Add-Member -InputObject $ssh -MemberType "NoteProperty" -Name "HostName" -Value $remoteHost
               $Global:AllSSHConnections[$server] = $ssh
               $ssh | out-default 
      }
}

Storing the connection simplifies using the other cmdlets. I have a little more error checking in the real version of the script so I can tell the user if there was simply a password error or something more fundamental at fault.

The first task I wanted to perform was to get a file listing via sftp. I have a couple of easily-solved gripes with the netcmdlets: I’ve already mentioned the sticking sftp connection, the other is that the EntryInfo object which is returned when getting a file listing doesn’t contain the fully qualified path, just the file name – so I sort the entries into order, and add properties for the directory and fully qualified path and used New-TableFormat  to give me the start of a formatting XML file to display the files nicely . The function also takes the path, and a connection object – which defaults to the connection I set up before.

Function Get-RemoteItem {
    param( [Parameter(ValueFromPipeLine=$true)]
          
[String]$path= "/*",
           $Connection  = $Global:ssh 
         )
process {
           if ($Connection.server -and $Connection.Credential ) {
                  $Directory = $path -replace "/[^/]*$","" 
                  Get-sftp -List $path -Credential $Connection.Credential`
                                            
-Server $Connection.Server -force | 
                           Sort-Object -property @{e={-not $_.isdir}}, filename | 
                           Add-Member -MemberType NoteProperty   -Name "directory" -Value $directory -PassThru | 
                           Add-MemberMemberType ScriptProperty -Name "Path" `
                                        -Value {$this.directory + "/" + $this.Filename}PassThru
           }
           else {write-warning "We don't appear to have a valid connection."}
         }
}

The next things to add were copy-RemoteItem and Copy-LocalItem – which just have to call Get-SFTP and Send-SFTP with some pre-set parameters.

Function Copy-RemoteItem {
[CmdletBinding(SupportsShouldProcess=$true)]
param ([Parameter(ValueFromPipeLine=$true, ValueFromPipelineByPropertyName=$true, mandatory=$true)]
      
[String]$path,
       [String]$destination = $PWD,
       $Connection = $Global:ssh,
       [Switch]$force
       )
process { if ($Connection.server -and $Connection.Credential -and $path ) {
            
 #Because we re-make the connection in the Get-Sftp command we can pass an SSH object.
              write-verbose "Copying $path from $($Connection.Server) to $destination"
              Get-SFTP -RemoteFile $path -LocalFile $destination -Credential $Connection.Credential `
                       -Server $Connection.Server -Force -Overwrite:$force |
                       Add-Member -PassThru -MemberType Noteproperty -Name "Destination" -Value $destination
          }
          Else {Write-warning "We don't seem to have a valid destination and path" }
        }
}
Set-Alias -Name rCopy -Value Copy-RemoteItem

Function Copy-LocalItem {
[CmdletBinding(SupportsShouldProcess=$true)]
param ([Parameter(ValueFromPipeLine=$true, ValueFromPipelineByPropertyName=$true, mandatory=$true)]
       [String]$path,
       [Parameter(Mandatory=$true)][String]$destination ,
       $Connection = $Global:ssh,
       [Switch]$force
      )
process {  if ($Connection.server -and $Connection.Credential -and $path ) {
              
write-verbose "Copying $path to $destination on $($Connection.Server)"
               Send-SFTP -LocalFile $path -RemoteFile $destination -Credential $Connection.Credential `
                          -Server $Connection.Server -Force -Overwrite:$force
           }
           Else {Write-warning "We don't seem to have a valid destination and path" }
        }
}

The last wrapper I needed was one to go round Invoke-SSH – primarily to set default parameters, and return only the response text.

 Function Invoke-RemoteCommand {
[CmdletBinding(SupportsShouldProcess=$true)]
param   ([parameter(Mandatory=$true , ValueFromPipeLine=$true)][String]$CmdLine,
         $Connection = $Global:ssh
        )
process { if ($Connection -is [PowerShellInside.NetCmdlets.Commands.SSHConnection]) {
             write-verbose "# $cmdline"
             Invoke-SSH -Connection $Connection -Command $CmdLine |
              
 Select-Object -ExpandProperty text
          }
        }
}

Job done … well almost. I’ll talk about a couple of tricks I’ve used in combination with these in future posts.

August 22, 2011

Easy-formating for PowerShell with New-TableFormat

Filed under: Powershell — jamesone111 @ 10:09 am

Most of the PowerShell I write ends up as Functions: rather than scripts and it’s worth differentiating  between the two.

  • A function is something which can be called from somewhere else.
  • A script is carries out a particular task (which might be to define some function(s) , or might be done with functions )

Several of my 10 tips for better functions are about being smart with inputs and outputs (which is less important for one off scripts). I should have included an extra point “Ask if constants should really be defaults for parameters” . If I notice “I am solving one specific case of X, with a value set to Y, so I can solve other X-problems if I allow the value to be changed.” it makes something reusable out of a one-off. For example, I put a wrapper round PowerShells great “Select-String” cmdlet to search PS1 files , then I realised that I might want to search .SQL or .XML files, but usually it would be PS1. So “*.PS1″ became the default value for a -include parameter.

Recently I put together a simple function named Get-SQL – it uses ODBC to send SQL commands – in  my case to MySQL on a Linux box but it could be any ODBC source. To avoid constantly creating and tearing sessions, it saves an ODBCconnection object in a global variable between invocations . Once it has been run for the first time in a session I can type Get-SQL “command” and use the existing connection: and I’ve made sure SQL statements can be piped in (that’s what I mean by being smart about inputs).

In it’s basic form Get-SQL takes 3 parameters, a connection string ($connection) a SQL statement ($SQL) and -ForceNew switch, which remakes the connection if it already exists
if ($forceNew -or (-not $global:ODBCconn)) {
    $global:ODBCconn = New-Object System.Data.Odbc.OdbcConnection($connection)
    $global:ODBCconn.open()
}
if ($sql) {
    $adapter = New-Object system.Data.Odbc.OdbcDataAdapter(
    New-Object system.Data.Odbc.OdbcCommand($sql,$ODBCconn))
    $table = New-Object system.Data.datatable
    [void]$adapter.fill($table)
    $table
}
else      {$ODBCconn }

So, if a SQL statement is passed, any data rows it generates will be returned; but if I am just setting up a connection for the rest of my session the ODBC connection object is returned.

PS C:\Users\james\Documents\windowsPowershell> Get-SQL -connection "DSN=foo-Customer5"
ConnectionString  : DSN=foo-Customer5
ConnectionTimeout : 15
Database          : foo
DataSource        : 192.168.1.234 via TCP/IP
ServerVersion     : 5.0.77
Driver            : myodbc5.dll
State             : Open
Site              :
Container         :

I want to show something shorter and simpler: I could finish with
$ODBCConn | format-table DataSource , Database, State
that might be acceptable now, but I don’t know what problems I’m storing up for the future by not returning the full ODBC object.
The better option is to write a Formatting XML file – so the object remains the same but PowerShell displays it the way I want. But writing and debugging the XML longhand is tedious. So the temptation is to go with the format-table method and fix things up later; but there is a short-cut to writing the XML which avoids parking the problem for later.. Richard mentions that it came up during the PowerShell Deep Dive in Vegas, And I downloaded it from here.  (Slightly oddly you need to put Function New-TableFormat { } into the editor, and paste the code block in then run it)

To use it you get the output the way you want with Format-Table and then replace “Format-Table” in the command line with New-TableFormat (with a couple of extra bits of finesse if needed). So it took about a minute to get the ODBC connection  to display like this

Data Source                Database State
-----------                -------- -----
192.168.1.234 via TCP/IP   foo      Open
  

I’ve also created wrappers around /n software’s Netcmdlets to make it easier to work with the linux server, and in one of those  I call
Connect-SSH -Server $server -Credential $credential -Force
which would normally output this :

AuthMode         : Password
CertPassword     :
CertStore        : MY
CertStoreType    : User
CertSubject      :
Config           :
Credential       : System.Management.Automation.PSCredential
FirewallHost     :
FirewallPassword :
FirewallPort     : 0
FirewallType     : None
FirewallUser     :
Force            : True
LocalIP          :
Password         :
Port             : 22
Server           : 192.168.1.234
ShellPrompt      :
SSHAccept        :
Timeout          : 10
User             :

It took about Two minutes to get Add-Member to show what the host thinks its name is, and create a formatting XML file to get the result to look like this:

Connected to   Port  Host name        User name
------------   ----  ---------        ---------
192.168.4.71   22    foo.contoso.com  \root

It was all of another minute to create a manifest file to load my scripts & formatting XML as a PowerShell module and have everything neat and tidy. Cracking stuff

August 17, 2011

Adding “Out-Edit” functionality to PowerShell

Filed under: Powershell — jamesone111 @ 8:41 pm

Back in March I wrote about useful bits in my PowerShell profile. Recently I’ve wanted to take the results of what I’m doing into an editor. It’s easy enough to pipe a result into CLIP.EXE which puts it in the clipboard, but it’s not too hard to use the object model of PowerShell ISE to do it directly. 

Function Out-Edit {
  Param   ( [parameter(ValueFromPipeLine= $true)][Alias('Text')]$inputObject, 
            [String]$Before,
            [String]$After,
            [Switch]$New
           )
   begin   { if ($new)   {$Editor = $psise.CurrentPowerShellTab.Files.Add().editor }
        
   else         {$Editor = $psise.CurrentFile.Editor }
            if ($before) {$Editor.InsertText($before) } 
           }
   process { $Editor.InsertText($inputObject ) }
   end     { if ($after) {$Editor.InsertText($after) } }
}

I gave the function a –new switch to decide if the input goes into the current file or a new one. And to allow the input to be topped and tailed – typically with something like $Foo=@" and "@ to make my output into a here-string , it’s just a case of getting a new or current file’s editor and calling its .InsertText() method

Note: In that March piece I showed my replacement for the default PSEDIT function, in the end I merged it with the function above and I’ve put my profile.ps1 on skydrive. In the end I changed the NEW switch, to a “UseExisting” switch, because I found that I almost always needed the new option.

August 7, 2011

Prayer-based parsing: PowerShell regular expressions again.

Filed under: Powershell — jamesone111 @ 9:52 am

 

Recently I saw a quote on Scott Hanselman’s blog

You’ve got a problem, and you’ve decided to use regular expressions to solve it.
Ok, now you’ve got two problems…”

A telling of the history of this Quote provides some other good quotes,; some restate “Using X as a solution adds to the problems”  another expanded a point I heard Thomas Lee make at one of his PowerShell camps. In PowerShell you can pipe a folder object, or a process object between steps and get its name from a .name property. Without this model: as one of quotes puts it:

The decades-old Unix “pipe” model is just plain dumb, because it forces you to think of everything as serializable text, even things that are fundamentally not text, or that are not sensibly serializable

(DOS copied the Unix pipe model, and it’s still in Windows -away from PowerShell).
Thomas had a great term for de-serializing text: “Prayer based parsing”. Every time you extract the bits you want from the text, you need to pray that it (still) works with your parsing rules. Some of the more arcane switches for command-line tools (on whatever OS) are to control their output – with one eye on simplifying the job of parsing it. 

Some tasks –like  “Screen scraping” – leave no alternative but to parse text. ( “A text only “pipe” is just automated screen scraping” is another way to describe the problem PowerShell’s pipeline solves.) Regular expressions are “big hammer” for advanced parsing (some of the “quotes” article gets into whether in Perl they are the most suitable hammer , or the most obvious hammer), and they’re something which PowerShell handles deftly with the –match and –replace operators.

I recently wanted to take a list of companies offering training which I had found the web, and transform it into a more database like format. In my screen scraped list each company had an entry which looked like this

Contoso Ltd S-1234
123 Station Road
Aberdeen
AB12 3DE
info@Contoso.com
http://www.contoso.com
Ph: (44) 123-456789
Fax: (44) 123-456790

Sign Up for a Online Course

approx: 0.8km / 0.5mi from London, GB, United Kingdom

I copied the list, and opened a new tab the PowerShell ISE and created a small snippet of PowerShell
$list = @"

"@

The @” … “@  defines a multi-line “here-string” – so I can paste the data in between the quotes hit the Run Script button and ta-da! my list is in a PowerShell variable. Studying the format, all the data I want is on consecutive lines – with a double line space after them. The –replace operator can remove single line breaks, leaving additional ones, to group the text I want on one line.
$list -replace "\r\n(?=\w)", ", "
\r and \n indicate the return and newline characters. So -replace "\r\n", ", " would replace all line breaks with with a comma and a space,
the (?=  ) construction specifies a “look ahead” says “match ONLY if what you have found is followed by…” and I want the replacement to happen only if the return/newline is followed by a ‘word’ character (\w) – so a line break followed by another line break won’t be replaced.

Now I can split my list – which is still a single giant string into  into multiple strings. Where I had two line breaks I know have one line break, a comma and a space so I can use that as the expression for the –split operator. 
($list -replace "\r\n(?=\w)", ", ") -split "\r\n, "

So far so good, but this will give be distances and the call to take a course. I want to discard any lines which don’t contain a company’s ID number

($list -replace "\r\n(?=\w)", ", ") -split "\r\n, " |
    foreach { if ($_ -match "\ss-\d+\s*,") {'"' + $_ + '"'}}

This looks at each line and if it contains a space, ‘S’, ‘–’, at least one digit, any number of spaces (including zero) and then a comma, it returns the line wrapped in double quotes. So now my text is one line per company looking like this. 

“Contoso Ltd S-1234, 123 Station Road, Aberdeen, AB12 3DE, info@Contoso.com, http://www.contoso.com, Ph: (44) 123-456789 , Fax: (44) 123-456790″

My next step is to replace the space before the ID number with ‘ “,” ’ 

(($list -replace "`r`n(?=\w)", ", ") -split "`r`n, " |
      foreach { if ($_ -match "\ss-\d+\s*,") {'"' + $_ + '"'}}
) -replace "\s+(?=S-\d+\s*,)", '","'

"\s+(?=S-\d+\s*,)" says “match on one or more spaces ONLY IF  followed by S, –, at least one digit, any spaces (or none) and then a comma.”  Wherever that Occurs I insert “,” breaking my line into two quoted strings separated by a comma – just like a CSV file.  The next step is to make it three quoted strings by replacing the coma and spaces after the ID number with the same ‘ “,” ’  combination

(($list -replace "`r`n(?=\w)", ", ") -split "`r`n, " |
      foreach { if ($_ -match "\ss-\d+\s*,") {'"' + $_ + '"'}}
) -replace "\s+(?=S-\d+\s*,)", '","'  -replace "(?<=S-\d+),\s*",'","'

"(?<=S-\d+),\s*" uses  (?<=   ) which is the look-behind construction, saying  “match on comma and any spaces (or none) ONLY IF it is preceded by ‘S’, ‘–’ and at least one digit”.  Now my line looks like this

“Contoso Ltd”,”S-1234”,”123 Station Road, Aberdeen, AB12 3DE, info@Contoso.com, http://www.contoso.com, Ph: (44) 123-456789 , Fax: (44) 123-456790″

The next task is to put the ” , ” combination before an email address:

(($list -replace "`r`n(?=\w)", ", ") -split "`r`n, " |
foreach { if ($_ -match "\ss-\d+\s*,") {'"' + $_ + '"'}}
) -replace "\s+(?=S-\d+\s*,)", '","' -replace "(?<=S-\d+),\s*",'","' `
–replace ",\s* (?=\w+@\w+\.\w+)" ,'","'

",\s* (?=\w+@\w+\.\w+)" says “match on comma and any spaces ONLY IF followed by at least one character , an @ sign, at least one character , a dot, and at least one character.”
Then I can use  ",\s*(?=http)" which matches on comma and any spaces ONLY IF followed by http and finally "\s*,\s* Ph:\*" and "\s*,\s*Fax:\s* " match on the ph and fax tags and surrounding spaces to put “,” before each of those.

Instead of using a different –replace operation for each step the terms  can be condensed into a single expression with | for “or” between each part.  You can see that building the expression up bit by bit is a lot easier than writing it in one go.

(($list -replace "`r`n(?=\w)", ", ") -split "`r`n, " |
      foreach { if ($_ -match "\ss-\d+\s*,") {'"' + $_ + '"'}}
) -replace "\s+(?=S-\d+\s*,)|(?<=S-\d+),\s*|, (?=\w+@\w+\.\w+)|,\s* (?=http)|\s*,\s*Ph:\s*|\s*,\s*Fax:\s*" , '","'

This turns my text into

“Contoso  Ltd”,”S-1234″,”123 Station Road, Aberdeen, AB12 3DE”, “info@Contoso.com”,”http://www.contoso.com&#8221;,”(44) 123-456789″,”(44) 123-456790″

There is potentially more-cleaning up I could do – for example identifying lines without email or URL sections and inserting a blank “”, in their place. But this gives me all I need; by writing a header row to a file and adding and my text to it I would get a ready made CSV file for Excel. I could have turned it into a bulk database import or used PowerShell’s has a ConvertFrom-Csv cmdlet to turn each row into a object or any number of other things. I don’t think I would have tried typing that line in one go: but PowerShell induced the habit of building up these long lines a little at a time meant it only took a couple of minutes to do. 

July 27, 2011

A “Tail” command in PowerShell

Filed under: Powershell — jamesone111 @ 3:46 pm

I mentioned that I have been working for a small Software company and in this new role I’m having to work with Linux servers and MySQL. MySQL has proved rather better than I expected and Linux itself worse –at least from the perspective of being place for me to get work done (I’ll leave the reasons why for another time).  Our app generates quite a lot of log files, and so do some of the services it uses , and so most of time I’ll have at least one Window open running the Unix tail –f command (tail outputs the last few lines from a file, and –f tells it to follow the file –that is, to keep watching it and output anything added to it).  There should be one for Windows but a quick search didn’t turn one up – so I put one together with PowerShell which pulls some interesting techniques from more than one place.

Watching a file

The first thing that’s needed to give the follow functionality is the ability to know when the log file has changed. I found that James Brundage’s PowerShell Pack gave me ALMOST what I needed. Here’s my modified version

Function Register-FileSystemWatcher {
param(  [Parameter(ValueFromPipelineByPropertyName=$true,Position=0,Mandatory=$true)]
      
 [Alias('FullName')] [string]$Path,
                            [string]$Filter = "*",
                            [switch]$Recurse, 
        [Alias('Do')][ScriptBlock[]]$Process,
                                    $MessageData, 
                         [string[]]$On = @("Created", "Deleted", "Changed", "Renamed"))
begin   { $ValidEvents = [IO.FileSystemWatcher].GetEvents() |
                            select-object -ExpandProperty name
        }
process {
    $realItem = Get-Item -path $path -ErrorAction SilentlyContinue
   
if (-not $realItem) { return }
    if ($realItem -is [system.io.fileinfo]) {
           $Path = Split-Path $realItem.Fullname
         $Filter = Split-Path $realItem.Fullname -leaf }
   
else { $Path = $realItem.Fullname}
    $watcher = New-Object IO.FileSystemWatcher -Property @{
                        
Path=$path;
                      Filter=$filter; 
       IncludeSubdirectories=$Recurse}
    foreach ($o in $on) {
    
 if ($validEvents -Ccontains $o) { #Note CASE SENSITIVE CONTAINS...
       
foreach ($p in $process) {
           
if ($p){Register-ObjectEvent $watcher $o -Action $d -MessageData $MessageData}
        }
     
} 
      Else {Write-warning ("$o is an invalid event name and was ignored." +
           
     [Environment]::NewLine + "Names are case sensitive and valid ones are"
+
                 [Environment]::NewLine + ($ValidEvents -join ", ") + ".")
      }
    }
}
}

The Function takes a path (usually a folder but possibly a file which can be passed via the pipeline) a filter, and a –recurse switch which determine what to watch. If the path isn’t valid the function drops out, and if the path is a file it is split into name and folder parts – the name becomes the filter and the folder becomes the path.
The path, filter and recurse are used to create a FileWatcher object. A FileWatcher raises events , and Register-ObjectEvent hooks PowerShell up to these events: the cmdlet says “when this event happens on that object, run this code block”. Usefully – as I learnt from a post on Ravikanth Chaganti’s blog you can pass something to the registration for use later on – which I’ll come back to in when we see the function in use. I do a quick check to see the event passed is valid before trying to hook up the code to it, generating a warning.

The tail command I created is named “Get-Tail” for two reasons,
(a) What gets returned is the “tail of the file” so GET- is the right verb to use out of the standard ones
(b) If PowerShell can’t find a command name it tries Get-Name as an alias, in other words:
 Get-Tail can be invoked simply as tail. It looks like this

function Get-tail {
param ( $path,
      [int]$Last = 20,
      [int]$CharsPerLine = 500,
      [Switch]$follow
)
$item = (Get-item $path)
if (-not $item) {return}
$Stream = $item.Open([System.IO.FileMode]::Open,
                   [System.IO.FileAccess]::Read, 
                    [System.IO.FileShare]::ReadWrite)
$reader = New-Object System.IO.StreamReader($Stream)
if ($charsPerLine * $last -lt $item.length) {
       $reader.BaseStream.seek((-1 * $last * $charsPerLine) ,[system.io.seekorigin]::End)
}
$reader.readtoend() -split "`n" -replace "\s+$","" | Select-Object -last $Last | write-host
if ($follow) {
          $Global:watcher = Register-FileSystemWatcher -Path $path -MessageData $reader`
        
 -On "Changed" -Process {
               $event.MessageData.readtoend() -split "`n" -replace "\s+$","" | write-host }
      $oldConsoleSetting = [console]::TreatControlCAsInput 
     
[console]::TreatControlCAsInput = $true
      while ($true) {
         
if ([console]::KeyAvailable) {
                       $key = [system.console]::readkey($true) 
                       if (($key.modifiers -band [consolemodifiers]"control")and
                           ($key.key -eq "C")) {
                           
 write-host -ForegroundColor red "Terminating..." ; break } 
                       else { if ([int]$key.keyCHAR -eq 13) { [console]::WriteLine() }
                     
       else { [console]::Write($key.keyCHAR) }} }
                 else {Start-Sleep -Milliseconds 250} } 
     
[console]::TreatControlCAsInput = $oldConsoleSetting
      Unregister-Event $watcher.name
   } 
$Stream.Close() 
$reader.Close()
}

Opening a sharable file in PowerShell

The function takes 4 parameters, –path , -last a -follow switch and a –CharsPerLine which was a bit of an after thought. : the .Open() method is used to open the file as a Read-Only FileStream allowing writes by others; and a StreamReader object is created to read from this Stream.

By using the Reader’s .ReadToEnd() method I could be ready to read anything which is added to the end of the file, and  output the result splitting it on new lines and removing end-of-line spaces – all of which is not much than I could have done with Get-Content.
I added a refinement after realizing that I occasionally deal with log files which are over a Gigabyte in length. Not only will they take ages to read, but using .ReadToEnd() will try to read the whole file into memory which is just horrible. So I added a –CharsPerLine parameter – I multiply this by the number of lines I want to read and if the file is bigger than that I seek forward to that many bytes from the end the file, before calling .ReadToEnd(). The default is a generous 500, so if I request 2000 lines I’ll read 1MB of data which isn’t too terrible even if the average line length is only a few characters. If I’m reading 20,000 lines I might set the parameter lower, or if I know the lines are very long I might set it higher. Then everything is set up for the optional Follow part.

Who reads for the Watchers ?

I want to call the .ReadToEnd() method when the file changes and output everything up to the end of the file. The question is, how to have access to the StreamReader inside the script which runs when FileSystemWatcher fires its changed event ? This is where Ravikanth Chaganti’s tip comes in; by making the Reader the “Message Object” for the event, it can be referenced in the script block.  By the way because I don’t know what else might end up happening I force the output to the console throughout – though my normal custom is to avoid using write-host
{$event.MessageData.readtoend() -split "\s*`n" | write-host }

Taking Control of Control+C

Finally – to mimic the behaviour of tail on unix I trap keyboard input and pass it through to the console until the user presses [Ctrl][c].  This works much better in the “Shell” form of PowerShell than the ISE. When [Ctrl][c]. is pressed the function cleans up and exits.

Job done. 

Update: I’ve put the script here for download

July 20, 2011

More on regular expressions–(another reason why PowerShell beats VBscript)

Filed under: Powershell — jamesone111 @ 8:26 pm

For the last few weeks I’ve been settling into a new job in a small company which writes software; and that has meant getting used to some new tools. One of these is an issue tracking system named JIRA. This post isn’t about JIRA, except to say it generates a lot of e-mails – what this post is about is parsing standardized blocks of text and JIRA’s subject lines provide a good example.  If we create an ‘issue’ to look at how some data gets processed we might end up with pile of mails with subject lines like this.

[JIRA] Created: (Foo-164) Test Run – Data type # 2
[JIRA] Commented: (Foo-164) Test Run – Data type # 2
[JIRA] Updated: (Foo-164) Test Run – Data type # 2
[JIRA] Assigned: (Foo-164) Test Run – Data type # 2
[JIRA] Resolved: (Foo-164) Test Run – Data type # 2
[JIRA] Closed: (Foo-164) Test Run – Data type # 2

JIRA accounts for more than half of my messages, so I set up an Outlook rule to move them to their own folder. The format of the message is standardized.

“[JIRA]” Event type “(“ project-ID “-” counter “)” issue-description

So it is pretty east to have a rule which looks for “[JIRA]” and moves messages to a new folder, but the format doesn’t help me sort and group messages – Outlook and Exchange can’t tell that the 6 messages above should be a single conversation. Sorting by date muddles all the issues and doesn’t group by project (“Foo” in my example). Sorting by subject groups all the created together, all the closed together and so on which is no better.  So I came up with the idea of rewriting the subject line – which naturally I did from PowerShell to begin with: First I had to get the JIRA folder in my inbox. The method for this hasn’t changed since the first version of Outlook, even though the languages come and go.

$ns = (New-Object -ComObject outlook.application).getNameSpace("MAPI")
$ExchStore = $ns.stores | where-object {$_.ExchangeStoreType -eq 0}
$jiraFolder = $ExchStore.folders.item('Inbox').folders.item('JIRA')

The next bit does the work. I wrote it as ONE line of PowerShell but I’ve spaced it out here for easy reading

foreach ($item in $jiraFolder.Items) {
   
$item.Subject = $item.Subject -replace "^\[JIRA\](.*?)\((\w*-\d*)\)",'$2 $1'
    $item.Save() }

The replace operation needs what a friend of mine calls a “Paddington hard stare” to understand it. 

  • “Look for the start of a line followed by “[JIRA]” ” is coded as   ^\[JIRA\]  
    The  [] characters have special meaning in a regex so need to be escaped with a \ sign.
  • “Any sequence of characters followed by “(” ” is coded .*\( 
    Like their square cousins, the () characters also have special meaning – which I’ll come to – and so they, too, need to be escaped with a \sign .
  • Regular expressions are naturally “greedy” so if my subject line had been
    “[JIRA] Closed: (Foo-164) Test Run ( Data type # 2)
    The term .*\( would match all the way up to the second ( character.
    Putting a ? symbol after the * tells it to to match using the shortest sequence of characters it can, so using  .*?\(  reduces the
    match to “ Closed: (”
  • Wrapping part of the expression in () saves it for later , so (.*?)\( will capture “Closed:” in this example
  • \w*-\d*\) means any number of “word” characters , a “-” character, any number of “digit” characters and a “)”  – which in my example will match  on “ Foo-164 )”
  • I can capture “Foo-164” by inserting () giving (\w*-\d*)\)

Putting the pieces together I get ^\[JIRA\](.*?)\((\w*-\d*)\) which matches on “[JIRA] Closed: (Foo-164)” and makes “ Closed: ” and “Foo-164”; available in the replacement text as $1 and $2 (note that PowerShell will process these as variables if the replacement text is wrapped in double quotes, so to ensure they reach regular expression parser single quotes are needed).  So the –replace operation replaces “[JIRA] Closed: (Foo-164)” with “FOO-164  Closed.” which is much better for sorting.
In practice I developed this a bit further by putting information into the message’s userProperties  but my initial prototype shows the idea.

I ran this against the messages I had already received and it worked very nicely. But it would only work as an on-going solution if I was happy to run my PowerShell script for every new mail. I’m not. So I need something that Outlook can run automatically – Macros (Functions written in Outlook’s VBA environment) can be invoked by telling the rule to Run a Script. I found that the Move-to-folder rule-operation needs to be moved into the script (if the rules engine moves the message, the script doesn’t work).  The line of PowerShell that was in the body of the ForEach loop in my example  turns into this:

strSubject = objItem.Subject  
If Left(strSubject, 6) = "[JIRA]" Then
    openParen = InStr(strSubject, "(")
    closeParen = InStr(openParen, strSubject, ")")
    StrJIRASubject = Mid(strSubject, (closeParen + 2))
    strJIRAAction = Mid(strSubject, 8, (openParen - 9))
    strJIRAID = Mid(strSubject, (openParen + 1), (closeParen - openParen - 1))
    objItem.Subject = strJIRAID + " " + strJIRAAction + " " + StrJIRASubject
    objItem.Save
End If

You can see that here I have to check that line begins with [JIRA]: using –replace in  PowerShell makes no changes if the match isn’t found, so doesn’t need to check – but VB script might scramble a non-JIRA subject line.
Then the script finds the positions of the Parentheses.
Then it has to isolate the pieces of text after the closing one, before the opening one but after “[JIRA] ” and between the two.
Then it assembles these parts to make the new subject line and saves the message.

I know people find  regular expressions tough to follow but I’d say it was pretty hard to tell what is going on in the script, For example, take the line
strJIRAAction = Mid(strSubject, 8, (openParen - 9))
Why select from character 8 for openParen –9 characters ?  

[Answer: “[JIRA]” is 6 characters, character number 7 is a space so the word “Closed:” begins at  character 8.  From there up to the “(” is  openParen – 8 characters , we want to stop one before that so we read openParen –9 characters and store “closed: ”  in strJIRAAction] .
It takes time to work this out.  If you know a little of the language of regex it takes less time to see that ^\[JIRA\](.*?)\( will put “closed: ” into $1

But there’s more.  I alluded to the need to check for [JIRA] at the start of the subject line in VB script.  The –Replace operator in PowerShell “Fails safe” Suppose a future version of [JIRA] changes from “[JIRA] Closed: (Foo-164)” to “[JIRA] (Foo-164): Closed” , the regular expression no longer matches, so nothing is changed. The VB script continues to run and … well you can try to work out what my Macro will do. The iterative development we tend to do in PowerShell lets you enter
"[JIRA] Closed: (Foo-164) Test (type 2)" -match "^\[JIRA\].*\((.*)\)" ; $matches
as a command line and see and fix the problem.  Something you just can’t do so well with VBscript.

March 25, 2011

PowerShell parameters revisited.

Filed under: Powershell — jamesone111 @ 8:49 pm

A little while back, as a follow up to a talk I’d given, I wrote a post entitled why it is better not to use PowerShell Parameter validation. I repeated the talk recently and met up with Thomas who’d organized both.  His initial instinct had been that my “best practice” of NOT declaring parameter types was just wrong – it’s not a surprising view given what he has been exposed to….

Over a quarter of a century ago, when I was studying computer science at University, one of the lecturers wrote the following quote from Dykstra on the board: “It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration.”
I might be pushing a heavy stone up a steep hill here, trying to overcome the “mental mutilation” of too much “good programming”. But to see why “good programmers” can get the declaration of parameter types wrong in PowerShell, consider this example:
Function Use-File { Param ([System.io.FileInfo]$theFile )
     Format-list -InputObject $thefile -Property *
}

[string]$f=".\Event.txt"
Use-File $f

Simple stuff: I declare a function with one parameter – a FileInfo object – which outputs a list showing all the properties of that object.
Then I pass a string to the function – and it contains a relative path to a file held in the current folder. So … What comes out of the function ?

  1. An error, or.
  2. The properties for the file in the current folder , or
  3. Something else ?

If you have a “good programming” background I’d expect you to say “An error”. Experience tells you passing a type other than the specified one does that. Not here.
In a shell you don’t want to worry about the type of object which comes out of one command or goes into the next, so PowerShell smooths out differences between types. Ask it to evaluate “Fire ” + 1 and it does an implicit conversion – usually called a type cast – of the second argument to the type of the first and returns “Fire 1”.
(1 + “Fire” fails because when it applies the rule of converting to match the first, “fire” can’t be turned into a number).
After spending a while doing only PowerShell, a compiled language like C# seems to need a huge number of “cast to String”, “Isn’t null”, “Isn’t empty”, “Is non-zero” operations, because it requires cast to string or Boolean types to be explicit.  It’s not one is good and the other bad, the different environments impose different requirements.

In PowerShell putting [Type] is how you explicitly cast something to a different type. [FileInfo] does not say “Reject anything which isn’t a file” as it would in C#,
it says to PowerShell “Convert anything which isn’t a file…”.
So, what conversion does it attempt? The underlying FileInfo .net object has a constructor which accepts a string and returns a file, so PowerShell creates a new FileInfo object using the constructor and passing in the string.  Unfortunately the constructor doesn’t handle the relative path, so the object which comes back  represents a non-existent read-only file in the windows\system32 folder. Yes you did read correctly, it doesn’t exist and you can’t write to it. 

I was taught to be as clear as the tools allowed about data types. It’s my contention that “good programmers” learn that it is dangerous to assume parameters will be an acceptable type; so when it is optional to specify types they will still do so out of habit. In PowerShell that leads to a more dangerous (and hidden) assumption- that what is passed will be the correct type or will fail if it can’t be cast correctly.  But what if a casting operation yields a nonsense result as it does here?

“Good programmers” accept writing something for type changes as a necessary a safety mechanism; at a command line it’s unnecessary pedantry. But PowerShell is both command line AND programming environment; and when we program in PowerShell we want to do it well, don’t we ? 
In that previous post I argued functions should cope with be passed names (paths in the example) and resolve them to the desired object (a file).  If a user expects to supply a file to your function by typing part of its name and hitting the [TAB] key then you have to write something after the parameter declaration (like a resolve-path statement) to cope with the change of type; it’s slightly different code to that which C# programmers need to write, but we don’t escape the task completely.

You might have thought ahead and asked “What would happen if I wrote”
Param ([String]$theFile) instead ? If the function is passed a file won’t casting it to string give me the full path ? That WILL work, but here’s an example to show why you should not use [String]:

Function Use-word {           
    Param ([string]$theword )           
    write-host "**$theWord**"           
}           
           
$w = "Hello","world"           
Use-word $w

This  function takes a single string  – and writes that string to the console: but I haven’t passed it a single string but an array containing  “Hello” and “world”.  I’ll pose the same question as before: what comes out ?

  1. An error.
  2. **Hello** (the first item in the array “wins”)
  3. **World** (the last item in the array “wins”)
  4. **Hello World**  
  5. Something else.

If your instincts said “An array of X into something which takes a single X won’t go so this should give an error” but you now doubt them, I’ve achieved something.
In fact, when PowerShell casts an array to a single string it converts the members to strings and concatenates them with spaces between each.[string]("hello","world") returns “Hello world”.  With paths, two valid paths will become a string which is not a valid path.

A lot of PowerShell cmdlets will accept arrays instead of a single parameter.  Operators also work with arrays  "hello","world" -split "l" gives the same result as
"hello" -split "l" ; "world" -split "l".
For a cmdlet example, you can get the a list of QFEs (Hotfixes) installed on a computer with  Get-WmiObject -class win32_QuickFixEngineering
You can find the results on multiple machines (if they are set up for remote WMI) with
Get-WmiObject -class win32_QuickFixEngineering –computer Server1,Server2,serverN
because the Get-WmiObject cmdlet accepts an array in the –computername parameter.

Any function which takes a –computername parameter does not even need to look at it before passing to Get-WmiObject cmdlet.  But this seems wrong or at least lazy: surely we should catch an error as soon as possible ? Here’s one last example to show what difference it makes:

Function Use-integer {
   Param ($theNumber )
  start-sleep $theNumber
}

$n = "hello"
Use-integer $n

So here I Don’t validate that $number is actually a number, and I pass in “Hello”, and Start-Sleep will generate an error which looks like this

Start-Sleep : Cannot bind parameter 'Seconds'.
Cannot convert value "hello" to type "System.Int32".
Error: "Input string was not in a correct format."

If  I DO validate that $number is a number, I get this

Use-integer : Cannot process argument transformation on parameter 'theNumber'.
Cannot convert value "hello" to type "System.Int32".
Error: "Input string was not in a correct format."

Early validation didn’t gain anything in this case. It might have saved me from using a lot of time/system resources before the error. If $ErrorActionPreference is set to “Continue”, which it is by default, my function might continue on to something stupid, and early validation would stop that.  So it is not universally wrong to use [type] validate parameters, you just need to ask two questions: “Does what I am doing catch what I need to ?” and “Does what I am doing handle that in the best way”. Best way includes handling names and paths for objects, handling arrays and even (if you’re really bold) handling script block parameters. This post has gone quite long enough so I’ll talk about those in another post.

March 21, 2011

What’s in my PowerShell profile (2) edit

Filed under: How to,Powershell — jamesone111 @ 11:59 am

Carrying on with the theme of Useful Stuff For A PowerShell Profile, which I started with WhatHas, I want to show the edit command I added.
To begin with – in the betas releases – I was  a little bit annoyed and puzzled that there was no easy way to tell the PowerShell ISE to edit a file, but when I found there is a route to do so through the object-model I added an edit function to my profile. A very similar function is in the release version with PSEdit, I’ve kept mine which looks like this:

function edit {
   param ([parameter(ValueFromPipelineByPropertyName=$true)] 
          [Alias("FullName","FileName")]$Path
   )
   process {
 
     if (test-path $path) {
           Resolve-path $path -ErrorAction "silentlycontinue" | foreach-Object {
               if ($host.name -match "\sISE\s") {
             
     $psise.CurrentPowerShellTab.Files.add($_.path) | out-null }
            
  else {notepad.exe $_.path} }
         
 }
       else {write-warning "$path -- not found "}
   }
}

It only takes one parameter, which can come via the pipeline – enabling piping is a habit I’ve developed. I use a couple of PowerShell’s clever tricks with parameters – first, instead of using the whole piped object it can use a single named property of the object, and second, aliases let me invoke edit with –path or –fullname; that’s not a lot of use here – I can miss the name out completely and PowerShell knows which parameter I mean because there is only one. But the result of putting the two together is clever-squared: it says to PowerShell “if you find the object has a Path property, use it, if it doesn’t but has a FullName property use that, if it doesn’t have one of those but has FileName, use that” and so on.

A real-world use for this is I had some scripts which fetched a folder name from WMI and used PowerShell’s built-in Join-Path cmdlet to add a file name to it to create a parameter for another call to WMI. This worked against remote servers but I found that if the folder starts with a drive letter which exists on the remote server but not on the local machine , Join-path will produce an error, so I had to do the operation without Join-path.  I can use WhatHas Join-path to get  MatchInfo objects for each occurrence of Join-path. These objects have a Path property so loading all the affected scripts into the ISE editor is as simple as:
whatHas join-path | edit

Piping is the main place where my version scores over PSEdit  (the other being that it works in outside the ISE). The body of the function just has to open the requested file(s). Test-Path and Resolve-Path accept arrays, so I don’t need to any work to allow the function to be called as edit.\file1.ps1, .\file2.ps1.
If Test-Path says no valid name was passed, a warning is printed, but if it is valid  Resolve-Path is used to turn a partial name or wildcarded name into one or more paths.  If multiple names are passed and any IS valid, execution will reach Resolve-Path which will produce an error if any IS NOT valid – hence the use of –ErrorAction.
Armed with one or more valid, fully qualified paths, it’s a case of using the ISE’s object model to open the file: of course that only works in the ISE so the command falls back to notepad if the host name isn’t “Windows PowerShell ISE Host”

What’s in my PowerShell profile (1) WhatHas

Filed under: How to,Powershell — jamesone111 @ 10:01 am

This shows simple use of Select-String – which is such a great tool it deserves a long piece – but  I’ve been saying for I while I would write up the functions I have in my profile: so for now I’m just going to show how I use Select-String  there

I keep everything in the Profile.PS1 under Documents\WindowsPowerShell. PowerShell  reads additional user and system-wide profiles depending on whether you start the console version or the ISE version but I ignore these.

In modules or anything that might get used in another script, I follow the proper verb-noun naming conventions;  but for utility functions in the profile I just use short names.  WhatHas is one of 3 short name / profile functions I have, its  job is to find which of my scripts contain(s) something or some things. I use it when I need either to fix something found in multiple files, or to remind myself how I used something in the past in order to use it in something new: I can type
WhatHas reflection
And get a listing of all the places I’ve used [System.Reflection.Assembly]::LoadWithPartialName
showing the file, line number and line itself:  Here’s the code

Function whatHas {
    param ( [Parameter(ValueFromPipeLine=$true,Mandatory=$true)]$pattern, 
            $fileSpec="*.ps1",
           
[Switch]$recurse
      
   )
  Process { $( if ($recurse) { dir -recurse -include $fileSpec} 
               else  { dir $fileSpec} ) |
           
 select-string -Pattern $pattern }
          }
}

Originally I only had two parameters: a –recurse switch choses which form of DIR gets used (I really should use Get-ChildItem rather than DIR or LS somehow I’m happier with DIR),  the files it returns and what every was passed in –pattern are fed into Select-String – which accepts arrays for the pattern so I don’t have to do anything  to support searching for multiple patterns with
WhatHas reflection,assembly
The fileSpec was originally hard coded to "*.ps1" but I realized I might want to search .TXT or .XML files so I made that the default for a parameter instead.
I’m getting in the habit of allowing the “main” parameter of a function to be piped in – making the function body a Process {} block ensures each item piped is processed. 

The output looks like this:

PS C:\Users\James\Documents\windowsPowershell> whathas test-path

hyperv.ps1:236:   If (test-path $VHDPath) {
hypervR2.ps1:236:   If (test-path $VHDPath) {
profile.ps1:6:        if (test-path $path) {

So for the “how to use” case I can often copy what I want from the output and paste it into what I’m working on. For the “need to fix” case I might want to pipe the output into something else, and I’ll look at what that might be in the next post

January 11, 2011

Why it is better not to use PowerShell Parameter validation

Filed under: Powershell — jamesone111 @ 12:36 pm

I was giving a talk shortly before Christmas and I was giving some advice based on what I had learned writing my PowerShell library for Hyper-V. I  said

  • Don’t force user to use an object as a parameter – convert names to objects in your code
  • Don’t force users to expand arrays – expand in your code
  • Don’t automatically punish users if a parameter is empty

The corollary from this is Don’t be over proscriptive with parameter checking, especially when it comes to types – which it kicked off an interesting debate. This is best explained with real world examples, so lets take a simple case from my Hyper-V world , and all the background you need to know is

  • A server contains zero or more Virtual Machines.
  • Virtual Machines  can be “Running” or “Stopped” (and in other states)
  • Virtual Machines are represented by VM objects, which have a state property to indicate whether they are running or stopped.

With that in mind I want to look at 3 commands:
Get-VM which returns VM Objects and must, as a minimum, accept  parameters of
-server to specify where to look for VMs and –VMName to filter the selection by name 
(in case you don’t know, if there is no other parameter that starts –VM PowerShell will let you abbreviate this as –VM)
Start-VM will change the the state to running
Stop-VM will change the  state to stopped. 
Before implementing the commands one must decide (among other things):

  • What are valid inputs for the -Server and -VMName parameters in Get-VM ?
  • What inputs should start-VM and Stop-VM take.
  • The Output of Get-VM can become the input of Start-VM and Stop-VM. What should happen if no VMs are found on a server ?

It would be a good idea for you to think about how you’d answer these questions before reading on because I’m going to set out my view here. My view is right, of course, but other views are not necessarily wrong.

To me, flexibility is key. Get-VM , in my view, must allow the person typing the command to specify multiple servers easily.  The most obvious example is
Get-VM  -Server ClusterNode1, ClusterNode2
If parameter validation says the server name must be a single string then you force the user to do something like this
"ClusterName1", "ClusterNode2" | foreach-Object {Get-VM –Server $_}

Not only is the first way shorter but it can be done by a user who has no PowerShell background.  In the same way it should be possible to get those VMs whose names indicate they are located in particular cities
Get-VM  -VM "London*" ,"Paris*"
Yes, I have just sneaked in support for Wildcards. Not allowing this means forcing the user into something like 
Get-VM | where-Object {($_.name –like "London*") –or ($_.name –like "Paris*") }

This may mean more work when we implement the Command (which we do once) to save work when it is run (which happens many times). 

What about the case where we run
Get-VM  -VM "London-DC01" -Server ClusterNode1
but London-DC01 is running on ClusterNode2 : Should this command return an error?
My (limited) background in databases says that if the query runs successfully and finds no matching data, “Nothing” is a perfectly valid output, and more desirable than an exception stopping a script. This begins to answer the question of what should the input to Start-VM and Stop-VM be.   

  1. It would be illogical if they did not accept the output of Get-VM, so the following should  be possible
    $myVMs = GET-VM ; Start-VM –VM $MyVMs
    Start-VM  -VM  (Get-VM –VM "London-DC01")
    Get-VM  | Start-VM
    And should not produce an error if the GET-VM command returns no VMs.
  2. Some might think it acceptable to say the -VM parameter of Start-VM and Stop-VM must contain VM objects. But if it is possible to Get VMs by passing VM name(s) and/or server name(s) then many administrators would say that
    Start-VM  -VM  (Get-VM –VM "London-DC01")
    is too like coding, and not enough like the shell command line they would expect which would be
    Start-VM  -VM "London-DC01"

PowerShell parameter declarations can specify how their type and content should be validated.  “Real” programmers who are used to always specifying the type of everything, tend to grasp this and say “We WILL specify a type (and other validation) in every parameter declaration”. In C#, for example, if someone tries to pass your code something of the wrong type, Visual studio will stop them and tell not to be so silly – their code won’t compile so they never see a ugly red runtime error. Making parameter types agree makes a little more work, but their code will be run many times (hopefully) so that’s tolerable.  But a PowerShell user might type a command in the shell once and then it’s gone, that extra work is less tolerable, and if input which seems logical to them violates rules you have set, the first they they will know is a ugly red runtime error:  any programmer should worry when normal user behaviour produces runtime errors (though a lot will just code to avoid the runtime error, not to adapt their rules to the way users expect to work). 

In PowerShell , in practice I’ve found I can only get this flexibility by allowing anything to be passed in and doing the validation, longhand, in the body of the code. In the VM example that means code which says “Is this an array ? I’ll deal with each item”; “Is this a string ? I’ll treat it as a name which I can turn into an object”; “Is it an Object of the Class I want ? Yippee! I can process it !”; “Is it an object of some other class from which I can get an object of the class I want? Turn it into the right object.”; “Was it anything else? If so do I need to stop execution or can I return nothing ?”  Allowing anything into the function body feels wrong, but I’d ask the question “If the language did not allow you to to specify the parameter type, would you expressly write code to throw a runtime error if the parameter passed wasn’t of the expected type ? If so, might it say ‘If you want to use this as an input, then do X’ ?”.  If the answer is yes to both then Your code should do more cope with normal user behaviour  but if it is yes to the first and no to the second then Validating type might be the right way to go.

By way of a second example I came across some code to create a hash from the content of files, and because PowerShell lets you add properties to objects, the code returned file objects with an added hash , so you do

Get_Some_Files | add-hash | something_to_find_Duplicates_using_hashes

But the person who wrote add-hash refused to allow anything but a file object; I couldn’t do $myFile = add-hash "C:\user\James\myFile.stuff" , but worse  dir –recurse  | add-hash produces an error when it hits a directory objects.  
I could insert a where-object command before the add-hash to filter down to the files, but if that is how the command is going to be used on many occasions, wouldn’t it be simpler for it to do that itself ?  If skipping directories silently bothers you, then catch directories, and use write-verbose to say “Ignoring Directory Xyz”, and if  someone is trying to add a hash to something which makes no sense – like a VM object – really bothers you then catch anything that isn’t a filename, file object or directory object and throw a runtime error further down the script.

As I was writing this Shay Levy retweeted a link to the Windows Scripting Guys’ post on Validating parameters what’s interesting is they show a function which checks phone number formats. So lets put in my phone number formatted as the ITU says it should be

test-parameters "+44 (7801) 8 8 10 10"
Test-Parameters : Cannot validate argument on parameter 'phoneNumber'. The argument "+44 (7801) 8 8 10 10" does not match the "\d{3}-\d{3}-\d{4}" pattern. Supply an argument
that matches "\d{3}-\d{3}-\d{4}" and try the command again.
At line:1 char:16
+ test-parameters <<<<  "+44 (7801) 8 8 10 10"
    + CategoryInfo          : InvalidData: (:) [Test-Parameters], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationError,Test-Parameters

What kind of user understands “Supply an argument that matches “\d{3}-\d{3}-\d{4}” and try the command again.”  ?
Even if we know that the number is ALWAYS American, if the ITU says we can put brackets, dashes and spaces into the number to aid readability shouldn’t we allow (425) 555 1234 or 4255551234 and then clean up the number in the function ?

Over-prescriptive (and often plain wrong) validation comes up in plenty of places: I’ve lost count of web sites which tell me “Credit card numbers must be entered without spaces.” (with all that computing power you think they could strip out the spaces, and maybe even identify Visa and Mastercard automatically). And there are the ones who say names can only contain A-Z and a-z, tough luck if yours has a hyphen, apostrophe or accented character. (being an O’Neill this one drives me nuts. So does not checking for apostrophes and throwing a SQL error).  Realistically we’re not going to get rid of it all. Just don’t add to it, OK ?

July 7, 2010

Working with the image module for PowerShell; part 3, GPS and other data

Filed under: Photography,Powershell — jamesone111 @ 7:59 am

In Part one I showed how my downloadable PowerShell module can tag photos using related data – like GPS position – which was logged as they were being taken, and in part two I showed how I’d extended the module in James Brundage’s  PowerPack for Windows 7. Now I want to explain the extensions which automate the processes of:

  • Getting the data logged by GPS units and similar devices
  • Reading each image file from the memory card and matching it to an entry in the log made at around the same time
  • Building up the set of EXIF filters filters based on the log entry.

The data and pictures are connected by the time stamp on each, but to connect properly the scripts must cope with any time difference between the camera’s clock and time on the logging device – whether that’s a GPS unit or my wrist mounted scuba computer. A few seconds won’t introduce much error, but the devices might be in different time zones –for example GPS works on Universal time (GMT) – so the offset is often hours, not seconds. My quick and dirty way of making a note of the difference is to photograph whatever is doing the logging (assuming it can display its time). The camera will record the time its own clock was set to in the EXIF “Date and time taken” field and subtracting that from the time displayed on the logger in the picture gives an offset to apply to all data points. The following is the core of a function named Set-Offset which could be seen in part one;

$RefDate = ([datetime]( Read-Host ("Please enter the Date & time " +
                                   "in the reference picture, formatted as" + [char]13 +
                        [Char]10 + "Either MM/DD/yyyy HH:MM:SS ±Z or " +
                                   "dd MMMM yyyy HH:mm:ss ±Z"))  
            ).touniversalTime()

$ReferenceImagePath = Read-Host "Please enter the path to the picture"
if ($ReferenceImagePath -and (test-path $ReferenceImagePath) -and $RefDate) {
     $picTime = (get-exif -image $ReferenceImagePath).dateTaken 
     $Global:offset = ($picTime - $refdate).totalSeconds
}

The real Set-Offset can take –refDate and –ReferenceImagePath parameters so the user doesn’t need to be prompted for them.  Most of code you can see is concerned with getting the user to enter the time (in a format that PowerShell can use) and the path to the file. The only part which uses the image module is
(get-exif -image $ReferenceImagePath).dateTaken
Get-Exif is a command I added, and it returns an object which contains all the interesting EXIF data from the image file. Only the value in the DateTaken property is of interest here; it is used to calculate the number of seconds between the camera time and logger time and the result is stored in a global variable named $offset.

The next step is to read the data and applying the offset to it; depending on the how it was logged the next step will be either:
  $Points = Get-NMEAData   -Path $Logpath -offset $offset
Or
  $Points = Get-GPXData    -Path $Logpath -offset $offset
Or 
  $Points = Get-CSVGPSData -Path $Logpath -offset $offset
Or
  $Points = Get-SuuntoData -Path $Logpath -offset $offset

The last one handles the comma separated data exported from the Suunto Dive Manager program which downloads the data from my dive watch. The other 3 deal with different formats of GPS data, it may be in the form of NMEA sentences (comma separated again) or the CSV format used by Efficasoft GPS utilities on my phone or the XML-based GPX format. (GPS data formats are worth another post of their own). You may need to make slight alterations to these functions to work with your own logger, but they are easy to change.  All of them except Get-GPXdata import from a CSV file – and use a feature which is new in PowerShell V2 to specify the CSV column headings when using the import-csv command.  Get-GPXData uses XML documents looking for a hierarchy which goes <gpx> <trk> <trkseg><trkpt><trkpt><trkpt><trkpt>… All the functions use select-object to remove fields which aren’t needed and insert calculated data (for example converting the native speed in knots from GPS to MPH and KM/H )

After running one of these commands there will be a collection of data points stored in the variable $points. Each data point has a time – adjusted by the offset value, so it is time as the camera would have seen it. The Suunto dive computer points have a Description (the name of the dive site and water temperature) and depth, while the GPS points have Speed (GPS works in knots and the script calculates Miles per Hour and Kilometres per hour); bearing, latitude as Degrees, Minutes, Seconds, North or South, Longitude as Degrees, Minutes, Seconds East or West, Latitude & Longitude in their original form from the logger and Altitude in both meters and Feet (NMEA data needs extra processing to get the attitude data and Get-NMEAdata has a –NoAltitude switch to allow processing to be speeded up if it only Latitude and Longitude are needed )

Armed with a collection of points the next step find the one nearest to the time the picture was taken; a function named Get-NearestPoint does this. Given the time stamped on the photo the function returns the data point logged closest to that time. It isn’t very sophisticated, taking 3 parameters: a time, a time-sorted array of points and the name of field on the data points to check for the time, and working through the points until the point being looked at is further away from the target time than the previous point; the core of the function looks like this:

   $variance = [math]::Abs(($dataPoints[0].$columnName - $MatchingTime).totalseconds)
   $i = 1
   do {
        $v = [math]::Abs(($dataPoints[$i].$columnName - $MatchingTime).totalseconds)
        if ($v -le $variance) {$i ++ ; $variance = $v }
      } while (($v -eq $variance) -and ($i -lt $datapoints.count))
   $datapoints[($i -1)]

In use it looks something like this.

$image = Get-Image        –Path "MyPicture.Jpg"
$dt    = Get-ExifItem     -image $image  -ExifID $ExifIDDateTimeTaken
$point = Get-nearestPoint –Data  $points -Column "DateTime" -MatchingTime $dt

$point contains the data used to set the EXIF properties of the picture, a process which requires a series of Exif filters to be created – and I explained EXIF Filters in Part 2.  As well as data retrieved from a log, there are times when I want to tag a picture manually. For example  I took some photos in London’s Trafalgar Square without a GPS logger that I want to tag with 51°30’30” N, 0° 7’40” W  . To make this easier I created a function named Convert-GPStoEXIFFilter which can be invoked like this:

$filter = Convert-GPStoEXIFFilter 51,30,30 "N" 0,7,40 "W"

If you’re not used to PowerShell I should say that in some places 51,30,30 would be the way to write 3 parameters.  In PowerShell  it is one array parameter with 3 members. (Even old hands at Powershell occasionally get confused and put in a comma which turns two parameters into a single array parameter)  I could have explicitly named the parameters and made it clear that these 3 were an array by writing
 -LatDMS @(51,30,30) -NS "N"

Convert-GPStoEXIFFilter returns a chain of up to 7 EXIF Filters for GPS version, Latitude reference (North or South) Longitude reference (East or West), Altitude reference (above or below Sea Level), the Latitude & Longitude (as degrees, Minutes, Second and Decimals) and Altitude in meters (altitude is optional). If $point holds the data logged at the time the picture was taken Convert-GPStoEXIFFilter can be invoked like this:

$filter = Convert-GPStoEXIFFilter -LatDMS $point.Latdms -NS $point.NS `
                   -LONDMS $point.londms -EW $point.ew -AltM $point.altM

At the end of part 2 I showed the Copy-Image command that handles renaming, rotating, and setting keywords & title EXIF fields and mentioned it could be handed a set of filters. All the parameters that Copy-image uses are available to Copy-GPSImage, which takes the the set of points as well . Internally it performs the $image= , $dt= , $point= and $filter= commands seen above before calling Copy-image with the image, the filter chain and the other parameters it was passed. The full set of parameters for Copy-GPSImage is as follows

Image The image to work on – this can be an image object, a file object or a file name can come from the Pipeline.
Points The array of GPS data points from Get-NMEAdata, get-GPXData or Get-CSVGPSData
Keywords Keywords to go into the EXIF Keyword Tags field
Title Text to go into the EXIF Title field
Rotate If specified, adds whatever rotate filter is indicated by the EXIF Orientation field
NoClobber The conventional PowerShell switch to say “Don’t overwrite the file if it already exists”
Destination The FOLDER to which the file should be saved.
Replace

Two values separated by a comma specifying a replacement in the file NAME

ReturnInfo

If specified returns the point(s) matched with the pictures

So now it is possible to use three commands to geotag the images, the first two get time offset , and get the data points, applying that offset in the process.

set-offset "D:\dcim \100Pentx\IMG43272.JPG" –Verbose
$points= Get-CSVGPSData 'F:\My Documents\My GPS\Track Log\20100425115503.log' ‑offset $offset

and the third gets the files on a memory card and push them into Copy-GPSImage

$photoPoints = Dir E:\dcim –include *.jpg –recurse |  Copy-GpsImage -Points $Points ` 
          -verbose  -DestPath "C:\users\jamesone\pictures\oxford"   ` 
          -Keywords "Oxfordshire"  -replace "IMG","OX-"  -returnInfo

This is much as it appeared in Part 1 although third command has changed slightly.Copy-GPSImage now has a –returnInfo switch which returns the points where a photo was taken; to link the point to the image file(s) which patched it an extra property Paths is added to the points.

I mention this because I wanted to show the functions I put added almost for fun at the end. Out-MapPoint and ConvertTo-GPX got brief mentions in part 1: with the data in $photopoints I can push camera symbols through to a map like this : (note the sort –unique to remove duplicate points, 79 is the camera symbol)
  $photopoints | sort dateTime -Unique | Out-MapPoint -symbol {79}

Alternatively I can create a GPX file which can be imported into MapPoint, Google Earth and lots of other tools. GPX files need to be UTF8 text, PowerShell wants to write output files as Unicode – thwarting it isn’t hard but is ugly. 
  $photopoints | sort dateTime -Unique | convertto-gpx | out-file photoPoints.gpx -Encoding utf8

With the photo points logged it would be nice to show the path I walked but that will have too many points so I wrote Merge-GPSPoint which combines all the points for each minute so I can do 
  Merge-GPSPoints $points | Out-MapPoint
or 
  Merge-GPSPoints $points | convertto-gpx | out-file WalkPoints.gpx -Encoding utf8

One thing I should point out here is that the GPX format which I convert to is a series of Waypoints (i.e places that will be navigated to in future), not track points (places which have been visited in the past). The import routine processes the latter.

The last detail of the module for now is that I also gave it a function to find out where an image was taken, like this

PS  > resolve-imageplace 'C:\users\Jamesone\Pictures\Oxford\OX-43624.JPG'
Summertown, Oxford, Oxford, Oxfordshire, England, United Kingdom, Europe, Earth

That’s not a data error when it says Oxford, Oxford. The Geoplaces web service I use returns

ToponymName : name        fcode Desctiption for fcode
Earth Earth AREA a tract of land without homogeneous character or boundaries
Europe Europe CONT

Continent

United Kingdom of Great Britain and Northern Ireland United Kingdom PCLI Independent political entity
England England ADM1 First-order administrative division (US States, England, Scotland etc)
County of Oxfordshire Oxfordshire ADM2 A sub-division of an ADM1 (Counties in the UK)
Oxford District Oxford ADM3 A sub-division of an ADM2 (District level councils in the UK)
Oxford Oxford PPL Populated Place  (Cities, Towns, Villages)
Summertown Summertown PPLX Section of populated place

I haven’t done much to introduce intelligence into processing this. I used Trafalagar square in one part 2 and this returns  Charing Cross, London, City of Westminster, Greater London, England, United Kingdom, Europe, Earth which is correct but difficult to allow for. To make matters worse all sorts of strange geo-political questions come up as well if you say UK is the country, and England is the topmost Administrative division: English people might well think counties are the tier below parliament adminstratively but since the Scottish parliament and Welsh assembly opened, you might find a different view if you step over the border. Software which works to the American model of displaying the Populated place and First admin Division – for example Seattle, Washington; is easily thrown giving Reading, Berkshire it gives Reading, England.  Those are questions to look at another time

This post originally appeared on my technet blog.

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.