James O'Neill's Blog

June 30, 2012

Using the Windows index to search from PowerShell:Part Two – Helping with user input

Filed under: Uncategorized — jamesone111 @ 10:46 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 26 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place


In part one I developed a working PowerShell function to query the Windows index. It outputs data rows which isn’t the ideal behaviour and I’ll address that in part three; in this part I’ll address another drawback: search terms passed as parameters to the function must be "SQL-Ready". I think that makes for a bad user experience so I’m going to look at the half dozen bits of logic I added to allow my function to process input which is a little more human. Regular expressions are the way to recognize text which must be changed, and I’ll pay particular attention to those as I know I lot of people find them daunting.

Replace * with %

SQL statements use % for wildcard, but selecting files at the command prompt traditionally uses *. It’s a simple matter to replace – but for the need to "escape" the* character, replacing * with % would be as simple as a –replace statement gets:
$Filter = $Filter -replace "\*","%"
For some reason I’m never sure if the camera maker is Canon or Cannon so I’d rather search for Can*… or rather Can%, and that replace operation will turn "CameraManufacturer=Can*" into "CameraManufacturer=Can%". It’s worth noting that –replace is just as happy to process an array of strings in $filter as it is to process one.

Searching for a term across all fields uses "CONTAINS (*,’Stingray’)", and if the -replace operation changes* to % inside a CONTAINS() the result is no longer a valid SQL statement. So the regular expression needs to be a little more sophisticated, using a "negative look behind"
$Filter = $Filter -replace " "(?<!\(\s*)\*","%"

In order to filter out cases like CONTAINS(*… , the new regular expression qualifies "Match on *",with a look behind – "(?<!\(\s*)" – which says "if it isn’t immediately preceded by an opening bracket and any spaces". In regular expression syntax (?= x) says "look ahead for x" and (?<= x) says "Look behind for x" (?!= x) is “look ahead for anything EXCEPT x” and (?<!x) is “look behind for anything EXCEPT x” these will see a lot of use in this function. Here (?<! ) is being used, open bracket needs to be escaped so is written as \( and \s* means 0 or more spaces.

Convert "orphan" search terms into ‘contains’ conditions.

A term that needs to be wrapped as a "CONTAINS" search can be identified by the absence of quote marks, = , < or > signs or the LIKE, CONTAINS or FREETEXT search predicates. When these are present the search term is left alone, otherwise it goes into CONTAINS, like this.
$filter = ($filter | ForEach-Object {
    if  ($_ -match "'|=|<|>|like|contains|freetext") 
          
{$_}
    else   {"Contains(*,'$_')"}
})

Put quotes in if the user omits them.

The next thing I check for is omitted quote marks. I said I wanted to be able to use Can*, and we’ve seen it changed to Can% but the search term needs to be transformed into "CameraManufacturer=’Can%’ ". Here is a –replace operation to do that.
$Filter = $Filter -replace "\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
This is a more complex regular expression which takes a few moments to understand

Regular expression

Meaning

Application

\s*(=|<|>|like)\s*
([^'\d][^\s']*)$

Any spaces (or none)

 

\s*(=|<|>|like)\s*
([^'\d][^s']*)$

= or < or > or "Like"

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^'\d][^\s']*)$

Anything which is NOT a ‘ character
or a digit

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^'\d][^\s']*)$

Any number of non-quote,
non-space characters (or none)

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^'\d][^\s']*)$

End of line

\s*(=|<|>|like)\s*
([^'\d][^\s']*)$

Capture the enclosed sections
as matches

$Matches[0]= "=Can%"
$Matches[1]= "="
$Matches[2]= "Can%"

‘ $1 ”$2” ’0

Replace Matches[0] ("=Can%")
with an expression which uses the
two submatches "=" and "can%".

= ‘Can%’

Note that the expression which is being inserted uses $1 and $2 to mean matches[1] and[2] – if this is wrapped in double quote marks PowerShell will try to evaluate these terms before they get to the regex handler, so the replacement string must be wrapped in single quotes. But the desired replacement text contains single quote marks, so they need to be doubled up.

Replace ‘=’ with ‘like’ for Wildcards

So far, =Can* has become =’Can%’, which is good, but SQL needs "LIKE" instead of "=" to evaluate a wildcard. So the next operation converts "CameraManufacturer = ‘Can%’ "into "CameraManufacturer LIKE ‘Can%’ ":
$Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "

Regular expression

Meaning

Application

\s*=\s*(?=’.+%’\s*$)

= sign surrounded by any spaces

CameraManufacturer = ‘Can%’

\s*=\s*(?=.+%’\s*$)

A quote character

CameraManufacturer = Can%’

\s*=\s*(?=’.+%’\s*$)

Any characters (at least one)

CameraManufacturer = ‘Can%’

\s*=\s*(?=’.+%’\s*$)

% character followed by ‘

CameraManufacturer = ‘Can%’

\s*=\s*(?=’.+%’\s*$)

Any spaces (or none)
followed by end of line

\s*=\s*(?=‘.+%’\s*$)

Look ahead for the enclosed expression but don’t include it in the match

$Matches[0] = "="
(but only if ‘Can%’ is present)

Provide Aliases

The steps above reconstruct "WHERE" terms to build syntactically correct SQL, but what if I get confused and enter “CameraMaker” instead of “CameraManufacturer” or “Keyword” instead of “Keywords” ? I need Aliases – and they should work anywhere in the SQL statement – not just in the "WHERE" clause but in "ORDER BY" as well.
I defined a hash table (a.k.a. a "dictionary", or an "associative array") near the top of the script to act as a single place to store the aliases with their associated full canonical names, like this:
$PropertyAliases = @{
    Width       = "System.Image.HorizontalSize";
    Height      = "System.Image.VerticalSize";
    Name        = "System.FileName";
    Extension   = "System.FileExtension";
    Keyword     = "System.Keywords";
    CameraMaker = "System.Photo.CameraManufacturer"
}
Later in the script, once the SQL statement is built, a loop runs through the aliases replacing each with its canonical name:
$PropertyAliases.Keys | ForEach-Object {
    $SQL= $SQL -replace "(?<=\s)$($_)(?=\s*(=|>|<|,|Like))",$PropertyAliases[$_]
}
A hash table has .Keys and .Values properties which return what is on the left and right of the equals signs respectively. $hashTable.keyName or $hashtable[keyName] will return the value, so $_ will start by taking the value "width", and its replacement will be $PropertyAliases["width"] which is "System.Image.HorizontalSize", on the next pass through the loop, "height" is replaced and so on. To ensure it matches on a field name and not text being searched for, the regular expression stipulates the name must be preceded by a space and followed by "="or "like" and so on.

Regular expression

Meaning

Application

(?<=\s)Width(?=\s*(=|>|<|,|Like))

The literal text "Width"

Width > 1024

(?<=\s)Width(?=\s*(=|>|<|,|Like))

A Space

(?<=\s)Width(?=\s*(=|>|<|,|Like))

Look behind for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if a leading space is present)

(?<=\s)Width(?=\s*(=|>|<|,|Like))

any spaces (or none)

(?<=\s)Width(?=\s*(=|>|<|,|Like))

The literal text "Like", or any of the characters comma, equals, greater than or less than

Width > 1024

(?<=\s)Width(?=\s*(=|>|<|,|Like))

Look ahead for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if " >" is present)

If the prefix is omitted put the correct one in.

This builds on the ideas we’ve seen already. I want the list of fields and prefixes to be easy to maintain, so just after I define my aliases I define a list of field types
$FieldTypes = "System","Photo","Image","Music","Media","RecordedTv","Search"
For each type I define two variables, a prefix and a fieldslist : the names must be FieldtypePREFIX and FieldTypeFIELDS – the reason for this will become clear shortly but here is what they look like
$SystemPrefix = "System."
$SystemFields = "ItemName|ItemUrl"
$PhotoPrefix  = "System.Photo."
$PhotoFields  = "cameramodel|cameramanufacturer|orientation"
In practice the field lists are much longer – system contains 25 fieldnames not just the two shown here. The lists are written with "|" between the names so they become a regular expression meaning "ItemName or ItemUrl Or …". The following code runs after aliases have been processed
foreach ($type in $FieldTypes) {
   $fields = (get-variable "$($type)Fields").value
   $prefix = (get-variable "$($type)Prefix").value 
   $sql    = $sql -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
}
I can save repeating code by using Get-Variable in a loop to get $systemFields, $photoFields and so on, and if I want to add one more field, or a whole type I only need to change the variable declarations at the start of the script. The regular expression in the replace works like this:

Regular expression

Meaning

Application

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

Look behind for a space
but don’t include it in the match

 

(?<=\s)(?=(cameramanufacturer|
orientation
)\s*(=|>|<|,|Like))"

The literal text "orientation" or "cameramanufacturer"

CameraManufacturer LIKE ‘Can%’

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

any spaces (or none)

 

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

The literal text "Like", or any of the characters comma, equals, greater than or less than

CameraManufacturer LIKE ‘Can%’

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

Look ahead for the enclosed expression
but don’t include it in the match

$match[0] is the point between the leading space and "CameraManufacturer LIKE" but doesn’t include either.

We get the effect of an "insert" operator by using ‑replace with a regular expression that finds a place in the text but doesn’t select any of it.
This part of the function allows "CameraManufacturer LIKE ‘Can%’" to become "System.Photo CameraManufacturer LIKE ‘Can%’ " in a WHERE clause.
I also wanted "CameraManufacturer" in an ORDER BY clause to become "System.Photo CameraManufacturer". Very sharp-eyed readers may have noticed that I look for a Comma after the fieldname as well as <,>,=, and LIKE. I modified the code which appeared in part one so that when an ORDER BY clause is inserted it is followed by a trailing comma like this:
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , " ) + ","}

the new version will work with this regular expression but the extra comma will cause a SQL error and so it must be removed later.
When I introduced the SQL I said the SELECT statement looks like this:

SELECT System.ItemName, System.ItemUrl,      System.FileExtension, System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText , System.KindText,     System.Kind,     System.MIMEType,       System.Size

Building this clause from the field lists simplifies code maintenance, and as a bonus anything declared in the field lists will be retrieved by the query as well as accepted as input by its short name. The SELECT clause is prepared like this:
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", " ) + ", "}

This replaces the "|" with a comma and puts a comma after each set of fields. This means there is a comma between the last field and the FROM – which allows the regular expression to recognise field names, but it will break the SQL , so it is removed after the prefixes have been inserted (just like the one for ORDER BY).
This might seem inefficient, but when I checked the time it took to run the function and get the results but not output them it was typically about 0.05 seconds (50ms) on my laptop – it takes more time to output the results.
Combining all the bits in this part with the bits in part one turns my 36 line function into about a 60 line one as follows

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
        [Alias("Sort")][String[]]$OrderBy,
        [Alias("Top")][String[]]$First,
        [String]$Path,
        [Switch]$Recurse 
      )
$PropertyAliases = @{Width ="System.Image.HorizontalSize"; 
                    Height = "System.Image.VerticalSize"}
$FieldTypes      = "System","Photo"
$PhotoPrefix     = "System.Photo."
$PhotoFields     = "cameramodel|cameramanufacturer|orientation"
$SystemPrefix    = "System."
$SystemFields    = "ItemName|ItemUrl|FileExtension|FileName"
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", ")+", " }
if ($Path -match "\\\\([^\\]+)\\.")
     {$SQL += " FROM $($matches[1]).SYSTEMINDEX WHERE "}
else {$SQL += " FROM SYSTEMINDEX WHERE "}
if ($Filter)
     {$Filter = $Filter -replace "\*","%"
      $Filter = $Filter -replace"\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
      $Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "
      $Filter = ($Filter | ForEach-Object {
          if ($_ -match "'|=|<|>|like|contains|freetext")
               {$_}
          else {"Contains(*,'$_')"}
      })
      $SQL += $Filter -join " AND "
    }
if ($Path)
    {if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
     $Path = $Path -replace "\\","/"
     if ($SQL -notmatch "WHERE\s$") {$SQL += " AND " }
     if ($Recurse) 
          {$SQL += " SCOPE = '$Path' "}
     else {$SQL += " DIRECTORY = '$Path' "}
}
if ($SQL -match "WHERE\s*$")
     { Write-warning "You need to specify either a path , or a filter." ; return }
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) + ","}
$PropertyAliases.Keys | ForEach-Object 
     { $SQL= $SQL -replace"(?<=\s)$($_)(?=\s*(=|>|<|,|Like))", $PropertyAliases.$_ }
foreach ($type in $FieldTypes)
   
{$fields = (get-variable "$($type)Fields").value
     $prefix = (get-variable "$($type)Prefix").value
     $SQL    = $SQL -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
    }
$SQL = $SQL -replace "\s*,\s*FROM\s+" , " FROM "
$SQL = $SQL -replace "\s*,\s*$" , ""
$Provider="Provider=Search.CollatorDSO;"+ "Extended Properties=’Application=Windows’;"
$Adapter = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS     = new-object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }
}

In part 3 I’ll finish the function by turning my attention to output

Using the Windows index to search from PowerShell: Part one: Building a query from user input.

Filed under: Uncategorized — jamesone111 @ 10:43 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 25 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place


I’ve spent some time developing and honing a PowerShell function that gets information from the Windows Index– the technology behind the search that is integrated into explorer in Windows 7 and Vista. The Index can be queried using SQL and my function builds the SQL query from user input, executes it and receives rows of data for all the matching items. In Part three, I’ll look at why rows of data aren’t the best thing for the function to return and what the alternatives might be. Part two will look at making user input easier – I don’t want to make an understanding SQL a prerequisite for using the function. In this part I’m going to explore the query process.

We’ll look at how at how the query is built in a moment, for now please accept that a ready-to-run query stored in the variable $SQL. Then it only takes a few lines of PowerShell to prepare and run the query

$Provider="Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$adapter = new-object system.data.oledb.oleDBDataadapter -argument $sql, $Provider
$ds      = new-object system.data.dataset
if ($adapter.Fill($ds)) { $ds.Tables[0] }

The data is fetched using oleDBDataAdapter and DataSet objects; the adapter is created specifying a "provider" which says where the data will come from and a SQL statement which says what is being requested. The query is run when the adapter is told to fill the dataset. The .fill() method returns a number, indicating how many data rows were returned by the query – if this is non-zero, my function returns the first table in the dataset. PowerShell sees each data row in the table as a separate object; and these objects have a property for each of the table’s columns, so a search might return something like this:

SYSTEM.ITEMNAME : DIVE_1771+.JPG
SYSTEM.ITEMURL : file:C:/Users/James/pictures/DIVE_1771+.JPG
SYSTEM.FILEEXTENSION : .JPG
SYSTEM.FILENAME : DIVE_1771+.JPG
SYSTEM.FILEATTRIBUTES : 32
SYSTEM.FILEOWNER : Inspiron\James
SYSTEM.ITEMTYPE : .JPG
SYSTEM.ITEMTYPETEXT : JPEG Image
SYSTEM.KINDTEXT : Picture
SYSTEM.KIND : {picture}
SYSTEM.MIMETYPE : image/jpeg
SYSTEM.SIZE : 971413

There are lots of fields to choose from, so the list might be longer. The SQL query to produce it looks something like this.

SELECT System.ItemName, System.ItemUrl,        System.FileExtension,
       System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText ,  System.KindText, 
       System.Kind,     System.MIMEType,       System.Size
FROM   SYSTEMINDEX
WHERE  System.Keywords = 'portfolio' AND Contains(*,'stingray')

In the finished version of the function, the SELECT clause has 60 or so fields; the FROM and WHERE clauses might be more complicated than in the example and an ORDER BY clause might be used to sort the data.
The clauses are built using parameters which are declared in my function like this:

Param ( [Alias("Where","Include")][String[]]$Filter ,
        [Alias("Sort")][String[]]$orderby,
        [Alias("Top")][String[]]$First,
        [String]$Path,
        [Switch]$Recurse
)

In my functions I try to use names already used in PowerShell, so here I use -Filter and -First but I also define aliases for SQL terms like WHERE and TOP. These parameters build into the complete SQL statement, starting with the SELECT clause which uses -First

if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields

If the user specifies –First 1 then $SQL will be "SELECT TOP 1 fields"; otherwise it’s just "SELECT fields". After the fields are added to $SQL, the function adds a FROM clause. Windows Search can interrogate remote computers, so if the -path parameter is a UNC name in the form \\computerName\shareName the SQL FROM clause becomes FROM computerName.SYSTEMINDEX otherwise it is FROM SYSTEMINDEX to search the local machine.
A regular expression can recognise a UNC name and pick out the computer name, like this:

if ($Path -match "\\\\([^\\]+)\\.") {
$sql += " FROM $($matches[1]).SYSTEMINDEX WHERE "
}
else {$sql += " FROM SYSTEMINDEX WHERE "}

The regular expression in the first line of the example breaks down like this

Regular expression

Meaning

Application

\\\\([^\\]+)\\.

2 \ characters: "\" is the escape character, so each one needs to be written as \\

\\computerName\shareName

\\\\([^\\]+)\\.

Any non-\ character, repeated at least once

\\computerName\shareName

"\\\\([^\\]+)\\."

A \,followed by any character

\\computerName\shareName

"\\\\([^\\]+)\\."

Capture the section which is enclosed by the brackets as a match

$matches[0] =\\computerName\s
$matches[1] =computerName

I allow the function to take different parts of the WHERE clause as a comma separated list, so that
-filter "System.Keywords = 'portfolio'","Contains(*,'stingray')"
is equivalent to
-filter "System.Keywords = 'portfolio' AND Contains(*,'stingray')"

Adding the filter just needs this:

if ($Filter) { $SQL += $Filter -join " AND "}

The folders searched can be restricted. A "SCOPE" term limits the query to a folder and all of its subfolders, and a "DIRECTORY" term limits it to the folder without subfolders. If the request is going to a remote server the index is smart enough to recognise a UNC path and return just the files which are accessible via that path. If a -Path parameter is specified, the function extends the WHERE clause, and the –Recurse switch determines whether to use SCOPE or DIRECTORY, like this:

if ($Path){
     if ($Path -notmatch "\w{4}:") {
           $Path = "file:" + (resolve-path -path $Path).providerPath
     }
     if ($sql -notmatch "WHERE\s*$") {$sql += " AND " }
     if ($Recurse)                   {$sql += " SCOPE = '$Path' " }
      else                           {$sql += " DIRECTORY = '$Path' "}
}

In these SQL statements, paths are specified in the form file:c:/users/james which isn’t how we normally write them (and the way I recognise UNC names won’t work if they are written as file://ComputerName/shareName). This is rectified by the first line inside the If ($Path) {} block, which checks for 4 "word" characters followed by a colon. Doing this will prevent ‘File:’ being inserted if any protocol has been specified –the same search syntax works against HTTP:// (though not usually when searching on your workstation), MAPI:// (for Outlook items) and OneIndex14:// (for OneNote items). If a file path has been given I ensure it is an absolute one – the need to support UNC paths forces the use of .ProviderPath here. It turns out there is no need to convert \ characters in the path to /, provided the file: is included.
After taking care of that, the operation -notmatch "WHERE\s*$" sees to it that an "AND" is added if there is anything other than spaces between WHERE and the end of the line (in other words if any conditions specified by –filter have been inserted). If neither -Path nor -filter was specified there will be a dangling WHERE at the end of the SQL statement .Initially I removed this with a ‑Replace but then I decided that I didn’t want the function to respond to a lack of input by returning the whole index so I changed it to write a warning and exit. With the WHERE clause completed, final clause in the SQL statement is ORDER BY, which – like WHERE – joins up a multi-part condition.

if ($sql -match "WHERE\s*$") {
     Write-warning "You need to specify either a path, or a filter."
     Return
}
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , ") }

When the whole function is put together it takes 3 dozen lines of PowerShell to handle the parameters, build and run the query and return the result. Put together they look like this:

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
         [Alias("Sort")][String[]]$OrderBy,
         [Alias("Top")][String[]]$First,
         [String]$Path,
         [Switch]$Recurse
)
if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields
if ($Path -match "\\\\([^\\]+)\\.") {
              $SQL += "FROM $($matches[1]).SYSTEMINDEX WHERE "
}
else         {$SQL += " FROM SYSTEMINDEX WHERE "}
if ($Filter) {$SQL += $Filter -join " AND "}
if ($Path) {
    if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
    $Path = $Path -replace "\\","/"
    if ($SQL -notmatch "WHERE\s*$") {$SQL += " AND " }
    if ($Recurse)                   {$SQL += " SCOPE = '$Path' " }
    else                            {$SQL += " DIRECTORY = '$Path' "}
}
if ($SQL -match "WHERE\s*$") {
    Write-Warning "You need to specify either a path or a filter."
    Return
}
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) }
$Provider = "Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$Adapter  = New-Object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS       = New-Object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }
}

The -Path parameter is more user-friendly as a result of the way I handle it, but I’ve made it a general rule that you shouldn’t expect the user to know too much of the underlying syntax ,and at the moment the function requires too much knowledge of SQL: I don’t want to type

Get-Indexed-Item –Filter "Contains(*,'Stingray')", "System.Photo.CameraManufacturer Like 'Can%'"

and it seems unreasonable to expect anyone else to do so. I came up with this list of things the function should do for me.

  • Don’t require the user to know whether a search term is prefixed with SYSTEM. SYSTEM.DOCUMENT, SYSTEM.IMAGE or SYSTEM.PHOTO. If the prefix is omitted put the correct one in.
  • Even without the prefixes some fieldnames are awkward for example "HorizontalSize" and "VerticalSize" instead of width and height. Provide aliases
  • Literal text in searches needs to be enclosed in single quotes, insert quotes if the user omits them.
  • A free text search over all fields is written as Contains(*,’searchTerm’) , convert "orphan" search terms into contains conditions.
  • SQL uses % (not *) for a wild card – replace* with % in filters to cope with users putting the familiar *
  • SQL requires the LIKE predicate(not =) for wildcards : replace = with like for Wildcards

In Part two, I’ll look at how I do those things.

June 25, 2012

Lonesome George

Filed under: Uncategorized — jamesone111 @ 3:15 pm

At Easter I was in the Galapagos Islands; work had taken me to Ecuador diving in the Galapagos was too good an opportunity to miss. Mainland Ecuador was a country I knew little about and two weeks working in the capital (Quito, just South of the Equator, and 10,000 feet up in the Andes) doesn’t qualify me as an expert. The client there was a good one to work with, and what I saw of the city (a bunch of Taxi rides and a bus tour on the one day we weren’t working) mean I’d go back if asked. Travel wasn’t good and the return flights so bad that I’ve vowed never to fly with Iberia again. Flying to the islands the plane had a problem which meant if it landed it couldn’t take off again so having got within sight of the islands we had to go all the way back to Quito and get another plane. Down at sea level the heat was ferocious, the transportation scary and the insect bites the worst I’ve had. But the diving… different kinds of Sharks (including a close encounter with a group of Hammerheads), Seal Lions, Turtles, Rays (including a Manta encounter on the very first dive which set the tone) – I’d put up with a lot for that. And if some search engine has steered you here, I dived with Scuba Iguana, and if I manage to go back I’ll dive with them again.

The Scuba place was pretty much next door to the Darwin station: home of giant tortoises and a tortoise breading programme. Galapagos comes from the Spanish word for saddle because the shape of the giant tortoise’s shell looked like a traditional saddle. I also learnt that some languages – including Spanish – don’t have distinct words for Tortoise and (Marine) Turtles.  The sex of tortoises is determined by the temperature at which the eggs incubate and the breeding programme gathers eggs, incubates them to get extra females, and looks after the baby tortoises keeping them safe human introduced species (like rats) which feed on eggs and baby tortoises. Each island’s tortoises are different so the eggs and hatchlings are marked up so they go back to right island. But there is no breeding programme for Pinta island (also named Abingdon island by the Royal Navy. According to a story told by Stephen Fry on QI, sailors ate giant tortoises and found them very good.)  A giant Tortoise was found on Pinta; but a search over several years failed to find a second. So he – Lonesome George, was brought to the Darwin station in the 1970s. No one knows for sure how he was then. All efforts to find a mate for him failed: so George lived out his final decades as the only known example of  Geochelone nigra abingdoni  the Pinta Galapagos tortoise.
On Easter Sunday I walked up see George and the giants from other islands who live at the station. George was keeping out of the sun; he shared an enclosure and I wondered what he made of the other species – if somewhere in that ancient reptile brain lurked a memory of others just like him, a template into which the other tortoises didn’t quite fi.

Later in trip I was asked to help with some work on survey being prepared about Hammerhead sharks. I was told they estimated as having a 20% chance of becoming extinct in the next 100 years. This statistic is quite hard to digest: my chances of being extinct in 100 years are close to 100%: so my contribution to the survey was suggest that telling people if things continue as they are the chances of seeing Hammerheads on a dive in 5 years will be X amount less than today. It’s not fair, but we care more about some species than others and I hope there will still be Hammerheads for my children to see in a few years. Sadly they won’t get the chance to see the Pinta Tortoise.

June 22, 2012

Windows Phone 8. Some Known Knowns, and Known Unknowns

Filed under: Uncategorized — jamesone111 @ 10:23 am

Earlier this week Microsoft held its Windows Phone Summit where it made a set of announcements about the next generation of Windows Phone – Windows phone 8 in summary these were

  • Hardware Support for Multi-core processors , Additional Screen resolutions, Removable memory cards, NFC
  • Software The Windows 8 core is now the base OS, Support for native code written in C or C++ (meaning better games), IE 10, new mapping (from Nokia), a speech API which apps can use. Better business-oriented features, VOIP support, a new “Wallet” experience and in-app payments, and a new start screen.

This summit came hot on the heels of the Surface tablet launch, which seemed to be a decision by Microsoft that making hardware for Windows 8 was too important to be left to the hardware makers. The first thing I noted about phone announcement was the lack of a Microsoft branded phone. I’ve said that Microsoft should make phones itself since before the first Windows Mobile devices appeared – when Microsoft was talking about “Stinger” internally; I never really bought any of the reasons given for not doing so. But I’d be astounded if Nokia didn’t demand that Microsoft promise not to make a phone (whether there’s a binding agreement or just an understanding between Messrs Elop and Ballmer we don’t know). Besides Nokia Microsoft has 3 other device makers on-board: HTC have devices for every Microsoft mobile OS since 2000, but also have a slew of devices for Android, Samsung were a launch partner for Phone 7 but since then have done more with Android ; LG were in the line up for the Windows Phone 7 launch and are replaced by Huawei.  What these 3  feel about Nokia mapping technology is a matter of guesswork but depends on branding and licensing terms.

There are some things we think we know, but actually they are things we know that we don’t know.

  • Existing 7.x phones will not run Windows Phone 8 but will get an upgrade to Windows 7.8. I have an HTC Trophy which I bought in November 2010 and it has gone from 7.0 to 7.5 and I’ll be quite happy to get 7.8. on a 2 year old phone. Someone who just bought a Nokia Lumia might not feel quite so pleased.  What will be in 7.8 ? The new start screen has been shown. But will it have IE10 ? Will it have the new mapping and Speech capabilities. The Wallet, In-app-payments ?  This matters because….
  • Programs specifically targeting Windows Phone 8 won’t run on 7. Well doh! Programs targeting Windows 7 don’t run XP. But what programs will need to target the new OS ? Phone 7 programs are .NET programs and since .NET compiles to an intermediate language not to CPU instructions, a program which runs on Windows 8-RT (previous called Windows on ARM) should go straight onto a Windows 8-intel machine (but not vice versa), and Phone 7 programs will run on Phone 8. An intriguing comment at the launch says the Phone 8 emulator runs on Hyper-V; unless Hyper-V has started translating between different CPU instruction sets this means the Emulated phone has an Intel CPU but it doesn’t matter because it is running .NET intermediate language not binary machine code. So how many new programs will be 8-specific ? – Say Skype uses the VOIP support and in-app payments for calling credit. Will users with old phones be stuck with the current version of Skype? Do they get a new version where those features don’t light up. Or do they get the same experience as someone with a new phone. If the only things which are Phone 8 specific are apps which need Multiple cores (or other newly supported hardware) there would never have been any benefit upgrading a single core phone from 7 to 8.  
  • Business support. This covers many possibilities, but what is in and what is out ? Will the encryption send keys back to base as bit-locker does ? Will there be support for Direct-Access ? Corporate wireless security ? Will adding Exchange server details auto-configure the phone to corporate settings (like the corporate app-store) . Will it be possible to block updates ? This list goes on and on.

It’s going to interesting to see what this does for Microsoft and for Nokia’s turn-round.

May 11, 2012

Lies, damn lies and public sector pensions

Filed under: Uncategorized — jamesone111 @ 9:08 am

Every time I hear that “public sector workers”  are protesting about government plans for their pensions –as happened yesterday – I think of two points I made to fellow school governors when the teachers took a day off (with the knock on cost to parents and the economy) these were

  • Does  classroom teachers understand their union wants them to subsidize the head’s pension from theirs ?  (Have the Unions explained to the classroom teachers that is what they are doing
  • Any teacher who is part of this protest has demonstrated they have insufficient grasp of maths to teach the subject (except, perhaps to the reception class.)

This second point is the easier of the two to explain: the pension paid for over your working life is a function of:

  • The salary you earned (which in turn depends on the rate at which your salary grew)
  • What fraction of your salary was paid in to your pension fund. It might be you gave up X% or your employer put in Y% that you never saw or a combination.   
  • How many years you paid in for (when you started paying in, and when you retire)
  • How well your pension fund grew before it paid out
  • How long the pension is expected to pay out for (how long you live after retirement)
  • Annuity rates – the interest earned on the pension fund as it pays out.

In addition, some people receive some pension which wasn’t paid for in this way; some employers (public or private sector) make guarantees to top up either the pension fund so it will buy a given level of annuity or to top-up  pension payments bought with the fund. The total you receive is the combination of what you have paid for directly and the top-up.
Change any factor – for example how long you expect to live – and either what you have paid for changes or the other factors have to change to compensate. Since earnings and rates of return aren’t something we control, living longer means either we have pay for a smaller pension, or we must pay in more, or retire later or some combination of all three. Demanding that the same amount will come out of a pension for longer, without paying more in is a demand for a guaranteed top-up in future – in the case of public sector employees that future top-up comes from future taxes, for private sector it comes from future profits.

It’s easy for those in Government to make pension promises because those promises don’t need to be met for 30 years or more. Teachers whose retirement is imminent would have come into the profession when Wilson, Heath or Callaghan was in Downing Street, and all 3 are dead and buried; so with the exception of Denis Healey are all the chancellors who set budgets while they were in office. It’s temping for governments to save on spending by under-contributing to pensions today, and leave future governments with the shortfall: taken to the extreme governments can turn pensions into a Ponzi scheme – this year’s “Teachers’/Police/whatever pay” bill covers what all current teachers/police officers / whoever are paid for doing the job and all pensions paid to retired ones for doing the job cheaply in the past. Since I am, more-or-less, accusing governments of all colours of committing fraud, I might as well add false accounting to the charge sheet. Let’s say the Government wants to buy a squadron of new aircraft but doesn’t want to raise taxes to pay for them all this year; it borrows money and the future liability that creates is accounted for. If the deal it makes with public sector workers is for a given amount to spend today, and a promise of a given standard of living in retirement ,does it record that promise – that future liability – as part of pay today? Take a wild guess.
This wouldn’t matter – outside accounting circles – if everything was constant. But the length of time between retirement and death has increased and keeps on increasing. For the sake of a simple example: lets assume someone born in 1920, joined the Police after world war II , served for 30 years and retired in 1975 at age 55 expecting to die at 70. Their retirement was half their length of service. Now consider someone born in 1955, who also joined the police at age 25, served for 30 years and retired 2010. Is any one making plans for their Pension to stop in 2025 ? We might reasonably expect this person to live well into their eighties – so we’ve moved from 1 retired officer for every 2 serving, to a 1:1 ratio. I’m not saying that in 1975 the ratio was 1:2 and in 2012 it is 1:1 but that’s the direction of travel. 

I’ve yet to hear a single teacher say their protests about pensions amount to a demand that they should under-fund their retirement as a matter of public policy and their pupils – who will then be tax payers – should make up difference. As one of those whose work generates the funds to pay for the public sector I must choose a combination of lower pension, later retirement, and higher contributions than I was led to expect when I started work 25 years or so ago. And there are people demanding my taxes insulate them from having to do the same; or (if you prefer) demanding a pay rise to cover the gap between what past governments have promised them and what they are actually paying for, or (and this becomes a bit uncomfortable) that government starts telling us what it really costs to have the teachers, nurses, police officers and so on we want. 

But what of my claim that Unions get low paid staff to subsidize the pensions of higher paid colleagues. Lets take two teachers; I’ll call them Alice and Bob, and since this is a simplified example they’ll fit two stereotypes: Alice sticks to working in the class room; and gets a 2% cost of living rise every year. Bob competes for every possible promotion, and gets promoted in alternate years, so he gets a 2% cost of living rise alternating with a 10% rise. Although they both started on the same salary after 9 end-of-year rises, Alice’s pay has gone up by 19.5% and Bob – who must be a head by now – has seen his rise by 74%  
Throughout the 10 years they pay 10% of their salary into their pension fund – to make the sums easy we’ll assume they pay the whole 10% on the last day of the year, and each year their pension fund manager earns them 10% of what was in their pension pot at the end of the previous year. After 10 years Alice has £17,184 in her pension pot, and Bob has £20,390 in his.

Alice (and her fellow class room teachers) are told by the Union Rep that any attempt to change from final salary as the calculation mechanism is an attack on your pension, for her, this is factually wrong. If you are ever told this you need to ask if you are a high flier like Bob or if your career is more like Alice’s. To see why it is wrong (and lets put it down to the Union rep being innumerate , rather than dishonest), lets pretend the pension scheme only has Alice and Bob in it. So the total pot is £37,574 – Alice put in 46% of that money, but of it is shared in the ratio of the final salaries 11,950 : 17,432 ,Alice gets 41% of the pay out. 
You can argue it doesn’t work like that because Alice’s pot (at 1.44 times her final salary) might just cover the percentage of her final salary she has been promised: Bob’s pension pot is only 1.17 times his final salary which will give him a smaller percentage so the government steps in and boosts his pot to be 1.44 time his  final salary just like Alice’s. So Bob gets a golden handshake of nearly £4700 and Alice gets nothing.
Suppose 1.44 years is nowhere near enough and Alice and Bob need 3 years salary to buy a large enough annuity; the government needs to find £18,668 for Alice (108% of her pot), and 31,907 for Bob (156% of his pot). Whichever way you cut and slice if your salary grows quicker than your colleagues you will do better out of final salary than they do. If it grows more slowly you will fare worse.  

Alice Bob
Salary Increase Pension Payment Pension Pot Salary Increase Pension Payment Pension Pot
Year 1   10,000.00 2%               1,000.00     1,000.00   10,000.00 10%               1,000.00     1,000.00
Year 2   10,200.00 2%               1,020.00     2,120.00   11,000.00 2%               1,100.00     2,200.00
Year 3   10,404.00 2%               1,040.40     3,372.40   11,220.00 10%               1,122.00     3,542.00
Year 4   10,612.08 2%               1,061.21     4,770.85   12,342.00 2%               1,234.20     5,130.40
Year 5   10,824.32 2%               1,082.43     6,330.36   12,588.84 10%               1,258.88     6,902.32
Year 6   11,040.81 2%               1,104.08     8,067.48   13,847.72 2%               1,384.77     8,977.33
Year 7   11,261.62 2%               1,126.16   10,000.39   14,124.68 10%               1,412.47   11,287.53
Year 8   11,486.86 2%               1,148.69   12,149.12   15,537.15 2%               1,553.71   13,970.00
Year 9   11,716.59 2%               1,171.66   14,535.69   15,847.89 10%               1,584.79   16,951.79
Year 10   11,950.93               1,195.09  17,184.35   17,432.68               1,743.27   20,390.23
Average Salary   10,949.72   13,394.10
Combined Final Salary   29,383.60 Total Pot  37,574.58
Alice’s share   15,282.37
Bob’s Share   22,292.21
Combined Average Salary   24,343.82 Total Pot   37,574.58
Alice’s share   16,900.85
Bob’s Share   20,673.73

What if the mechanism for calculating were Average salary , not final salary? It doesn’t quite remove gap but gets very close. Instead of £2,000 of Alice’s money going to Bob it’s less than £300. 
A better way to look at this is to say if the amount of money in the combined Pension pot pays £5000 a year in Pensions, do we split it as roughly £2000 to Alice and £3000 to Bob (the rough ratio of their final salaries – each gets about 1/6th of their final salary) or £2250 to Alice and £2750 to Bob (the ratio of their average salaries and each gets about 1/5th of their average).
Whenever average salary is suggested as a basis, union leaders will say that pensions are calculated from a smaller number as if it reduces the amount paid. If the government wanted to take money that way it would be simpler to say “a full pension will in future be a smaller percentage of final salary”. Changing to average-based implies an increase in the percentage paid. 

That perhaps is the final irony. Rank and file Police officers – whose career pay is like Alice’s in the example – marched through London yesterday demanding that their pensions be left alone; you do not need to spend long reading “Inspector Gadget” to realise when you remove the problems created for the Police by politicians most of the problems that are left are created by senior officers whose career pay follows the “Bob” path. Yet the “many” marching were demanding that they continue to subsidize these “few”. As Gadget himself likes to say : you couldn’t make it up.

April 22, 2012

Don’t Swallow the cat. Doing the right thing with software development and other engineering projects.

Filed under: Uncategorized — jamesone111 @ 8:30 pm

In my time at Microsoft I became aware of the saying “communication occurs only between equals.” usually couched in the terms “People would rather lie than deliver bad news to Steve Ballmer”. Replacing unwelcome truths with agreeable dishonesty wasn’t confined to the CEOs direct reports, and certainly isn’t a disease confined to Microsoft. I came across ‘The Hierarchy of Power Semantics’ more than 30 years ago when I didn’t understand what was meant by the title; it was written in the 1960s and if you don’t recognise “In the "beginning was the plan and the specification, and the plan was without form and the specification was void and there was darkness on the face of the implementation team”  see here – language warning for the easily offended.
Wikipedia says the original form of “communication occurs only between equals”  is Accurate communication is possible only in a non-punishing situation. There are those who (consciously or not) use the impossibly of saying “No” to extract more from staff and suppliers; it can produce extraordinary results, but sooner or later it goes horribly wrong. For example the Challenger disaster was caused by the failure of an ‘O’ ring in solid rocket booster made by Morton Thiokol. The engineers responsible for the booster were quite clear that in cold weather the ‘O’ rings were likely to fail with catastrophic results.  NASA asked if a launch was OK after a freezing night and fearing the consequences of saying “No” managers at Morton Thiokol over-ruled the engineers and allowed the disastrous launch to go ahead.  Most people can think of some case where someone made an impossible promise to a customer, because they were afraid to say no.

Several times recently I have heard people say something to the effect that ‘We’re so committed to doing this the wrong way that we can’t change to the right way.”  Once the person saying it was me, which was the genesis of this post. Sometimes, in a software project because saying to someone – even to ourselves – “We’re doing this wrong” is difficult, so we create work rounds. The the odd title of this post comes from a song which was played on the radio a lot when I was a kid.

There was an old lady, who swallowed a fly, I don’t know why she swallowed a fly. I guess she’ll die.
There was an old lady, who swallowed a spider that wriggled and jiggled and ticked inside her. She Swallowed the spider to catch the fly  … I guess she’ll die
There was an old lady, who swallowed a bird. How absurd to swallow a bird. She swallowed the bird to catch the spider … I guess she’ll die
There was an old lady, who swallowed a cat. Fancy that to swallow a cat. She swallowed the cat to catch the bird …  I guess she’ll die
There was an old lady, who swallowed a dog. What a hog to swallow a dog. She swallowed the dog to catch the cat … I guess she’ll die
There was an old lady, who swallowed a horse. She’s dead, of course

In other words each cure needs a further, more extreme cure.  In my case the “fly” was a simple problem I’d inherited. It would take a couple of pages to explain the context, so for simplicity it concerns database tables and the “spider” was to store data de-normalized. If you don’t spend your days working with databases, imagine you have a list of suppliers, and a list of invoices from those suppliers. Normally you would store an ID for the supplier in the invoice table, and look up the name from the supplier table using the ID. For what I was doing it was better to put the supplier name in the invoices table, and ignore the ID. All the invoices for the supplier can be looked up by querying for the name. The same technique applied to products supplied by that supplier: store the supplier name in the product table, look up products by supplier name. This is not because I didn’t know any better, I had database normal forms drummed into me two decades ago. To stick with the metaphor: I know that, under normal circumstances, swallowing spiders is bad, but faced with this specific fly it was demonstrably the best course of action.
At this point someone who could have saved me from my folly pointed out that supplier names had to be editable. I protested that the names don’t change, but Amalgamated Widgets did, in fact, become Universal Widgets. This is an issue because Amalgamated not Universal raised the invoices in the filing cabinet so matching them to invoices in the system requires preserving the name as it was when the invoice was generated. “See, I was right name should be stored” – actually this exception doesn’t show I was right at all, but on I went. On the other hand all of  Amalgamated’s products belong to Universal now. Changing names means doing a cascaded update (update any product with the old company name to the new name when a name changes) the real case has more than just products. If you’re filling in the metaphor you’ve guessed I’d reached the point of figuring out how to swallow a bird. Worse, I could see another problem looming (anyone for Cat ?): changes to products had to be exported to another system, and the list of changes had their own table requiring cascaded updates from the cascaded updates.

One of the great quotes in Macbeth says “I am in blood stepped in so far that should I wade no more, Returning were as tedious as go o’er.” he knows what he’s doing is wrong, but it is as hard to go back (and do right) as it is to go on.  Except it isn’t: the solution is not to swallow another couple more spiders and a fly, the solution is to swallow a bird, then a cat and so on.  The dilemma is that the effort for an additional work-round is smaller than the effort to go back fix the initial problem and unpick all the work-rounds to date – either needs to be done now, and the easy solution is to choose the one which needs the least effort now. The sum of effort required for future work-rounds is greater but we can discount that effort because it isn’t needed now. Only in a non-punishing situation can we tell people that progress must be halted for a time to fix a problem which has been mitigated up to now. Persuading people that such a problem needs to fixed at all isn’t trivial, I heard this quote in a Radio programme a while back

“Each uneventful day that passes reinforces a steadily growing false sense of confidence that everything is alright:
that I, we, my group must be OK because the way we did things today resulted in no adverse consequences.”

In my case the problem is being fixed at the moment, but in how many organisations is it what career limiting move to tell people that something which has had now adverse consequences to date must be fixed? 

February 4, 2012

Customizing PowerShell, Proxy functions and a better Select-String

Filed under: Uncategorized — jamesone111 @ 9:24 pm

I suspect that even regular PowerShell users don’t customize their environment much. By co-incidence, in the last few weeks I’ve made multiple customizations to my systems (my scripts are sync’d over 3 machines, customize one, customize all). Which has given me multiple things to talk about. My last post was about adding persistent history this time I want to look at Proxy Functions …

Select-String is, for my money, one of the best things in PowerShell. It looks through piped text or through files for anything which matches a regular expression (or simple text) and reports back what matched and where with all the detail you could ever need. BUT It has a couple of things wrong with it: it won’t do a recursive search for files, and sometimes the information which comes back is too detailed. I solved both problems with a function I named “WhatHas” which has been part of my profile for ages. I have been using this to search scripts, XML files and saved SQL whenever I need a snippet of code that I can’t remember or because something needs to be changed and I can’t be sure I’ve remembered which files contain it. I use WhatHas dozens (possibly hundreds) of times a week. Because it was a quick hack I didn’t support every option that Select-string has, so if a code snippet spans lines I have go back to the proper Select-String cmdlet and use its -context option to get the lines either side of the match: more often than not I find myself typing dir -recurse {something} | select-String {options}

A while back I saw a couple of presentations on Proxy functions (there’s a good post about them here by Jeffrey Snover): I thought when I saw them that I would need to implement one for real before I could claim to understand them, and after growing tired of jumping back and forth between select-string and WhatHas, I decided it was time to do the job properly creating a proxy function for Select-String and keep whathas as an alias. 

There are 3 bits of background knowledge you need for proxy functions.

  1. Precedence. Aliases beat Functions, Functions beat Cmdlets. Cmdlets beat external scripts and programs. A function named Select-String will be called instead of a cmdlet named Select-String – meaning a function can replace a cmdlet simply by giving it the same name. That is the starting point for a Proxy function.
  2. A command can be invoked as moduleName\CommandName. If I load a function named “Get-Stuff” from my profile.ps1 file for example, it won’t have an associated module name but if I load it as part of a module, or if “Get-Stuff” is a cmdlet it will have a module name.
    Get-Command get-stuff | format-list name, modulename
    will show this information You can try
    > Microsoft.PowerShell.Management\get-childitem
    For yourself. It looks like an invalid file-system path, but remember PowerShell looks for a matching Alias, then a matching Function and a then a matching cmdlet before looking for a file.
  3. Functions can have a process block (which runs for each item passed via the pipeline) a begin block (which runs before the first pass through process, and an end block (which runs after the last item has passed through process.) Cmdlets follow the same structure, although it’s harder to see.

Putting these together A function named Select-String can call the Select-String cmdlet, but it must call it as Microsoft.PowerShell.Utility\Select-String or it will just go round in a loop. In some cases, calling it isn’t quite enough and PowerShell V2 delivered the steppable pipeline which can take a PowerShell command (or set of commands piped together) and allow us to run its begin block , process block , and end block, under the control of an function. So a Proxy function looks like this :
Function Select-String {
  [CmdletBinding()]
  Param  ( Same Parameters as the real Select-String
           Less any I want to prevent people using
           Plus any I want to add
         )
   Begin { My preliminaries
           Get
$steppablePipeline
           $steppablePipeline.begin()
         } 
Process { My Per-item code against current item ($_ )
          $steppablePipeline.Process($_)
         }

     end { $steppablePipeline.End
           My Clean up code
         }
}

What would really help would be something produce a function like this template, and fortunately it is built into PowerShell: it does the whole thing in 3 steps: Get the command to be proxied, get the detailed metadata for command and build a Proxy function with the meta data, like this:
  $cmd=Get-command select-string -CommandType cmdlet
  $MetaData = New-Object System.Management.Automation.CommandMetaData ($cmd)
  [System.Management.Automation.ProxyCommand]::create($MetaData)

The last command will output the Proxy function body to the console, I piped the result into Clip.exe and pasted the result into a new definition
Function Select-String { }
And I had a proxy function.

At this point it didn’t do anything that the original cmdlet doesn’t do but that was a starting point for customizing.
The auto-generated parameters are be formatted like this
  [Parameter(ParameterSetName='Object', Mandatory=$true, ValueFromPipeline=$true)]
  [AllowNull()]
  [AllowEmptyString()]
  [System.Management.Automation.PSObject]
  ${InputObject},

And I removed some of the line breaks to reduce the screen space they use from 53 lines to about half that.
The ProxyCommand creator wraps parameter names in braces just in case something has a space or other breaking character in the name, and I took those out.
Then I added two new switch parameters -Recurse and -BareMatches.

Each of the Begin, Process and End blocks in the function contains a try...catch statement, and in the try part of the begin block the creator puts code to check if the -OutBuffer common parameter is set and if it is, over-rides it (why I’m not sure) – followed by code to create a steppable pipeline, like this:
  $wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('Select-String',
                                                           [System.Management.Automation.CommandTypes]::Cmdlet)
  $scriptCmd = {& $wrappedCmd @PSBoundParameters }
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

I decided it would be easiest to build up a string and make that into the steppable pipeline . In simplified form
   $wrappedCmd        = "Microsoft.PowerShell.Utility\Select-String " 
  $scriptText        = "$wrappedCmd @PSBoundParameters"
  if ($Recurse)      { $scriptText = "Get-ChildItem @GCI_Params | " + $scriptText }
  if ($BareMatches)  { $scriptText += " | Select-Object –ExpandProperty 'matches' " +
                                      " | Select-Object -ExpandProperty 'value'   " }  
  $scriptCmd         = [scriptblock]::Create($scriptText)  
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

& commandObject works in a scriptblock: the “&” sign says  “run this” and if this is a command object that’s just fine: so the generated code has scriptCmd = {& $wrappedCmd @PSBoundParameters } where $wrappedCmd  is a command object.
but when I first changed the code from using a script block to using a string I put the original object $wrappedCmd inside a string. When the object is inserted into a string, the conversion renders it as the unqualified name of the command – the information about the module is lost, so I produced a script block which would call the function, which would create a script block which would call the function which… is an effective way to cause a crash.

The script above won’t quite work on its own because
(a) I haven’t built up the parameters for Get-Childitem. So if -recurse or –barematches are specified I build up a hash table to hold them, using taking the necessary parameters from what ever was passed, and making sure they aren’t passed on to the Select-String Cmdlet when it is called. I also make sure that a file specification is passed for a recursive search it is moved from the path parameter to the include parameter.
(b) If -recurse or -barematches get passed to the” real” Select-String cmdlet it will throw a “parameter cannot be found” error, so they need to be removed from $psboundParameters.

This means the first part of the block above turns into
  if ($recurse -or $include -or $exclude) {
     $GCI_Params = @{}
     foreach ($key in @("Include","Exclude","Recurse","Path","LiteralPath")) {
          if ($psboundparameters[$key]) {
             
$GCI_Params[$key] = $psboundparameters[$key]
              [void]$psboundparameters.Remove($key)
          }
     }
     # if Path doesn't seem to be a folder it is probably a file spec
     # So if recurse is set, set Include to hold file spec and path to hold current directory
     if ($Recurse -and -not $include -and ((Test-Path -PathType Container $path) -notcontains $true) ) {
        $GCI_Params["Include"] = $path
        $GCI_Params["Path"] = $pwd
     }
   $scriptText = "Get-ChildItem @GCI_Params | "
}
else { $scriptText = ""}

And the last part is
if ($BareMatches) {
  $psboundparameters.Remove("BareMatches")
  $scriptText += " | Select-object -expandproperty 'matches' | Select-Object -ExpandProperty 'value' "
}
$scriptCmd = [scriptblock]::Create($scriptText)
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

There’s no need for me to add anything to the process or end blocks, so that’s it – everything Select-String originally did, plus recursion and returning bare matches.

I’ve put the whole file on skydrive here

January 28, 2012

Adding Persistent history to PowerShell

Filed under: Uncategorized — jamesone111 @ 7:19 pm

The Doctor: “You’re not are you? Tell me you’re not Archaeologists”
River Song: “Got a problem with Archaeologists ?”
The Doctor: “I’m a time traveller. I point and laugh at Archaeologists”

I’ve said to several people that my opinion of Linux has changed since I left Microsoft: after all I took a job with Microsoft because I rated their products and stayed there 10 years, I didn’t have much knowledge of Linux at the start of that stay and had no call to develop any during it, so my views I had was based on ignorance and supposition. After months of dealing with it on a daily basis I know how wrong I was. In reality Linux is uglier, more dis-functional and, frankly, retarded than I ever imagined it could be. Neither of our Linux advocates suggest it should be used anywhere but servers (just as well they both have iPhones and Xboxes which are about as far from the Open source ideal as it’s possible to go). But compare piping or tab expansion in PowerShell and Bash and you’re left in no doubt which one was designed 20 years ago. “You can only pipe text ? Really ? How … quaint”.

One of guys was trying to do something with Exchange from the command-line and threw down a gauntlet.
If PowerShell’s so bloody good why hasn’t it got persistent history”
OK. This is something which Bash has got going for it. How much work would it take to fix this ? Being PowerShell the answer is “a few minutes”. Actually the answer is “a lot less time than it takes to write a blog post about it”

First a little side track various people I know have a PowerShell prompt which looks like
[123]  PS C:\users\Fred>
Where 123 was the history ID. Type H (or history, or Get-History) and PowerShell shows you the previous commands, with their history ID, the command Invoke-History <id> (or ihy for short) runs the command.
I’d used PowerShell for ages before I discovered typing #<id>[Tab] inserts the history item into the command line. I kept saying “I’ll do that one day”, and like so many things I didn’t get round to it.
I already use the history ID I have this function in my profile
Function HowLong {
   <# .Synopsis Returns the time taken to run a command
      .Description By default returns the time taken to run the last command
     
.Parameter ID The history ID of an earlier item.
   #>

   param  ( [Parameter(ValueFromPipeLine=$true)]
           
$id = ($MyInvocation.HistoryId -1)
     
    )
  process {  foreach ($i in $id) {
                 (get-history $i).endexecutiontime.subtract(
                                 
(get-history ($i)).startexecutiontime).totalseconds
            }
          }
}
Once you know $MyInvocation.HistoryID gives the ID of the current item, it is easy to change the Prompt function to return something which contains it.

At the moment I find I’m jumping back and forth between PowerShell V2, and the CTP of V3 on my laptop
(and I can run PowerShell –Version 2 to launch a V2 version if I see something which I want to check between versions).
So I finally decided I would change the prompt function. This happened about the time I got the “why doesn’t the history get saved” question. Hmmm. Working with history in the Prompt function. Tick, tick, tick.  [Side track 2 In PowerShell the prompt isn’t a constant, it is the result of a function.  To see the function use the command type function:prompt]
So here is the prompt function I now have in my profile.
Function prompt {
  $hid = $myinvocation.historyID
  if ($hid -gt 1) {get-history ($myinvocation.historyID -1 ) |
                      convertto-csv | Select -last 1 >> $logfile
 
}
  $(if (test-path variable:/PSDebugContext) { '[DBG]: ' } else { '' }) + 
    "#$([math]::abs($hid)) PS$($PSVersionTable.psversion.major) " + $(Get-Location) + 
    $(if ($nestedpromptlevel -ge 1) { '>>' }) + '> '
}

The first part is new lines are new: get the history ID and if is greater than 1, get the previous history item, convert from an object to CSV format, discard the CSV header and append it to the file named in $logFile (I know I haven’t set it yet)

The second part is lifted from the prompt function found in the default profile, that reads
"PS $($executionContext.SessionState.Path.CurrentLocation)$('>' * ($nestedPromptLevel + 1)) "
It’s actually one line but I’ve split it at the + signs for ease of reading.
I put the a # sign and the history ID before “PS” – when PowerShell starts the ID is –1 so I make sure it is the absolute value.
After “PS” I put the major version of PowerShell.
I’m particularly pleased with the #ID part in the non-ISE version of PowerShell double clicking on #ID selects it. My mouse is usually close enough to my keyboard that the keypad [Enter] key is within reach of my thumb so if I scroll up to look at something I did earlier, one flickity gesture (double-click, thumb enter, right click [tab]) has the command in the current command line.

So now I’m keeping a log, and all I need to do is to load that log my from Profile. PowerShell has an Add-History command and the on-line help talks about reading in the history from a CSV file so that was easy – I decided I would truncate the log when PowerShell started and also ensure that the file had the CSV header so here’s the reading friendly version of what’s in my profile.

$MaximumHistoryCount = 2048
$Global:logfile = "$env:USERPROFILE\Documents\windowsPowershell\log.csv"
$truncateLogLines = 100
$History = @()
$History += '#TYPE Microsoft.PowerShell.Commands.HistoryInfo'
$History += '"Id","CommandLine","ExecutionStatus","StartExecutionTime","EndExecutionTime"'
if (Test-Path $logfile) {$history += (get-content $LogFile)[-$truncateLogLines..-1] | where {$_ -match '^"\d+"'} }
$history > $logfile
$History | select -Unique  |
         
 Convertfrom-csv -errorAction SilentlyContinue |
         
 Add-History -errorAction SilentlyContinue

UPDATE Copying this code into the blog page and splitting the last $history line to fit, something went wrong and the
select -unique went astray. Oops.
It’s there because hitting enter doesn’t advance the History count, or run anything but does cause the prompt function to re-run. Now I’ve had to look it again it occurs to me it would be better to have select –unique in the (get-content $logfile) rather in the Add-history section as this would remove duplicates before truncating.

So … increase the history count, from the default of 64 (writing this I found that in V3 ctp 2 the default is 4096). Set a Global variable to be the path to the log file, and make it obvious what the length is I will truncate the log to.
Then build an array of strings named history. Put the CSV header information into $history, and if the log file exists put up to the truncate limit of lines into $history as well. Write $history back to the log file and pipe it into add history, hide any lines which won’t parse correctly. Incidentally those who like really long lines of PowerShell could recode all lines with $history in them into one line. So a couple of lines in the prompt function and between 3 and 9 lines in the profile depending on how you write them all in it’s less than a dozen lines. This blog post has taken a good couple of hours, and I don’t the code in 10 to 15 minutes.

image

Oh , and one thing I really like – when I launch PowerShell –Version 2 inside Version 3, it imports the history giving easy access to the commands I just used without needing to cut and paste.

If you’re a Bash user and didn’t storm off in a huff after my initial rudeness I’d like to set a good natured challenge. A non-compiled enhancement to bash I can load automatically which gives it tab expansion on par with PowerShell’s (OK, PowerShell has an unfair advantage completing parameters, so just command names and file names). And in case you wondered about the quote at the top of the post from one of Stephen Moffat’s Dr Who episodes. You see, “I Know PowerShell. I point and laugh at Bash users.”

December 20, 2011

Free NetCmdlets

Filed under: Uncategorized — jamesone111 @ 9:48 pm
I’ve mentioned the NetCmdlets before. Although not perfect if you spend a lot of your life using PowerShell and various network tools they are a big help. They’ve made a bunch of things which would have been longwinded and painful relatively easy. So here is a mail I have just had from PowerShell inside (aka /n software) . I don’t normally hold with pasting mails straight into a blog post, but you’ll see why if you read on: if you click the link it asks you to fill in some details and “A member of our sales team will contact you with your FREE NetCmdlets Workstation License. (Limit one per customer.)”

A Gift For The Holidays: FREE NetCmdlets Workstation License – This Week Only, Tell a Friend!

Help us spread some PowerShell cheer! Tweet, blog, post, email, or just tell a friend and you can both receive a completely free workstation license of NetCmdlets!

NetCmdlets includes powerful cmdlets offering easy access to every major Internet technology, including: SNMP, LDAP, DNS, Syslog, HTTP, WebDav, FTP, SMTP, POP, IMAP, Rexec/RShell, Telnet, SSH, Remoting, and more. Hurry, this offer ends on Christmas day – Happy Holidays!

NetCmdlets Workstation License:

$99.00 FREE

NetCmdlets Server License:
* Special Limited Time Offer *

$349.00 $199.00 [+] Order Now

Hurry, this offer ends on Christmas day – Get your free license now!

Happy Holidays!

Or as we say this side of the Pond Merry Christmas !

October 17, 2011

How to be Creative with QR Codes

Filed under: Uncategorized — jamesone111 @ 1:15 pm

I’ve been playing with QR codes recently, and have started to use them . HELLO if you’ve been at The Experts Conference in Frankfurt and scanned one of from one of my sessions the files form these are on My Skydrive 

I looked at a number of different code libraries which would build a QR code for me but the easiest way turned out to be to use a web service, provided by Google.  And I wrapped this up in a little PowerShell

[System.Reflection.Assembly]::LoadWithPartialName(”System.Web") | Out-Null
Function get-qrcode {
param ([parameter(ValueFromPipeLine= $true, mandatory=$true)]
      
$Text,
       $path = (join-path $pwd "QRCode.PNG")
)
  if ($WebClient -eq $null) {$Global:WebClient=new-object System.Net.WebClient }
  $WebClient.DownloadFile(("http://chart.apis.google.com/chart?cht=qr&chs=547x547&chld=H|0&chl=" + 
                         [System.Web.HttpUtility]
::UrlEncode($Text)) , $path)
   Start-Process $path
}

So it takes two parameters, a block of text and a path where the file should be saved. The text must be specified, but there is a default for

To save having to the create lots of web client objects, I keep a web client object , then one line of powershell gets a PNG file from the Google Service and saves it to the path.  Finally I launch the file in the default viewer.

The next step is to use some image tools to pretty up the QR code. I still use an ancient version of PaintShop pro and that does the job here nicely.

qr-1

On the left is the original QR code, in the middle I have applied some gaussian blur to the image and on the right I have reduced this to two colours, 100% black and 100% white. This smoothes off the edges. Then I take the image back to 16 million colours and add some colour.

 

qr-2

One of the really nice things about QR codes is that they have error correction built in (and in my call to the web service I specify the maximum amount) this means we put something which isn’t part of the code into the picture. I’ve used the Frankfurt skyline from the slide template here, but this is scope for creativity.

Of course there is nothing which says the data in the code must be a URL. Here’s a message for anyone who has got a reader. The file name and the Ferrari might be a clue to what it says.

QR-Ferris

April 10, 2011

Ten tips for better PowerShell functions

Filed under: Uncategorized — jamesone111 @ 11:02 pm

Explaining PowerShell often involves telling people it is both an interactive shell – a replacement for the venerable CMD.EXE – and a scripting language used for single task scripts and libraries of re-useable functions. There are some good practices which are common to both kinds of writing  – including comments, being explicit with parameters, using full names instead of aliases and so-on but having written hundreds of “script cmdlets” I have developed some views on what makes a good function which I wanted to share…

1. Name your function properly
It’s not actually compulsory to use Verb-SingularNoun names with the standard verbs listed by Get-Verb. “Helpers” which you might pop in a profile can be better with a short name. But if your function ends up in a module Import-module grumbles when it sees non-standard verbs. Getting the right name can clarify your thinking about what a command should or should not do. I cite IPConfig.exe as an example of a command line tool which didn’t know when to stop – what it does changes dramatically with different switches.  PowerShell tends towards multiple smaller functions whose names tell you what they will do – which is a Good Thing

2. Use standard, consistent and user-friendly parameters.
(a) PowerShell Cmdlets give you –whatIf and –Confirm switches; before you do something irreversible -  you can get these in your own functions Put this line of code before any others in the function
[CmdletBinding(SupportsShouldProcess=$True)]
and then where you do something which is hard to undo  
If ($psCmdlet.shouldProcess("Target" , "Action")) {
    dangerous actions
}
(b) Look at the names PowerShell uses: “path”, not “filename” , “ComputerName” not “Host”, “Force” “NoClobber” and so on – copy what has been done before unless you have a good reason to do something different; I don’t use “ComputerName” when working with Virtual Machines because it is not clear if it means a Virtual Machine or the Physical Machine which hosts them.
(c)If you are torn between two names : remember that “Computer” is a valid shortening of “ComputerName” and for names which are shortenings of an alternative you can define aliases, like this:
[Alias("Where","Include")]$Filter
TIP 1.You can discover all the parameter names used in by cmdlets, and how popular they are like this
get-command -c cmdlet | get-help -full| foreach {$_.parameters.parameter} |
   forEach{$_.name} | group -NoElement | sort count

Tip2
If you think “Filter” is the right name to re-use you can see how other cmdlets use it like this:
Get-Command -C cmdlet | where { $_.definition -match "filter"} | get-help  -Par "filter" 

3. Support Piping into your functions.
V2 of PowerShell greatly simplified Piping. The more you use PowerShell the stronger sense you get that the output of one command should become the input for another. If you are writing functions, aim for the ability to pipe into them and pipe their output into other things.  Piped input becomes a parameter, all you need to do is

  • Make sure the parts of the function which run for each piped object are in a
    process {} block
  • Prefix the parameter declaration with [parameter(ValueFromPipeline=$true)].
  • If you want a property of a piped object instead of the whole object, use ValueFromPipelineByPropertyName
  • If different types of objects get piped, and they use different property names for what you want, give your parameter aliases, and it will look for the “true” name if it doesn’t find it try each alias in turn.

If you find code that looks like this
something | foreach {myFunction $_ }
It is a sign that you probably need to look at piping.

4. Be flexible about arrays and types of parameters
Piping is one way to feed many objects into one command. In addition, many built-in cmdlets and operators will accept arrays as parameters just as happily as they would accept a single object; previously I gave the example  of Get-WmiObject whose –computername parameter can specify a list of machines – it makes for simpler code.
It is easier to use the functions which catch being passed arrays and process them sensibly (and see that previous post for why simply putting [String] or [FileInfo] in front of a parameter doesn’t work).  Actually I see it as good manners – “I handle the loop so you don’t have to do it at the command line”
Accepting arrays is one case of not being over-prescriptive about types: but it isn’t the only one. If I write something which deals with, say, a Virtual Machine, I ensure that VM names are just as valid as objects which represent VMs. For functions which work with files, it has to be just as acceptable to pass System.IO.FileInfo and System.Management.Automation.PathInfo, objects or strings containing the path (unique or wild card, relative path or absolute). 
TIP:  resolve-path will accept any of these and convert them into objects with fully-qualified paths.
It seems rude to make the user use Get-whatever to fetch the object if I can do it for them.

5. Support ScriptBlock parameters.
If one parameter can be calculated from another it is good to let the user say how to do the calculation.  Consider this example with Rename-Object. I have photos named IMG_4000.JPG, IMG_4001.JPG , IMG_4002.JPG, up to IMG_4500.JPG. They were taken underwater, so I want them to be named DIVE4000.JPG etc. I can use:
dir IMG_*.JPG | rename-object –newname {$_.name –replace "IMG_","DIVE"}
In English “Get the files named IMG_*.JPG and rename them. The new name for each one is the result of replacing IMG_ with DIVE in that one’s current name.” Again you can write a loop to do it but a script block saves you the trouble.

  • The main candidates for this are functions where one parameter is piped and a second parameter is connected to a property of the Piped one.
  • When you are dealing with multiple items arriving from the pipeline, be careful what variables you set in the process{} block of the function: you can introduce some great bugs by overwriting non-piped parameters. For example if you had to implement rename-object, it would be valid to handle a string that had been piped in as the –path parameter by converting it into a FileInfo object – doing so has no effect on the next object to come down the pipe; but if you convert a script block which is passed as -NewName to a String, when the next object arrives it will get that string – I’ve had great fun with the bugs which result from this
  • All you need to do to provide this functionality is
    If ($newname –is [ScriptBlock]) { $TheNewName = $newname.invoke() }
    else                            { $TheNewName = $newname}

6. Don’t make input mandatory if you can set a sensible default.
Perhaps obvious, but… If I write a function named “Get-VM” which finds virtual machines with a given name, what should I do if the the user doesn’t give me a VM name ? Return nothing ? Throw an error ? Or assume they want all possible VMs ?
What would you mean if you typed Get-VM on its own ?

7. Don’t require the user to know too much underlying syntax.
Many of my functions query WMI; WMI uses SQL syntax; SQL Syntax uses “%” as a wildcard, not “*”.  Logical conclusion: if a user wants to specify a wildcarded filter to my functions they should learn to use % instead of *.  That just seems wrong to me: so my code replaces any instance of * with %.  If the user is specifying filtering or search terms a few lines to change the from things they will instinctively do, or wish they could do, to what is required for SQL,  LDAP or any other syntax can make a huge difference in usability.

8. Provide information with Write-Verbose , Write-debug and Write-warning
When you are trying to debug the natural reaction is to put in Write-Host commands, fix the problem and take them out again.  Instead of doing that change $DebugPreference and/or $VerbosePreference and use write-debug / write-verbose to output information. You can leave them in and stop the output by changing the preference variables back. If your function already has
[CmdletBinding(SupportsShouldProcess=$True)]
at the start then you get –debug and –verbose switches for free.
Write-Error is ugly and if you are able to continue, it’s often better to use Write-warning.
And learn to use write-progress when you expect something to spend a long time between screen updates.

9. Remember: your output is someone else’s input.
(a) Point 8  Didn’t talk about using Write-Host – only use it to display something you want to prevent going into something else.
(b) Avoid formatting output in the function, try to output objects which can be consumed by something else.  If you must format turn it on or off with a –formatted or -raw switch.
(c) Think about the properties of the objects you emit. Many commands will understand that something is a file if it has a .Path property, so add one to the objects coming out of your function and they can be piped into copy, invoke-item, resolve-path and so on. Usually that is good – and if it might be dangerous look at what you can do to change it.  Another example: when I get objects that represent components of a virtual machine their properties don’t include the VM name. So I go to a little extra trouble to add it.
Add-Member can add properties or aliases for properties to an object for example
$obj | Add-member -MemberType AliasProperty –Name "Height"-Value “VerticalSize”

10 Provide help
In-line help is easy – it is just a carefully comment before any of the code in your function. It isn’t just there for some far when you share the function with the wider world. It’s for you when you are trying to figure out what you did months previously – and Murphy’s law says you’ll be trying to do it at 3AM when everything else is against you.
Describe what the Parameters expect and what they can and can’t accept. 
Give examples (plural) to show different ways that the function can be called. And when you change the function in the future, check the examples still work.

April 9, 2011

Pattern recognition–the human and PowerShell kinds

Filed under: Uncategorized — jamesone111 @ 8:40 pm

Recently BBC’s Top Gear has been promoting the idea that a particular type of obnoxious drivers have been replacing the BMWs that they traditionally bought with Audis. Chatting to a friend who is a long term Audi customer, and whose household features ‘his’ and ‘hers’ Audis we came to the conclusion that once you think there is a pattern, you recognise it and the your awareness increases – even if in reality it is no more prevalent. I think the same thing happens in IT in general and scripting in particular – it has happened to me recently… when  my understanding of regular expressions in PowerShell took a big step forward, and now I’m finding all manner of places where it helps.

I use a handful of basic regular expressions  for things like removing a trailing \ character from the end of a string with something like:
$Path = $Path –replace "\\$" , ""
Many people use –replace to swap text without realising it handles regular expressions – in this case  “\” is the escape character in regular expression, so to match “\” itself it has to escaped as “\\” . The $ character means “end-of-line” so this fragment just says ‘Replace “\” at the end of $Path – if you find one – with nothing, and store the result back in $Path.  PowerShell’s –Split operator also uses regular expressions. This can be a trap – if you try to split using  “.” it has means “any character” any you get a result you didn’t expect:
This.that" –split "." returns 10 empty strings – (the –split operator discards the delimiter) ; to match a “.” it must  be escaped as “\.” . But it’s also a benefit if you want to split sentences apart you can make  “.” and any spaces round it the delimiter– which saves the need to trim afterwards. The –Match operator uses regular expressions too  – I  worry when I see it used in examples for new users who may use something which parses unexpectedly as a regular expression .

I thought that I knew regular expressions – until thanks to an article by Tome Tanasovski, I found I had missed a big bit of the picture, which meant my understanding was wrong.  I thought that a match meant the equivalent of running a highlighter pen over part of the text and –replace means “take something out and put something else back” – both are usually true but not always. Tome also did a presentation for the PowerShell user group – there’s a link to the recording on Richard’s blog – I’d recommend watching it and pausing every so often to try things out.
Tome showed look-aheads and look-behinds. These say “It’s a Match if it is followed by something”, or “preceded by something” (or not).  This adds a whole new dimension…

A couple of days later I hit a snag with PowerShell’s Split-Path cmdlet. If the path is on a remote machine it might uses a drive letter which doesn’t exist on the local machine – and in that situation Split-Path throws an error. But I can use the –Split operator with a regular expression. I want to say “Find a \ followed by some characters that aren’t \ and the end of the string”. Lets test this:
PS C:\Users\James\Documents\windowsPowershell> $pwd -split "\\[^\\]+$"
C:\Users\James\Documents

As in my first example  ‘\\’ is an escaped ‘\’ character, and ‘$’ means “end of line” , ‘[^\\]’ says “Anything which not the ‘\’ character”  and ‘+’ means “at least once” So this translates as “Look for a ‘\’ followed my at least 1 non-‘\’ followed by end of line”. It’s mostly right but it doesn’t work (yet).
I copied my command prompt so you can see that ‘WindowsPowerShell’ is part of the my working directory – but that bit got lost; or to be more precise it was matched in the expression, so –split returned the text on either side of it.
I want to say “Find ONLY a ‘\’ . The one you want is followed by some characters that aren’t ‘\’ and the end of the string but they don’t form part of the delimiter.”  The syntax for Is followed by is “(?=   )” so I can wrap that around the [^\\]+$ part  and test that:
PS C:\Users\James\Documents\windowsPowershell> $pwd -split "\\(?=[^\\]+?$)"
C:\Users\James\Documents
windowsPowershell

Regular-Expressions can turn into a write-only language – easy to build up but pretty hard to pull apart.  At risk of making things worse, not everyone knows that PowerShell has a “multiple = operator”; if you write $a , $b  = 1,2  it will assign 1 to $a and 2 to $b. Since the output of the split operation is 2 items we can try this
PS C:\Users\James\Documents\windowsPowershell> $Parent,$leaf = $pwd -split "\\(?=[^\\]+?$)"
PS C:\Users\James\Documents\windowsPowershell> $Parent
C:\Users\James\Documents
PS C:\Users\James\Documents\windowsPowershell> $leaf
windowsPowershell

The “cost” of using regular expressions is that the term used to do the split is something akin to a magical incantation. The benefit is code is a lot more streamlined than using the string object’s  .LastInstanceOf(), .Substring() and .length() methods and some arithmetic to get to the same result. I’d contend that even allowing for the “incantation” the regex way makes it easier to see that $pwd is being split into 2 parts.
Good stuff so far, but Tome had another trick:  the match that selects nothing and the replace that removes nothing.  That made me stop and redefine my understanding.  Here’s the use case:

Ages ago I wrote about using PowerShell to query the Windows [Vista] Destkop Index – it works just as well with Windows 7.  The a zillion or so field names used in these queries have names like  “System.Title”, “System.Photo.Orientation” and “System.Image.Dimensions” – I’d type the bare field name like “title” by mistake or waste time discovering whether “HorizontalSize” belonged to System.Photo or System.Image. 
It would be better to enable my Get-IndexedFile function to put in the the right prefix: but could it be done reasonably efficiently and elegantly?
Here lookarounds come into their own. They let me write “If you can find a spot which is immediately after a space, and immediately before the word ‘Dimensions’ OR the word ‘HorizontalSize’ OR…” and so on for all the Image Fields “AND that word is followed by any spaces and a ‘=’ sign  THEN put ‘System.image.’ at the spot you found”.  With just the first two fieldnames the operation looks like this
-replace "(?<=\s) (?=(Dimensions|HorizontalSize)\s*=)" , "system.image."
                 ^
I have put an extra space in for the spot that will be matched – the ^ is pointing this out, it isn’t part of the code.
“(?<=  )” is the wrapper for the “look behind” operation  (replacing the ‘=’ with ‘!’ negates the expression) so “(?<=\s)”  says “behind this spot you find a space” and the second half is a “look ahead” which says “in front of this spot you find ‘Dimensions’ or ‘HorizontalSize’ then zero or more spaces (‘\s*’) followed by ‘=’ ”. A match with an expression like this is like an I-beam cursor between characters – rather than highlighting some: so the –replace operator has nothing to remove but it still inserts ‘system.image’ at that point. So lets put that to the test.

PS> "horizontalsize = 1024"  -replace "(?<=\s)(?=(Dimensions|HorizontalSize)\s*=)",
                                       "system.image."
system.image.horizontalsize = 1024

It works !  This whole exercise of writing a Get-IndexedFilesfunction – which I will share in due course -  ended up as worked example in using regex to support good function design. I’ve got another post in draft at the moment about my ideas on good function design, so I’ll post that and then come back to looking at all the different ways I made use of regular expressions in this one function.

March 26, 2011

F1: The hidden effects of moving wings

Filed under: Uncategorized — jamesone111 @ 10:25 pm

There seem to be divided opinions about the effect of the “Drag reduction system” introduced in F1 this season. The rules are that

  • Drivers can operate device to lower the effective part of the rear wing, cutting both lift and drag. The wing returns to its original position when the driver applies the brakes.
  • In wet conditions this will be disabled
  • In qualifying the drivers can use this at will
  • In the race it is armed remotely by a system in race control – if the car close enough to the one in front (the margin will be 1 second to begin with – this my change over the season) at a specific point the following driver can lower his wing for a specific section of the track – typically the longest straight. .

“Push to pass” divides people: we had it in the days of turbo engines: in the 1980s we had qualifying engines which wouldn’t last a race distance; the boost button in a race gave a burst of similar power – for a sustained period it was a case of “The engines cannae’ take it”, nor could the tyres, and fuel would run  out. We it had when KERS first appeared;  “Kinetic Energy Recovery Systems” are currently a gimmick: energy stored, rate at which it is returned (Power) and time over which the return can take place are all constrained. F1 talks about being greener, removing the limits on KERS would be an obvious way and I’d have it feeding extra power in when the driver applied full throttle. Now we have it with wings.

Predicted effect 1. Last use wins. IF it turns out to make passing easy then if two cars are evenly matched, drivers won’t want to be re-passed, so they will time their passing move to use the wing at the last moment

Predicted effect 2. More tyre stops. There was always a decision to make: sacrifice position on-track by making a stop for fresh tyre – or hold out ? The harder it is to overtake the bigger the advantage of fresh rubber needs to be before stopping becomes the preferred option – because as Murray Walker always used to say “Catching is one thing, passing is quite another”.  So picture the scene with a dozen or so laps to go the first two cars have been on hard tyres for a good few laps and the leader is being caught: thanks to DRS the 2nd place car gets past. The former leader’s his car is fractionally slower but on fresh soft tyres could go 2-3 seconds a lap quicker – enough to catch the 20-30 seconds a pit stop takes with a couple of laps to go. Most of the tyre advantage will have gone by the time he has caught up: previously it would have been easy for the new leader to defend for the last couple of laps. Now if the chasing car can get within DRS distance he should be able to make a last gasp pass.  In the wet inspired changes of tyres win races – it didn’t really happen in the dry – until now.

Predicted effect 3. The return of slipstreaming. The FIA banned slipstreaming… OK, not as such. Imposing an 18,000 Rev limit banned it. How so ?  Without a limit on revs, in top gear, revs and speed increase until the acceleration force coming from the engine matches the retardation force from friction and aerodynamic drag.  Reduce drag by slipstreaming and top speed and engine revs will increase. But what if gear ratios are optimised to get the best lap time with no slipstream (in qualifying) – hitting the maximum Revs as the driver hits the brake at the fastest point ? If revs are limited the car won’t go any faster with the aid of slipstream.

With the ability to use DRS in qualifying, the optimum is to hit 18,000 in low-drag trim at the fastest point. The teams can’t change gear ratios after qualifying and the race the cars will be in high-drag trim most of the time – so they won’t be reaching 18,000 revs and will have a margin for slipstreaming.

Predicted effect 4. Race pace trumps grid place. Grid penalties become less effective. The advantage of starting ahead of a car which is faster than then yours / or disadvantage of starting behind a slower car varies with the difficulty in passing. Since the car can’t be reconfigured after qualifying, making overtaking easier might mean car set-up is tilted more towards race configuration than qualifying. It also means taking a penalty for a precautionary gearbox change (say) is smaller

Whether or not any of these things happen remains to be seen. Still: fun season in prospect.

February 11, 2011

Elop and Ballmer : it’s the Ecosystem . Does everyone get it now ?

Filed under: Uncategorized — jamesone111 @ 6:07 pm

As the dust starts to settle after the announcement of the partnership between Microsoft and Nokia, the question has come from more than one quarter “what about the other handset vendors ?” 
There are 4 of them today. HTC has 4 phones out (the HD7, Mozart Trophy and Surround) with a 5th, the “Pro” due imminently; it makes more than a dozen Android devices. LG has one phone the Optimus 7 with a pro version in the pipeline, and according to Wikipedia makes 8 Android devices.  Samsung has the Omnia 7 and (if Wikipedia is to be believed) makes 9 Android devices. Dell is the last of the first wave Windows phone 7 manufacturers and it too has an Android device. Nokia is the first phone maker to go with Windows 7 who didn’t also go with Android. The initial “gang of four” can’t accuse Microsoft of infidelity when they had not been exclusive themselves,one must expect Microsoft to sell operating systems to all comers. The four also know that once they were eight: Garmin-Asus, HP, Toshiba and Sony-Ericsson were with them at launch. HP’s plans changed and the other three went quiet. Some drop out, some join, and so the world turns. I suspect their first thought is that if the decline in Nokia’s share comes to a halt, there will be less easy business to pick up.

You can watch the Stephen Elop and Steve Ballmer’s press conference here . Elop’s opening remarks echoed something from his burning platform memo: “The game has changed from a battle of devices to a war of ecosystems.”  Ecosystem was the word of the day*, I stopped counting the number of times it was used. It’s not about Nokia’s Windows Phone 7 device against, Dell’s, against HTCs against LG’s against Samsung’s. It’s about Microsoft’s ecosystem – with phones by Dell,HTC, LG, Nokia Samsung, and whoever else, against Google’s ecosystem with phones by Dell, HTC, LG, Samsung, Motorola, Sony-Ericsson, Old Uncle Tom Cobley and all, against the Apple and Blackberry ecosystems.

About 25:15 into the press conference someone asked Steve Ballmer about the other handset makers. He replies “The overall development of critical mass in Windows phone 7,  from other manufacturers, from the chipset community, is important to both of us – despite the fact that obviously Nokia wants to sell all the Windows phones it can”. Elop chimes in “This is an important thing for people to think about: Our number one priority is the success of the Windows Phone Ecosystem, in which Nokia is participating, so it is to our benefit to get that critical mass and virtuous cycle going which includes work done by  some of our handset competitors. We will encourage that, that’s a good thing. ”. We can’t prosper unless the ecosystem prospers. 

The next question Elop gets is “Why Microsoft not Android ?”  and his answer is interesting “What we assessed was 3 options…  internal: MeeGo Symbian and so-forth … the concerns about whether we could quickly enough develop a third ecosystem without the help of a partner like Microsoft … made that option concerning, absolutely concerning.” (Lovely way of putting it) “We explored the opportunity with the Google ecosystem … our fundamental belief is that we would have difficulties differentiating within that ecosystem – if we tipped over into the Android ecosystem, and there was a sense that was the dominant ecosystem at that point the commoditisation risk was very high, prices, profits everything being pushed down value being moved out to Google essentially. ”.

That begs a bunch of follow-up questions. “You don’t think you can build an ecosystem with MeeGo but HP think they can with WebOs  do you think they should be more … concerned ?”. “Is the Android handset market in a dash to the bottom or are you saying Nokia joining that market would have made it so ?”  “How will Nokia differentiate when Microsoft has worked to give consistent experience over all the Windows Phone 7 devices?”  “Why won’t  value will get moved out to Microsoft ? (Isn’t that the danger if you succeed and Microsoft’s becomes the dominant ecosystem ?)”
The financial question did get asked, and half answered, Nokia will pay Royalties on the OS, Microsoft will buy services from Nokia to strengthen the ecosystem. Who knows if Google would have offered them the  same ? Connections have little way to add value, the opportunities for the handset makers are reducing. Pushing stuff to the handsets is where the opportunity is today. I haven’t made a lot of purchases since I got my HTC trophy, but Microsoft take 30% of software sales from marketplace and if the margin Music is the same they’ve made £10 out of me in 3 months, that’s £80 over the two year life of a phone. I never had much idea what a Microsoft charged to put its software on a phone, (you can buy Windows 7 home Premium from a PC builder for £62 + VAT so Windows Mobile 6.x / Windows Phone 7 netting more than £80 doesn’t seem plausible). A year ago Microsoft made zero once a phone had been sold, now the revenue is reaching the point  where a token license fee is feasible – I can’t see Microsoft letting go completely.  And of course the biggest ecosystem by unit volume is an OS given away to sell advertising. You can see Nokia wanting to be in an ecosystem where it is not just as a handset maker.

 

* I nearly used “It’s the ecosystem stupid”, or “Our priorities are ecosystem, ecosystem, ecosystem” for a title.

February 10, 2011

IE9 and getting to grips with privacy

Filed under: Uncategorized — jamesone111 @ 10:04 pm

I wrote a post a few weeks ago entitled “One small step for IE9, one giant leap for privacy”. With the arrival of IE9 release candidate things have changed a little bit: the only words I can repeat from my initial reaction are STUPID and BROKEN but whilst things are not quite as I would have liked they are nowhere near as bad I first feared

First the bad….

  1. You can no longer import /export   block /allow lists. Previously you could export the list which IE built up, hand edit the XML and re-import it. This was tedious to do and only  worked at very small scale.
  2. The XML format announced by  Dean Hachamovitch Corporate Vice President, Internet Explorer in December   – which was the self same format used in import and export  appears to have been abandoned. The new format is a a text file, but doesn’t use the de-facto standard (adblocker for firefox).
  3. Microsoft was previously accused – by the Wall Street Journal no less of sabotaging the protection offered by in private filtering. When people take a good look at the Lists Microsoft is promoting  there may be fresh accusations that they are poor filleted things.
  4. The auto-blocking is a bit heavy handed – on my system it blocked on WordPress’s stylesheets. I can find the offending block and make it an allow, but most users won’t be able to and will turn their own personal block lists off.

Let’s expand on that third point.  I can’t see any sense in changing from the XML format to a text file but not allowing people to import the lists used by the ad-blocker add-on for Firefox.  If you want to believe in a conspiracy, then Microsoft is too cosy with advertisers. Microsoft’s page actually has an entry from Easy List described thus: “EasyPrivacy Tracking Protection List is based on the popular EasyPrivacy subscription for Adblock Plus and is managed by the well-known EasyList project“. Quite true and there’s   2,581 lines of it. But main EasyList which blocks 10,000 plus sites is not available (though it shouldn’t take long to convert it)

So the good.

  1. It only takes a couple of minutes with PowerShell to get the XML files into the new text format. Since my old list blocked or allowed whole domains , and used the same text as a description as it had in the “block regex” or “allow regex”, I just needed to see if a row was a block or an allow, and prefix the domain name with “+d” or “-d”  converting  “\.” to “.” in the description as I went.
    $x=[xml](get-content C:\users\Public\Documents\export.xml | %{$_ -replace "item","rssItem" })
    $x.rss.channel.rssItem | where {$_.blockregex}  | % {"-d " + $_.description -replace "\\\.","."} | clip

    $x.rss.channel.rssItem | where {$_.allowregex}  | % {"+d " + $_.description -replace "\\\.","."}  |clip
    The first line reads the XML and the second and third place the text in the clipboard and I can paste it into notepad with a file which starts
    msFilterList
    : Expires=5

    The first line is obvious and the second is the number of days to wait between updates. Simples. I could have done it all in PowerShell without bothering with notepad/

  2. It’s pretty easy to put an “Add my TPL” link on a page  it goes
    <a href="javascript:window.external.msAddTrackingProtectionList ('http://server.MyDomain.com/myFileText, 'Description’')">Add TPL</a>
    I couldn’t get this to work with File:// urls, or skydrive and wordpress filters out that kind of link. Grrr
  3. Active X blocking.  Read “flash blocking”. This is new for RC. Regular readers will know I hate flash which wants to pull my attention away from the content I’m trying to read. No active-X means No flash. But sometimes you need flash turning it on and off is a pain. Now you get this 
    image
    I’ve circled the the “content blocked” icon on the address bar. Click it and you can turn off what you want – and it seems to apply to that site only , but on all future visits – though you can turn it back on.
    image

February 9, 2011

On outsiders and burning platforms.

Filed under: Uncategorized — jamesone111 @ 6:06 pm

 

The only impression Stephen Elop left on me in his time at Microsoft was that his speech at Tech-ed  in Berlin on the 20th Anniversary of the fall of the wall had been seen as boring. What is already being called  his “Burning platform” memo  pretty much kills off any reputation for dullness. Wake up calls to companies rarely get openings of this drama:
There is a pertinent story about a man who was working on an oil platform in the North Sea. He woke up one night from a loud explosion …  He could stand on the platform, and inevitably be consumed by the burning flames. Or, he could plunge 30 meters in to the freezing waters. The man was standing upon a “burning platform,” and he needed to make a choice.  ….  We too, are standing on a “burning platform” ”

Engadget says Elop is “neither Finnish, nor raised in the Nokia system” and  I’ve had a couple of conversations recently about incomers to companies.  You’d expect Microsoft to end up full of “Microsoft people”,  Ford to be full of “Ford people” and Nokia full of “Nokia people”. If you can get past people’s tendency to hire in their own image and companies being most attractive to people like the ones they already have – even then people adopt the ways of a community they belong to – which could be business, academic, government, even monastic. It was someone writing about the Atkins diet who introduced me to an idea from Thomas Kuhn’s “The structure of scientific revolutions”: new ideas must come from outside the community (Atkins was a cardiologist, not a dietician); precisely because being part of the community means buying into its ideas. People raised inside a system don’t change it: change comes from outside.

The people who appointed Elop would have known that.  It’s likely that Elop had reached the conclusion “If we jump we may drown, if we don’t we will burn” before he took the job, and would have shared it. Now he’s told a wider audience there is no other choice “we have multiple points of scorching heat that are fuelling a blazing fire around us.”

He says the iPhone has 61% of the $300+ smartphone market and Nokia “still don’t have a product that is close to their experience.”  He says  “Android came on the scene just over 2 years ago, and this week they took our leadership position in smartphone volumes. Unbelievable”.  And at the low end of the market “Chinese OEMs are cranking out a device much faster than … ‘the time that it takes us to polish a PowerPoint presentation’.”  
The problems aren’t just external Nokia is “not even fighting with the right weapons.”  he says:  “We thought MeeGo would be a platform for winning high-end smartphones. However, at this rate, by the end of 2011, we might have only one MeeGo product in the market.”  We thought, past tense? If you don’t think it will be that platform then what do you do with it?  He says Symbian “has proven to be non-competitive … and also creating a disadvantage when we seek to take advantage of new hardware platforms.”  Symbian, like Windows Mobile 6.x was designed for pre-iPhone market they’re cold-war products in a Post Berlin-Wall world. 

And it’s to metaphors of warfare that Elop turns to explain the jump Nokia is expected to announce on Feb 11th. “The battle of devices has now become a war of ecosystems, where ecosystems include not only the hardware and software of the device, but developers, applications, ecommerce, advertising, search, social applications, location-based services, unified communications and many other things. Our competitors aren’t taking our market share with devices; they are taking our market share with an entire ecosystem. This means we’re going to have to decide how we either build, catalyse or join an ecosystem.”

Putting the building option to one side, which ecosystem will they catalyse or join? The language seems to rule out selecting more than one. It’s probably safe to assume that Apple aren’t looking to license their OS, ditto RIM. Palm’s new WebOS doesn’t have much of an ecosystem. A Google VP appeared to rule out Android by tweeting#feb11 “Two turkeys do not make an Eagle”.’  The date appears to suggest Nokia and someone else.  And I seem to have crossed everyone but Microsoft off the list.

Microsoft Published Steve Ballmer’s mail from September 9th last year announcing Elop was off.  Maybe “I… look forward to continuing to work with him in his new role at Nokia.” is just what you say as CEO when one of your people becomes CEO somewhere else? But if on Friday, 155 days later, Steve and Stephen make a joint announcement someone is bound to ask if 155 days is long enough to do much more than polish a two-CEO PowerPoint presentation.

February 7, 2011

How to read Excel files in PowerShell.

Filed under: Uncategorized — jamesone111 @ 2:44 pm

I picked up a post of Lee Holmes’ entitled Does Your Hard Work Advance the Ecosystem; which  joins a discussion that started with what Lee terms a “fairly chewy piece” on the Lync PowerShell blog. That was about reading stuff from the cells of an Excel spreadsheet to set up users. Over here Jason basically says “very clever and all that, but wouldn’t you just convert to .CSV ?”  Lee’s piece is worth reading in full, but the key points are

  1. Some people will do things the hard way.  A few lines of PowerShell can replace hundreds in something like C#. [Side note: I’ve been working in C# recently and I know this only too well]. The experience of some hard-core programmers stops them imagining an easy way.
  2. There isn’t anything built into PowerShell to read XLS / XLSx files, so showing people how Excel can be used does move things forward. And PowerShell is about automating: starting a process by manually loading files into Excel and converting isn’t ideal.
  3. It is easy to write up and share what you did for a specific case. But a generic is better –providing an “Import-XLS” command is more useful than CreateLyncUsersFromXLS.PS1

I have to agree with both Jason and Lee – one probably would load the file in the Excel, and that isn’t the ideal. I’ve been manipulating the Office object model on and off since the early 1990’s so I really get Jason’s point that there is more to learn there than makes sense for an admin to pick up to get round the problem of people not saving in .CSV format when asked.

So here, for any admin confronted with that problem is ConvertFrom-XLx – it accepts a path to a file, or a file object from the pipeline ,  (so dir *.xlsx | convertFrom-XLx will work). It checks that the file exists, and ends with .XL<something> – that’s the little regular expression used with the -match and -replace operators. The -replace puts .CSV into the save name and the file is then opened, saved as CSV (that’s type 6 in a SaveAs operation), and closed (false says don’t stop to ask the user anything on closing)
Finally if a –PassThru parameter is specified the CSV file can be read in and passed to the next step in a pipeline.  

function ConvertFrom-XLx {
  param ([parameter(             Mandatory=$true,
                         ValueFromPipeline=$true,
           ValueFromPipelineByPropertyName=$true)]
         [string]$path ,
         [switch]$PassThru
        )

  begin { $objExcel = New-Object -ComObject Excel.Application }
Process { if ((test-path $path) -and ( $path -match ".xl\w*$")) {
                  
 $path = (resolve-path -Path $path).path
                $savePath = $path -replace ".xl\w*$",".csv"
              $objworkbook=$objExcel.Workbooks.Open( $path)
            
 $objworkbook.SaveAs($savePath,6) # 6 is the code for .CSV
              $objworkbook.Close($false)
              if ($PassThru) {Import-Csv -Path $savePath }
         
}
        
 else {Write-Host "$path : not found"}
        }
   end  { $objExcel.Quit() }
}

February 3, 2011

How to get files from a Windows Phone 7 App to a PC

Filed under: Uncategorized — jamesone111 @ 12:31 pm

In my review of Windows phone 7 I said that a lot of things I wanted to see changed in future could be summarized as “make it easier to get stuff on and off the phone”. If you are a user of phones I’m afraid this article won’t be of any more than academic interest, but if you write apps for the phone it is something you may want to know, especially if your background does not include Windows Communication Framework (WCF).

An earlier previous post  had said my new Windows 7 phone was Not a “Storage device” – that the Zune software syncs music, pictures and video [only]  without exposing the file system.  The memory card is non removable, and there is no API for developers to upload to Windows live. If you start writing for the phone you quickly find there is an “E-Mail compose task”, but if you does not let you add files from your app as attachments. If you have a small amount of text to send you can can put it in the body of the message. I found that the following code works …

EmailComposeTask ect = new EmailComposeTask();
ect.Subject = "Data from  Windows phone";
using (StreamReader sr = new StreamReader(
               appStorage.OpenFile("myFile.Txt", System.IO.FileMode.Open)))
{
  ect.Body = sr.ReadToEnd();
  ect.Body = ect.Body.Substring(0, 30000);
  }
  ect.Show();

But it’s unsuitable for anything but text and worse, if you want to send more than 32 KB it will crash

image

.NET experts might say “just look at System.Net.Mail”, but the compact framework on the phone doesn’t support the mail parts of System.Net, or the sockets parts (in case you harbour visions of connecting to Port 25 and hand cranking through SMTP – an old party piece of mine), it doesn’t even support ftp. If your app wants to get data in or out it must be over http… 

There are 3rd party storage services on the internet, but very few of them publish an API (Microsoft don’t publish the API for SkyDrive, and most of the other free storage providers follow suit.) As far as can I can find, no one yet has implemented something which can be simply added to an app to give “upload file” functionality. (No sooner will I push the publish button on this post than someone will of). Coding for the that APIs I was able to find would add a lot of complexity but it was also in the back of my mind that, I don’t want people who use my app to be required to sign up with a new storage provider. More to the point I want the data on my PC, not in the cloud so maybe if I could code something to do that … Everything I read said “use WCF” but assumed  pre-existing knowledge beyond mine. what I had.  So this is a crash guide to WCF for one purpose – moving files – and if you are new to WCF you only need to know that:

  • It allows clients to call methods/functions/procedures – “operations” on a server (PC), your work goes into writing the operations, not hooking them up to the network.
  • It works over HTTP so the phone can be a client
  • Access to the sever is via the SOAP protocol, which allows Visual Studio to add “proxy classes” which handle accessing the remote operations, so essentially you say “Make a connection”, “call this”, “call that” without worrying that what you are calling is remote. 

I am going to implement a very simple service, in the simplest possible way. It is going to be a Windows Command Line program, modelled on an example found on MSDN. To set itself up to listen on a particular port it needs to be run as administrator (you can read more about why, and how to avoid this requirement but it is beyond the scope of this post). This means to start debugging it Visual Studio needs to be run as Administrator, and the finished program needs to be Run As Administrator.

The service takes ONE kind of request that contains a file name and text which is to be written to the end of the file (if the file doesn’t already exist, it will be created). That’s it, the following are left as exercises for the reader

  • Any kind of security or authentication. If you leave the server part running and someone accesses it, they can add or change files on your computer.
  • Any handling of “state”,  beyond rudimentary flow control.
  • Handling non-text files.

If you are trying to do this yourself, it can be done in VB (see the MSDN example) but here is a step by step tutorial in C#

  • Start by creating a new project in Visual Studio, use a Windows Console Application  project
  • Add a reference to System.ServiceModel.
  • You should have one file open (program.cs), add to the using statements,
      using System.ServiceModel;
      using System.ServiceModel.Description;
  • as the first lines in your namespace add
    [ServiceContract]
    public interface Itransfer {
             [OperationContract]
            string NewOrAppendString(string fileName, string content);     

    }
  • This defines the service interface clients will be able to connect to and the operations it can perform.
    You can see there is one service and one operation, named NewOrAppendString, which takes two arguments – strings named filename and content, and returns a string as a result.
  • Next add a class which will implement Itransfer. The method(s) just defined must have matching one(s) in this class, like so:
    public class Transfer : Itransfer
        { public string NewOrAppendString(string fileName, string content)
            { try
                { using (var file = System.IO.File.Open(
                      Environment.GetFolderPath(Environment.SpecialFolder.Desktop) +
                       @"\" + fileName, System.IO.FileMode.Append))
                    {using (var sw = new System.IO.StreamWriter(file))
                       {sw.WriteLine(content); }
                    }
                   return "OK";
                }
                catch { return "failed"; }
           }
    }
  • That’s the service defined, notice the file is being saved to the desktop; obviously the program could be more polished. The same applies to the following which will start the service. Visual studio should have give the Program class a main procedure so inside  static void Main(string[] args) we can insert.
    string endpoint      = "http://localhost:31365/Transfer/Service";
    Uri baseAddress      = new Uri(endpoint);
    ServiceHost Selfhost = new ServiceHost(typeof(Transfer), baseAddress);           
    try  {
               Console.WriteLine("Attempting to start listening on " + endpoint);
               Selfhost.AddServiceEndpoint( typeof(Itransfer),
                                            new BasicHttpBinding(), "Transfer");
               ServiceMetadataBehavior smb = new ServiceMetadataBehavior();
               smb.HttpGetEnabled          = true;
               Selfhost.Description.Behaviors.Add(smb);
               Selfhost.Open();
               Console.WriteLine("Press <ENTER> to terminate service.");
               Console.ReadLine();
               Selfhost.Close();
            }
    catch (CommunicationException ce) {
               Console.WriteLine("An Exception occurrred : {0}", ce.Message);
               Selfhost.Abort();
    }

  • This is not hard to decipher, the MetaDataBehavior is set to tell the service to publish its description – without that Visual studio can’t get create the proxy class. I’m using a fixed port and in a polished program we’d let the user specify it, we’d display the IP address to connect to and so on. A “production” version might be more sophisticated in a lot of ways, and obviously you can offer more operations: Want to remotely control a program ? Send a query to a database ? If you can write a method for it, it should be possible (though you should read about the need to serialize data types). Powerful stuff.  
  • The next step is to start the program. Copy the end point to the clipboard and paste it into the address box in Internet explorer . You should get the metadata description page, which tells you a little about the service. Try replacing localhost with the IP address of your computer and test that, then try that URL in IE on your phone / phone emulator.
  • You will probably find that Windows firewall blocks the connection from a phone, but allows it from the emulator (or anything else running on the same PC). If this happens start the Windows Firewall management console, click “Incoming rules” (top left), and then “New Rule” (top right).   Specify “port” in the wizard on the next page select TCP “and enter the port number (31365 in my example code),  select “allow the connection” on the next page, and select all the networks on the one after that and on the last page give the rule a name like “phone transfer”. Unless you can get the page in IE the rest will not work so don’t underestimate the importance of this step.
  • Now you can add a service reference to  project in Visual studio, if you are trying to copy Click for a larger versionmy steps here the simplest way is to create a new phone application. In solution explorer, right click and add a new Service reference. Paste the URL into the address box and press GO. In a moment you should get a message “1 service(s) found”.

    The one service we have is named “transfer” and if you expand that you’ll see that it implements Itransfer which has an operation named NewOrAppendString. You can rename the Namespace. at the bottom I changed mine to “Transfer”

  • With the service reference in place it is possible to define a variable as a member of the phone apps’s main page 
    Transfer.ItransferClient client;
  • Then I use it in some code attached to a button in my phone app
    string addrText = http://192.168.1.100:31365/Transfer/Service/Transfer"; 
    client = new Transfer.ItransferClient("BasicHttpBinding_Itransfer", addrText );
    client.NewOrAppendStringCompleted +=
                 new  EventHandler< Transfer.NewOrAppendStringCompletedEventArgs >
                        (client_NewOrAppendStringCompleted);
    client.CloseCompleted +=
                 new EventHandler<System.ComponentModel.AsyncCompletedEventArgs>
                        (client_CloseCompleted);
    String mytext = "here is some text" + Environment.NewLine ; client.NewOrAppendStringAsync("Test.txt", mytext );
    client.CloseAsync;
  • I’m changing the address of the service from “localhost” to my machine’s IP address, and normally we’d have a mechanism for this information to be put into the program. The name that’s used in the next line is a bit hard to discover , it is the address specified when the service reference was added ( http://192.168.1.100:31365/Transfer/Service/ ), with service name (Transfer) appended to the end. If you look back at the console program you can see where that name is specified.  BasicHttpBinding was specified for the end point in that program too, and the end point name is the binding name joined to the interface name with an underscore.
  • Client has a method to call NewOrAppendString asynchronously. It will then raise an event to say the call has completed (the same applies to closing the client connection), so we add a couple of procedures to handle these events, for now they can use MessageBox.Show to say they have been called.
  • At this point it is necessary to break the app. Stop the service, then run the app and click the button and you will get an error in the code generated by adding the service reference.
    public string EndNewOrAppendString(System.IAsyncResult result) {   
        object[] _args = new object[0];
        string _result = ((string)(base.EndInvoke("NewOrAppendString", _args, result)));
        return _result;}
  • The simplest thing is to wrap a try , catch round this so that a failure does not cause the program to die (rather charmingly the phone tidies up after a crash so you don’t see the error, the program just goes away). We can look at the result in the event handler and see if we had an error. 
    public string EndNewOrAppendString(System.IAsyncResult result) {
        object[] _args = new object[0];
        string _result = "";
        try { _result = ((string)   <etc as above> ;}
        catch { }
        return _result;
    }
  • Now if you run the app again the call will come back without crashing and you can look for success and assume failure if you don’t explicitly get success.. So now you can start the command line program and try again. If all is well a file should appear on your desktop .
  • This is all grand, but you’ll find when you want to send files that there is a limit to how big the request can be: I send my data when it exceeds 4000 bytes to remain inside this limit. You can also out for yourself (by modifying this code to send the numbers from 1 to 1000 as separate requests) there  is no guarantee that requests will be processed in the order they are sent.  My solution is to only send the next block of a file when I have had the OK returned from the current one.  I define a string , transferFileName, and a stream reader, transferFileReader at the Page level in the phone app along with transferClient then I have the following code for my button
    private void btnSend_Click(object sender, RoutedEventArgs e){
        transferFileName    = filesList.SelectedItem.ToString();
        transferFileReader  = new StreamReader(appStorage.OpenFile(transferFileName ,
                                                       System.IO.FileMode.Open));
        string myText        = transferFileReader.ReadLine();
        string addrText      ="http://192.168.1.100:31365/Transfer/Service/Transfer";
       transferClient          = new Transfer.ItransferClient("BasicHttpBinding_Itransfer",
                                                       addrText);
       client.NewOrAppendStringCompleted +=
                 new  EventHandler< Transfer.NewOrAppendStringCompletedEventArgs >
                        (client_NewOrAppendStringCompleted);
       client.CloseCompleted +=
                 new EventHandler<System.ComponentModel.AsyncCompletedEventArgs>
                        (client_CloseCompleted);
        transferClient.NewOrAppendStringAsync(transferFileName , myText);
    }
  • This is much the same as before except I’m reading from a file and the name for the destination and reader object are accessible to other methods. Astute readers will also have spotted that I only send one line of the file. That’s it. Nothing can happen until the completed event fires and its method gets called, which looks like this :
    void client_NewOrAppendStringCompleted(object sender,
                                         NewOrAppendStringCompletedEventArgs e)
    {
        if (e.Result.ToUpper() == "OK")
            {  string myText = "";
                while ((myText.Length < 4000) && (!transferFileReader.EndOfStream))
                    { myText += transferFileReader.ReadLine() + Environment.NewLine;}
                if (myText != "")
                   { transferClient.NewOrAppendStringAsync(transferFileName, myText); }
            }
        if ((transferFileReader.EndOfStream) | (e.Result.ToUpper() != "OK"))
            {  transferClient.CloseAsync(); }
        if (e.Result.ToUpper() != "OK")
            { var foo = MessageBox.Show("An error occurred in the transfer"); }
    }

    void client_CloseCompleted(object sender,
           System.ComponentModel.AsyncCompletedEventArgs e)
    {  transferFileReader.Close(); }

  • So if the result is “OK” we read from the file until we have 4K of data or hit the end of the file. We send the text if there is any (we’ll immediately hit the end of file if we got the OK back from the very last block in the file, and we want to avoid an infinite loop). We client if we got to the end, or hit an error. And if we hit an error we say so, and when the client closes, we close the file. Even with a couple of extra bits for specifying the port and server name it comes out at only about 60 lines of C# It takes me 50-something lines to handle input of server name and port and persist them between settings. 

Eventually I’ll write a nicer and more versatile version, but as I’ve been at pains to point out, my purpose is to show how this stuff works anyone going through the same thing can learn from what I went through.

December 20, 2010

Why tablets shouldn’t take Windows Phone (in this release)

Filed under: Uncategorized — jamesone111 @ 11:41 am

The last few posts show how much Windows phone 7 has had my attention in the the last few days and I’ve been thinking about versions.

Some people say products don’t come right until version 3. Version 1 is about getting a product out and the saying goes “to ship is to choose”. Planning for Vn+1 starts before you ship Vn and with little customer feedback to work with, V2 plans start as “put back what was cut to ship V1”. It’s V3 and beyond that incorporate lessons from the field Although I’ve seen Windows Phone 7 called a V1 product it incorporates a lot of lessons from past products.

Then there are point releases. Phone 7 is a .0 release. Point zeros are the big shifts: think of Windows 2.x-3.0 , or 3.x to 95 , or 9x/ME & NT4 to  Windows 2000 or 2000/XP to Vista. Point zeros are sometimes where the wheels fall off; nearly 20 years haven’t erased the memory of DOS 4.0; and Windows 6.0 (Vista) got a bad name which it never shed. Windows 6.1 (which is what Windows 7 is technically) has people talking about Microsoft getting its mojo back; Windows 2000 (NT 5) wasn’t widely adopted but 5.1 (Windows XP) became entrenched, and so on.

The rumour mill has turned to when Windows Phone 8 will come and what it might contain and what might be in point releases of 7, how many point releases and when might they be? I don’t know the answers, but like a lot of people I have given some thought to what I’d like to see. I said in my review of my Windows 7 phone the improvements I want fall into 3 groups. 

  1. Improve support of cloud services (or the Services themselves)
  2. Make it easier to get stuff on and off the phone
  3. Allow developers to do more.

A couple of  friends of mine think it would be great to see Windows Phone 7 on slate devices, there are two blockers to that: first is Microsoft view of the world, which I described in a previous post

portable computing is carrying your computer (your office) with you – if you don’t want to lug a desktop replacement around to achieve that, then a tablet PC or a netbook is a better way. …  the iPad has proved the market exists for something that is neither personal computer (and a Mac is a personal computer in this context)  nor pocket sized.

It’s wrong to say Microsoft say anything bigger than a Phone must absolutely, non-negotiably must run full Windows, because there have long been Windows CE devices in the not-quite-a-PC part of the market – Microsoft seems to take the lack of success enjoyed by these devices as supporting their view that above a certain size full Windows is better. Engineers at HTC (or LG or Samsung) could easily mate the main board of a smart phone with an 8” display (instead of a 4” one) doubling the diagonal quadruples the area,  so battery space increases four-fold (without making it any thicker) allowing enough capacity to match the iPad. I haven’t investigated Samsung’s Android-based Galaxy tab enough to know if Samsung have done much more than that; but I if device makers told Microsoft they wanted to use Windows phone in slates I don’t believe they’d be told “You can’t. You know where to get Android.”

Here’s a thought: the second blocker is that shortcomings in Windows Phones 7.0 – which might be fixed in 7.1 or 7.5 or whatever it ends up being called – are more painful on a tablet.

To see what I mean look at Office. On my HTC trophy I can sync my One-Note notes via Windows-Live SkyDrive so all my notes come out with me and new ones I jot down while I’m out and about get back to my PC. It’s what I want in my pocket.  But what about PowerPoint – do we really want to take lots of slide decks out on our phones ? Microsoft must hope not because the Zune software which owns all transfers only syncs media files , not documents.  I can open a slide deck from SkyDrive, but I can’t save it back there much less sync it. The only sync option is to have internet-facing SharePoint (and unless it has changed very recently Microsoft itself doesn’t have that for its own staff).  If SharePoint lives on the corporate network it’s accessed remotely from a PC with VPN or Windows 7’s fabulous direct access, but there is no VPN or direct access client in Windows Phone 7.o. Take the phone to the office and if the WiFi there uses certificate based authentication you can’t connect (that was in Pocket PC and Windows mobile but gone from WP7).  Plug in a memory card ? Nope: there’s no socket for one.

The main way PPTx , XLSx and DOCx files will reach the phone, and the only way they will leave is via email, but the limitations don’t matter much. The phone gives the functional equivalent of having a printed copy. It’s no accident that “Comment” is one of only 3 or 4 buttons on the App bar in Word and Excel; a major use to receive documents by mail, comment on them and send a copy back (Third party apps can’t access mail to send their files, but office ones can.)  But using the 4” screen to  show a slide deck to a client or put together a complex document or spreadsheet, aren’t things I expect to often (or at all).

Limitations which are acceptable on a phone aren’t on a slate: lack of cut and paste changes from an occasional and minor inconvenience to a complete disaster. Anyone who wants a keyboard will soon discover the phone’s Bluetooth stack doesn’t support one.  I might not want to take my whole office with me, but I’d want more documents on the device and sync’d back to the cloud or the office than I would with a phone.

A handful of changes to WP7 which are desirable for phones are essential for slates:

  • Shared storage. Today developers can write files only to their app’s Isolated Storage area. Those files are invisible when composing a new mail message – and there is no “attach file to message” API.  A parallel set of file calls to open / read / write files in shared storage would solve a lot of problems, especially if an additional storage device and Windows Live Skydrive both appeared in “shared storage”. A Live mesh client syncing all or part of shared storage would be useful too. 
  • Corporate network support (VPN and certificate based authentication) at least on a par with Mobile 6.5
  • Extensible sync. An option to enable sync of an app’s Isolated Storage at the phone, and a popup in the Zune software which says “Would you like to sync TypeX files to and from your phone ? Select a (PC) folder”. Certificates for logging on to corporate networks (and the requests to generate them) could be sent this way, so could ring tones.

I think the tablet market is going to get more diverse, with different form factors, price points and capabilities. There are some cheap (and frankly quite nasty) Android tablets, the iPad as a premium device, Windows 7 devices (this one from RM looks nice and  Dell’s Inspiron Duo is getting a lot of attention) when Windows Phone 7.x is ready I won’t be surprised to see it on a tablet either.

December 16, 2010

How to get ISP mail over SMTP on Vodafone.

Filed under: Uncategorized — jamesone111 @ 11:15 pm

There are some days when I want to be really sarcastic to companies – and the reason I don’t is when a company truly deserves it, it is only possible to vent on some poor customer service rep who isn’t to blame and can’t fix it. For example

Dear Vodafone, I would like to nominate the person or team responsible for your email software for an award in the category of user-mendacious* software….

For background, I use 2 ISPs – when I got cable in 2000, I kept my mailbox with my Dial-up ISP, for which they charge me a token fee; as soon as I hooked up the cable, they wouldn’t let me send outbound mail through their server.  I understood why this was; servers have one of two rules: either a message must be sent to a mailbox we control or it must come from a network address we control. The opposite of these rules says “A message sent through us but not from or to one of our people is probably spam”. So you need to send through the server of the connection provider, not the mailbox provider (Or give up on a mail clients which use Pop and SMTP protocols and access your mail through a web browser). ISPs understand this and usually have an outbound (SMTP) server named SMTP.ispName.com (or .net or whatever). Authentication is taken care of by network address: remember the thinking “our users are OK, unless we see them spamming”.  

Until a couple of weeks ago every GPRS or 3G connection I had made had been made via Orange – the carrier provides the network so I need to use their server , i.e. SMTP.Orange.com. It was a piece of cake to have my Exchange AND ISP mailbox from my phones. Moving from Orange to Vodafone I thought I would just have to use SMTP.Vodafone.com. I was wrong.

imageFirstly my attempts to find a simple guide to setting up on Vodafone’s web site proved fruitless. Now that I know the server name I thought I’d do a search to see if is there at all.  You can see what I got.

So first, if you are trying to do this, you might be told to use Smtp.vodafone.com, Smtp.vodafone.co.uk, Smtp.vodafone.net or send.vodafone.net; but that is because a lot of the advice you will find is out of date – even if you do find it via Vodafone’s own site search. I’m sure the .net one worked for me only a matter of days ago but as I write in December 2010 they don’t. 

imageThe mail server for Vodafone is SMTP.360.COM.
I had to find this out from a text chat with a Vodafone technician. 360.com ?  What’s that ? Oh yes, I remember Vodafone stuck a link on the phone called “My vodafone” which goes to something at 360.com – here’s a screen shot. 
Whatever…  SMTP.360.COM needs to go in the “Outgoing (SMTP) server” box.  Tap tap tap “logon error”. The person in the text chat said I did not need to enter credentials. But … excuse me Vodafone, you can tell from my IP address that I am one of your customers so why are things structured so I need to log on ? and what credentials do I use? 

Searches keep saying “Use your vodafone mail account” I didn’t think I had one, and I don’t want one. If I say I have over 20 years experience configuring mail systems, you’ll understand how you humiliating the next bit was. I called the help desk.  And this is the reason for not getting sarcastic with the poor customer service rep: she could fix my problem. So It turns out you need to go to 360.com and create a user account. This has nothing whatsoever to do with any account you might already have at Vodafone, you then set the mail settings to “Outbound server requires authentication”, put in the user name (just the name, no @360 or similar nonsense), and password. Then you can forget all about 360 (unless you want an example of how carriers think they’re adding value). 

Since I’m doing all of this on a Windows Phone 7 device I wish that Microsoft could impose something on carriers to get this right. Couldn’t Microsoft, Apple and the others team up to make this happen ?  


Footnote
* The description of “User mendacious” was one that Douglas Adams applied to the computer game of the HitchHikers Guide to the galaxy.

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.