James O'Neill's Blog

July 24, 2017

An extra for my PowerShell profile–Elevate

Filed under: Uncategorized — jamesone111 @ 7:15 pm

More than usual, in the last few days I’ve found myself starting PowerShell or the ISE only to find I wanted a session as administrator : it’s a common enough thing but eventually I said ENOUGH!  I’d seen “-verb runas” to start an executable as administrator , so I added this to my profile.

Function Elevate        {
<#
.Synopsis
    Runs an instance of the current program As Administrator
#>

    Start-Process (Get-Process -id $PID).path -verb runas
}

Advertisements

June 2, 2017

On writing understandable scripts.

Filed under: Uncategorized — jamesone111 @ 7:20 pm

 

At two conferences recently I gave a talk on “What makes a good PowerShell module”  (revisiting an earlier talk) the psconf.eu guys have posted a video of it and I’ve made the slides available (the version in US used the same slide deck with a different template). .

One of the my points was Prefer the familiar way to the clever way. A lot of us like the brilliant PowerShell one-liner (I belong to the “We don’t need no stinking variables” school and will happily pipe huge chains of commands together). But sometimes breaking it into multiple commands means that when you return it later or someone new picks up what you have done, it is easier to understand what is happening.  There are plenty of other examples, but generally clever might be opaque ; opaque needs comments and somewhere I picked up that what applies to jokes, applies to programming: if you have to explain it, it isn’t that good.

Sometimes, someone doesn’t now the way which is familiar to everyone else, and they throw in something like this example which I used in the talk:
Set-Variable -Scope 1 -Name "variableName" -Value $($variableName +1)
I can’t recall ever using Set-Variable, and why would someone use it to to set a variable to its current value + 1? The key must be in the -scope parameter, –scope 1 means “the parent scope” , most people would write $Script:VariableName ++ or $Global:VariableName ++ When we encounter something like this, unravelling what Set-Variable is doing interrupts the flow of understanding … we have to go back and say “so what was happening when that variable was set …” 

There are lots of cases where there are multiple ways to do something some are easier to understand but aren’t automatically the one we pick: all the following appear to do the same job

"The value is " + $variable
"The value is " + $variable.ToString()
"The value is $variable"
"The value is {0}" -f $variable
"The value is " -replace "$",$variable

You might see .ToString() and say “that’s thinking like a C# programmer” … but if $variable holds a date and the local culture isn’t US the first two examples will produce different results (to string will use local cultural settings) .
If you work a lot with the –f operator , you might use {0:d} to say “insert the first item in ‘short date’ format for the local culture” and naturally write
“File {0} is {1} bytes in size and was changed on {2}” –f $variable.Name,$variable.Length,$variable.LastWriteTime
Because the eye has to jump back and forth along the line to figure out what goes into {0} and then into {1} and so on, this loses on readability compared concatenating the parts with + signs, it also assumes the next person to look at the script has the same familiarity with the –f operator. I can hear old hands saying “Anyone competent with PowerShell should be familiar with –f” but who said the person trying to understand your script meets your definition of competence.   
As someone who does a lot of stuff with regular expressions, I might be tempted by the last one … but replacing the “end of string” marker ($) to as a way of appending excludes people who aren’t happy with regex.  I’m working on something which auto-generates code at the moment and it uses this because the source that it reads doesn’t provide a way of specifying “append”, but has “replace”. I will let it slide in this case but being bothered by it is a sign that I do ask myself “are you showing you’re clever or writing something that can be worked on later”.  Sometimes the only practical way is hard, but if there is a way which takes an extra minute to write and pays back when looking at the code in a few months time.   

November 30, 2016

Powershell Piped Parameter Peculiarities (and a Palliative pattern!)

Filed under: Uncategorized — jamesone111 @ 7:33 am

Writing some notes before sharing a PowerShell module,  I did a quick fact check and rediscovered a hiccup with piped parameters and (eventually) remembered writing a simplified script to show the problem – 3 years ago as it turns out. The script appears below: it has four parameter sets and all it does is tell us which parameter set was selected: There are four parameters: A is in all 4 sets, B is in Sets 2,3 and 4, C is only in 3 and D is only in set 4. I’m not really a fan of parameter sets but they help intellisense to remove choices which don’t apply. 

function test { 
[CmdletBinding(DefaultParameterSetName="PS1")]
param (  [parameter(Position=0, ValueFromPipeLine=$true)]
         $A
         [parameter(ParameterSetName="PS2")]
         [parameter(ParameterSetName="PS3")]
         [parameter(ParameterSetName="PS4")]
         $B,
         [parameter(ParameterSetName="PS3", Mandatory)]
         $C,
         [parameter(ParameterSetName="PS4", Mandatory)]
         $D
)
$PSCmdlet.ParameterSetName
}

So lets check out what comes back for different parameter combinations
> test  1
PS1

No parameters or parameter A only gives the default parameter set. Without parameter C or D it can’t be set 3 or 4, and with no parameter B it isn’t set 2 either.

> test 1 -b 2
PS2
Parameters A & B or parameter B only gives parameter set 2, – having parameter B it must be set 2,3 or 4 and but 3 & 4 can be eliminated because C and D are missing. 

> test 1 -b 2 –c 3 
PS3

Parameter C means it must be set 3 (and D means it must be set 4) ; so lets try piping the input for parameter A
> 1 | test 
PS1
> 1 | test  -b 2 -c 3
PS3

So far it’s as we’d expect.  But then something goes wrong.
> 1 | test  -b 2
Parameter set cannot be resolved using the specified named parameters

Eh ? If data is being piped in, PowerShell no longer infers a parameter set from the absent mandatory parameters.  Which seems like a bug. And I thought about it: why would piping something change what you can infer about a parameter not being on the command line? Could it be uncertainty whether values could come from properties the piped object ? I thought I’d try this hunch
   [parameter(ParameterSetName="PS3", Mandatory,ValueFromPipelineByPropertyName=$true)]
  $C,
  [parameter(ParameterSetName="PS4", Mandatory,ValueFromPipelineByPropertyName=$true)]
  $D

This does the trick – though I don’t have a convincing reason why two places not providing the values works better than one – (in fact that initial hunch doesn’t seem to stand up to logic) . This (mostly) solves the problem– there could be some odd results if parameter D was named “length” or “path” or anything else commonly used as a property name. I also found in the “real” function that adding ValueFromPipelineByPropertyName to too many parameters – non mandatory ones – caused PowerShell to think a set had been selected and then complain that one of the mandatory values was missing from the piped object. So just adding it to every parameter isn’t the answer

July 1, 2016

Just enough admin and constrained endpoints. Part 2: Startup scripts

Filed under: Uncategorized — jamesone111 @ 1:36 pm

In part 1 I looked at endpoints and their role in building your own JEA solution, and said applying constraints to end points via a startup script did these things

  • Loads modules
  • Hides cmdlets, aliases and functions from the user.
  • Defines which scripts and external executables may be run
  • Defines proxy functions to wrap commands and modify their functionality
  • Sets the PowerShell language mode, to further limit the commands which can be run in a session, and prevent new ones being defined.

The endpoint is a PowerShell RunSpace running under its own user account (ideally a dedicated account) and applying the constraints means a user connecting to the endpoint can do only a carefully controlled set of things. There are multiple ways to set up an endpoint, I prefer to do it with using a start-up script, and below is the script I used in a recent talk on JEA. It covers all the points and works but being an example the scope is extremely limited :

$Script:AssumedUser  = $PSSenderInfo.UserInfo.Identity.name
if ($Script:AssumedUser) {
   
Write-EventLog -LogName Application -Source PSRemoteAdmin -EventId 1 -Message "$Script:AssumedUser, Started a remote Session"
}
# IMPORT THE COMMANDS WE NEED
Import-Module -Name PrintManagement -Function Get-Printer

#HIDE EVERYTHING. Then show the commands we need and add Minimum functions
if (-not $psise) { 
    Get-Command -CommandType Cmdlet,Filter,Function | ForEach-Object  {$_.Visibility = 'Private' }
    Get-Alias                                       | ForEach-Object  {$_.Visibility = 'Private' }
    #To show multiple commands put the name as a comma separated list 
    Get-Command -Name Get-Printer                   | ForEach-Object  {$_.Visibility = 'Public'  } 

    $ExecutionContext.SessionState.Applications.Clear()
    $ExecutionContext.SessionState.Scripts.Clear()

    $RemoteServer =  [System.Management.Automation.Runspaces.InitialSessionState]::CreateRestricted(
                                     
[System.Management.Automation.SessionCapabilities]::RemoteServer)
    $RemoteServer.Commands.Where{($_.Visibility -eq 'public') -and ($_.CommandType -eq 'Function') } |
              
ForEach-Object {  Set-Item -path "Function:\$($_.Name)" -Value $_.Definition }
}

#region Add our functions and business logic
function Restart-Spooler {
<#
.Synopsis
    Restarts the Print Spooler service on the current Computer
.Example
    Restart-Spooler
    Restarts the spooler service, and logs who did it  
#>

    Microsoft.PowerShell.Management\Restart-Service -Name "Spooler"
    Write-EventLog -LogName Application -Source PSRemoteAdmin -EventId 123 -Message "$Script:AssumedUser, restarted the spooler"
}
#endregion
#Set the language mode
if (-not $psise) {$ExecutionContext.SessionState.LanguageMode = [System.Management.Automation.PSLanguageMode]::NoLanguage}

Logging
Any action taken from the endpoint will appear to be carried out by privileged Run As account, so the script needs to log the name of the user who connects runs commands. So the first few lines of the script get the name of the connected user and log the connection: I set-up PSRemoteAdmin as a source in the event log by running.  
New-EventLog -Source PSRemoteAdmin -LogName application

Then the script moves on to the first bullet point in the list at the start of this post: loading any modules required; for this example, I have loaded PrintManagement. To make doubly sure that I don’t give access to unintended commands, Import-Module is told to load only those that I know I need.

Private functions (and cmdlets and aliases)
The script hides the commands which we don’t want the user to have access to (we’ll assume everything). You can try the following in a fresh PowerShell Session (don’t use one with anything you want to keep!)

function jump {param ($path) Set-Location -Path $path }
(Get-Command set-location).Visibility = "Private"
cd \
This defines jump as a function which calls Set-Location – functionally it is the same as the alias CD; Next we can hide Set-location, and try to use CD but this returns an error
cd : The term 'Set-Location' is not recognized
But Jump \ works: making something private stops the user calling it from the command line but allows it to be called in a Function. To stop the user creating their own functions the script sets the language mode as its final step 

To allow me to test parts of the script, it doesn’t hide anything if it is running in the in the PowerShell ISE, so the blocks which change the available commands are wrapped in  if (-not $psise) {}. Away from the ISE the script hides internal commands first. You might think that Get-Command could return aliases to be hidden, but in practice this causes an error. Once everything has been made Private, the Script takes a list of commands, separated with commas and makes them public again (in my case there is only one command in the list). Note that script can see private commands and make them public, but at the PowerShell prompt you can’t see a private command so you can’t change it back to being public.

Hiding external commands comes next. If you examine $ExecutionContext.SessionState.Applications and $ExecutionContext.SessionState.Scripts you will see that they are both normally set to “*”, they can contain named scripts or applications or be empty. You can try the following in an expendable PowerShell session

$ExecutionContext.SessionState.Applications.Clear()
ping localhost
ping : The term 'PING.EXE' is not recognized as the name of a cmdlet function, script file, or operable program.
PowerShell found PING.EXE but decided it wasn’t an operable program.  $ExecutionContext.SessionState.Applications.Add("C:\Windows\System32\PING.EXE") will enable ping, but nothing else.

So now the endpoint is looking pretty bare, it only has one available command – Get-Printer. We can’t get a list of commands, or exit the session, and in fact PowerShell looks for “Out-Default” which has also been hidden. This is a little too bare; we need to Add constrained versions of some essential commands;  while to steps to hide commands can be discovered inside PowerShell if you look hard enough, the steps to put in the essential commands need to come from documentation. In the script $RemoteServer gets definitions and creates Proxy functions for:

Clear-Host   
Exit-PSSession
Get-Command  
Get-FormatData
Get-Help     
Measure-Object
Out-Default  
Select-Object

I’ve got a longer explanation of proxy functions here, the key thing is that if PowerShell has two commands with the same name, Aliases beat Functions, Functions beat Cmdlets, Cmdlets beat external scripts and programs. “Full” Proxy functions create a steppable pipeline to run a native cmdlet, and can add code at the begin stage, at each process stage for piped objects and at the end stage, but it’s possible to create much simpler functions to wrap a cmdlet and change the parameters it takes; either adding some which are used by logic inside the proxy function, removing some or applying extra validation rules. The proxy function PowerShell provides for Select-Object only supports two parameters: property and InputObject, and property only allows 11 pre-named properties. If a user-callable function defined for the endpoint needs to use the “real” Select-Object – it must call it with a fully qualified name: Microsoft.PowerShell.Utility\Select-Object (I tend to forget this, and since I didn’t load these proxies when testing in the ISE, I get reminded with a “bad parameter” error the first time I use the command from the endpoint).  In the same way, if the endpoint manages active directory and it creates a Proxy function for Get-ADUser, anything which needs the Get-ADUser cmdlet should specify the ActiveDirectory module as part of the command name.

By the end of the first if … {} block the basic environment is created. The next region defines functions for additional commands; these will fall mainly into two groups: proxy functions as I’ve just described and functions which I group under the heading of business logic. The end point I was creating had “Initialize-User” which would add a user to AD from a template, give them a mailbox, set their manager and other fields which appear in the directory, give them a phone number, enable them Skype-For-Business with Enterprise voice and set-up Exchange voice mail, all in one command. How many proxy and business logic commands there will be, and how complex they are both depend on the situation; and some commands – like Get-Printer in the example script – might not need to be wrapped in a proxy at all.
For the example I’ve created a Restart-Spooler command. I could have created a Proxy to wrap Restart-Service and only allowed a limited set of services to be restarted. Because I might still do that the function uses the fully qualified name of the hidden Restart-Service cmdlet, and I have also made sure the function writes information to the event log saying what happened. For a larger system I use a 3 digits where the first indicates the type of object impacted (1xx for users , 2xx for mailboxes and so on) and the next two what was done (x01 for Added , x02 for Changed a property).

The final step in the script is to set the language mode. There are four possible language modes Full Language is what we normally see; Constrained language limits calling methods and changing properties to certain allowed .net types, the MATH type isn’t specifically allowed, so [System.Math]::pi will return the value of pi, but [System.Math]::Pow(2,3) causes an error saying you can’t invoke that method, the SessionState type isn’t on the allowed list either so trying to change the language back will say “Property setting is only allowed on core types”. Restricted language doesn’t allow variables to be set and doesn’t allow access to members of an object (i.e. you can look at individual properties, call methods, or access individual members of an array), and certain variables (like $pid) are not accessible. No language stops us even reading variables 

Once the script is saved it is a question of connecting to the end point to test it. In part one I showed setting-up the end point like this
$cred = Get-Credential
Register-PSSessionConfiguration -Name "RemoteAdmin"       -RunAsCredential $cred `
                                -ShowSecurityDescriptorUI
-StartupScript 'C:\Program Files\WindowsPowerShell\EndPoint.ps1'
The start-up script will be read from the given path for each connection, so there is no need to do anything to the Session configuration when the script changes; as soon as the script is saved to the right place I can then get a new session connecting to the “RemoteAdmin” endpoint, and enter the session. Immediately the prompt suggests something isn’t normal:

$s = New-PSSession -ComputerName localhost -ConfigurationName RemoteAdmin
Enter-PSSession $s
[localhost]: PS>

PowerShell has a prompt function, which has been hidden. If I try some commands, I quickly see that the session has been constrained

[localhost]: PS> whoami
The term 'whoami.exe' is not recognized…

[localhost]: PS> $pid
The syntax is not supported by this runspace. This can occur if the runspace is in no-language mode...

[localhost]: PS> dir
The term 'dir' is not recognized ….

However the commands which should be present are present. Get-Command works and shows the others

[localhost]: PS> get-command
CommandType  Name                    Version    Source
-----------  ----                    -------    ------
Function     Exit-PSSession
Function     Get-Command
Function     Get-FormatData
Function     Get-Help
Function     Get-Printer                 1.1    PrintManagement                                                                                        
Function     Measure-Object
Function     Out-Default
Function     Restart-Spooler
Function     Select-Object

We can try the following to show how the Select-object cmdlet has been replaced with a proxy function with reduced functionality:
[localhost]: PS> get-printer | select-object -first 1
A parameter cannot be found that matches parameter name 'first'.

So it looks like all the things which need to be constrained are constrained, if the functions I want to deliver – Get-Printer and Restart-Spooler – if  work properly I can create a module using
Export-PSSession -Session $s -OutputModule 'C:\Program Files\WindowsPowerShell\Modules\remotePrinters' -AllowClobber -force
(I use -force and -allowClobber so that if the module files exist they are overwritten, and if the commands have already been imported they will be recreated.)  
Because PowerShell automatically loads modules (unless $PSModuleAutoloadingPreference tells it not to), saving the module to a folder listed in $psModulePath means a fresh PowerShell session can go straight to using a remote command;  the first command in a new session might look like this

C:\Users\James\Documents\windowsPowershell> restart-spooler
Creating a new session for implicit remoting of "Restart-Spooler" command...
WARNING: Waiting for service 'Print Spooler (Spooler)' to start...

The message about creating a new session comes from code generated by Export-PSSession which ensures there is always a session available to run the remote command. Get-PSSession will show the session and Remove-PSSession will close it. If a fix is made to the endpoint script which doesn’t change the functions which can be called or their parameters, then removing the session and running the command again will get a new session with the new script. The module is a set of proxies for calling the remote commands, so it only needs to change to support modifications to the commands and their parameters. You can edit the module to add enhancements of your own, and I’ve distributed an enhanced module to users rather than making them export their own. 

You might have noticed that the example script includes comment-based help – eventually there will be client-side tests for the script, written in pester, and following the logic I set out in Help=Spec=Test, the test will use any examples provided. When Export-PsSession creates the module, it includes help tags to redirect requests, so running Restart-Spooler –? locally requests help from the remote session; unfortunately requesting help relies on a existing session and won’t create a new one.

June 27, 2016

Technical Debt and the four most dangerous words for any project.

Filed under: Uncategorized — jamesone111 @ 9:15 am

I’ve been thinking about technical debt. I might have been trying to avoid the term when I wrote Don’t swallow the cat, or more likely I hadn’t heard it, but I was certainly describing it – to adapt Wikipedia’s definition it is the future work that arises when something that is easy to implement in the short run is used in preference to the best overall solution”. However it is not confined to software development as Wikipedia suggests.
“Future work” can come from bugs (either known, or yet to be uncovered because of inadequate testing), design kludges which are carried forward, dependencies on out of date software, documentation that was left unwritten… and much more besides.

The cause of technical debt is simple: People won’t say “I (or we) cannot deliver what you want, properly, when you expect it”.
“When you expect it” might be the end of a Scrum Sprint, a promised date or “right now”. We might be dealing with someone who asks so nicely that you can’t say “No” or the powerful ogre to whom you dare not say “No”. Or perhaps admitting “I thought I could deliver, but I was wrong” is too great a loss of face. There are many variations.

I’ve written before about “What you measure is what you get” (WYMIWIG) it’s also a factor. In IT we measure success by what we can see working. Before you ask “How else do you judge success?”, Technical debt is a way to cheat the measurement – things are seen to be working before all the work is done. To stretch the financial parallel, if we collect full payment without delivering in full, our accounts must cover the undelivered part – it is a liability like borrowing or unpaid invoices.

Imagine you have a deadline to deliver a feature. (Feature could be a piece of code, or an infrastructure service however small). Unforeseeable things have got in the way. You know the kind of things: the fires which apparently only you know how to extinguish, people who ask “Can I Borrow You”, but should know they are jeopardizing your ability to meet this deadline, and so on.
Then you find that doing your piece properly means fixing something that’s already in production. But doing that would make you miss the deadline (as it is you’re doing less testing than you’d like and documentation will have to be done after delivery). So you work around the unfixed problem and make the deadline. Well done!
Experience teaches us that making the deadline is rewarded, even if you leave a nasty surprise for whoever comes next – they must make the fix AND unpick your workaround. If they are up against a deadline they will be pushed to increase the debt. You can see how this ends up in a spiral: like all debt, unless it is paid down, it increases in future cycles.

The Quiet Crisis unfolding in Software Development has a warning to beware of high performers, they may excel at the measured things by cutting corners elsewhere. It also says watch out for misleading metrics – only counting “features delivered” means the highest performers may be leaving most problems in their wake. Not a good trait to favour when identifying prospective managers.

Sometimes we can say “We MUST fix this before doing anything else.”, but if that means the whole team (or worse its manager) can’t do the thing that gets rewarded then we learn that trying to complete the task properly can be unpopular, even career limiting. Which isn’t a call to do the wrong thing: some things can be delayed without a bigger cost in the future; and borrowing can open opportunities that refusing to ever take on any debt (technical or otherwise) would deny us. But when the culture doesn’t allow delivery plans to change, even in the face of excessive debt, it’s living beyond its means and debt will become a problem.

We praise delivering on-time and on-budget, but if capacity, deadline and deliverables are all fixed, only quality is variable. Project management methodologies are designed to make sure that all these factors can be varied and give project teams a route to follow if they need to vary by too great a margin. But a lot of work is undertaken without this kind of governance. Capacity is what can be delivered properly in a given time by the combination of people, skills, equipment and so on, each of which has a cost. Increasing headcount is only one way to add capacity, but if you accept adding people to a late project makes it later then it needs to be done early. When me must demonstrate delivery beyond our capacity, it is technical debt that covers the gap.

Forecasting is imprecise, but it is rare to start with plan we don’t have the capacity to deliver. I think another factor causes deadlines which were reasonable to end up creating technical debt.

The book The Phoenix Project has a gathered a lot of fans in the last couple of years, and one of its messages is that Unplanned work is the enemy of planned work. This time management piece separates Deep work (which gives satisfaction and takes thought, energy, time and concentration) from Shallow work (the little stuff). We can do more of value by eliminating shallow work and the Quiet Crisis article urges managers to limit interruptions and give people private workspaces, but some of each day will always be lost to email, helping colleagues and so on.

But Unplanned work is more than workplace noise. Some comes from Scope Creep, which I usually associate with poor specification, but unearthing technical debt expands the scope, forcing us to choose between more debt and late delivery. But if debt is out in the open then the effort to clear it – even partially – can be in-scope from the start.
Major incidents can’t be planned and leave no choice but to stop work and attend to them. But some diversions are neither noise, nor emergency. “Can I Borrow You?” came top in a list of most annoying office phrases and “CIBY” serves as an acronym for a class of diversions which start innocuously. These are the four dangerous words in the title.

The Phoenix Project begins with the protagonist being made CIO and briefed “Anything which takes focus away from Phoenix is unacceptable – that applies to whole company”. For most of the rest of the book things are taking that focus. He gets to contrast IT with manufacturing where a coordinator accepts or declines new work depending on whether it would jeopardize any existing commitments. Near the end he says to the CEO Are we even allowed to say no? Every time I’ve asked you to prioritize or defer work on a project, you’ve bitten my head off. …[we have become] compliant order takers, blindly marching down a doomed path”. And that resonates. Project steering boards (or similarly named committees) can to assign capacity to some projects and disappoint others. Without one – or if it is easy to circumvent – we end up trying to deliver everything and please everyone;  “No” and “What should I drop?” are answers, we don’t want to give especially to those who’ve achieved their positions by appearing to deliver everything, thanks to technical debt.

Generally, strategic tasks don’t compete to consume all available resources. People recognise these should have documents covering

  • What is it meant to do, and for whom? (the specification / high level design)
  • How does it do it? (Low level design, implementation plan, user and admin guides)
  • How do we know it does what it is meant to? (test plan)

But “CIBY” tasks are smaller, tactical things; they often lack specifications: we steal time for them from planned work assuming we’ll get them right first time, but change requests are inevitable. Without a spec, there can be no test plan: yet we make no allowance for fixing bugs. And the work “isn’t worth documenting”, so questions have to come back to the person who worked on it.  These tasks are bound to create technical debt of their own and they jeopardize existing commitments pushing us into more debt.

Optimistic assumptions aren’t confined to CIBY tasks. We assume strategic tasks will stay within their scope: we set completion dates using assumptions about capacity (the progress for each hour worked) and about the number of hours focused on the project each day. Optimism about capacity isn’t a new idea, but I think planning doesn’t allow for shallow / unplanned work – we work to a formula like this:
TIME = SCOPE / CAPACITY
In project outcomes, debt is a fourth variable and time lost to distracting tasks a fifth. A better formula would look like this
DELIVERABLES = (TIME * CAPACITY) – DISTRACTIONS + DEBT  

Usually it is the successful projects which get a scope which properly reflects the work needed, stick to it, allocate enough time and capacity and hold on to it. It’s simple in theory, and projects which go off the rails don’t do it in practice, and fail to adjust. The Phoenix Project told how failing to deliver “Phoenix” put the company at risk. After the outburst I quoted above, the CIO proposes putting everything else on hold, and the CEO, who had demanded 100% focus on Phoenix, initially responds “You must be out of your right mind”. Eventually he agrees, Phoenix is saved and the company with it. The book is trying to illustrate many ideas, but one of them boils down to “the best way to get people to deliver what you want is to stop asking them to deliver other things”.

Businesses seem to struggle to set priorities for IT: I can’t claim to be an expert in solving this problem, but the following may be helpful

Understanding the nature of the work. Jeffrey Snover likes to say “To ship is to choose”. A late project must find an acceptable combination of additional cost, overall delay, feature cuts, and technical debt. If you build websites, technical debt is more acceptable than if you build aircraft. If your project is a New Year’s Eve firework display, delivering without some features is an option, delay is not. Some feature delays incur cost, but others don’t.

Tracking all work: Have a view of what is completed, what is in Progress, what is “up next”, and what is waiting to be assigned time. The next few points all relate to tracking.
Work in progress has already consumed effort but we only get credit when it is complete. An increasing number of task in progress may mean people are passing work to other team members faster than their capacity to complete it or new tasks are interrupting existing ones.
All work should have a specification
before it starts. Writing specifications takes time, and “Create specification for X” may be task in itself.
And yes, I do know that technical people generally hate tracking work and writing specifications. 
Make technical debt visible. It’s OK to split an item and categorize part as completed and the rest as something else. Adding the undelivered part to the backlog keeps it as planned work, and also gives partial credit for partial delivery – rather than credit being all or nothing. It means some credit goes to the work of clearing debt.
And I also know technical folk see “fixing old stuff” as a chore, but not counting it just makes matters worse.
Don’t just track planned work. Treat jobs which jumped the queue, that didn’t have a spec or that displaced others like defects in a manufacturing process – keep the score, and try to drive it down to zero. Incidents and “CIBY” jobs might only be recorded as an afterthought but you want see where they are coming from and try to eliminate them at source.

Look for process improvements. if a business is used to lax project management, it will resist attempts to channel all work through a project steering board. Getting stakeholders together in a regular “IT projects meeting” might be easier, but get the key result (managing the flow of work).

And finally Having grown-up conversations with customers.
Businesses should understand the consequences of pushing for delivery to exceed capacity; which means IT (especially those in management) must be able to deliver messages like these.
“For this work to jump the queue, we must justify delaying something else”
“We are not going be able to deliver [everything] on time”, perhaps with a follow up of “We could call it delivered when there is work remaining but … have you heard of technical debt?”

February 26, 2014

Depth of field

Filed under: Uncategorized — jamesone111 @ 7:51 pm

Over the years I have seen a lot written about Depth of Field and recently I’ve seen it explained wrongly but with great passion. So I thought I would post the basic formulae, show how they are derived and explain how they work in practical cases.

So first: a definition. When part of an image is out of focus that’s not an on/off state. There’s massively blurred, a little blurred, and such a small amount of blur it still looks sharply focused: if we magnify the “slightly out of focus” parts enough we can see that they are not sharply focused. Depth of field is a measure of how far either side of the point of focus, appears to be properly focused (even though it is very slightly out of focus), given some assumptions about how much the image is magnified. 

image

When a lens focuses an image, the lens-to subject distance, D, and lens-to-image distance, d, are related to the focal length with the equation
1/D + 1/d = 1/f

We can rearrange this to derive the distance to the subject (D) in terms of focal length (f) and image distance (d)
D = df/(d-f)

Since d is always further than f, we can write the difference as Δ and replace d with f+Δ. Putting that into the previous equation makes it
D = (f2+Δf)/Δ which re-arranges to
D = (f2/Δ)+f

When Δ is zero the lens is focused at infinity, so if you think of that position as the start point for any lens, to focus nearer to the camera we move lens away from the image by distance of Δ

The formula be rearranged as the “Newtonian form” of the equation
D-f = f2 , therefore
Δ(D-f) = f2 and since Δ = (d-f)
(d-f)(D-f) = f2

We can work out a focus scale for a lens using D = (f2/Δ)+f . Assume we have a 60mm lens, and it moves in or out 1/3mm for each 30 degrees of turn ; 
When the lens is 60mm from the image we can mark ∞ at the 12 O’clock position: Δ = 0 and D= ∞,
if we turn the ∞ mark to the 1 O’clock position (30 degrees) Δ = 1/3 and D= 3600/(1/3) = 10800 + 60 = 10.86 M, so we can write 10.9 in at the new 12 O’clock position
turn the ∞ mark to the 2 O’clock position (60 degrees)  Δ = 2/3 and  D= 3600/(2/3) = 5400 + 60 = 5.46 M , so we can write 5.5 at the latest 12 O’clock position
turn the ∞ mark to the 3 O’clock position (90 degrees)  Δ = 1  and D= 3600 + 60 = 3.66 M, so this time we write 3.7 at 12 O’clock
turn the ∞ mark to the 4 O’clock position (120 degrees) Δ = 4/3 and D= 3600/(4/3) = 2700 + 60 = 2.76 M, so 2.8 goes on the scale at 12 O’clock
turn the ∞ mark to the 5 O’clock position (150 degrees) Δ = 5/3 and D= 3600/(5/3) = 2160 + 60 = 2.22 M
turn the ∞ mark to the 6 O’clock position (180 degrees) Δ = 2 and  D= 3600/2 = 1800 + 60 = 1.86 M so we can 2.2 and 1.9 to the scale to finish the job

And so on. For simplicity of calculation we often consider the extra 60mm insignificant and D ≈(f2/Δ) is usually close enough. It’s also worth noting that the roles of D as subject distance and d as image distance can be swapped – the whole arrangement is symmetrical.

In the diagram above, the blue lines show the lens is focused at a distance D, which has a lens to image distance (d) of (f+Δ) , something further away than D will not come to a sharp focus at the image plane, but some distance in front of it (something nearer than D will come to a focus behind the image plane). Focused rays of light form a cone: if the point of the code is not on the image plane, the rays form a disc which is called “a circle of confusion”. The red lines in the diagram illustrate the case for something at infinity and show how smaller a aperture width (bigger f/ number) leads to a smaller circle of confusion. 
The only factors which determine the size of the circle that is formed are focal length, aperture, and the distance between the lens and the image (i.e. the distance at which the lens is focused) Two set-ups using the same focal length, and same aperture, focused at the same distance will produce the same size circle regardless of the size of the recording medium which captures the image, the size of the image circle produced by the lens or any other factor.

A point at infinity will form an image at a distance f behind the lens (that’s the definition of focal length) and so we know it forms a image Δ in front of the film/sensor in the setup in the diagram.
The red lines form two similar triangles between the lens and the image. The “base” of the large one is w (the aperture width) and its "height" is f.
We normally write aperture as a ratio between width and focal length, e.g. f/2 means the aperture’s width is half the focal length.
So f = aw (where a is the f/ number) , so we can say this triangle has a base of w and a height of w*a

The base of smaller triangle is the circle of confusion from the mis-focused point at infinity.
This circle’s diameter is normally written as c, so using similar triangles the height of the smaller triangle must be its base * a, so:
Δ = c * a

As the lens moves further away from the image, the circle for the point at infinity gets bigger: a small enough circle looks like a point and but there comes a size where we can see it is a circle.
If we know that size we, can calculate the value of Δ as c*a and since we know that D = (f2/Δ) + f, we can define the subject distance when a point at infinity starts to look out of focus as
(f2/ca) + f  .
This distance is known as the hyperfocal distance (H) strictly, H = (f2/ca ) + f,  but it usually accurate enough to write H ≈ f2/ca ;
It later we’ll use a rearrangement of this: since Δ = ca, this simplified form of the equation can be turned into Δ≈f2/H

image

We can see that we get the same size circle if the image plane is c*a in front of where the image would focus as well as c*a behind it, so we can say
(1) for an subject at distance is D, the lens to image distance is approximately (f2/D)+f  (more accurately it is  (f2/(D-f))+f ) and
(2) the zone of apparent sharp focus runs from anything which would be in focus at (f2/D)+f -ca to anything which would be in focus at (f2/D)+f + ca

This formula is accurate enough for most purposes: but it would be more accurate to say the range runs ((f2/(D-f))+f)*(f+ca)/f  to ((f2/(D-f))+f)*(f-ca)/f because this accounts for  Δ getting slightly bigger as d increases for nearer and nearer subjects.  The error is biggest at short distances with wide apertures.
A  35mm frame with c=0.03 and an aperture of f/32, gives c*a ≈ 1. If we focus a 50mm lens at 1m (f2/D)+f = 52.5,
So the simple form of the formula would say an image formed 51.5-53.5mm behind the lens is in the “in focus zone”. The long form is 51.45- 53.55.
So instead of the dof extending from  1.716M to 0.764m  it actually goes from 1.774 to 0.754m.   
Since we only measure distance to 1 or two significant figures, aperture to 1 or 2 significant figures (and f/22 is really f/23) and focal length to the nearest whole mm (and stated focal length can be inaccurate by 1 or 2 mm) the simple formula gives the the point where most people kind-of feel that the image isn’t really properly focused to enough accuracy.  

It’s also worth noting that the if we have a focus scale like the one out lined above the same distance either side of a focus will give the same Δ, so we can calculate Δ for each aperture mark, and put depth of field scale marks on a lens.

∞    11.    5.5    3.7    2.8    2.2    1.9M 
^ | ^

If we want to work out D.o.F numbers (e.g. to make our own tables) , we know that the lens to image distance for the far point (df ) is (f2/Df)+f  and for the near point, (dn) it is (f2/Dn)+f

therefore,  f2/Df + f = f2/D + f – Δ    (or + Δ for the near point)

we can remove +f from each side and get f2/Df  = f2/D – Δ ;

since Δ = f2/H, we can rewrite this it as f2/Df  = f2/D  – f2/H ;

the f2 terms cancel out so we get  1/Df = 1/D – 1/H , for the far point and for the near point 1/Dn = 1/D + 1/H  ;

We can rewrite these as  1/Df =(H-D)/(H*D), for the far point and for the near point 1/Dn =(H+D)/(H*D) so

Df = HD/(H-D), for the far point and for the near point Dn =HD/(H+D)

These produce an interesting series

Focus Distance (D) Near Point(Dn) Far Point(Df)
H
H H/2
H/2 H/3 H
H/3 H/4 H/2
H/4 H/5 H/3
H/5 H/6 H/4

In other words, if the focus distance is H/x the near point is H/(x+1) and the far point is H/(x-1).

[These formulae: Dn = H/((H/D)+1)  ,  Df = H/((H/D)-1)  can be re-arranged to Dn = H/((H+d)/D)  ,  Df = H/((H-D)/D) and then to Dn = HD/(H+d), Df = HD/(H-D) – the original formulae ]

This can useful for doing a quick mental d.o.f calculation. A 50mm lens @ f/8 on full frame has a hyperfocal distance of roughly 10m (502/(.03*8) +50 = 10.46M). If I focus at 1M (roughly H/10) the near point is H/11 = 0.90909M and the far point is 1.1111M so I have roughly 9CM in front and 11 CM behind

Earlier I said “As the lens moves further away from the image, the circle gets bigger: a small enough circle looks like a point and but there comes a size where it starts looking like a circle. If we know that size…

How much of the image the circle occupies, determines whether it is judged to be still in focus, or a long way out of focus. So value for c must be proportional to the size of the image, after any cropping has been done.

By convention 35mm film used c=0.03mm and APS-C crop sensor cameras use c=0.02mm. Changing image (sensor) size changes allowable circle size c, and so changes Δ , and so the depth of field scale on a lens designed for one size of of image needs to be adjusted if used on a camera where the image is different size (on an APS-C camera reading the scale for 1 stop wider aperture than actually set will give roughly the right reading).

Size of the circle formed does not depend on image size but allowable circle size does and hyperfocal distance and apparent depth of field change when c changes

Changing sensor size (keeping same position with the same lens and accepting a change of framing).

If we use two different cameras – i.e. use a different circle size – at the same spot, focused on the same place and we use the same focal length and same aperture on both, then the one with the smaller image has less depth of field. It doesn’t matter how we get to the smaller image, whether it is by cropping a big one or starting with a smaller film/sensor size.

We get less D.o.F because c has become smaller, so f2/ca – the hyperfocal distance has moved further away. When you look at f2/ca,  a smaller value of c needs a larger value of a to compensate.

Changing sensor size and focal length (getting the same framing from same position)

If we use two cameras – with different circle size – and use different focal lengths to give the same angle of view, but keep the same aperture then the larger image will have less depth of field because the f and c have gone up by the same factor , but f is squared in the equation. A larger value of a is needed to compensate for f being squared.

So: a 50mm @ f/8 on full frame has the approximate field of view and depth of field of a 35mm @ f/5.6 on APS-C . If that’s we want, the full frame camera needs to use a slower shutter speed or higher ISO to compensate, which have their own side effects.

If we want the depth that comes from the 35mm @ f/32 on APS-C , the 50 might not stop down to f/44 to give the same depth on Full Frame.

But if we use the 50 @ f/1.4 to isolate a subject from the background on full frame the 35 probably doesn’t open up to f/1

Changing focal length and camera position

People often think of perspective as a function of the angle of view of the lens. Strictly that isn’t correct : perspective is a function of the ratios of subject to camera distances. If you have two items the same size with one a meter behind the other and you stand a meter from the nearer one, the far one is 2M away, and will appear 1/2 the size. If you stand 10 meters from the first (and therefore 11 meters from the second), the far object will appear 10/11ths of the size. It doesn’t matter what else is in the frame. But: if you fit a wider lens the natural response is to move closer to the subject : it is that change of viewpoint which causes that the change of perspective. Changing focal length and keeping position constant means the perspective is constant, and the framing changes. Changing focal length and keeping framing constant means a change of position and with it a change of perspective.

If you have two lenses for the same camera and a choice between standing close with a wide angle lens or further away with a telephoto (and accepting the change of perspective for the same framing) we can work out the distances.

Let’s say with the short lens, H is 10 and you stand 5 meters away.

The near point is (10 * 5) / (10+5) = 3.33 : 1.67 meters in front

The far point is (10 * 5) / (10-5) = 10 : 5 meters behind = 6.67 in total

If we double the focal length and stand twice as far away the hyperfocal distance increases 4 fold (if the circle size and aperture don’t change), so we get a d.o.f zone like this

(40*10) / (40+10) = 8 : 2 meters in front

(40*10) / (40-10) = 13.33: 3.33 meters behind =5.33 in total.

Notice the background is more out of focus with the long lens, but there is actually MORE in focus in front of the subject. The wider lens includes more "stuff" in the background and it is sharper – which is why long lenses are thought of as better at isolating a subject from the background.

Changing camera position and sensor size.

If you only have one lens and your choice is to move further away and crop the image (or use a smaller sensor) or come close and use a bigger image what we can calculate that too: keeping the full image / close position as the first case from the previous example we would keep the near point 1.6667 meters in front and a far point 5 meters behind = 6.67 in total

If we use half the sensor width, we halve c and double H, if we double the distances we have doubled every term in the equation.

(20*10) / (20+10) = 6.6667 : 3.33 meters (in front)

(20*10) / (20-10) = 20 : 10 meters (behind) – 13.33Meters in total, so you get twice as much in the zone either side of the image but dropping back and cropping.

June 30, 2012

Using the Windows index to search from PowerShell:Part Two – Helping with user input

Filed under: Uncategorized — jamesone111 @ 10:46 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 26 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place


In part one I developed a working PowerShell function to query the Windows index. It outputs data rows which isn’t the ideal behaviour and I’ll address that in part three; in this part I’ll address another drawback: search terms passed as parameters to the function must be "SQL-Ready". I think that makes for a bad user experience so I’m going to look at the half dozen bits of logic I added to allow my function to process input which is a little more human. Regular expressions are the way to recognize text which must be changed, and I’ll pay particular attention to those as I know I lot of people find them daunting.

Replace * with %

SQL statements use % for wildcard, but selecting files at the command prompt traditionally uses *. It’s a simple matter to replace – but for the need to "escape" the* character, replacing * with % would be as simple as a –replace statement gets:
$Filter = $Filter -replace "\*","%"
For some reason I’m never sure if the camera maker is Canon or Cannon so I’d rather search for Can*… or rather Can%, and that replace operation will turn "CameraManufacturer=Can*" into "CameraManufacturer=Can%". It’s worth noting that –replace is just as happy to process an array of strings in $filter as it is to process one.

Searching for a term across all fields uses "CONTAINS (*,’Stingray’)", and if the -replace operation changes* to % inside a CONTAINS() the result is no longer a valid SQL statement. So the regular expression needs to be a little more sophisticated, using a "negative look behind"
$Filter = $Filter -replace " "(?<!\(\s*)\*","%"

In order to filter out cases like CONTAINS(*… , the new regular expression qualifies "Match on *",with a look behind – "(?<!\(\s*)" – which says "if it isn’t immediately preceded by an opening bracket and any spaces". In regular expression syntax (?= x) says "look ahead for x" and (?<= x) says "Look behind for x" (?!= x) is “look ahead for anything EXCEPT x” and (?<!x) is “look behind for anything EXCEPT x” these will see a lot of use in this function. Here (?<! ) is being used, open bracket needs to be escaped so is written as \( and \s* means 0 or more spaces.

Convert "orphan" search terms into ‘contains’ conditions.

A term that needs to be wrapped as a "CONTAINS" search can be identified by the absence of quote marks, = , < or > signs or the LIKE, CONTAINS or FREETEXT search predicates. When these are present the search term is left alone, otherwise it goes into CONTAINS, like this.
$filter = ($filter | ForEach-Object {
    if  ($_ -match "'|=|<|>|like|contains|freetext") 
          
{$_}
    else   {"Contains(*,'$_')"}
})

Put quotes in if the user omits them.

The next thing I check for is omitted quote marks. I said I wanted to be able to use Can*, and we’ve seen it changed to Can% but the search term needs to be transformed into "CameraManufacturer=’Can%’ ". Here is a –replace operation to do that.
$Filter = $Filter -replace "\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
This is a more complex regular expression which takes a few moments to understand

Regular expression

Meaning

Application

\s*(=|<|>|like)\s*
([^’\d][^\s’]*)$

Any spaces (or none)

 

\s*(=|<|>|like)\s*
([^’\d][^s’]*)$

= or < or > or "Like"

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^’\d][^\s’]*)$

Anything which is NOT a ‘ character
or a digit

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^’\d][^\s’]*)$

Any number of non-quote,
non-space characters (or none)

CameraManufacturer=Can%

\s*(=|<|>|like)\s*
([^’\d][^\s’]*)$

End of line

\s*(=|<|>|like)\s*
([^’\d][^\s’]*)$

Capture the enclosed sections
as matches

$Matches[0]= "=Can%"
$Matches[1]= "="
$Matches[2]= "Can%"

‘ $1 ”$2” ‘0

Replace Matches[0] ("=Can%")
with an expression which uses the
two submatches "=" and "can%".

= ‘Can%’

Note that the expression which is being inserted uses $1 and $2 to mean matches[1] and[2] – if this is wrapped in double quote marks PowerShell will try to evaluate these terms before they get to the regex handler, so the replacement string must be wrapped in single quotes. But the desired replacement text contains single quote marks, so they need to be doubled up.

Replace ‘=’ with ‘like’ for Wildcards

So far, =Can* has become =’Can%’, which is good, but SQL needs "LIKE" instead of "=" to evaluate a wildcard. So the next operation converts "CameraManufacturer = ‘Can%’ "into "CameraManufacturer LIKE ‘Can%’ ":
$Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "

Regular expression

Meaning

Application

\s*=\s*(?=’.+%’\s*$)

= sign surrounded by any spaces

CameraManufacturer = ‘Can%’

\s*=\s*(?=.+%’\s*$)

A quote character

CameraManufacturer = Can%’

\s*=\s*(?=’.+%’\s*$)

Any characters (at least one)

CameraManufacturer = ‘Can%’

\s*=\s*(?=’.+%’\s*$)

% character followed by ‘

CameraManufacturer = ‘Can%’

\s*=\s*(?=’.+%’\s*$)

Any spaces (or none)
followed by end of line

\s*=\s*(?=‘.+%’\s*$)

Look ahead for the enclosed expression but don’t include it in the match

$Matches[0] = "="
(but only if ‘Can%’ is present)

Provide Aliases

The steps above reconstruct "WHERE" terms to build syntactically correct SQL, but what if I get confused and enter “CameraMaker” instead of “CameraManufacturer” or “Keyword” instead of “Keywords” ? I need Aliases – and they should work anywhere in the SQL statement – not just in the "WHERE" clause but in "ORDER BY" as well.
I defined a hash table (a.k.a. a "dictionary", or an "associative array") near the top of the script to act as a single place to store the aliases with their associated full canonical names, like this:
$PropertyAliases = @{
    Width       = "System.Image.HorizontalSize";
    Height      = "System.Image.VerticalSize";
    Name        = "System.FileName";
    Extension   = "System.FileExtension";
    Keyword     = "System.Keywords";
    CameraMaker = "System.Photo.CameraManufacturer"
}
Later in the script, once the SQL statement is built, a loop runs through the aliases replacing each with its canonical name:
$PropertyAliases.Keys | ForEach-Object {
    $SQL= $SQL -replace "(?<=\s)$($_)(?=\s*(=|>|<|,|Like))",$PropertyAliases[$_]
}
A hash table has .Keys and .Values properties which return what is on the left and right of the equals signs respectively. $hashTable.keyName or $hashtable[keyName] will return the value, so $_ will start by taking the value "width", and its replacement will be $PropertyAliases["width"] which is "System.Image.HorizontalSize", on the next pass through the loop, "height" is replaced and so on. To ensure it matches on a field name and not text being searched for, the regular expression stipulates the name must be preceded by a space and followed by "="or "like" and so on.

Regular expression

Meaning

Application

(?<=\s)Width(?=\s*(=|>|<|,|Like))

The literal text "Width"

Width > 1024

(?<=\s)Width(?=\s*(=|>|<|,|Like))

A Space

(?<=\s)Width(?=\s*(=|>|<|,|Like))

Look behind for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if a leading space is present)

(?<=\s)Width(?=\s*(=|>|<|,|Like))

any spaces (or none)

(?<=\s)Width(?=\s*(=|>|<|,|Like))

The literal text "Like", or any of the characters comma, equals, greater than or less than

Width > 1024

(?<=\s)Width(?=\s*(=|>|<|,|Like))

Look ahead for the enclosed expression
but don’t include it in the match

$Matches[0] = "Width"
(but only if " >" is present)

If the prefix is omitted put the correct one in.

This builds on the ideas we’ve seen already. I want the list of fields and prefixes to be easy to maintain, so just after I define my aliases I define a list of field types
$FieldTypes = "System","Photo","Image","Music","Media","RecordedTv","Search"
For each type I define two variables, a prefix and a fieldslist : the names must be FieldtypePREFIX and FieldTypeFIELDS – the reason for this will become clear shortly but here is what they look like
$SystemPrefix = "System."
$SystemFields = "ItemName|ItemUrl"
$PhotoPrefix  = "System.Photo."
$PhotoFields  = "cameramodel|cameramanufacturer|orientation"
In practice the field lists are much longer – system contains 25 fieldnames not just the two shown here. The lists are written with "|" between the names so they become a regular expression meaning "ItemName or ItemUrl Or …". The following code runs after aliases have been processed
foreach ($type in $FieldTypes) {
   $fields = (get-variable "$($type)Fields").value
   $prefix = (get-variable "$($type)Prefix").value 
   $sql    = $sql -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
}
I can save repeating code by using Get-Variable in a loop to get $systemFields, $photoFields and so on, and if I want to add one more field, or a whole type I only need to change the variable declarations at the start of the script. The regular expression in the replace works like this:

Regular expression

Meaning

Application

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

Look behind for a space
but don’t include it in the match

 

(?<=\s)(?=(cameramanufacturer|
orientation
)\s*(=|>|<|,|Like))"

The literal text "orientation" or "cameramanufacturer"

CameraManufacturer LIKE ‘Can%’

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

any spaces (or none)

 

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

The literal text "Like", or any of the characters comma, equals, greater than or less than

CameraManufacturer LIKE ‘Can%’

(?<=\s)(?=(cameramanufacturer|
orientation)\s*(=|>|<|,|Like))"

Look ahead for the enclosed expression
but don’t include it in the match

$match[0] is the point between the leading space and "CameraManufacturer LIKE" but doesn’t include either.

We get the effect of an "insert" operator by using ‑replace with a regular expression that finds a place in the text but doesn’t select any of it.
This part of the function allows "CameraManufacturer LIKE ‘Can%’" to become "System.Photo CameraManufacturer LIKE ‘Can%’ " in a WHERE clause.
I also wanted "CameraManufacturer" in an ORDER BY clause to become "System.Photo CameraManufacturer". Very sharp-eyed readers may have noticed that I look for a Comma after the fieldname as well as <,>,=, and LIKE. I modified the code which appeared in part one so that when an ORDER BY clause is inserted it is followed by a trailing comma like this:
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , " ) + ","}

the new version will work with this regular expression but the extra comma will cause a SQL error and so it must be removed later.
When I introduced the SQL I said the SELECT statement looks like this:

SELECT System.ItemName, System.ItemUrl,      System.FileExtension, System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText , System.KindText,     System.Kind,     System.MIMEType,       System.Size

Building this clause from the field lists simplifies code maintenance, and as a bonus anything declared in the field lists will be retrieved by the query as well as accepted as input by its short name. The SELECT clause is prepared like this:
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", " ) + ", "}

This replaces the "|" with a comma and puts a comma after each set of fields. This means there is a comma between the last field and the FROM – which allows the regular expression to recognise field names, but it will break the SQL , so it is removed after the prefixes have been inserted (just like the one for ORDER BY).
This might seem inefficient, but when I checked the time it took to run the function and get the results but not output them it was typically about 0.05 seconds (50ms) on my laptop – it takes more time to output the results.
Combining all the bits in this part with the bits in part one turns my 36 line function into about a 60 line one as follows

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
        [Alias("Sort")][String[]]$OrderBy,
        [Alias("Top")][String[]]$First,
        [String]$Path,
        [Switch]$Recurse 
      )
$PropertyAliases = @{Width ="System.Image.HorizontalSize"; 
                    Height = "System.Image.VerticalSize"}
$FieldTypes      = "System","Photo"
$PhotoPrefix     = "System.Photo."
$PhotoFields     = "cameramodel|cameramanufacturer|orientation"
$SystemPrefix    = "System."
$SystemFields    = "ItemName|ItemUrl|FileExtension|FileName"
if ($First) 
     {$SQL = "SELECT TOP $First "}
else {$SQL = "SELECT "}
foreach ($type in $FieldTypes)
     {$SQL +=((get-variable "$($type)Fields").value -replace "\|",", ")+", " }
if ($Path -match "\\\\([^\\]+)\\.")
     {$SQL += " FROM $($matches[1]).SYSTEMINDEX WHERE "}
else {$SQL += " FROM SYSTEMINDEX WHERE "}
if ($Filter)
     {$Filter = $Filter -replace "\*","%"
      $Filter = $Filter -replace"\s*(=|<|>|like)\s*([^'\d][^\s']*)$",' $1 ''$2'' '
      $Filter = $Filter -replace "\s*=\s*(?='.+%'\s*$)" ," LIKE "
      $Filter = ($Filter | ForEach-Object {
          if ($_ -match "'|=|<|>|like|contains|freetext")
               {$_}
          else {"Contains(*,'$_')"}
      })
      $SQL += $Filter -join " AND "
    }
if ($Path)
    {if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
     $Path = $Path -replace "\\","/"
     if ($SQL -notmatch "WHERE\s$") {$SQL += " AND " }
     if ($Recurse) 
          {$SQL += " SCOPE = '$Path' "}
     else {$SQL += " DIRECTORY = '$Path' "}
}
if ($SQL -match "WHERE\s*$")
     { Write-warning "You need to specify either a path , or a filter." ; return }
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) + ","}
$PropertyAliases.Keys | ForEach-Object 
     { $SQL= $SQL -replace"(?<=\s)$($_)(?=\s*(=|>|<|,|Like))", $PropertyAliases.$_ }
foreach ($type in $FieldTypes)
   
{$fields = (get-variable "$($type)Fields").value
     $prefix = (get-variable "$($type)Prefix").value
     $SQL    = $SQL -replace "(?<=\s)(?=($Fields)\s*(=|>|<|,|Like))" , $Prefix
    }
$SQL = $SQL -replace "\s*,\s*FROM\s+" , " FROM "
$SQL = $SQL -replace "\s*,\s*$" , ""
$Provider="Provider=Search.CollatorDSO;"+ "Extended Properties=’Application=Windows’;"
$Adapter = new-object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS     = new-object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }
}

In part 3 I’ll finish the function by turning my attention to output

Using the Windows index to search from PowerShell: Part one: Building a query from user input.

Filed under: Uncategorized — jamesone111 @ 10:43 am

Note: this was originally written for the Hey,Scripting guy blog where it appeared as the 25 June 2012 episode. The code is available for download . I have some more index related posts coming up so I wanted to make sure everything was in one place


I’ve spent some time developing and honing a PowerShell function that gets information from the Windows Index– the technology behind the search that is integrated into explorer in Windows 7 and Vista. The Index can be queried using SQL and my function builds the SQL query from user input, executes it and receives rows of data for all the matching items. In Part three, I’ll look at why rows of data aren’t the best thing for the function to return and what the alternatives might be. Part two will look at making user input easier – I don’t want to make an understanding SQL a prerequisite for using the function. In this part I’m going to explore the query process.

We’ll look at how at how the query is built in a moment, for now please accept that a ready-to-run query stored in the variable $SQL. Then it only takes a few lines of PowerShell to prepare and run the query

$Provider="Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$adapter = new-object system.data.oledb.oleDBDataadapter -argument $sql, $Provider
$ds      = new-object system.data.dataset
if ($adapter.Fill($ds)) { $ds.Tables[0] }

The data is fetched using oleDBDataAdapter and DataSet objects; the adapter is created specifying a "provider" which says where the data will come from and a SQL statement which says what is being requested. The query is run when the adapter is told to fill the dataset. The .fill() method returns a number, indicating how many data rows were returned by the query – if this is non-zero, my function returns the first table in the dataset. PowerShell sees each data row in the table as a separate object; and these objects have a property for each of the table’s columns, so a search might return something like this:

SYSTEM.ITEMNAME : DIVE_1771+.JPG
SYSTEM.ITEMURL : file:C:/Users/James/pictures/DIVE_1771+.JPG
SYSTEM.FILEEXTENSION : .JPG
SYSTEM.FILENAME : DIVE_1771+.JPG
SYSTEM.FILEATTRIBUTES : 32
SYSTEM.FILEOWNER : Inspiron\James
SYSTEM.ITEMTYPE : .JPG
SYSTEM.ITEMTYPETEXT : JPEG Image
SYSTEM.KINDTEXT : Picture
SYSTEM.KIND : {picture}
SYSTEM.MIMETYPE : image/jpeg
SYSTEM.SIZE : 971413

There are lots of fields to choose from, so the list might be longer. The SQL query to produce it looks something like this.

SELECT System.ItemName, System.ItemUrl,        System.FileExtension,
       System.FileName, System.FileAttributes, System.FileOwner, 
       System.ItemType, System.ItemTypeText ,  System.KindText, 
       System.Kind,     System.MIMEType,       System.Size
FROM   SYSTEMINDEX
WHERE  System.Keywords = 'portfolio' AND Contains(*,'stingray')

In the finished version of the function, the SELECT clause has 60 or so fields; the FROM and WHERE clauses might be more complicated than in the example and an ORDER BY clause might be used to sort the data.
The clauses are built using parameters which are declared in my function like this:

Param ( [Alias("Where","Include")][String[]]$Filter ,
        [Alias("Sort")][String[]]$orderby,
        [Alias("Top")][String[]]$First,
        [String]$Path,
        [Switch]$Recurse
)

In my functions I try to use names already used in PowerShell, so here I use -Filter and -First but I also define aliases for SQL terms like WHERE and TOP. These parameters build into the complete SQL statement, starting with the SELECT clause which uses -First

if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields

If the user specifies –First 1 then $SQL will be "SELECT TOP 1 fields"; otherwise it’s just "SELECT fields". After the fields are added to $SQL, the function adds a FROM clause. Windows Search can interrogate remote computers, so if the -path parameter is a UNC name in the form \\computerName\shareName the SQL FROM clause becomes FROM computerName.SYSTEMINDEX otherwise it is FROM SYSTEMINDEX to search the local machine.
A regular expression can recognise a UNC name and pick out the computer name, like this:

if ($Path -match "\\\\([^\\]+)\\.") {
$sql += " FROM $($matches[1]).SYSTEMINDEX WHERE "
}
else {$sql += " FROM SYSTEMINDEX WHERE "}

The regular expression in the first line of the example breaks down like this

Regular expression

Meaning

Application

\\\\([^\\]+)\\.

2 \ characters: "\" is the escape character, so each one needs to be written as \\

\\computerName\shareName

\\\\([^\\]+)\\.

Any non-\ character, repeated at least once

\\computerName\shareName

"\\\\([^\\]+)\\."

A \,followed by any character

\\computerName\shareName

"\\\\([^\\]+)\\."

Capture the section which is enclosed by the brackets as a match

$matches[0] =\\computerName\s
$matches[1] =computerName

I allow the function to take different parts of the WHERE clause as a comma separated list, so that
-filter "System.Keywords = 'portfolio'","Contains(*,'stingray')"
is equivalent to
-filter "System.Keywords = 'portfolio' AND Contains(*,'stingray')"

Adding the filter just needs this:

if ($Filter) { $SQL += $Filter -join " AND "}

The folders searched can be restricted. A "SCOPE" term limits the query to a folder and all of its subfolders, and a "DIRECTORY" term limits it to the folder without subfolders. If the request is going to a remote server the index is smart enough to recognise a UNC path and return just the files which are accessible via that path. If a -Path parameter is specified, the function extends the WHERE clause, and the –Recurse switch determines whether to use SCOPE or DIRECTORY, like this:

if ($Path){
     if ($Path -notmatch "\w{4}:") {
           $Path = "file:" + (resolve-path -path $Path).providerPath
     }
     if ($sql -notmatch "WHERE\s*$") {$sql += " AND " }
     if ($Recurse)                   {$sql += " SCOPE = '$Path' " }
      else                           {$sql += " DIRECTORY = '$Path' "}
}

In these SQL statements, paths are specified in the form file:c:/users/james which isn’t how we normally write them (and the way I recognise UNC names won’t work if they are written as file://ComputerName/shareName). This is rectified by the first line inside the If ($Path) {} block, which checks for 4 "word" characters followed by a colon. Doing this will prevent ‘File:’ being inserted if any protocol has been specified –the same search syntax works against HTTP:// (though not usually when searching on your workstation), MAPI:// (for Outlook items) and OneIndex14:// (for OneNote items). If a file path has been given I ensure it is an absolute one – the need to support UNC paths forces the use of .ProviderPath here. It turns out there is no need to convert \ characters in the path to /, provided the file: is included.
After taking care of that, the operation -notmatch "WHERE\s*$" sees to it that an "AND" is added if there is anything other than spaces between WHERE and the end of the line (in other words if any conditions specified by –filter have been inserted). If neither -Path nor -filter was specified there will be a dangling WHERE at the end of the SQL statement .Initially I removed this with a ‑Replace but then I decided that I didn’t want the function to respond to a lack of input by returning the whole index so I changed it to write a warning and exit. With the WHERE clause completed, final clause in the SQL statement is ORDER BY, which – like WHERE – joins up a multi-part condition.

if ($sql -match "WHERE\s*$") {
     Write-warning "You need to specify either a path, or a filter."
     Return
}
if ($orderby) { $sql += " ORDER BY " + ($OrderBy -join " , ") }

When the whole function is put together it takes 3 dozen lines of PowerShell to handle the parameters, build and run the query and return the result. Put together they look like this:

Function Get-IndexedItem{
Param ( [Alias("Where","Include")][String[]]$Filter ,
         [Alias("Sort")][String[]]$OrderBy,
         [Alias("Top")][String[]]$First,
         [String]$Path,
         [Switch]$Recurse
)
if ($First) {$SQL = "SELECT TOP $First "}
else        {$SQL = "SELECT "}
$SQL += " System.ItemName, System.ItemUrl " # and the other 58 fields
if ($Path -match "\\\\([^\\]+)\\.") {
              $SQL += "FROM $($matches[1]).SYSTEMINDEX WHERE "
}
else         {$SQL += " FROM SYSTEMINDEX WHERE "}
if ($Filter) {$SQL += $Filter -join " AND "}
if ($Path) {
    if ($Path -notmatch "\w{4}:") {$Path = "file:" + $Path}
    $Path = $Path -replace "\\","/"
    if ($SQL -notmatch "WHERE\s*$") {$SQL += " AND " }
    if ($Recurse)                   {$SQL += " SCOPE = '$Path' " }
    else                            {$SQL += " DIRECTORY = '$Path' "}
}
if ($SQL -match "WHERE\s*$") {
    Write-Warning "You need to specify either a path or a filter."
    Return
}
if ($OrderBy) { $SQL += " ORDER BY " + ($OrderBy -join " , " ) }
$Provider = "Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;"
$Adapter  = New-Object system.data.oledb.oleDBDataadapter -argument $SQL, $Provider
$DS       = New-Object system.data.dataset
if ($Adapter.Fill($DS)) { $DS.Tables[0] }
}

The -Path parameter is more user-friendly as a result of the way I handle it, but I’ve made it a general rule that you shouldn’t expect the user to know too much of the underlying syntax ,and at the moment the function requires too much knowledge of SQL: I don’t want to type

Get-Indexed-Item –Filter "Contains(*,'Stingray')", "System.Photo.CameraManufacturer Like 'Can%'"

and it seems unreasonable to expect anyone else to do so. I came up with this list of things the function should do for me.

  • Don’t require the user to know whether a search term is prefixed with SYSTEM. SYSTEM.DOCUMENT, SYSTEM.IMAGE or SYSTEM.PHOTO. If the prefix is omitted put the correct one in.
  • Even without the prefixes some fieldnames are awkward for example "HorizontalSize" and "VerticalSize" instead of width and height. Provide aliases
  • Literal text in searches needs to be enclosed in single quotes, insert quotes if the user omits them.
  • A free text search over all fields is written as Contains(*,’searchTerm’) , convert "orphan" search terms into contains conditions.
  • SQL uses % (not *) for a wild card – replace* with % in filters to cope with users putting the familiar *
  • SQL requires the LIKE predicate(not =) for wildcards : replace = with like for Wildcards

In Part two, I’ll look at how I do those things.

June 25, 2012

Lonesome George

Filed under: Uncategorized — jamesone111 @ 3:15 pm

At Easter I was in the Galapagos Islands; work had taken me to Ecuador diving in the Galapagos was too good an opportunity to miss. Mainland Ecuador was a country I knew little about and two weeks working in the capital (Quito, just South of the Equator, and 10,000 feet up in the Andes) doesn’t qualify me as an expert. The client there was a good one to work with, and what I saw of the city (a bunch of Taxi rides and a bus tour on the one day we weren’t working) mean I’d go back if asked. Travel wasn’t good and the return flights so bad that I’ve vowed never to fly with Iberia again. Flying to the islands the plane had a problem which meant if it landed it couldn’t take off again so having got within sight of the islands we had to go all the way back to Quito and get another plane. Down at sea level the heat was ferocious, the transportation scary and the insect bites the worst I’ve had. But the diving… different kinds of Sharks (including a close encounter with a group of Hammerheads), Seal Lions, Turtles, Rays (including a Manta encounter on the very first dive which set the tone) – I’d put up with a lot for that. And if some search engine has steered you here, I dived with Scuba Iguana, and if I manage to go back I’ll dive with them again.

The Scuba place was pretty much next door to the Darwin station: home of giant tortoises and a tortoise breading programme. Galapagos comes from the Spanish word for saddle because the shape of the giant tortoise’s shell looked like a traditional saddle. I also learnt that some languages – including Spanish – don’t have distinct words for Tortoise and (Marine) Turtles.  The sex of tortoises is determined by the temperature at which the eggs incubate and the breeding programme gathers eggs, incubates them to get extra females, and looks after the baby tortoises keeping them safe human introduced species (like rats) which feed on eggs and baby tortoises. Each island’s tortoises are different so the eggs and hatchlings are marked up so they go back to right island. But there is no breeding programme for Pinta island (also named Abingdon island by the Royal Navy. According to a story told by Stephen Fry on QI, sailors ate giant tortoises and found them very good.)  A giant Tortoise was found on Pinta; but a search over several years failed to find a second. So he – Lonesome George, was brought to the Darwin station in the 1970s. No one knows for sure how he was then. All efforts to find a mate for him failed: so George lived out his final decades as the only known example of  Geochelone nigra abingdoni  the Pinta Galapagos tortoise.
On Easter Sunday I walked up see George and the giants from other islands who live at the station. George was keeping out of the sun; he shared an enclosure and I wondered what he made of the other species – if somewhere in that ancient reptile brain lurked a memory of others just like him, a template into which the other tortoises didn’t quite fi.

Later in trip I was asked to help with some work on survey being prepared about Hammerhead sharks. I was told they estimated as having a 20% chance of becoming extinct in the next 100 years. This statistic is quite hard to digest: my chances of being extinct in 100 years are close to 100%: so my contribution to the survey was suggest that telling people if things continue as they are the chances of seeing Hammerheads on a dive in 5 years will be X amount less than today. It’s not fair, but we care more about some species than others and I hope there will still be Hammerheads for my children to see in a few years. Sadly they won’t get the chance to see the Pinta Tortoise.

June 22, 2012

Windows Phone 8. Some Known Knowns, and Known Unknowns

Filed under: Uncategorized — jamesone111 @ 10:23 am

Earlier this week Microsoft held its Windows Phone Summit where it made a set of announcements about the next generation of Windows Phone – Windows phone 8 in summary these were

  • Hardware Support for Multi-core processors , Additional Screen resolutions, Removable memory cards, NFC
  • Software The Windows 8 core is now the base OS, Support for native code written in C or C++ (meaning better games), IE 10, new mapping (from Nokia), a speech API which apps can use. Better business-oriented features, VOIP support, a new “Wallet” experience and in-app payments, and a new start screen.

This summit came hot on the heels of the Surface tablet launch, which seemed to be a decision by Microsoft that making hardware for Windows 8 was too important to be left to the hardware makers. The first thing I noted about phone announcement was the lack of a Microsoft branded phone. I’ve said that Microsoft should make phones itself since before the first Windows Mobile devices appeared – when Microsoft was talking about “Stinger” internally; I never really bought any of the reasons given for not doing so. But I’d be astounded if Nokia didn’t demand that Microsoft promise not to make a phone (whether there’s a binding agreement or just an understanding between Messrs Elop and Ballmer we don’t know). Besides Nokia Microsoft has 3 other device makers on-board: HTC have devices for every Microsoft mobile OS since 2000, but also have a slew of devices for Android, Samsung were a launch partner for Phone 7 but since then have done more with Android ; LG were in the line up for the Windows Phone 7 launch and are replaced by Huawei.  What these 3  feel about Nokia mapping technology is a matter of guesswork but depends on branding and licensing terms.

There are some things we think we know, but actually they are things we know that we don’t know.

  • Existing 7.x phones will not run Windows Phone 8 but will get an upgrade to Windows 7.8. I have an HTC Trophy which I bought in November 2010 and it has gone from 7.0 to 7.5 and I’ll be quite happy to get 7.8. on a 2 year old phone. Someone who just bought a Nokia Lumia might not feel quite so pleased.  What will be in 7.8 ? The new start screen has been shown. But will it have IE10 ? Will it have the new mapping and Speech capabilities. The Wallet, In-app-payments ?  This matters because….
  • Programs specifically targeting Windows Phone 8 won’t run on 7. Well doh! Programs targeting Windows 7 don’t run XP. But what programs will need to target the new OS ? Phone 7 programs are .NET programs and since .NET compiles to an intermediate language not to CPU instructions, a program which runs on Windows 8-RT (previous called Windows on ARM) should go straight onto a Windows 8-intel machine (but not vice versa), and Phone 7 programs will run on Phone 8. An intriguing comment at the launch says the Phone 8 emulator runs on Hyper-V; unless Hyper-V has started translating between different CPU instruction sets this means the Emulated phone has an Intel CPU but it doesn’t matter because it is running .NET intermediate language not binary machine code. So how many new programs will be 8-specific ? – Say Skype uses the VOIP support and in-app payments for calling credit. Will users with old phones be stuck with the current version of Skype? Do they get a new version where those features don’t light up. Or do they get the same experience as someone with a new phone. If the only things which are Phone 8 specific are apps which need Multiple cores (or other newly supported hardware) there would never have been any benefit upgrading a single core phone from 7 to 8.  
  • Business support. This covers many possibilities, but what is in and what is out ? Will the encryption send keys back to base as bit-locker does ? Will there be support for Direct-Access ? Corporate wireless security ? Will adding Exchange server details auto-configure the phone to corporate settings (like the corporate app-store) . Will it be possible to block updates ? This list goes on and on.

It’s going to interesting to see what this does for Microsoft and for Nokia’s turn-round.

May 11, 2012

Lies, damn lies and public sector pensions

Filed under: Uncategorized — jamesone111 @ 9:08 am

Every time I hear that “public sector workers”  are protesting about government plans for their pensions –as happened yesterday – I think of two points I made to fellow school governors when the teachers took a day off (with the knock on cost to parents and the economy) these were

  • Does  classroom teachers understand their union wants them to subsidize the head’s pension from theirs ?  (Have the Unions explained to the classroom teachers that is what they are doing
  • Any teacher who is part of this protest has demonstrated they have insufficient grasp of maths to teach the subject (except, perhaps to the reception class.)

This second point is the easier of the two to explain: the pension paid for over your working life is a function of:

  • The salary you earned (which in turn depends on the rate at which your salary grew)
  • What fraction of your salary was paid in to your pension fund. It might be you gave up X% or your employer put in Y% that you never saw or a combination.   
  • How many years you paid in for (when you started paying in, and when you retire)
  • How well your pension fund grew before it paid out
  • How long the pension is expected to pay out for (how long you live after retirement)
  • Annuity rates – the interest earned on the pension fund as it pays out.

In addition, some people receive some pension which wasn’t paid for in this way; some employers (public or private sector) make guarantees to top up either the pension fund so it will buy a given level of annuity or to top-up  pension payments bought with the fund. The total you receive is the combination of what you have paid for directly and the top-up.
Change any factor – for example how long you expect to live – and either what you have paid for changes or the other factors have to change to compensate. Since earnings and rates of return aren’t something we control, living longer means either we have pay for a smaller pension, or we must pay in more, or retire later or some combination of all three. Demanding that the same amount will come out of a pension for longer, without paying more in is a demand for a guaranteed top-up in future – in the case of public sector employees that future top-up comes from future taxes, for private sector it comes from future profits.

It’s easy for those in Government to make pension promises because those promises don’t need to be met for 30 years or more. Teachers whose retirement is imminent would have come into the profession when Wilson, Heath or Callaghan was in Downing Street, and all 3 are dead and buried; so with the exception of Denis Healey are all the chancellors who set budgets while they were in office. It’s temping for governments to save on spending by under-contributing to pensions today, and leave future governments with the shortfall: taken to the extreme governments can turn pensions into a Ponzi scheme – this year’s “Teachers’/Police/whatever pay” bill covers what all current teachers/police officers / whoever are paid for doing the job and all pensions paid to retired ones for doing the job cheaply in the past. Since I am, more-or-less, accusing governments of all colours of committing fraud, I might as well add false accounting to the charge sheet. Let’s say the Government wants to buy a squadron of new aircraft but doesn’t want to raise taxes to pay for them all this year; it borrows money and the future liability that creates is accounted for. If the deal it makes with public sector workers is for a given amount to spend today, and a promise of a given standard of living in retirement ,does it record that promise – that future liability – as part of pay today? Take a wild guess.
This wouldn’t matter – outside accounting circles – if everything was constant. But the length of time between retirement and death has increased and keeps on increasing. For the sake of a simple example: lets assume someone born in 1920, joined the Police after world war II , served for 30 years and retired in 1975 at age 55 expecting to die at 70. Their retirement was half their length of service. Now consider someone born in 1955, who also joined the police at age 25, served for 30 years and retired 2010. Is any one making plans for their Pension to stop in 2025 ? We might reasonably expect this person to live well into their eighties – so we’ve moved from 1 retired officer for every 2 serving, to a 1:1 ratio. I’m not saying that in 1975 the ratio was 1:2 and in 2012 it is 1:1 but that’s the direction of travel. 

I’ve yet to hear a single teacher say their protests about pensions amount to a demand that they should under-fund their retirement as a matter of public policy and their pupils – who will then be tax payers – should make up difference. As one of those whose work generates the funds to pay for the public sector I must choose a combination of lower pension, later retirement, and higher contributions than I was led to expect when I started work 25 years or so ago. And there are people demanding my taxes insulate them from having to do the same; or (if you prefer) demanding a pay rise to cover the gap between what past governments have promised them and what they are actually paying for, or (and this becomes a bit uncomfortable) that government starts telling us what it really costs to have the teachers, nurses, police officers and so on we want. 

But what of my claim that Unions get low paid staff to subsidize the pensions of higher paid colleagues. Lets take two teachers; I’ll call them Alice and Bob, and since this is a simplified example they’ll fit two stereotypes: Alice sticks to working in the class room; and gets a 2% cost of living rise every year. Bob competes for every possible promotion, and gets promoted in alternate years, so he gets a 2% cost of living rise alternating with a 10% rise. Although they both started on the same salary after 9 end-of-year rises, Alice’s pay has gone up by 19.5% and Bob – who must be a head by now – has seen his rise by 74%  
Throughout the 10 years they pay 10% of their salary into their pension fund – to make the sums easy we’ll assume they pay the whole 10% on the last day of the year, and each year their pension fund manager earns them 10% of what was in their pension pot at the end of the previous year. After 10 years Alice has £17,184 in her pension pot, and Bob has £20,390 in his.

Alice (and her fellow class room teachers) are told by the Union Rep that any attempt to change from final salary as the calculation mechanism is an attack on your pension, for her, this is factually wrong. If you are ever told this you need to ask if you are a high flier like Bob or if your career is more like Alice’s. To see why it is wrong (and lets put it down to the Union rep being innumerate , rather than dishonest), lets pretend the pension scheme only has Alice and Bob in it. So the total pot is £37,574 – Alice put in 46% of that money, but of it is shared in the ratio of the final salaries 11,950 : 17,432 ,Alice gets 41% of the pay out. 
You can argue it doesn’t work like that because Alice’s pot (at 1.44 times her final salary) might just cover the percentage of her final salary she has been promised: Bob’s pension pot is only 1.17 times his final salary which will give him a smaller percentage so the government steps in and boosts his pot to be 1.44 time his  final salary just like Alice’s. So Bob gets a golden handshake of nearly £4700 and Alice gets nothing.
Suppose 1.44 years is nowhere near enough and Alice and Bob need 3 years salary to buy a large enough annuity; the government needs to find £18,668 for Alice (108% of her pot), and 31,907 for Bob (156% of his pot). Whichever way you cut and slice if your salary grows quicker than your colleagues you will do better out of final salary than they do. If it grows more slowly you will fare worse.  

Alice Bob
Salary Increase Pension Payment Pension Pot Salary Increase Pension Payment Pension Pot
Year 1   10,000.00 2%               1,000.00     1,000.00   10,000.00 10%               1,000.00     1,000.00
Year 2   10,200.00 2%               1,020.00     2,120.00   11,000.00 2%               1,100.00     2,200.00
Year 3   10,404.00 2%               1,040.40     3,372.40   11,220.00 10%               1,122.00     3,542.00
Year 4   10,612.08 2%               1,061.21     4,770.85   12,342.00 2%               1,234.20     5,130.40
Year 5   10,824.32 2%               1,082.43     6,330.36   12,588.84 10%               1,258.88     6,902.32
Year 6   11,040.81 2%               1,104.08     8,067.48   13,847.72 2%               1,384.77     8,977.33
Year 7   11,261.62 2%               1,126.16   10,000.39   14,124.68 10%               1,412.47   11,287.53
Year 8   11,486.86 2%               1,148.69   12,149.12   15,537.15 2%               1,553.71   13,970.00
Year 9   11,716.59 2%               1,171.66   14,535.69   15,847.89 10%               1,584.79   16,951.79
Year 10   11,950.93               1,195.09  17,184.35   17,432.68               1,743.27   20,390.23
Average Salary   10,949.72   13,394.10
Combined Final Salary   29,383.60 Total Pot  37,574.58
Alice’s share   15,282.37
Bob’s Share   22,292.21
Combined Average Salary   24,343.82 Total Pot   37,574.58
Alice’s share   16,900.85
Bob’s Share   20,673.73

What if the mechanism for calculating were Average salary , not final salary? It doesn’t quite remove gap but gets very close. Instead of £2,000 of Alice’s money going to Bob it’s less than £300. 
A better way to look at this is to say if the amount of money in the combined Pension pot pays £5000 a year in Pensions, do we split it as roughly £2000 to Alice and £3000 to Bob (the rough ratio of their final salaries – each gets about 1/6th of their final salary) or £2250 to Alice and £2750 to Bob (the ratio of their average salaries and each gets about 1/5th of their average).
Whenever average salary is suggested as a basis, union leaders will say that pensions are calculated from a smaller number as if it reduces the amount paid. If the government wanted to take money that way it would be simpler to say “a full pension will in future be a smaller percentage of final salary”. Changing to average-based implies an increase in the percentage paid. 

That perhaps is the final irony. Rank and file Police officers – whose career pay is like Alice’s in the example – marched through London yesterday demanding that their pensions be left alone; you do not need to spend long reading “Inspector Gadget” to realise when you remove the problems created for the Police by politicians most of the problems that are left are created by senior officers whose career pay follows the “Bob” path. Yet the “many” marching were demanding that they continue to subsidize these “few”. As Gadget himself likes to say : you couldn’t make it up.

April 22, 2012

Don’t Swallow the cat. Doing the right thing with software development and other engineering projects.

Filed under: Uncategorized — jamesone111 @ 8:30 pm

In my time at Microsoft I became aware of the saying “communication occurs only between equals.” usually couched in the terms “People would rather lie than deliver bad news to Steve Ballmer”. Replacing unwelcome truths with agreeable dishonesty wasn’t confined to the CEOs direct reports, and certainly isn’t a disease confined to Microsoft. I came across ‘The Hierarchy of Power Semantics’ more than 30 years ago when I didn’t understand what was meant by the title; it was written in the 1960s and if you don’t recognise “In the "beginning was the plan and the specification, and the plan was without form and the specification was void and there was darkness on the face of the implementation team”  see here – language warning for the easily offended.
Wikipedia says the original form of “communication occurs only between equals”  is Accurate communication is possible only in a non-punishing situation. There are those who (consciously or not) use the impossibly of saying “No” to extract more from staff and suppliers; it can produce extraordinary results, but sooner or later it goes horribly wrong. For example the Challenger disaster was caused by the failure of an ‘O’ ring in solid rocket booster made by Morton Thiokol. The engineers responsible for the booster were quite clear that in cold weather the ‘O’ rings were likely to fail with catastrophic results.  NASA asked if a launch was OK after a freezing night and fearing the consequences of saying “No” managers at Morton Thiokol over-ruled the engineers and allowed the disastrous launch to go ahead.  Most people can think of some case where someone made an impossible promise to a customer, because they were afraid to say no.

Several times recently I have heard people say something to the effect that ‘We’re so committed to doing this the wrong way that we can’t change to the right way.”  Once the person saying it was me, which was the genesis of this post. Sometimes, in a software project because saying to someone – even to ourselves – “We’re doing this wrong” is difficult, so we create work rounds. The the odd title of this post comes from a song which was played on the radio a lot when I was a kid.

There was an old lady, who swallowed a fly, I don’t know why she swallowed a fly. I guess she’ll die.
There was an old lady, who swallowed a spider that wriggled and jiggled and ticked inside her. She Swallowed the spider to catch the fly  … I guess she’ll die
There was an old lady, who swallowed a bird. How absurd to swallow a bird. She swallowed the bird to catch the spider … I guess she’ll die
There was an old lady, who swallowed a cat. Fancy that to swallow a cat. She swallowed the cat to catch the bird …  I guess she’ll die
There was an old lady, who swallowed a dog. What a hog to swallow a dog. She swallowed the dog to catch the cat … I guess she’ll die
There was an old lady, who swallowed a horse. She’s dead, of course

In other words each cure needs a further, more extreme cure.  In my case the “fly” was a simple problem I’d inherited. It would take a couple of pages to explain the context, so for simplicity it concerns database tables and the “spider” was to store data de-normalized. If you don’t spend your days working with databases, imagine you have a list of suppliers, and a list of invoices from those suppliers. Normally you would store an ID for the supplier in the invoice table, and look up the name from the supplier table using the ID. For what I was doing it was better to put the supplier name in the invoices table, and ignore the ID. All the invoices for the supplier can be looked up by querying for the name. The same technique applied to products supplied by that supplier: store the supplier name in the product table, look up products by supplier name. This is not because I didn’t know any better, I had database normal forms drummed into me two decades ago. To stick with the metaphor: I know that, under normal circumstances, swallowing spiders is bad, but faced with this specific fly it was demonstrably the best course of action.
At this point someone who could have saved me from my folly pointed out that supplier names had to be editable. I protested that the names don’t change, but Amalgamated Widgets did, in fact, become Universal Widgets. This is an issue because Amalgamated not Universal raised the invoices in the filing cabinet so matching them to invoices in the system requires preserving the name as it was when the invoice was generated. “See, I was right name should be stored” – actually this exception doesn’t show I was right at all, but on I went. On the other hand all of  Amalgamated’s products belong to Universal now. Changing names means doing a cascaded update (update any product with the old company name to the new name when a name changes) the real case has more than just products. If you’re filling in the metaphor you’ve guessed I’d reached the point of figuring out how to swallow a bird. Worse, I could see another problem looming (anyone for Cat ?): changes to products had to be exported to another system, and the list of changes had their own table requiring cascaded updates from the cascaded updates.

One of the great quotes in Macbeth says “I am in blood stepped in so far that should I wade no more, Returning were as tedious as go o’er.” he knows what he’s doing is wrong, but it is as hard to go back (and do right) as it is to go on.  Except it isn’t: the solution is not to swallow another couple more spiders and a fly, the solution is to swallow a bird, then a cat and so on.  The dilemma is that the effort for an additional work-round is smaller than the effort to go back fix the initial problem and unpick all the work-rounds to date – either needs to be done now, and the easy solution is to choose the one which needs the least effort now. The sum of effort required for future work-rounds is greater but we can discount that effort because it isn’t needed now. Only in a non-punishing situation can we tell people that progress must be halted for a time to fix a problem which has been mitigated up to now. Persuading people that such a problem needs to fixed at all isn’t trivial, I heard this quote in a Radio programme a while back

“Each uneventful day that passes reinforces a steadily growing false sense of confidence that everything is alright:
that I, we, my group must be OK because the way we did things today resulted in no adverse consequences.”

In my case the problem is being fixed at the moment, but in how many organisations is it what career limiting move to tell people that something which has had now adverse consequences to date must be fixed? 

February 4, 2012

Customizing PowerShell, Proxy functions and a better Select-String

Filed under: Uncategorized — jamesone111 @ 9:24 pm

I suspect that even regular PowerShell users don’t customize their environment much. By co-incidence, in the last few weeks I’ve made multiple customizations to my systems (my scripts are sync’d over 3 machines, customize one, customize all). Which has given me multiple things to talk about. My last post was about adding persistent history this time I want to look at Proxy Functions …

Select-String is, for my money, one of the best things in PowerShell. It looks through piped text or through files for anything which matches a regular expression (or simple text) and reports back what matched and where with all the detail you could ever need. BUT It has a couple of things wrong with it: it won’t do a recursive search for files, and sometimes the information which comes back is too detailed. I solved both problems with a function I named “WhatHas” which has been part of my profile for ages. I have been using this to search scripts, XML files and saved SQL whenever I need a snippet of code that I can’t remember or because something needs to be changed and I can’t be sure I’ve remembered which files contain it. I use WhatHas dozens (possibly hundreds) of times a week. Because it was a quick hack I didn’t support every option that Select-string has, so if a code snippet spans lines I have go back to the proper Select-String cmdlet and use its -context option to get the lines either side of the match: more often than not I find myself typing dir -recurse {something} | select-String {options}

A while back I saw a couple of presentations on Proxy functions (there’s a good post about them here by Jeffrey Snover): I thought when I saw them that I would need to implement one for real before I could claim to understand them, and after growing tired of jumping back and forth between select-string and WhatHas, I decided it was time to do the job properly creating a proxy function for Select-String and keep whathas as an alias. 

There are 3 bits of background knowledge you need for proxy functions.

  1. Precedence. Aliases beat Functions, Functions beat Cmdlets. Cmdlets beat external scripts and programs. A function named Select-String will be called instead of a cmdlet named Select-String – meaning a function can replace a cmdlet simply by giving it the same name. That is the starting point for a Proxy function.
  2. A command can be invoked as moduleName\CommandName. If I load a function named “Get-Stuff” from my profile.ps1 file for example, it won’t have an associated module name but if I load it as part of a module, or if “Get-Stuff” is a cmdlet it will have a module name.
    Get-Command get-stuff | format-list name, modulename
    will show this information You can try
    > Microsoft.PowerShell.Management\get-childitem
    For yourself. It looks like an invalid file-system path, but remember PowerShell looks for a matching Alias, then a matching Function and a then a matching cmdlet before looking for a file.
  3. Functions can have a process block (which runs for each item passed via the pipeline) a begin block (which runs before the first pass through process, and an end block (which runs after the last item has passed through process.) Cmdlets follow the same structure, although it’s harder to see.

Putting these together A function named Select-String can call the Select-String cmdlet, but it must call it as Microsoft.PowerShell.Utility\Select-String or it will just go round in a loop. In some cases, calling it isn’t quite enough and PowerShell V2 delivered the steppable pipeline which can take a PowerShell command (or set of commands piped together) and allow us to run its begin block , process block , and end block, under the control of an function. So a Proxy function looks like this :
Function Select-String {
  [CmdletBinding()]
  Param  ( Same Parameters as the real Select-String
           Less any I want to prevent people using
           Plus any I want to add
         )
   Begin { My preliminaries
           Get
$steppablePipeline
           $steppablePipeline.begin()
         } 
Process { My Per-item code against current item ($_ )
          $steppablePipeline.Process($_)
         }

     end { $steppablePipeline.End
           My Clean up code
         }
}

What would really help would be something produce a function like this template, and fortunately it is built into PowerShell: it does the whole thing in 3 steps: Get the command to be proxied, get the detailed metadata for command and build a Proxy function with the meta data, like this:
  $cmd=Get-command select-string -CommandType cmdlet
  $MetaData = New-Object System.Management.Automation.CommandMetaData ($cmd)
  [System.Management.Automation.ProxyCommand]::create($MetaData)

The last command will output the Proxy function body to the console, I piped the result into Clip.exe and pasted the result into a new definition
Function Select-String { }
And I had a proxy function.

At this point it didn’t do anything that the original cmdlet doesn’t do but that was a starting point for customizing.
The auto-generated parameters are be formatted like this
  [Parameter(ParameterSetName='Object', Mandatory=$true, ValueFromPipeline=$true)]
  [AllowNull()]
  [AllowEmptyString()]
  [System.Management.Automation.PSObject]
  ${InputObject},

And I removed some of the line breaks to reduce the screen space they use from 53 lines to about half that.
The ProxyCommand creator wraps parameter names in braces just in case something has a space or other breaking character in the name, and I took those out.
Then I added two new switch parameters -Recurse and -BareMatches.

Each of the Begin, Process and End blocks in the function contains a try...catch statement, and in the try part of the begin block the creator puts code to check if the -OutBuffer common parameter is set and if it is, over-rides it (why I’m not sure) – followed by code to create a steppable pipeline, like this:
  $wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('Select-String',
                                                           [System.Management.Automation.CommandTypes]::Cmdlet)
  $scriptCmd = {& $wrappedCmd @PSBoundParameters }
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

I decided it would be easiest to build up a string and make that into the steppable pipeline . In simplified form
   $wrappedCmd        = "Microsoft.PowerShell.Utility\Select-String " 
  $scriptText        = "$wrappedCmd @PSBoundParameters"
  if ($Recurse)      { $scriptText = "Get-ChildItem @GCI_Params | " + $scriptText }
  if ($BareMatches)  { $scriptText += " | Select-Object –ExpandProperty 'matches' " +
                                      " | Select-Object -ExpandProperty 'value'   " }  
  $scriptCmd         = [scriptblock]::Create($scriptText)  
  $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

& commandObject works in a scriptblock: the “&” sign says  “run this” and if this is a command object that’s just fine: so the generated code has scriptCmd = {& $wrappedCmd @PSBoundParameters } where $wrappedCmd  is a command object.
but when I first changed the code from using a script block to using a string I put the original object $wrappedCmd inside a string. When the object is inserted into a string, the conversion renders it as the unqualified name of the command – the information about the module is lost, so I produced a script block which would call the function, which would create a script block which would call the function which… is an effective way to cause a crash.

The script above won’t quite work on its own because
(a) I haven’t built up the parameters for Get-Childitem. So if -recurse or –barematches are specified I build up a hash table to hold them, using taking the necessary parameters from what ever was passed, and making sure they aren’t passed on to the Select-String Cmdlet when it is called. I also make sure that a file specification is passed for a recursive search it is moved from the path parameter to the include parameter.
(b) If -recurse or -barematches get passed to the” real” Select-String cmdlet it will throw a “parameter cannot be found” error, so they need to be removed from $psboundParameters.

This means the first part of the block above turns into
  if ($recurse -or $include -or $exclude) {
     $GCI_Params = @{}
     foreach ($key in @("Include","Exclude","Recurse","Path","LiteralPath")) {
          if ($psboundparameters[$key]) {
             
$GCI_Params[$key] = $psboundparameters[$key]
              [void]$psboundparameters.Remove($key)
          }
     }
     # if Path doesn't seem to be a folder it is probably a file spec
     # So if recurse is set, set Include to hold file spec and path to hold current directory
     if ($Recurse -and -not $include -and ((Test-Path -PathType Container $path) -notcontains $true) ) {
        $GCI_Params["Include"] = $path
        $GCI_Params["Path"] = $pwd
     }
   $scriptText = "Get-ChildItem @GCI_Params | "
}
else { $scriptText = ""}

And the last part is
if ($BareMatches) {
  $psboundparameters.Remove("BareMatches")
  $scriptText += " | Select-object -expandproperty 'matches' | Select-Object -ExpandProperty 'value' "
}
$scriptCmd = [scriptblock]::Create($scriptText)
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)

There’s no need for me to add anything to the process or end blocks, so that’s it – everything Select-String originally did, plus recursion and returning bare matches.

I’ve put the whole file on skydrive here

January 28, 2012

Adding Persistent history to PowerShell

Filed under: Uncategorized — jamesone111 @ 7:19 pm

The Doctor: “You’re not are you? Tell me you’re not Archaeologists”
River Song: “Got a problem with Archaeologists ?”
The Doctor: “I’m a time traveller. I point and laugh at Archaeologists”

I’ve said to several people that my opinion of Linux has changed since I left Microsoft: after all I took a job with Microsoft because I rated their products and stayed there 10 years, I didn’t have much knowledge of Linux at the start of that stay and had no call to develop any during it, so my views I had was based on ignorance and supposition. After months of dealing with it on a daily basis I know how wrong I was. In reality Linux is uglier, more dis-functional and, frankly, retarded than I ever imagined it could be. Neither of our Linux advocates suggest it should be used anywhere but servers (just as well they both have iPhones and Xboxes which are about as far from the Open source ideal as it’s possible to go). But compare piping or tab expansion in PowerShell and Bash and you’re left in no doubt which one was designed 20 years ago. “You can only pipe text ? Really ? How … quaint”.

One of guys was trying to do something with Exchange from the command-line and threw down a gauntlet.
If PowerShell’s so bloody good why hasn’t it got persistent history”
OK. This is something which Bash has got going for it. How much work would it take to fix this ? Being PowerShell the answer is “a few minutes”. Actually the answer is “a lot less time than it takes to write a blog post about it”

First a little side track various people I know have a PowerShell prompt which looks like
[123]  PS C:\users\Fred>
Where 123 was the history ID. Type H (or history, or Get-History) and PowerShell shows you the previous commands, with their history ID, the command Invoke-History <id> (or ihy for short) runs the command.
I’d used PowerShell for ages before I discovered typing #<id>[Tab] inserts the history item into the command line. I kept saying “I’ll do that one day”, and like so many things I didn’t get round to it.
I already use the history ID I have this function in my profile
Function HowLong {
   <# .Synopsis Returns the time taken to run a command
      .Description By default returns the time taken to run the last command
     
.Parameter ID The history ID of an earlier item.
   #>

   param  ( [Parameter(ValueFromPipeLine=$true)]
           
$id = ($MyInvocation.HistoryId -1)
     
    )
  process {  foreach ($i in $id) {
                 (get-history $i).endexecutiontime.subtract(
                                 
(get-history ($i)).startexecutiontime).totalseconds
            }
          }
}
Once you know $MyInvocation.HistoryID gives the ID of the current item, it is easy to change the Prompt function to return something which contains it.

At the moment I find I’m jumping back and forth between PowerShell V2, and the CTP of V3 on my laptop
(and I can run PowerShell –Version 2 to launch a V2 version if I see something which I want to check between versions).
So I finally decided I would change the prompt function. This happened about the time I got the “why doesn’t the history get saved” question. Hmmm. Working with history in the Prompt function. Tick, tick, tick.  [Side track 2 In PowerShell the prompt isn’t a constant, it is the result of a function.  To see the function use the command type function:prompt]
So here is the prompt function I now have in my profile.
Function prompt {
  $hid = $myinvocation.historyID
  if ($hid -gt 1) {get-history ($myinvocation.historyID -1 ) |
                      convertto-csv | Select -last 1 >> $logfile
 
}
  $(if (test-path variable:/PSDebugContext) { '[DBG]: ' } else { '' }) + 
    "#$([math]::abs($hid)) PS$($PSVersionTable.psversion.major) " + $(Get-Location) + 
    $(if ($nestedpromptlevel -ge 1) { '>>' }) + '> '
}

The first part is new lines are new: get the history ID and if is greater than 1, get the previous history item, convert from an object to CSV format, discard the CSV header and append it to the file named in $logFile (I know I haven’t set it yet)

The second part is lifted from the prompt function found in the default profile, that reads
"PS $($executionContext.SessionState.Path.CurrentLocation)$('>' * ($nestedPromptLevel + 1)) "
It’s actually one line but I’ve split it at the + signs for ease of reading.
I put the a # sign and the history ID before “PS” – when PowerShell starts the ID is –1 so I make sure it is the absolute value.
After “PS” I put the major version of PowerShell.
I’m particularly pleased with the #ID part in the non-ISE version of PowerShell double clicking on #ID selects it. My mouse is usually close enough to my keyboard that the keypad [Enter] key is within reach of my thumb so if I scroll up to look at something I did earlier, one flickity gesture (double-click, thumb enter, right click [tab]) has the command in the current command line.

So now I’m keeping a log, and all I need to do is to load that log my from Profile. PowerShell has an Add-History command and the on-line help talks about reading in the history from a CSV file so that was easy – I decided I would truncate the log when PowerShell started and also ensure that the file had the CSV header so here’s the reading friendly version of what’s in my profile.

$MaximumHistoryCount = 2048
$Global:logfile = "$env:USERPROFILE\Documents\windowsPowershell\log.csv"
$truncateLogLines = 100
$History = @()
$History += '#TYPE Microsoft.PowerShell.Commands.HistoryInfo'
$History += '"Id","CommandLine","ExecutionStatus","StartExecutionTime","EndExecutionTime"'
if (Test-Path $logfile) {$history += (get-content $LogFile)[-$truncateLogLines..-1] | where {$_ -match '^"\d+"'} }
$history > $logfile
$History | select -Unique  |
         
 Convertfrom-csv -errorAction SilentlyContinue |
         
 Add-History -errorAction SilentlyContinue

UPDATE Copying this code into the blog page and splitting the last $history line to fit, something went wrong and the
select -unique went astray. Oops.
It’s there because hitting enter doesn’t advance the History count, or run anything but does cause the prompt function to re-run. Now I’ve had to look it again it occurs to me it would be better to have select –unique in the (get-content $logfile) rather in the Add-history section as this would remove duplicates before truncating.

So … increase the history count, from the default of 64 (writing this I found that in V3 ctp 2 the default is 4096). Set a Global variable to be the path to the log file, and make it obvious what the length is I will truncate the log to.
Then build an array of strings named history. Put the CSV header information into $history, and if the log file exists put up to the truncate limit of lines into $history as well. Write $history back to the log file and pipe it into add history, hide any lines which won’t parse correctly. Incidentally those who like really long lines of PowerShell could recode all lines with $history in them into one line. So a couple of lines in the prompt function and between 3 and 9 lines in the profile depending on how you write them all in it’s less than a dozen lines. This blog post has taken a good couple of hours, and I don’t the code in 10 to 15 minutes.

image

Oh , and one thing I really like – when I launch PowerShell –Version 2 inside Version 3, it imports the history giving easy access to the commands I just used without needing to cut and paste.

If you’re a Bash user and didn’t storm off in a huff after my initial rudeness I’d like to set a good natured challenge. A non-compiled enhancement to bash I can load automatically which gives it tab expansion on par with PowerShell’s (OK, PowerShell has an unfair advantage completing parameters, so just command names and file names). And in case you wondered about the quote at the top of the post from one of Stephen Moffat’s Dr Who episodes. You see, “I Know PowerShell. I point and laugh at Bash users.”

December 20, 2011

Free NetCmdlets

Filed under: Uncategorized — jamesone111 @ 9:48 pm
I’ve mentioned the NetCmdlets before. Although not perfect if you spend a lot of your life using PowerShell and various network tools they are a big help. They’ve made a bunch of things which would have been longwinded and painful relatively easy. So here is a mail I have just had from PowerShell inside (aka /n software) . I don’t normally hold with pasting mails straight into a blog post, but you’ll see why if you read on: if you click the link it asks you to fill in some details and “A member of our sales team will contact you with your FREE NetCmdlets Workstation License. (Limit one per customer.)”

A Gift For The Holidays: FREE NetCmdlets Workstation License – This Week Only, Tell a Friend!

Help us spread some PowerShell cheer! Tweet, blog, post, email, or just tell a friend and you can both receive a completely free workstation license of NetCmdlets!

NetCmdlets includes powerful cmdlets offering easy access to every major Internet technology, including: SNMP, LDAP, DNS, Syslog, HTTP, WebDav, FTP, SMTP, POP, IMAP, Rexec/RShell, Telnet, SSH, Remoting, and more. Hurry, this offer ends on Christmas day – Happy Holidays!

NetCmdlets Workstation License:

$99.00 FREE

NetCmdlets Server License:
* Special Limited Time Offer *

$349.00 $199.00 [+] Order Now

Hurry, this offer ends on Christmas day – Get your free license now!

Happy Holidays!

Or as we say this side of the Pond Merry Christmas !

October 17, 2011

How to be Creative with QR Codes

Filed under: Uncategorized — jamesone111 @ 1:15 pm

I’ve been playing with QR codes recently, and have started to use them . HELLO if you’ve been at The Experts Conference in Frankfurt and scanned one of from one of my sessions the files form these are on My Skydrive 

I looked at a number of different code libraries which would build a QR code for me but the easiest way turned out to be to use a web service, provided by Google.  And I wrapped this up in a little PowerShell

[System.Reflection.Assembly]::LoadWithPartialName(”System.Web") | Out-Null
Function get-qrcode {
param ([parameter(ValueFromPipeLine= $true, mandatory=$true)]
      
$Text,
       $path = (join-path $pwd "QRCode.PNG")
)
  if ($WebClient -eq $null) {$Global:WebClient=new-object System.Net.WebClient }
  $WebClient.DownloadFile(("http://chart.apis.google.com/chart?cht=qr&chs=547x547&chld=H|0&chl=" + 
                         [System.Web.HttpUtility]
::UrlEncode($Text)) , $path)
   Start-Process $path
}

So it takes two parameters, a block of text and a path where the file should be saved. The text must be specified, but there is a default for

To save having to the create lots of web client objects, I keep a web client object , then one line of powershell gets a PNG file from the Google Service and saves it to the path.  Finally I launch the file in the default viewer.

The next step is to use some image tools to pretty up the QR code. I still use an ancient version of PaintShop pro and that does the job here nicely.

qr-1

On the left is the original QR code, in the middle I have applied some gaussian blur to the image and on the right I have reduced this to two colours, 100% black and 100% white. This smoothes off the edges. Then I take the image back to 16 million colours and add some colour.

 

qr-2

One of the really nice things about QR codes is that they have error correction built in (and in my call to the web service I specify the maximum amount) this means we put something which isn’t part of the code into the picture. I’ve used the Frankfurt skyline from the slide template here, but this is scope for creativity.

Of course there is nothing which says the data in the code must be a URL. Here’s a message for anyone who has got a reader. The file name and the Ferrari might be a clue to what it says.

QR-Ferris

April 10, 2011

Ten tips for better PowerShell functions

Filed under: Uncategorized — jamesone111 @ 11:02 pm

Explaining PowerShell often involves telling people it is both an interactive shell – a replacement for the venerable CMD.EXE – and a scripting language used for single task scripts and libraries of re-useable functions. There are some good practices which are common to both kinds of writing  – including comments, being explicit with parameters, using full names instead of aliases and so-on but having written hundreds of “script cmdlets” I have developed some views on what makes a good function which I wanted to share…

1. Name your function properly
It’s not actually compulsory to use Verb-SingularNoun names with the standard verbs listed by Get-Verb. “Helpers” which you might pop in a profile can be better with a short name. But if your function ends up in a module Import-module grumbles when it sees non-standard verbs. Getting the right name can clarify your thinking about what a command should or should not do. I cite IPConfig.exe as an example of a command line tool which didn’t know when to stop – what it does changes dramatically with different switches.  PowerShell tends towards multiple smaller functions whose names tell you what they will do – which is a Good Thing

2. Use standard, consistent and user-friendly parameters.
(a) PowerShell Cmdlets give you –whatIf and –Confirm switches; before you do something irreversible –  you can get these in your own functions Put this line of code before any others in the function
[CmdletBinding(SupportsShouldProcess=$True)]
and then where you do something which is hard to undo  
If ($psCmdlet.shouldProcess("Target" , "Action")) {
    dangerous actions
}
(b) Look at the names PowerShell uses: “path”, not “filename” , “ComputerName” not “Host”, “Force” “NoClobber” and so on – copy what has been done before unless you have a good reason to do something different; I don’t use “ComputerName” when working with Virtual Machines because it is not clear if it means a Virtual Machine or the Physical Machine which hosts them.
(c)If you are torn between two names : remember that “Computer” is a valid shortening of “ComputerName” and for names which are shortenings of an alternative you can define aliases, like this:
[Alias("Where","Include")]$Filter
TIP 1.You can discover all the parameter names used in by cmdlets, and how popular they are like this
get-command -c cmdlet | get-help -full| foreach {$_.parameters.parameter} |
   forEach{$_.name} | group -NoElement | sort count

Tip2
If you think “Filter” is the right name to re-use you can see how other cmdlets use it like this:
Get-Command -C cmdlet | where { $_.definition -match "filter"} | get-help  -Par "filter" 

3. Support Piping into your functions.
V2 of PowerShell greatly simplified Piping. The more you use PowerShell the stronger sense you get that the output of one command should become the input for another. If you are writing functions, aim for the ability to pipe into them and pipe their output into other things.  Piped input becomes a parameter, all you need to do is

  • Make sure the parts of the function which run for each piped object are in a
    process {} block
  • Prefix the parameter declaration with [parameter(ValueFromPipeline=$true)].
  • If you want a property of a piped object instead of the whole object, use ValueFromPipelineByPropertyName
  • If different types of objects get piped, and they use different property names for what you want, give your parameter aliases, and it will look for the “true” name if it doesn’t find it try each alias in turn.

If you find code that looks like this
something | foreach {myFunction $_ }
It is a sign that you probably need to look at piping.

4. Be flexible about arrays and types of parameters
Piping is one way to feed many objects into one command. In addition, many built-in cmdlets and operators will accept arrays as parameters just as happily as they would accept a single object; previously I gave the example  of Get-WmiObject whose –computername parameter can specify a list of machines – it makes for simpler code.
It is easier to use the functions which catch being passed arrays and process them sensibly (and see that previous post for why simply putting [String] or [FileInfo] in front of a parameter doesn’t work).  Actually I see it as good manners – “I handle the loop so you don’t have to do it at the command line”
Accepting arrays is one case of not being over-prescriptive about types: but it isn’t the only one. If I write something which deals with, say, a Virtual Machine, I ensure that VM names are just as valid as objects which represent VMs. For functions which work with files, it has to be just as acceptable to pass System.IO.FileInfo and System.Management.Automation.PathInfo, objects or strings containing the path (unique or wild card, relative path or absolute). 
TIP:  resolve-path will accept any of these and convert them into objects with fully-qualified paths.
It seems rude to make the user use Get-whatever to fetch the object if I can do it for them.

5. Support ScriptBlock parameters.
If one parameter can be calculated from another it is good to let the user say how to do the calculation.  Consider this example with Rename-Object. I have photos named IMG_4000.JPG, IMG_4001.JPG , IMG_4002.JPG, up to IMG_4500.JPG. They were taken underwater, so I want them to be named DIVE4000.JPG etc. I can use:
dir IMG_*.JPG | rename-object –newname {$_.name –replace "IMG_","DIVE"}
In English “Get the files named IMG_*.JPG and rename them. The new name for each one is the result of replacing IMG_ with DIVE in that one’s current name.” Again you can write a loop to do it but a script block saves you the trouble.

  • The main candidates for this are functions where one parameter is piped and a second parameter is connected to a property of the Piped one.
  • When you are dealing with multiple items arriving from the pipeline, be careful what variables you set in the process{} block of the function: you can introduce some great bugs by overwriting non-piped parameters. For example if you had to implement rename-object, it would be valid to handle a string that had been piped in as the –path parameter by converting it into a FileInfo object – doing so has no effect on the next object to come down the pipe; but if you convert a script block which is passed as -NewName to a String, when the next object arrives it will get that string – I’ve had great fun with the bugs which result from this
  • All you need to do to provide this functionality is
    If ($newname –is [ScriptBlock]) { $TheNewName = $newname.invoke() }
    else                            { $TheNewName = $newname}

6. Don’t make input mandatory if you can set a sensible default.
Perhaps obvious, but… If I write a function named “Get-VM” which finds virtual machines with a given name, what should I do if the the user doesn’t give me a VM name ? Return nothing ? Throw an error ? Or assume they want all possible VMs ?
What would you mean if you typed Get-VM on its own ?

7. Don’t require the user to know too much underlying syntax.
Many of my functions query WMI; WMI uses SQL syntax; SQL Syntax uses “%” as a wildcard, not “*”.  Logical conclusion: if a user wants to specify a wildcarded filter to my functions they should learn to use % instead of *.  That just seems wrong to me: so my code replaces any instance of * with %.  If the user is specifying filtering or search terms a few lines to change the from things they will instinctively do, or wish they could do, to what is required for SQL,  LDAP or any other syntax can make a huge difference in usability.

8. Provide information with Write-Verbose , Write-debug and Write-warning
When you are trying to debug the natural reaction is to put in Write-Host commands, fix the problem and take them out again.  Instead of doing that change $DebugPreference and/or $VerbosePreference and use write-debug / write-verbose to output information. You can leave them in and stop the output by changing the preference variables back. If your function already has
[CmdletBinding(SupportsShouldProcess=$True)]
at the start then you get –debug and –verbose switches for free.
Write-Error is ugly and if you are able to continue, it’s often better to use Write-warning.
And learn to use write-progress when you expect something to spend a long time between screen updates.

9. Remember: your output is someone else’s input.
(a) Point 8  Didn’t talk about using Write-Host – only use it to display something you want to prevent going into something else.
(b) Avoid formatting output in the function, try to output objects which can be consumed by something else.  If you must format turn it on or off with a –formatted or -raw switch.
(c) Think about the properties of the objects you emit. Many commands will understand that something is a file if it has a .Path property, so add one to the objects coming out of your function and they can be piped into copy, invoke-item, resolve-path and so on. Usually that is good – and if it might be dangerous look at what you can do to change it.  Another example: when I get objects that represent components of a virtual machine their properties don’t include the VM name. So I go to a little extra trouble to add it.
Add-Member can add properties or aliases for properties to an object for example
$obj | Add-member -MemberType AliasProperty –Name "Height"-Value “VerticalSize”

10 Provide help
In-line help is easy – it is just a carefully comment before any of the code in your function. It isn’t just there for some far when you share the function with the wider world. It’s for you when you are trying to figure out what you did months previously – and Murphy’s law says you’ll be trying to do it at 3AM when everything else is against you.
Describe what the Parameters expect and what they can and can’t accept. 
Give examples (plural) to show different ways that the function can be called. And when you change the function in the future, check the examples still work.

April 9, 2011

Pattern recognition–the human and PowerShell kinds

Filed under: Uncategorized — jamesone111 @ 8:40 pm

Recently BBC’s Top Gear has been promoting the idea that a particular type of obnoxious drivers have been replacing the BMWs that they traditionally bought with Audis. Chatting to a friend who is a long term Audi customer, and whose household features ‘his’ and ‘hers’ Audis we came to the conclusion that once you think there is a pattern, you recognise it and the your awareness increases – even if in reality it is no more prevalent. I think the same thing happens in IT in general and scripting in particular – it has happened to me recently… when  my understanding of regular expressions in PowerShell took a big step forward, and now I’m finding all manner of places where it helps.

I use a handful of basic regular expressions  for things like removing a trailing \ character from the end of a string with something like:
$Path = $Path –replace "\\$" , ""
Many people use –replace to swap text without realising it handles regular expressions – in this case  “\” is the escape character in regular expression, so to match “\” itself it has to escaped as “\\” . The $ character means “end-of-line” so this fragment just says ‘Replace “\” at the end of $Path – if you find one – with nothing, and store the result back in $Path.  PowerShell’s –Split operator also uses regular expressions. This can be a trap – if you try to split using  “.” it has means “any character” any you get a result you didn’t expect:
This.that" –split "." returns 10 empty strings – (the –split operator discards the delimiter) ; to match a “.” it must  be escaped as “\.” . But it’s also a benefit if you want to split sentences apart you can make  “.” and any spaces round it the delimiter– which saves the need to trim afterwards. The –Match operator uses regular expressions too  – I  worry when I see it used in examples for new users who may use something which parses unexpectedly as a regular expression .

I thought that I knew regular expressions – until thanks to an article by Tome Tanasovski, I found I had missed a big bit of the picture, which meant my understanding was wrong.  I thought that a match meant the equivalent of running a highlighter pen over part of the text and –replace means “take something out and put something else back” – both are usually true but not always. Tome also did a presentation for the PowerShell user group – there’s a link to the recording on Richard’s blog – I’d recommend watching it and pausing every so often to try things out.
Tome showed look-aheads and look-behinds. These say “It’s a Match if it is followed by something”, or “preceded by something” (or not).  This adds a whole new dimension…

A couple of days later I hit a snag with PowerShell’s Split-Path cmdlet. If the path is on a remote machine it might uses a drive letter which doesn’t exist on the local machine – and in that situation Split-Path throws an error. But I can use the –Split operator with a regular expression. I want to say “Find a \ followed by some characters that aren’t \ and the end of the string”. Lets test this:
PS C:\Users\James\Documents\windowsPowershell> $pwd -split "\\[^\\]+$"
C:\Users\James\Documents

As in my first example  ‘\\’ is an escaped ‘\’ character, and ‘$’ means “end of line” , ‘[^\\]’ says “Anything which not the ‘\’ character”  and ‘+’ means “at least once” So this translates as “Look for a ‘\’ followed my at least 1 non-‘\’ followed by end of line”. It’s mostly right but it doesn’t work (yet).
I copied my command prompt so you can see that ‘WindowsPowerShell’ is part of the my working directory – but that bit got lost; or to be more precise it was matched in the expression, so –split returned the text on either side of it.
I want to say “Find ONLY a ‘\’ . The one you want is followed by some characters that aren’t ‘\’ and the end of the string but they don’t form part of the delimiter.”  The syntax for Is followed by is “(?=   )” so I can wrap that around the [^\\]+$ part  and test that:
PS C:\Users\James\Documents\windowsPowershell> $pwd -split "\\(?=[^\\]+?$)"
C:\Users\James\Documents
windowsPowershell

Regular-Expressions can turn into a write-only language – easy to build up but pretty hard to pull apart.  At risk of making things worse, not everyone knows that PowerShell has a “multiple = operator”; if you write $a , $b  = 1,2  it will assign 1 to $a and 2 to $b. Since the output of the split operation is 2 items we can try this
PS C:\Users\James\Documents\windowsPowershell> $Parent,$leaf = $pwd -split "\\(?=[^\\]+?$)"
PS C:\Users\James\Documents\windowsPowershell> $Parent
C:\Users\James\Documents
PS C:\Users\James\Documents\windowsPowershell> $leaf
windowsPowershell

The “cost” of using regular expressions is that the term used to do the split is something akin to a magical incantation. The benefit is code is a lot more streamlined than using the string object’s  .LastInstanceOf(), .Substring() and .length() methods and some arithmetic to get to the same result. I’d contend that even allowing for the “incantation” the regex way makes it easier to see that $pwd is being split into 2 parts.
Good stuff so far, but Tome had another trick:  the match that selects nothing and the replace that removes nothing.  That made me stop and redefine my understanding.  Here’s the use case:

Ages ago I wrote about using PowerShell to query the Windows [Vista] Destkop Index – it works just as well with Windows 7.  The a zillion or so field names used in these queries have names like  “System.Title”, “System.Photo.Orientation” and “System.Image.Dimensions” – I’d type the bare field name like “title” by mistake or waste time discovering whether “HorizontalSize” belonged to System.Photo or System.Image. 
It would be better to enable my Get-IndexedFile function to put in the the right prefix: but could it be done reasonably efficiently and elegantly?
Here lookarounds come into their own. They let me write “If you can find a spot which is immediately after a space, and immediately before the word ‘Dimensions’ OR the word ‘HorizontalSize’ OR…” and so on for all the Image Fields “AND that word is followed by any spaces and a ‘=’ sign  THEN put ‘System.image.’ at the spot you found”.  With just the first two fieldnames the operation looks like this
-replace "(?<=\s) (?=(Dimensions|HorizontalSize)\s*=)" , "system.image."
                 ^
I have put an extra space in for the spot that will be matched – the ^ is pointing this out, it isn’t part of the code.
“(?<=  )” is the wrapper for the “look behind” operation  (replacing the ‘=’ with ‘!’ negates the expression) so “(?<=\s)”  says “behind this spot you find a space” and the second half is a “look ahead” which says “in front of this spot you find ‘Dimensions’ or ‘HorizontalSize’ then zero or more spaces (‘\s*’) followed by ‘=’ ”. A match with an expression like this is like an I-beam cursor between characters – rather than highlighting some: so the –replace operator has nothing to remove but it still inserts ‘system.image’ at that point. So lets put that to the test.

PS> "horizontalsize = 1024"  -replace "(?<=\s)(?=(Dimensions|HorizontalSize)\s*=)",
                                       "system.image."
system.image.horizontalsize = 1024

It works !  This whole exercise of writing a Get-IndexedFilesfunction – which I will share in due course –  ended up as worked example in using regex to support good function design. I’ve got another post in draft at the moment about my ideas on good function design, so I’ll post that and then come back to looking at all the different ways I made use of regular expressions in this one function.

March 26, 2011

F1: The hidden effects of moving wings

Filed under: Uncategorized — jamesone111 @ 10:25 pm

There seem to be divided opinions about the effect of the “Drag reduction system” introduced in F1 this season. The rules are that

  • Drivers can operate device to lower the effective part of the rear wing, cutting both lift and drag. The wing returns to its original position when the driver applies the brakes.
  • In wet conditions this will be disabled
  • In qualifying the drivers can use this at will
  • In the race it is armed remotely by a system in race control – if the car close enough to the one in front (the margin will be 1 second to begin with – this my change over the season) at a specific point the following driver can lower his wing for a specific section of the track – typically the longest straight. .

“Push to pass” divides people: we had it in the days of turbo engines: in the 1980s we had qualifying engines which wouldn’t last a race distance; the boost button in a race gave a burst of similar power – for a sustained period it was a case of “The engines cannae’ take it”, nor could the tyres, and fuel would run  out. We it had when KERS first appeared;  “Kinetic Energy Recovery Systems” are currently a gimmick: energy stored, rate at which it is returned (Power) and time over which the return can take place are all constrained. F1 talks about being greener, removing the limits on KERS would be an obvious way and I’d have it feeding extra power in when the driver applied full throttle. Now we have it with wings.

Predicted effect 1. Last use wins. IF it turns out to make passing easy then if two cars are evenly matched, drivers won’t want to be re-passed, so they will time their passing move to use the wing at the last moment

Predicted effect 2. More tyre stops. There was always a decision to make: sacrifice position on-track by making a stop for fresh tyre – or hold out ? The harder it is to overtake the bigger the advantage of fresh rubber needs to be before stopping becomes the preferred option – because as Murray Walker always used to say “Catching is one thing, passing is quite another”.  So picture the scene with a dozen or so laps to go the first two cars have been on hard tyres for a good few laps and the leader is being caught: thanks to DRS the 2nd place car gets past. The former leader’s his car is fractionally slower but on fresh soft tyres could go 2-3 seconds a lap quicker – enough to catch the 20-30 seconds a pit stop takes with a couple of laps to go. Most of the tyre advantage will have gone by the time he has caught up: previously it would have been easy for the new leader to defend for the last couple of laps. Now if the chasing car can get within DRS distance he should be able to make a last gasp pass.  In the wet inspired changes of tyres win races – it didn’t really happen in the dry – until now.

Predicted effect 3. The return of slipstreaming. The FIA banned slipstreaming… OK, not as such. Imposing an 18,000 Rev limit banned it. How so ?  Without a limit on revs, in top gear, revs and speed increase until the acceleration force coming from the engine matches the retardation force from friction and aerodynamic drag.  Reduce drag by slipstreaming and top speed and engine revs will increase. But what if gear ratios are optimised to get the best lap time with no slipstream (in qualifying) – hitting the maximum Revs as the driver hits the brake at the fastest point ? If revs are limited the car won’t go any faster with the aid of slipstream.

With the ability to use DRS in qualifying, the optimum is to hit 18,000 in low-drag trim at the fastest point. The teams can’t change gear ratios after qualifying and the race the cars will be in high-drag trim most of the time – so they won’t be reaching 18,000 revs and will have a margin for slipstreaming.

Predicted effect 4. Race pace trumps grid place. Grid penalties become less effective. The advantage of starting ahead of a car which is faster than then yours / or disadvantage of starting behind a slower car varies with the difficulty in passing. Since the car can’t be reconfigured after qualifying, making overtaking easier might mean car set-up is tilted more towards race configuration than qualifying. It also means taking a penalty for a precautionary gearbox change (say) is smaller

Whether or not any of these things happen remains to be seen. Still: fun season in prospect.

February 11, 2011

Elop and Ballmer : it’s the Ecosystem . Does everyone get it now ?

Filed under: Uncategorized — jamesone111 @ 6:07 pm

As the dust starts to settle after the announcement of the partnership between Microsoft and Nokia, the question has come from more than one quarter “what about the other handset vendors ?” 
There are 4 of them today. HTC has 4 phones out (the HD7, Mozart Trophy and Surround) with a 5th, the “Pro” due imminently; it makes more than a dozen Android devices. LG has one phone the Optimus 7 with a pro version in the pipeline, and according to Wikipedia makes 8 Android devices.  Samsung has the Omnia 7 and (if Wikipedia is to be believed) makes 9 Android devices. Dell is the last of the first wave Windows phone 7 manufacturers and it too has an Android device. Nokia is the first phone maker to go with Windows 7 who didn’t also go with Android. The initial “gang of four” can’t accuse Microsoft of infidelity when they had not been exclusive themselves,one must expect Microsoft to sell operating systems to all comers. The four also know that once they were eight: Garmin-Asus, HP, Toshiba and Sony-Ericsson were with them at launch. HP’s plans changed and the other three went quiet. Some drop out, some join, and so the world turns. I suspect their first thought is that if the decline in Nokia’s share comes to a halt, there will be less easy business to pick up.

You can watch the Stephen Elop and Steve Ballmer’s press conference here . Elop’s opening remarks echoed something from his burning platform memo: “The game has changed from a battle of devices to a war of ecosystems.”  Ecosystem was the word of the day*, I stopped counting the number of times it was used. It’s not about Nokia’s Windows Phone 7 device against, Dell’s, against HTCs against LG’s against Samsung’s. It’s about Microsoft’s ecosystem – with phones by Dell,HTC, LG, Nokia Samsung, and whoever else, against Google’s ecosystem with phones by Dell, HTC, LG, Samsung, Motorola, Sony-Ericsson, Old Uncle Tom Cobley and all, against the Apple and Blackberry ecosystems.

About 25:15 into the press conference someone asked Steve Ballmer about the other handset makers. He replies “The overall development of critical mass in Windows phone 7,  from other manufacturers, from the chipset community, is important to both of us – despite the fact that obviously Nokia wants to sell all the Windows phones it can”. Elop chimes in “This is an important thing for people to think about: Our number one priority is the success of the Windows Phone Ecosystem, in which Nokia is participating, so it is to our benefit to get that critical mass and virtuous cycle going which includes work done by  some of our handset competitors. We will encourage that, that’s a good thing. ”. We can’t prosper unless the ecosystem prospers. 

The next question Elop gets is “Why Microsoft not Android ?”  and his answer is interesting “What we assessed was 3 options…  internal: MeeGo Symbian and so-forth … the concerns about whether we could quickly enough develop a third ecosystem without the help of a partner like Microsoft … made that option concerning, absolutely concerning.” (Lovely way of putting it) “We explored the opportunity with the Google ecosystem … our fundamental belief is that we would have difficulties differentiating within that ecosystem – if we tipped over into the Android ecosystem, and there was a sense that was the dominant ecosystem at that point the commoditisation risk was very high, prices, profits everything being pushed down value being moved out to Google essentially. ”.

That begs a bunch of follow-up questions. “You don’t think you can build an ecosystem with MeeGo but HP think they can with WebOs  do you think they should be more … concerned ?”. “Is the Android handset market in a dash to the bottom or are you saying Nokia joining that market would have made it so ?”  “How will Nokia differentiate when Microsoft has worked to give consistent experience over all the Windows Phone 7 devices?”  “Why won’t  value will get moved out to Microsoft ? (Isn’t that the danger if you succeed and Microsoft’s becomes the dominant ecosystem ?)”
The financial question did get asked, and half answered, Nokia will pay Royalties on the OS, Microsoft will buy services from Nokia to strengthen the ecosystem. Who knows if Google would have offered them the  same ? Connections have little way to add value, the opportunities for the handset makers are reducing. Pushing stuff to the handsets is where the opportunity is today. I haven’t made a lot of purchases since I got my HTC trophy, but Microsoft take 30% of software sales from marketplace and if the margin Music is the same they’ve made £10 out of me in 3 months, that’s £80 over the two year life of a phone. I never had much idea what a Microsoft charged to put its software on a phone, (you can buy Windows 7 home Premium from a PC builder for £62 + VAT so Windows Mobile 6.x / Windows Phone 7 netting more than £80 doesn’t seem plausible). A year ago Microsoft made zero once a phone had been sold, now the revenue is reaching the point  where a token license fee is feasible – I can’t see Microsoft letting go completely.  And of course the biggest ecosystem by unit volume is an OS given away to sell advertising. You can see Nokia wanting to be in an ecosystem where it is not just as a handset maker.

 

* I nearly used “It’s the ecosystem stupid”, or “Our priorities are ecosystem, ecosystem, ecosystem” for a title.

Next Page »

Blog at WordPress.com.