January 19, 2010

How to Pretty Print XML from PowerShell, and output UTF, ANSI and other non-unicode formats

PowerShell has been taking more than its fair share of my time of late and I need redress the balance a bit – just not quite yet.

Powershell and redirection.

I’ve been working on my hyper-V library for codeplex and this has separate files for every command, and then to keep the start-up process for the module manageable these are merged together – I have a very short script to do this . All the constants from the top of the files get grouped together, at the top of the final file, but basically it is getting the files and outputting them to a destination using the > and >> operators . Then I got a mail from Ben who wanted to sign the scripts but found a problem if they saved as “Unicode BigEndian Text”. I hadn’t selected this, but that’s the default for text output from PowerShell. One can use Out-File –encoding ASCII, but that has another undesirable behaviour – it pads (or truncates) text to fit a given width. It turns out that – although the help files for Add-Content and Set-Content don’t mention it, both take –encoding  so > can be replaced with | Set-Content –encoding ASCII filename and  >> can be replaced with | Add-Content –encoding ASCII filename.

Pretty printing XML

Ben’s mail was rather timely because I had parked a problem with XML. I wrote about writing MAML Help files some while back and I’m still using InfoPath to do the job: the formatting is wantonly nasty. So I wanted to reformat the files they were vaguely readable, and went and found various articles about how to do it (I think I ended up adapting this code of James’s but I wish now I’d kept the link to be sure I’m assigning credit correctly)

function Format-XML {Param ([string]$xmlfile)
  $Doc=New-Object system.xml.xmlDataDocument
  $doc.Load((Resolve-Path $xmlfile))
$sw=New-Object system.io.stringwriter
$writer=New-Object system.xml.xmltextwriter($sw)
  $writer.Formatting = [System.xml.formatting]::Indented

Of course I use > to redirect this to a file and it did not work if I used | clip and pasted it into notepad all was well. Eventually it dawned on me that the first line of the file was
<?xml version="1.0" encoding="UTF-8"?>

And of course I was creating unicode files so … | Set-Content –encoding UTF8 and it all works. So I had my  nicely format XML files providing help. And the next post will explain what it was all for.

