James O'Neill's Blog

February 7, 2011

How to read Excel files in PowerShell.

Filed under: Uncategorized — jamesone111 @ 2:44 pm

I picked up a post of Lee Holmes’ entitled Does Your Hard Work Advance the Ecosystem; which  joins a discussion that started with what Lee terms a “fairly chewy piece” on the Lync PowerShell blog. That was about reading stuff from the cells of an Excel spreadsheet to set up users. Over here Jason basically says “very clever and all that, but wouldn’t you just convert to .CSV ?”  Lee’s piece is worth reading in full, but the key points are

  1. Some people will do things the hard way.  A few lines of PowerShell can replace hundreds in something like C#. [Side note: I’ve been working in C# recently and I know this only too well]. The experience of some hard-core programmers stops them imagining an easy way.
  2. There isn’t anything built into PowerShell to read XLS / XLSx files, so showing people how Excel can be used does move things forward. And PowerShell is about automating: starting a process by manually loading files into Excel and converting isn’t ideal.
  3. It is easy to write up and share what you did for a specific case. But a generic is better –providing an “Import-XLS” command is more useful than CreateLyncUsersFromXLS.PS1

I have to agree with both Jason and Lee – one probably would load the file in the Excel, and that isn’t the ideal. I’ve been manipulating the Office object model on and off since the early 1990’s so I really get Jason’s point that there is more to learn there than makes sense for an admin to pick up to get round the problem of people not saving in .CSV format when asked.

So here, for any admin confronted with that problem is ConvertFrom-XLx – it accepts a path to a file, or a file object from the pipeline ,  (so dir *.xlsx | convertFrom-XLx will work). It checks that the file exists, and ends with .XL<something> – that’s the little regular expression used with the -match and -replace operators. The -replace puts .CSV into the save name and the file is then opened, saved as CSV (that’s type 6 in a SaveAs operation), and closed (false says don’t stop to ask the user anything on closing)
Finally if a –PassThru parameter is specified the CSV file can be read in and passed to the next step in a pipeline.  

function ConvertFrom-XLx {
  param ([parameter(             Mandatory=$true,
                         ValueFromPipeline=$true,
           ValueFromPipelineByPropertyName=$true)]
         [string]$path ,
         [switch]$PassThru
        )

  begin { $objExcel = New-Object -ComObject Excel.Application }
Process { if ((test-path $path) -and ( $path -match ".xl\w*$")) {
                  
 $path = (resolve-path -Path $path).path
                $savePath = $path -replace ".xl\w*$",".csv"
              $objworkbook=$objExcel.Workbooks.Open( $path)
            
 $objworkbook.SaveAs($savePath,6) # 6 is the code for .CSV
              $objworkbook.Close($false)
              if ($PassThru) {Import-Csv -Path $savePath }
         
}
        
 else {Write-Host "$path : not found"}
        }
   end  { $objExcel.Quit() }
}

3 Comments

  1. I have always found it odd that people spend a lot of time working with CSV files and controlling Excel programatically when it has already been done with OLEDB. I have been using this for years with vbscript.

    The .Net version is slightly more fiddley but works a treat and lets you use standard SQL queries, gives you back column headings and a rich interface to tell you about the data retrurned. See below for an example. There are only small differences to the connection string for different versions of Excel and even to read CSV files (although that is not such an issue with Powershell).

    You don’t need Excel on the system where you are running the scripts, just the appropriate data access component. For OLEDB.NET and Office 2010, this is the Microsoft Access Database Engine 2010 Redistributable “

    Example
    $OleDbConn = New-Object “System.Data.OleDb.OleDbConnection”
    $OleDbCmd = New-Object “System.Data.OleDb.OleDbCommand”
    $OleDbAdapter = New-Object “System.Data.OleDb.OleDbDataAdapter”
    $DataTable = New-Object “System.Data.DataTable”

    $OleDbConn.ConnectionString = “Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\swhite26\Documents\asfcusers.xlsx;Extended Properties=””Excel 12.0 Xml;HDR=YES””;”
    $OleDbConn.Open()

    $OleDbCmd.Connection = $OleDbConn
    $OleDbCmd.commandtext = “Select * from [Sheet1$]”
    $OleDbAdapter.SelectCommand = $OleDbCmd

    $RowsReturned = $OleDbAdapter.Fill($DataTable)

    ForEach ($DataRec in $DataTable) {
    Write-host “sAMAccountName=$($DataRec.sAMAccountName)”
    }

    $OleDbConn.Close()

    Comment by Stuart White — February 7, 2011 @ 3:22 pm

  2. I don’t want to put people off from using OLEDB by making it appear overly complicated. The previous example stores each individual step. In fact you can reduce the same process down to just a few lines.

    Example
    $DataTable = New-Object “System.Data.DataTable”

    $OleDbAdapter = New-Object System.Data.OleDb.OleDbDataAdapter “Select * from [Sheet1$]”,”Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\swhite26\Documents\asfcusers.xlsx;Extended Properties=””Excel 12.0 Xml;HDR=YES””;”
    $RowsReturned = $OleDbAdapter.Fill($DataTable)

    ForEach ($DataRec in $DataTable) {
    Write-host “sAMAccountName=$($DataRec.sAMAccountName)”
    }

    Comment by Stuart White — February 7, 2011 @ 4:59 pm

  3. The .Net version is slightly more fiddley but works a treat and lets you use standard SQL queries, gives you back column headings and a rich interface to tell you about the data retrurned. See below for an example. There are only small differences to the connection string for different versions of Excel and even to read CSV files (although that is not such an issue with Powershell).
    +1

    Comment by Alejoa LeClaire — February 17, 2011 @ 6:40 pm


RSS feed for comments on this post.

Create a free website or blog at WordPress.com.