Databases / SQL | James O'Neill's Blog

January 29, 2017

Sharing GetSQL

Filed under: Databases / SQL,Powershell — jamesone111 @ 8:03 pm

I’ve uploaded a tool named “GetSQL” to the PowerShell Gallery.
Three of my last four posts (and one upcoming one) are about sharing stuff which I have been using for a long time and GetSQL goes back several years – every now and then I add something to it, so it seems like it might never be finished, but eventually I decided that it was fit to be shared.

When I talk about good PowerShell habits, one of the themes is avoiding massive “do-everything” functions; it’s also good minimize the number of dependencies on the wider system state – for example by avoiding global variables. Sometimes it is necessary to put these guidelines to one side – the module exposes a single command, Get-SQL which uses argument completers (extra script to fill in parameter values). There was a time when you needed to get Jason Shirks TabExpansion Plus Plus to do anything with argument completers but, if you’re running a current version of PowerShell, Register-ArgumentCompleter is now a built in cmdlet. PowerShell itself helps complete parameter names – and Get-SQL is the only place I’ve made heavy use of parameter sets to strip out parameters which don’t apply (and had to overcome some strange behaviour along the way); having completers to fill in the values for the names of connections, databases, tables and column names dramatically speeds up composing commands – not least by removing some silly typos.

One thing about Get- commands in PowerShell is that if the PowerShell parser sees a name which doesn’t match a command where a command-name should be it tries putting Get- in front of the name so History runs Get-History, and SQL runs Get-SQL. That’s one of the reasons why the command didn’t begin life as “Invoke-SQL” and instead became a single command. I also wanted to be able to run lots of commands against the same database without having to reconnect each time. So the connection objects that are used are global variables, which survive from one call to the next – I can run
SQL “select id, name from customers where name like ‘%smith’ ” SQL “Select * from orders where customerID = 1234”
without needing to make and break connections each time – note, these are shortened versions of the command which could be written out in full as:
Get-SQL –SQL “Select * from orders where customerID = 1234”
The first time Get-SQL runs it needs to make a connection, so as well as –SQL it has a –Connection parameter (and extra switches to simplify connections to SQL server, Access and Excel). Once I realized that I needed to talk to multiple databases, I started naming the sessions and creating aliases from which the session can be inferred (I described how here) so after running
Get-SQL -Session f1 -Excel -Connection C:\Users\James\OneDrive\Public\F1\f1Results.xlsx(which creates a session named F1 with an Excel file containing results of formula one races) I can dump the contents of a table with
f1 –Table “[races]”

Initially, I wanted to either run a chunk of SQL –like the first example (and I added a –Paste switch to bring in SQL from the Windows Clipboard), or to get the whole of a table (like the example above with a –Table parameter), or see what tables had be defined and their structure so I added –ShowTables and –Describe. The first argument completer I wrote used the “show tables” functionality to get a list of tables and fill in the name for the –Table or –Describe parameters. Sticking to the one function per command rule one would mean writing “Connect-SQL”, “Invoke-RawSQL” “Invoke-SQLSelect”, and “Get-SQLTable” as separate commands, but it just felt right to be able to allow
Get-SQL -Session f1 -Excel -Connection C:\Users\James\OneDrive\Public\F1\f1Results.xlsx –showtables

It also just felt right to allow the end of a SQL statement to be appended to what the command line had built giving a command like the following one – the output of a query can, obviously, be passed through a PowerShell Where-Object command but makes much more sense to filter before sending the data back.
f1 –Table “[races]” “where season = 1977”

“where season = 1977” is the –SQL parameter and –Table “[races]” builds “Select * from [Races]” and the two get concatenated to
“Select * from [Races] where season = 1977”.
With argument completers for the table name, it makes sense to fill column names in as well, so there is a –Where parameter with a completer which sees the table name and fills in the possible columns. So the same query results from this:
f1 –Table “[races]” -where “season” “= 1977”
I took it a stage further and made the comparison operators work like they do in where-object, allowing
f1 –Table “[races]” -where “season” -eq 1977

The script will sort out wrapping things in quotes, changing the normal * into SQL’s % for wildcards and so on. With select working nicely (I added –Select to choose columns, –OrderBy, –Distinct and –GroupBy ) I moved on to insert, update (set) and Delete functionality. You can see what I meant when I said I keep adding bits and it might never be “finished”.

Sometimes I just use the module as a quick way to add a SQL query to something I’m working on: need to query a Lync/Skype backend database ? There’s Get-SQL. Line-of-business database ? There’s Get-SQL. Need to poke round in Adobe Lightroom’s SQL-Lite database? I do that with with Get-SQL - though in both cases its simple query building is left behind, it just goes back to its roots of running a lump of SQL which was created in some other tool. The easiest way to get data into Excel is usually the EXPORT part of Doug Finke’s excellent import-excel and sometimes it is easier to do it with insert or update queries using Get-SQL. The list goes on … when your hammer seems like a good one, you find an awful lot of nails.

Comments Off

July 5, 2013

PowerShell TabCompletion++ or TabCompletion#

Filed under: Databases / SQL,Powershell — jamesone111 @ 9:23 pm

One of the tricks of PowerShell V3 is that it is even cleverer with Tab completion / intellisense than previous versions, though it is not immediately obvious how you can take control of it. When I realised what was possible I had to apply it to a Get-SQL command command I had written. I wanted the ISE to be able to give me something like this

I happened to have Jan Egil Ring ‘s article on the subject for PowerShell magazine (where he credits another article by Tobias Weltner ) open in my browser, when I watched Jason Shirk gave a talk covering the module he has published via GitHub named TabExpansion ++. This module includes the custom tab completers for some PowerShell which don’t have them and some for legacy programs. Think about that for a moment, if you use NET STOP in PowerShell instead of in CMD, tab completion fills in the names of the services. Yes, I know PowerShell has a Stop-Service cmdlet, but if you’ve been using the NET commands since the 1980s (yes, guilty) why stop using them ?

More importantly Jason has designed a framework where you can easily add your own tab completers – which are the basis for intellisense in the ISE. On loading, his module searches all ps1 and psm1 files in the paths $env:PSModulePath and $env:PSArgumentCompleterPath for functions with the ArgumentCompleter attribute – I’ll explain that shortly. When it finds one, it extracts the function body and and "imports" it into the TabExpansion++ module. If I write argument completer functions and save them with my modules (which are in the PSModulePath) then when I load TabExpansion++ …. whoosh! my functions get tab completion.

A lot of my work at the moment involves dealing with data in a MySQL database, and I have installed the MySQL ODBC driver, and I wrote a function Named Get-SQL (which I can just invoke as SQL)
When I first wrote it, it was simple enough: leave an ODBC connection open as a global variable and pump SQL queries into it. After a while I found I was sending a lot of “Select * From table_Name” queries, and so I gave it a –Table parameter which would be built into a select query and a –gridview parameter which would sent the data to the PowerShell grid viewer. Then I found that I was doing a lot of “Desc table_name” queries, so I added a -describe parameter. One way and another the databases have ended up with long table names which are prone to mistyping, this seemed like a prime candidate for an argument completer, so I set about extending TabExpansion++ (does that make it TabExpansion#? if you haven’t noticed with C# the # sign is ++ ++ one pair above the other).

It takes 4 things to make a tab completer function. First: one or more ArgumentCompleter attributes [ArgumentCompleter(Parameter = 'table', Command = ('SQL','Get-SQL'), Description = 'Complete Table names for Get-SQL , for example: Get-SQL -GridView -Table ')]

This defines the parameter that the completer works with – which must be a single string. If the completer supports multiple parameters, you must use multiple ArgumentCompleter attributes.
And it defines the command(s) that the completer works with. The definition can be a string, an array of strings, or even a ScriptBlock.
The Second thing needed is a param block that understands the parameters passed to a tab completer. param($commandName, $parameterName, $wordToComplete, $commandAst, $fakeBoundParameter)

The main one here is $wordToComplete – the partially typed word that tab completion is trying to fill in. However as you can see in the screen shot it is possible to look at the parameters already completed and use them to produce the list of possible values.
$wordToComplete -is used in the third part is the body that gets those possible parameter value. So in my function I have something a bit like this…
$parameters = Get-TableName | Where-Object { $_ -like "$wordToComplete*" } | Sort-Object

And the final part is to return the right kind of object to tab completion process, and Jason’s module has a helper function for this
$parameters | ForEach-Object {$tooltip = "$_" New-CompletionResult $_ $tooltip}

There is the option to have different text as the tool tip – in some places $tooltip – which is shown in intellisense – would be set to something other than the value being returned. Here I’ve kept it in place to remind me rather than a calling New-CompletionResult $_ $_

And that’s it. Unload and reload TabExpansion++ and my SQL function now knows how to expand -table. I added a second attribute to allow the same code to handle -describe and then wrote something to get field names so I could have a picklist for –orderby and –select as well. With -select intellisense doesn’t pop up a second time if you select a name and enter a comma to start a second; but tab completion works. Here’s the finished item

Function SQLFieldNameCompletion { [ArgumentCompleter(Parameter = ('where'), Command = ('SQL','Get-SQL'), Description = 'Complete field names for Get-SQL , for example: Get-SQL -GridView -Table ')] [ArgumentCompleter(Parameter = ('select'), Command = ('SQL','Get-SQL'), Description = 'Complete field names for Get-SQL , for example: Get-SQL -GridView -Table ')] [ArgumentCompleter(Parameter = ('orderBy'), Command = ('SQL','Get-SQL'), Description = 'Complete field names for Get-SQL , for example: Get-SQL -GridView -Table ')] param($commandName, $parameterName, $wordToComplete, $commandAst, $fakeBoundParameter) If ($DefaultODBCConnection) { $TableName = $fakeBoundParameter['Table'] Get-SQL -describe $TableName | Where-Object { $_.column_name -like "$wordToComplete*" } | Sort-Object -Property column_name | ForEach-Object {$tooltip = $_.COLUMN_NAME + " : " + $_.TYPE_NAME New-CompletionResult $_.COLUMN_NAME $tooltip } } }

Which all saves me a few seconds a few dozen times a day.

Comments Off

August 9, 2012

Getting to the data in Adobe Lightroom–with or without PowerShell

Filed under: Databases / SQL,Photography,Powershell — jamesone111 @ 7:01 am

Some Adobe software infuriates me (Flash), I don’t like their PDF reader and use Foxit instead, apps which use Adobe-Air always seem to leak memory. But I love Lightroom . It does things right – like installations – which other Adobe products get wrong. It maintains a “library” of pictures and creates virtual folders of images ( “collections” ) but it maintains metadata in the images files so data stays with pictures when they are copied somewhere else – something some other programs still get badly wrong. My workflow with Lightroom goes something like this.

If I expect to manipulate the image at all I set the cameras to save in RAW, DNG format not JPG (with my scuba diving camera I use CHDK to get the ability to save in DNG)
Shoot pictures – delete any where the camera was pointing at the floor, lens cap was on, studio flash didn’t fire etc. But otherwise don’t edit in the camera.
Copy everything to the computer – usually I create a folder for a set of pictures and put DNG files into a “RAW” subfolder. I keep full memory cards in filing sleeves meant for 35mm slides..
Using PowerShell I replace the IMG prefix with something which tells me what the pictures are but keeps the camera assigned image number.
Import Pictures into Lightroom – manipulate them and export to the parent folder of the “RAW” one. Make any prints from inside Lightroom. Delete “dud” images from the Lightroom catalog.
Move dud images out of the RAW folder to their own folder. Backup everything. Twice. [I’ve only recently learnt to export the Lightroom catalog information to keep the manipulations with the files]
Remove RAW images from my hard disk

There is one major pain. How do I know which files I have deleted in Lightroom ? I don’t want to delete them from the hard-disk I want to move them later. It turns out Lightroom uses a SQL Lite database and there is a free Windows ODBC driver for SQL Lite available for download. With this in place one can create a ODBC data source – point it at a Lightroom catalog and poke about with data. Want a complete listing of your Lightroom data in Excel? ODBC is the answer. But let me issue these warnings:

Lightroom locks the database files exclusively – you can’t use the ODBC driver and Lightroom at the same time. If something else is holding the files open, Lightroom won’t start.
The ODBC driver can run UPDATE queries to change the data: do I need to say that is dangerous ? Good.
There’s no support for this. If it goes wrong, expect Adobe support to say “You did WHAT ?” and start asking about your backups. Don’t come to me either. You can work from a copy of the data if you don’t want to risk having to fall back to one of the backups Lightroom makes automatically

I was interested in 4 sets of data shown in the following diagrams. Below is image information with the Associated metadata, and file information. Lightroom stores images (Adobe_Images table) IPTC and EXIF metadata link to images – their “image” field joins to the “id_local” primary key in images. Images have a “root file” (in the AgLibraryFile table) which links to a library folder (AgLibraryFolder) which is expressed as a path from a root folder (AgLibraryRootFolder table). The link always goes to the “id_local” field I could get information about the folders imported into the catalog just by querying these last two tables (Outlined in red)

The SQL to fetch this data looks like this for just the folders
SELECT RootFolder.absolutePath || Folder.pathFromRoot as FullName FROM AgLibraryFolder Folder JOIN AgLibraryRootFolder RootFolder ON RootFolder.id_local = Folder.rootFolder ORDER BY FullName

SQLlite is one of the dialects of SQL which doesn’t accept AS in the FROM part of a SELECT statement . Since I run this in PowerShell I also put a where clause in which inserts a parameter. To get all the metadata the query looks like this
SELECT rootFolder.absolutePath || folder.pathFromRoot || rootfile.baseName || '.' || rootfile.extension AS fullName, LensRef.value AS Lens, image.id_global, colorLabels, Camera.Value AS cameraModel, fileFormat, fileHeight, fileWidth, orientation , captureTime, dateDay, dateMonth, dateYear, hasGPS , gpsLatitude, gpsLongitude, flashFired, focalLength, isoSpeedRating , caption, copyright FROM AgLibraryIPTC IPTC JOIN Adobe_images image ON image.id_local = IPTC.image JOIN AgLibraryFile rootFile ON rootfile.id_local = image.rootFile JOIN AgLibraryFolder folder ON folder.id_local = rootfile.folderJOIN AgLibraryRootFolder rootFolder ON rootFolder.id_local = folder.rootFolder JOIN AgharvestedExifMetadata metadata ON image.id_local = metadata.image LEFT JOIN AgInternedExifLens LensRef ON LensRef.id_Local = metadata.lensRef LEFT JOIN AgInternedExifCameraModel Camera ON Camera.id_local = metadata.cameraModelRef ORDER BY FullName

Note that since some images don’t have a camera or lens logged the joins to those tables needs to be a LEFT join not an inner join. Again the version I use in PowerShell has a Where clause which inserts a parameter.

OK so much for file data – the other data I wanted was about collections. The list of collections is in just one table (AgLibraryCollection) so very easy to query, and but I also wanted to know the images in each collection.

Since one image can be in many collections,and each collection holds many images AgLibraryCollectionImage is a table to provide a many to relationship. Different tables might be attached to AdobeImages depending on what information one wants from about the images in a collection, I’m interested only in mapping files on disk to collections in Lightroom, so I have linked to the file information and I have a query like this.

SELECT Collection.name AS CollectionName , RootFolder.absolutePath || Folder.pathFromRoot || RootFile.baseName || '.' || RootFile.extension AS FullName FROM AgLibraryCollection Collection JOIN AgLibraryCollectionimage cimage ON collection.id_local = cimage.Collection JOIN Adobe_images Image ON Image.id_local = cimage.imageJOIN AgLibraryFile RootFile ON Rootfile.id_local = image.rootFile JOIN AgLibraryFolder Folder ON folder.id_local = RootFile.folder JOIN AgLibraryRootFolder RootFolder ON RootFolder.id_local = Folder.rootFolder ORDER BY CollectionName, FullName

Once I have an ODBC driver (or an OLE DB driver) I have a ready-made PowerShell template for getting data from the data source. So I wrote functions to let me do :
Get-LightRoomItem -ListFolders -include $pwd
To List folders, below the current one, which are in the LightRoom Library
Get-LightRoomItem -include "dive"
To list files in LightRoom Library where the path contains "dive" in the folder or filename
Get-LightRoomItem | Group-Object -no -Property "Lens" | sort count | ft -a count,name
To produce a summary of lightroom items by lens used. And
$paths = (Get-LightRoomItem -include "$pwd%dng" | select -ExpandProperty path) ; dir *.dng | where {$paths -notcontains $_.FullName} | move -Destination scrap -whatif
Stores paths of lightroom items in the current folder ending in .DNG in $paths; then gets files in the current folder and moves those which are not in $paths (i.e. in Lightroom.) specifying -Whatif allows the files to be confirmed before being moved.

Get-LightRoomCollection to list all collections
Get-LightRoomCollectionItem -include musicians | copy -Destination e:\raw\musicians to copies the original files in the “musicians” collection to another disk

I’ve shared the PowerShell code on Skydrive

Comments Off