A neat PowerShell Performance trick

Everyone knows that in PowerShell you work with objects and it is a best practice to always stick to objects, not strings or any custom format. However, I've had a performance issue in my script recently where I was not able to find the bottleneck for quite a long time and the point where I finally found it was quite surprising to me.

However, sometimes you just have to deviate from best practices and find a solution to reach your own goal. That is also probably the way how you learn new things – you learn basic rules without understanding, then you keep following rules but getting better and better understanding why the rules look like they do, and at some point you come up with your own ideas and solutions which deviate from the common rules but allow you to reach your specific goal.

So, I've had a performance issue in my script recently where I was not able to find the bottleneck for quite a long time and the point where I finally found it was quite surprising to me. I tracked all possible suspect blocks of my code with Measure-Command and analyzed how it could be improved. I was not able to find many places to improve performance, but as it often happens with performance improvements you just need to find one and only bottleneck and get rid of it and this was the case in my situation as well.

Let’s assume we work with Active Directory and need to use Get-ADComputer to get necessary information and store it in our custom data structure:

$searchBase = 'OU=test,DC=contoso,DC=com'
$properties = 'OperatingSystem', 'whenCreated', 'CN', 'DistinguishedName'
$computerObject = New-Object 'PSObject'
$computerObject | Add-Member -MemberType NoteProperty -Name 'whenCreated' -Value $null
$computerObject | Add-Member -MemberType NoteProperty -Name 'OperatingSystem' -Value $null
$computerObject | Add-Member -MemberType NoteProperty -Name 'DistinguishedName' -Value $null
$computerObject | Add-Member -MemberType NoteProperty -Name 'CN' -Value $null

Then we just loop through all computers and store information in our custom object. Here I have added Measure-Command to measure time of the foreach loop:

$computers = Get-ADComputer -Filter * -SearchBase $searchBase -Properties $properties
Measure-Command -Expression {  
    foreach ($computer in $computers) {
        $computerObject.whenCreated = $computer.whenCreated
        $computerObject.OperatingSystem = $computer.OperatingSystem
        $computerObject.DistinguishedName = $computer.DistinguishedName
        $computerObject.CN = $computer.CN
    }
}

This way is straightforward and normal, you should probably use it in most cases. However, as I will show further, it is not the best way from the performance perspective and you become even forced to avoid it when you work with thousands of objects.

Now, let's try the same way but add Select-Object cmdlet to receive only necessary properties. I know this is not really recommended or considered best practice, but let's try it anyway:

Measure-Command -Expression {  
    foreach ($computer in $computers) {
        $computerObject.whenCreated = $computer.whenCreated
        $computerObject.OperatingSystem = $computer.OperatingSystem
        $computerObject.DistinguishedName = $computer.DistinguishedName
        $computerObject.CN = $computer.CN
    }
}

And now compare the results - the latter way is almost 4x faster:
results

How is this possible? And, more importantly, why? To cut a long story short, the idea here is that the more often you extract properties from an object, the slower your script is.

Therefore, if you deal with a huge number of objects, performance probably matters and using the aforementioned trick with Select-Object you can extract and work with only necessary properties which should make your script incredibly faster.

One obvious drawback of my approach that once you call Select-Object you immediately loose the original type of the object and end up with a custom one which will definitely bite you later:

log

So, be careful, don't just blindly follow this technique - always consider best practices and measure performance.
PS. This and other useful techniques are also described here.

Comments

comments powered by Disqus