HOWTO: Automatically Trigger Debug Logging When CPU Under Load

I am troubleshooting an issue where a domain controller pins its CPU at 100% for extended periods and as a result, LDAP authentication requests fail causing organization wide application failures. I recently posted a HOWTO on setup and configuration of the Server Performance Advisor tool from Microsoft. This is the tool that allows you to determine which specific machines and even specific LDAP queries are being performed on a domain controller. You can find more on that HOWTO here.
The problem now is that the tool generates a tremendous amount of data and so it’s not feasible to leave it running all the time. I wanted to find a way that the SPA could be triggered only during periods of high CPU utilization. While I unfortunately unable to find an off-the-shelf solution to this, thanks to PowerShell I was able to quickly develop a solution for this problem.

The script below is a rough-around-the-edges purpose built script that is designed to be run on an offending domain controller. It requires that the Server Performance Advisor has already been setup and configured. Assuming that’s done, the idea is that you simply run this script on your domain controller for an extended period of time. What the script does is check for CPU utilization (by default every 60 seconds) and if high CPU usage is detected, it automatically triggers the SPA.

Once you let this run over time, you can then use the SPA Report Explorer to determine which specific clients and queries were the most expensive and take action to resolve them. An example of what the script can do for you is shown below. Each report shown was automatically triggered during periods of high CPU usage and each capture 2 minutes worth of detailed logs.

reportexplorer

If you find this useful, let me know in the comments below!

# This script is designed to be run on a Domain Controller that is experiencing high CPU utilization
# This script assumes that an SPA Database has already been deployed and configured

$SPAPAth = "D:\SPA" # The location of the extracted files for the Server Performance Advisor
$AdminUser = "domain\username" # A user account that has "Logon as Batch Job" permission on the target server
$SecondsToWait = 60 # How long to wait between checks. Recommended: 60 (seconds)
$CPUThreshold = 95 # When CPU usage reaches this level as a percent, trigger the SPA. Recommended: 95 (%)
$LogDuration = 120 # The duration of time the SPA should run once triggered. Recommended: 120 (seconds)
$SQLServer = "SQLSERVER01" # The hostname of the server that is hosting SQL Server
$SPADatabase = "SPA1" # The name of the SPA Database that will be used. This will be created in advance when you first run the wizard from SPAConsole.exe 

# Import the SPA cmdlets, specifically Start-SPAAnalysis
Import-Module -Name $SPAPAth\SpaCmdlets.dll 

# Function to display countdown progress bar. This is useful as this is designed to run for long periods of time and this helps validate the script is still functioning
Function Wait-CountDown ($WaitTimeInSeconds) {
 $IntervalPercent = $WaitTimeInSeconds / 100
 while($SecondsToWait -gt 0) {
 Write-Progress -Activity "Waiting for next execution..." -status "$SecondsToWait seconds remaining..." -PercentComplete ($SecondsToWait/$IntervalPercent)
 Start-Sleep -Seconds 1
 $SecondsToWait-- }
}

# Create prompt to type in password for the account with log on as batch job rights to server
$Cred = Get-Credential -UserName $AdminUser -Message "Account with 'log on as batch job right' on destination server:"

# Loop indefinitely or until the user stops the script
While(1 -eq 1) {
 # Get the current CPU usage as a percent
 $ProcCounter = Get-Counter -Counter "\Processor(_Total)\% Processor Time" -SampleInterval 2
 $CPUUsage =[math]::Round(($ProcCounter.readings -split ":")[-1])

 # If the threshold is met or exceeded, it means the domain controller is busy. Start the SPA logger to determine exactly why it is busy
 if($CPUUsage -ge $CPUThreshold)
 { Start-SpaAnalysis -ServerName "localhost" -AdvisorPackName Microsoft.ServerPerformanceAdvisor.AD.V1 -Duration $LogDuration -SqlInstanceName $SQLServer -SqlDatabaseName $SPADatabase -Credential $Cred }
 Wait-CountDown $SecondsToWait
}

Continue reading

HOWTO: Configure Server Performance Advisor to troubleshoot Domain Controller Performance

Have you ever found one of your domain controllers pinned at 100% CPU with lsass.exe using up 99% of it?  Have you been baffled at how to figure out what to do next and how to figure out who and what is causing it?  Thankfully Microsoft has created a tool to help aid us in our troubleshooting.  I’ve found however that there is remarkably little documentation online for this tool so hopefully this document will help those that are trying to get this tool working.

This HOWTO describes how to use a downloadable tool called the Server Performance Advisor (SPA) to troubleshoot situations where a Windows domain controller is experiencing high CPU utilization.

The SPA tool is a free download from Microsoft and can be downloaded here:

http://download.microsoft.com/download/0/3/D/03D07D11-18D4-4160-B4AC-915061B85669/SPAPlus_amd64.cab
(At the time of this writing, the most recent version is v3.1)

Please note that this tool requires a SQL Server to function.  Fortunately, the free Express edition will work.  In my case I used SQL 2008 R2.  You can grab that here:

http://www.microsoft.com/en-ca/download/details.aspx?id=23650

SQL Server Installation

 

For this use case, I will install SQL server on my laptop.  You could also use an existing SQL server if you have one or build a new VM.

  • When you run the installer, select New installation or add features to an existing installation

clip_image001

Continue reading

HOWTO: Real World PowerShell Solved for Absolute Beginners

I had a conversation with a coworker today where he expressed an interest in learning PowerShell.  He knew it was critical to his future in IT and could help him solve many day to day challenges but felt that it was too intimidating and didn’t know where to begin. I decided I would do my best to give him a clear and concrete place to start. That’s where this blog post steps in.  What I will attempt to do over the remainder of this post is introduce the the core concepts of PowerShell in such a way that you can be immediately productive with it to solve real world problems without feeling burdened at learning the entire syntax and structure all at once.

Let’s start by outlining our scenario.  We are working on a windows 8 client machine that a user reports is “acting funny” as random pop ups advertisements are appearing on the screen.  You immediately suspect some kind of malware but you need to investigate to confirm your suspicions.  How would you go about solving this problem?

The first thing you might do is open up Task Manager and select the Details tab to see if you can spot any unusual processes.

image

Unfortunately there are so many that it’s difficult to isolate and determine which ones are valid and which ones are not.  Now what we might do next if we are desperate is to go through every process one at a time and try to figure out what they do and if they are legitimate or not.  You could do that… or you could use PowerShell.  It might seem scary but let’s see if this PowerShell thing can do anything to help us here.

  • To launch Powershell, simply click the start button and type ‘powershell‘ or launch it through the start menu by clicking this icon

image

  • A window will appear where you can enter commands.  The question now is —  What do I type?

image

Continue reading

HOWTO: Monitor Concurrent Network Connections with PowerShell

This quick HOWTO is a PowerShell script I wrote to monitor concurrent connections to a server. In this case we have a domain controller that is not behaving properly and I suspect it may be due to some kind of port exhaustion. The script is very quick and dirty but since it works I figured I’d share it. Note that the heavy lifting is done by TCPVCON.EXE from Sysinternals (http://technet.microsoft.com/en-ca/sysinternals/bb897437.aspx). I also include the current CPU utilization so I can correlate if during periods of high CPU if we are seeing an unusually high connection count.
Continue reading

HOWTO: Analyze Very Large Text Files with PowerShell and Python

There are countless situations where an IT professional needs to parse through a log file. In most instances, notepad/notepad2/notepad++ are enough to get in, find the information required and get out. But what if your log files are large. As in 20GB a day large. None of the typical editing tools will help you in this case as the files are simply too large to open. In this case, you can switch to more specialized text viewers such as LogExpert. But what if you need to actually manipulate the data in these log files to perform some kind of analysis? In other words, what if you have to review every single line in that 20GB a day file and do something to compare it to some other line? But before you can even do that, you have to reformat the data as the original source includes a bunch of cruft and formatting that you simply don’t want. This is where things get interesting and this is what this blog post will help you to solve.

In this specific scenario, I have a Windows DNS server that is very heavily used by tens of thousands of endpoints. The request from management was to identify which two hour block of time over the course of a week where the DNS servers are least utilized. In addition, there there was a want to know what the top 20 most requested DNS records were during a given 24 hour period. At first blush, I thought this would be fairly simple:

 

  1. Enable DNS Debug Logging within the DNS Management Console
  2. Capture 24 hours worth of data
  3. Parse the resulting file to extract the date stamps and queries for that time period
  4. Group the results such that we can find the total number of queries as well as the most popular queries

I initially tried this approach and while it would have worked with smaller files, it failed miserably in this case due to size of the data involved. This DNS server was generating log files in the neighborhood of 240MB per minute. My initial parsing code to extract the query names was taking about 90 minutes to run on each file. As a result, a large number of queries were missed. I eventually realized that if I was going to solve this problem, I was going to have to get clever and optimize.

Continue reading

HOWTO: Windows 10 Technical Preview Feedback

Microsoft has released released a technical preview of its upcoming consumer desktop operating system Windows 10.  Anyone can download it (amazingly, you don’t even need to sign in or sign up for anything to grab it!) from here: http://windows.microsoft.com/en-us/windows/preview-iso

I’ve been playing with it for a few minutes now and already I’ve noticed something fundamentally different compared to the Windows 8 preview.  Back then when I first loaded it up, I immediately found myself cursing at the new start menu.  Unfortunately, other than this blog and various forums, I had no venue to voice my frustrations and certainly no venue where my complaints wouldn’t disappear into the abyss.  Flash forward to this Windows 10 technical preview and I was shocked by a dialog box that appeared when I tried to open the control panel.  It asked me: "Do you prefer to use the control panel or the PC Settings panel by default?" This wasn’t an OS configuration setting though.  No, since this is a preview, this was a feedback prompt.  I was being asked not only what I wanted, but given an opportunity to say why!

I then started exploring this a little further.  I noticed that there is a search icon on the start menu that annoys me and I wanted to remove.  Surprisingly (or perhaps not), I discovered there was no obvious means of removing it.  That’s when I ran the "Feedback" app right from within Windows 10.  After registering my (test) Windows Live account, I found this:

image

Continue reading

HOWTO: Install a 2 tier Windows 2012 R2 AD Integrated PKI Infrastructure

Earlier this year I was fortunate enough to spend the day with Mike MacGillivray, a Professional Field Engineer from Microsoft that specializes in Microsoft Public Key Infrastructure or PKI.  During that meeting we ended up building a brand new Microsoft PKI platform in our development environment.  I ended up taking a ton of screenshots and documented the process as best as I could.  Since I had this content anyway, I have opted to share what I have.  Note that I have obfuscated any details that would be deemed company specific.  Fortunately, since this was built in our development environment, there isn’t a lot of that.  Without further ado, I present a guide to building a Windows 2012 R2 Public Key Infrastructure.

Through this you will build and configure:

  • A Windows 2012 R2 PKI Root Server
  • A Windows 2012 R2 PKI Issuing/Enterprise Server
  • Certificate Revocation Lists (CRLs) published through IIS configured through (in our case) an Exchange Client Access Server (CAS)
  • Certificate Web Enrollment

A high level break down of the environment is shown below:

image

The remainder of this email describes the specific steps necessary to create and configure a PKI environment.  Before we begin you will need:

– Windows 2012 R2 server that is not joined to the domain that will host the offline root certificate.  This must be protected at all costs (FSRVDEVRCA1)
– Windows 2012 R2 server that is joined to the domain.  This is the enterprise/subordinate/issuing CA and will be used to issue and revoke new client certificates (FSRVDEVECA1)
– Highly available IIS Web Server to host the CRL.  We have selected to use our Exchange Client Access (CAS) servers for this purpose (FSRVDEVCAS1)

Continue reading

HOWTO: Verify free space on C: for all servers with PowerShell

We will be applying production windows updates to all of our production servers shortly and I wanted to manually verify that we had at least 2GB free on all of our C: drives in preparation for the update.  We have automated monitoring systems but I wanted to grab the information directly to ensure it was accurate.  I exported a list of all of our servers into a text file and ran the following PowerShell to produce a list of all of the C: drives free space.  I then copied and pasted this into Excel.  Quick and dirty but it works great.

 

$computers = Import-Csv c:\temp\hostlist.txt

$computers | % {
$Results = Get-WmiObject Win32_LogicalDisk -ComputerName $_.ServerList -Filter “DeviceID=’C:'” -ErrorAction SilentlyContinue |
Select-Object Size,@{Name=”FreeSize”;Expression={“{0:N1}” -f($_.freespace/1gb) } }
if ($? -eq $True) { Write-Host $Computer.ServerList $Results.FreeSize }
}

HOWTO: Parse HTML using PowerShell

Unimportant Backstory

Today I was unfortunate to discover that one of the drives in my FreeNAS box failed.  I replaced the drive and wanted to watch the progress of the rebuild.  If you log into the FreeNAS web management console there is a section that shows you the number of sectors synchronized and the percent complete.  But that’s only useful if you stare at it.  I want to know if it’s locked up which would require grabbing this value and if it doesn’t change after a certain period, send an email alert.

But before I can do any of that, I need to start with the basics and figure out how to pull the actual HTML from the website so I can parse it and do interesting things like that from there.

Important Part

The code below has the following capabilities:

  • Is able to programatically authenticate against any PHP based (and possibly other) authentication mechanisms
  • Connects to a specific URL and pulls down all of the raw HTML for that page into a variable for further manipulation

This is certainly a handy snippet to keep in your back pocket!

 

# This is the URL that when visited with a web browser contains the username and password fields to fill in
$LoginURL = "http://yourwebsite.com/login.php"

# This is the URL of the page you actually want to pull content from but if accessed directly will normally just redirect you to the login page above
$ContentURL = "http://yourwebsite.com/someothercontentthatfirstrequiresauthentication.php"

# The username and password used to authenticate with the site above
$Username = "hero"
$Password = "superman"

# Create a new object that pulls the HTML data from the login page including the username and password fields
$website = Invoke-WebRequest -Uri $LoginURL

# Note the "username" and "password" attributes specified here may have a different name.  
# Verify by checking the contents of $website.Forms[0].fields
$website.Forms[0].Fields.username = $Username
$website.Forms[0].Fields.password = $Password

# Connect to the login URL and send the login credentials you created as POST and save the resulting session
Invoke-WebRequest "$LoginURL" -SessionVariable WebSession -Body $result.Forms[0] -Method Post | Out-Null

# Now that we're authenticated, connect to the actual URL you want and pass in the session object you created above
$data = Invoke-WebRequest -Uri $ContentURL -WebSession $WebSession

# There is a ton of other metadata that is returned that you most likely don't care about.  
#If you just want the raw HTML to pull some specified content, try using the "outerhtml" property as shown below
$HTMLOutput = $data | select -ExpandProperty Parsedhtml | select -ExpandProperty IHTMLDocument3_documentElement | select -expandproperty outerhtml 

# Display the results to the screen.  This will be the raw HTML returned by the site.  You can now do whatever you'd like with it.
$HTMLOutput

HOWTO: Restore AD Object from 2008 R2 Domain

I am in a situation where I need to delete a critical production Database server computer object in Active Directory for an upgrade but in the event that upgrade fails, I will need to restore the original computer object.

To that end I found an excellent Technet blog on the subject at http://blogs.technet.com/b/askds/archive/2009/08/27/the-ad-recycle-bin-understanding-implementing-best-practices-and-troubleshooting.aspx.

But for those of you that don’t want to read and just want the shortest possible answer, check out below:
Note: The recycle bin must be enabled in advance. If you’ve deleted something before enabling it and wish to restore, I’m afraid you’re not going to be happy

Identify the object to restore

# Identify which objects are available in your recycle bin.
# Note in our case we have many Domain Controllers and so to speed up the process and because I know which DC the object was deleted on, we’re going to specify a specific DC
# This will produce a list of all objects where the most recently deleted object will be at the very end of the list

Get-ADObject -server CORPDC1 -filter ‘isdeleted -eq $true -and name -ne “Deleted Objects”‘ -includeDeletedObjects -property * |
Where {$_.samAccountName -ne $null} | select samaccountname, whenChanged | sort whenChanged

 

Restore the object

# Once you have confirmed the samaccount name of the object you wish to delete, specify it and pass it to the Restore-ADObject cmdlet

Get-ADObject -server CORPDC1 -filter ‘isdeleted -eq $true -and name -ne “Deleted Objects”‘ -includeDeletedObjects -property * |
Where {$_.samAccountName -eq ‘john.smith’} | Restore-ADObject 

 

Tada! The object is now restored.