HOWTO: Convert Complex Text Data into PowerShell Objects

PowerShell 5.0 introduces a wonderful new cmdlet called ConvertFrom-String.  Don’t let the simple name deceive you though.  There is some exceptionally complex math running behind the scenes here to do some truly wonderful things.
In fact, the code that gets executed inside this function is based upon the "FlashExtract" project completed by Microsoft Research.  How much math? Well here is a portion of the algorithm that was published in their whitepaper.

image
Source: http://research.microsoft.com/en-us/um/people/sumitg/pubs/pldi14-flashextract.pdf

The idea here is to give us as administrators the ability to take some existing complex text data, intelligently analyze it and convert it into native PowerShell objects. 
Technically we can already do this today using regular expressions.  But coming up with the right combination of letters and characters to produce the intended results is no easy task.  At least until now.
The idea here is that instead of trying to micro manage exactly character-by-character how you want to extract the data to get to the content you want, you instead simply ‘tell’ PowerShell want you want.
Specifically, you pass the ConvertFrom-String function a marked up template of the data that indicates which data is important.

It’s hard to understand in writing but will make a lot more sense once you see it in action.  This is also when the "ah ha" moment comes and you realize the power this new cmdlet offers.

For our example, we’re going to take a look at the c:\windows\windowsupdate.log.  This is a plain text log file that contains several columns:

image

Let’s say we wanted to filter based only on those entries that were marked as an "Agent" entry AND contained "WSUS" only in the details section.  Trying to do that with notepad searching alone would be a nightmare.

So instead lets see how ConvertFrom-String can make our lives better:

image

The basic idea here is to first open the file you wish to analyze and copy and paste in some lines that represent the different types of data that the parsing engine is going to run into. From there:

1) Mark up the columns you want to be included in your object
2) Include examples of the data in the file so that the learning engine inside ConvertFrom-String can "learn" the dataset and parse it accordingly

That’s it.  If you run the code above, you’ll now get a view that looks similar to notepad but now where the data is separated into columns and is searchable by them!

image

A few things to note that I learned the hard way trying to figure out how to make this work:

1) Start your first line with propertyname*:{ Text goes here }.  Everything you then add will be listed as sub properties under this name

2) In the text goes here section, list the properties you want to extract in the form of {columnname:sampledata} being sure to include the squiggly brackets

Here is another example I was playing with where we use ConvertFrom-String to extract the IP, Subnet and Default Gateway from an the text output of ipconfig

image

image

Again, in #1, we define the template which includes the marked up data we want to extract including the column names we want the data to be in and #2 is where we give it some sample data for the engine to "learn" how to read the data.

This is an exceptionally powerful tool and I’m sure I will find countless uses for this going forward.  Thanks Microsoft!

Here are some more resources to help you learn this cmdlet if you’re interested:

A fun Christmas themed holiday story that introduces the idea in a fun way:
http://blogs.technet.com/b/heyscriptingguy/archive/2015/12/24/a-holiday-special-rusty-the-red-eyed-scripter-part-4.aspx

Some technical details including debugging help
http://blogs.msdn.com/b/powershell/archive/2014/10/31/convertfrom-string-example-based-text-parsing.aspx

A great writeup on ConverFrom-String
http://www.lazywinadmin.com/2014/09/powershell-convertfrom-string-and.html

1 comment

  1. Nice exsontien, it Nice exsontien, it will come in handy. I have a small request though, on the translate from I have selected Auto Detect but there is nothing that says I have it selected. Maybe you could add a X or something between Auto detect, so I see that I have it selected. Translate to should have the same thing But once again, a very nice addition to the browser. Was this answer helpful?

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.