Tag Archives: envfile

Part IV: Splunk, HEC, Indexer Acknowledgement, and PowerShell

And now, Part IV! Like I’ve mentioned previously, please do yourself a favor and read the previous parts to this series of posts: Part I, Part II, and Part III. This won’t make sense without it. It may not make sense with it, and if that turns out to be true, then let me know by leaving a comment, or reaching me on Twitter. I know it’s possible that Splunk feels like a foreign language. What I know, I learned over the course of a few weeks, and it doesn’t feel like much. I probably shouldn’t be surprised that there’s so much to cover; so much was learned. Maybe you’ll have to do it too: I read the same articles over and over and over. I did have some help from a colleague, but if you don’t have that, then you can consider it me, up to a point,

The next thing we’ll look at is our first POST request to Splunk. This is us, sending in our JSON payload. Notice our first below Invoke-RestMethod command. First, it includes our $HttpRequestEventParams parameter hash table. This includes the Uri, Method, Headers, and the Body parameter and their corresponding parameter values. Here’s a reminder of what’s in that parameter hash table.

URI = @{$EventUri; Method = 'POST'; Headers = $Headers; Body = $Body}

As a part of this Invoke-RestMethod command, we also included the StatusCodeVariable parameter. When our command completes, it will have created the $StatusCode variable which should contain a 200 response, if our message was received by Splunk. Additionally, we have this command writing any output of the command (from Splunk) into the $Ack variable.

#region: Make requests to Splunk REST web services.
$Ack = Invoke-RestMethod @HttpRequestEventParams -StatusCodeVariable StatusCode -Verbose 
$StatusCode
$AckBody = "{`"acks`": [$($Ack.ackId)]}"
$HttpRequestAckParams = @{URI = $AckUri; Method = 'POST'; Headers = $Headers; Body = $AckBody}

Measure-Command -Expression {
    Do {
        $AckResponse = Invoke-RestMethod @HttpRequestAckParams -Verbose
        $AckResponse.acks.0
        Start-Sleep -Seconds 30
    } Until ($AckResponse.acks.0 -eq $true)
} # End Measure-Command
#endregion.

Keep in mind here, that if we weren’t using indexer acknowledgment, we’d probably be done. But now, we’ve got to create a second POST request to Splunk, so we can determine when the data that we know was received (200 response), is headed into the Splunk pipeline for processing. It’s a guarantee of data ingestion (not indigestion). But, since we’re going along down the indexer acknowledgment path (for now and only now), let’s walk through what else is happening here.

First, we create $AckBody. It uses the PowerShell object in $Ack and turns it back into JSON. Invoke-RestMethod has this helpful feature of turning JSON into a PowerShell object, so we’ve got to reverse that so we can send it back to Splunk. Once $AckBody is done, we’ll use it as a parameter value in $HttpRequestAckParams. About $AckBody, be sure to read this page (it’s been linked before), under the “Query for indexing status” section. Splunk sends us an ack ID and then we need to send it back with the same headers as in the first Invoke-RestMethod request. Remember, this includes the HEC token (pulled out of our .env file forever ago), and the channel identifier we created as a random GUID.

Following? Again, reach out to me if you need this information and I’m failing at getting it across to you. I’m not too big to help/rewrite/clarify. I get it, this has the potential to make someone launch their laptop out of a second-story window. Luckily, I work on the first floor of my house. Also, print out the Splunk articles I’ve linked and make them a home on the drink table next to your couch (like I did).

Okay, we’re on the last step. Remember, we’re still working as those indexer acknowledgement is enabled. This whole second POST request is irrelevant if you’re not going to use it. As is the channel identifier. Again, I’ll post modified code just as soon as my indexer acknowledgment is disabled.

Measure-Command -Expression {
    Do {
        $AckResponse = Invoke-RestMethod @HttpRequestAckParams -Verbose
        $AckResponse.acks.0
        Start-Sleep -Seconds 30
    } Until ($AckResponse.acks.0 -eq $true)
} # End Measure-Command

I mentioned that I’m not going to be using indexer acknowledgment because of the time it takes; I simply don’t have that. I’m in the business of automating. I record the duration of every function I invoke. It isn’t going to work for me. Anyway, I have my second Invoke-RestMethod request inside a Do-Until loop. This loop continues to run the command every 30 seconds until I get a $true response from Splunk (that means that Splunk is for sure going to index my data). For fun, that’s inside of a Measure-Command Expression parameter. This, is how I determined it took too much time for me. Below I’ll include the entire code as one block from all three posts. In the fifth post, yet to be written, I’ll include the entire code as one block, too. And remember, that post won’t have any requirements on an enabled indexer acknowledgment, the second POST request, the channel identifier, etc.

Thank you for hanging in for this one. Hopefully, it proves helpful for someone. Oh, that reminds me. There were two Splunk functions written to do all this when I started looking around. Maybe you found them too. They made little sense to me until I really took the time to learn what’s happening. Now that you’ve read this series, read over those functions; see what you understand about them now that you wouldn’t have been able to before. It might be much more than you thought it would be.

@torggler https://www.powershellgallery.com/packages/Send-SplunkEvent

@halr9000 https://gist.github.com/halr9000/d7bce26533db7bca1746

#region: Read .env file and create environment variables.
$FilterScript = {$_ -ne '' -and $_ -notmatch '^#'}
$Content = Get-Content -Path 'C:\Users\tommymaynard\Documents\_Repos\code\functiontemplate\random\telemetry\.env' | Where-Object -FilterScript $FilterScript
If ($Content) {
	Foreach ($Line in $Content) {
		$KVP = $Line -split '=',2; $Key = $KVP[0].Trim(); $Value = $KVP[1].Trim()
		Write-Verbose -Message "Adding an environment variable: `$env`:$Key."
		[Environment]::SetEnvironmentVariable($Key,$Value,'Process')
	} # End Foreach.
} Else {
	'$Content variable was not set. Do things without the file/the environment variables.'
}	# End If.
#endregion.

#region: Read clixml .xml file and obtain hashtable.
$HashTablePath = 'C:\Users\tommymaynard\Documents\_Repos\code\functiontemplate\random\telemetry\eventhashtable.xml'
$EventHashTable = Import-Clixml -Path $HashTablePath
#endregion.

#region: Create Splunk / Invoke-RestMethod variables.
$EventUri = $env:SplunkUrl + '/services/collector/event'
$AckUri = $env:SplunkUrl + '/services/collector/ack'
$ChannelIdentifier = (New-Guid).Guid
$Headers = @{Authorization = "Splunk $env:SplunkHECToken"; 'X-Splunk-Request-Channel' = $ChannelIdentifier}
$Body = ConvertTo-Json -InputObject $EventHashTable
$HttpRequestEventParams = @{URI = $EventUri; Method = 'POST'; Headers = $Headers; Body = $Body}
#endregion.

#region: Make requests to Splunk REST web services.
$Ack = Invoke-RestMethod @HttpRequestEventParams -StatusCodeVariable StatusCode -Verbose 
$StatusCode
$AckBody = "{`"acks`": [$($Ack.ackId)]}"
$HttpRequestAckParams = @{URI = $AckUri; Method = 'POST'; Headers = $Headers; Body = $AckBody}

Measure-Command -Expression {
	Do {
		$AckResponse = Invoke-RestMethod @HttpRequestAckParams -Verbose
		$AckResponse.acks.0
		Start-Sleep -Seconds 30
	} Until ($AckResponse.acks.0 -eq $true)
} # End Measure-Command
#endregion.

Part III: Splunk, HEC, Indexer Acknowledgement, and PowerShell

This is part three of a continuation of posts on Splunk, HEC, and sending data using PowerShell to Splunk for consumption. Make sure you’ve read the first two parts (Part I and Part II), as I’m going to assume you have!

Now that we have our data ready to be sent to Splunk in our $EventHashTable variable, we need to create some URLs using the environment variables we created in Part I. The first two lines create two URIs. In the first one, $EventUri, combines the value stored in $env:SplunkUrl with the string ‘/services/collector/event’. The second one, $AckUri, combines the same $env:SplunkUrl with the string ‘/services/collector/ack’. When completed, these two URIs are two different Splunk REST endpoints.

#region: Create Splunk / Invoke-RestMethod variables.
$EventUri = $env:SplunkUrl + '/services/collector/event'
$AckUri = $env:SplunkUrl + '/services/collector/ack'
$ChannelIdentifier = (New-Guid).Guid
$Headers = @{Authorization = "Splunk $env:SplunkHECToken"; 'X-Splunk-Request-Channel' = $ChannelIdentifier}
$Body = ConvertTo-Json -InputObject $EventHashTable
$HttpRequestEventParams = @{URI = $EventUri; Method = 'POST'; Headers = $Headers; Body = $Body}
#endregion.

We’re only going to need the second URI — $AckUri (see the above code) — if our HEC token requires indexer acknowledgment. For now, we’ll assume that it does. Beneath the two lines that create and assign the $EventUri and $AckUri variable is the creation of the $ChannelIdentifier variable. As the HEC has indexer acknowledgment enabled (a checkmark in the GUI), you’re required to make more than one POST request to Splunk. You send your first one, which we’ll see momentarily, and then you send a second one, as well. The first one gets your data to Splunk. The second (or third, or fourth, or fifth, etc.) lets you know if your data is being processed by Splunk. It may take more than just a second POST request to get back an acknowledgment. It’s a nice feature — who wouldn’t want to know!? Not only do you get your 200 status response letting you know Splunk received your data in the first request, but you can also know when Splunk is actually doing something with it!

I loved the idea at first; however, I was quick to discover it was taking anywhere from two to four, and sometimes five, minutes to return “true.” Once it returned true, I could rest assure that the data was being indexed by Splunk. Nice feature, but for me, that’s too much time to spend waiting. If I get a 200 back from my first POST, then I’d rather just carry on without concerning myself with whether or not my data actually hit the Splunk processing pipeline. From Part I through here, we’re going to continue as though you want to use indexer acknowledgment. Toward the end, we’ll see the code changes I implement to remove it.

I should say this, however, it’s in here, but the idea behind indexer acknowledgment is not about security. It’s about, and I quote, “…to prevent a fast client from impeding the performance of a slow client.” You see, index acknowledgment slows things down. So much so, I couldn’t use it. Lovely idea, but not for my purposes. It also uses channels, as you might’ve gathered from our $ChannelIdentifier variable. To create a channel, you create a random GUID and send that in with your initial POST request. Then you use the same channel identifier to obtain the acknowledgment. As I write this and read this, I realize that you really ought to read through the document I linked above. Here it is again: https://docs.splunk.com/Documentation/Splunk/8.1.0/Data/AboutHECIDXAck. It’s going to do this process more justice than I have. That said, I feel confident that what I’ve had to say might actually prove helpful when combined with the Splunk documentation.

After our $ChannelIdentifier variable is created and assigned, we’re going to create a $Headers (hash table) variable. It will contain two headers: one for authorization that contains our HEC token and one for our channel identifier used for indexer acknowledgment. The below code has been copied from the above code example.

$Headers = @{
    Authorization = "Splunk $env:SplunkHECToken"
    'X-Splunk-Request-Channel' = $ChannelIdentifier
}

Following that variable creation, we’re going to create two more. One will be our $Body variable and the other a hash table of parameters and their values that we’ll end up including with Invoke-RestMethod in Part IV. As a part of creating the $Body variable, we’re going to take our Hash table (with its nested hash table) and convert it all to JSON. This is what we want to send to Splunk. The below code has also been copied from the above code example.

$Body = ConvertTo-Json -InputObject $EventHashTable
$HttpRequestEventParams = @{
    URI = $EventUri
    Method = 'POST'
    Headers = $Headers
    Body = $Body
}

I think it’s important to see the JSON representation of our $EventHashTable variable because you’ll end up seeing this in the Splunk documentation. Notice the placement of the source, host, and time event metadata versus the (nested) event data.

$Body = Convertto-Json -InputObject $EventHashTable
$EventHashTable
{
  "source": "PowerShellFunctionTemplate",
  "host": "mainpc.tommymaynard.com\\TOMMYCOMPUTER",
  "event": {
    "OperatingSystemPlatform": "Win32NT",
    "PSHostProgram": "ConsoleHost",
    "PSVersionActive": "PowerShell 7.0.1"
  },
  "time": 1608148655,
  "sourcetype": "Telemetry"
}

The final line in our code example is the creation of a parameter hash table we’ll use — in a later part — to send our data to Splunk using Invoke-RestMethod.

$HttpRequestEventParams = @{
    URI = $EventUri
    Method = 'POST'
    Headers = $Headers
    Body = $Body
}

Until next time. … Part IV is now available.

Part II: Splunk, HEC, Indexer Acknowledgement, and PowerShell

In the first part of this series, we discussed using a .env file to obtain a Splunk HEC token and our Splunk URL. These two pieces of information are going to need to make their way into our POST request to Splunk. Be sure to read Part I if you haven’t already.

The next thing we need to do is pull in a hash table from the disk. The reason I’ve done it this way is to ensure I had a hash table to use without having to create it. It was a quicker way to work with my code, as I was still in the development and testing phase. In production, the hash table would be created not from the disk/stale data, but from real-time data collection.

#region: Read clixml .xml file and obtain hashtable.
$HashTablePath = 'C:\Users\tommymaynard\Documents\_Repos\code\functiontemplate\random\telemetry\eventhashtable.xml'
$EventHashTable = Import-Clixml -Path $HashTablePath
#endregion.

The above code does two things: One, it sets the $HashTablePath variable to an XML file. This file was created previously using Export-Clixml. The second thing it does is import the XML file using Import-Clixml. So, what’s in the file and what are we doing here? As we’ll see later, we’re going to use a hash table—an in-memory data structure to store key-value pairs—and get that converted (to something else) and sent to Splunk.

This isn’t just any hash table, so let’s take a look at what’s stored in the $EventHashTable variable once these two lines of code are complete.

$EventHashTable
Name                           Value
----                           -----
event                          {OperatingSystemPlatform, IPAddressPublic, Command:0, PSVersionOther…}
sourcetype                     Telemetry
host                           mainpc.tommymaynard.com\TOMMYCOMPUTER
time                           1608148655
source                         PowerShellFunctionTemplate

If you’ve worked with PowerShell hash tables before, then you’ll probably recognize that there’s a nested hash table inside this hash table. That, or at minimum, that one of the values doesn’t seem right. This nested hash table is stored as the value for the event key. The four other keys — host, time, source, and sourcetype — are standard key-value pairs that have values that are not nested hash tables. Let’s take a look at the key-value pairs inside the nested hash table. We can use dotted notation to view the values stored in each key.

$EventHashTable.host
mainpc.tommymaynard.com\TOMMYCOMPUTER
$EventHashTable.time; $EventHashTable.source
1608148655
PowerShellFunctionTemplate
$EventHashTable.sourcetype
Telemetry

Now, let’s view the key-value pairs stored inside the nested hash table. Then we’ll run through how we’re able to nest a hash table inside another hash table before it was saved to disk using Export-Clixml.

 $EventHashTable.event  
Name                           Value
----                           -----
OperatingSystemPlatform        Win32NT
IPAddress                      64.204.192.66
Command:0:CommandName          New-FunctionTemplate
PSVersionAdditional            Windows PowerShell 5.1.19041.610
DateTime                       Wed. 12/16/2020 12:57:35 PM     
Command:0:Parameter:1          PassThru:True
Template                       FunctionTemplate 3.4.7
Command:0:CommandType          Function
IPAddressAdditional            192.168.86.127
Command:0:Parameter:0          Log:ToScreen
Duration                       0:00:00:00.9080064
Domain\Username                mainpc.tommymaynard.com\tommymaynard
CommandName                    New-FunctionTemplate
ModuleName
Domain\Computer                mainpc.tommymaynard.com\TOMMYCOMPUTER
OperatingSystem                Microsoft Windows 10.0.19041
PSVersion                      PowerShell 7.1.0
PSHostProgram                  ConsoleHost

Look at all that data! It’s beautiful (although not completely accurate, so that it may be posted here). When it’s not pulled from the disk/stale data, such as in this example, it’s being gathered by a function during its invocation, and then it’s sent off to Splunk! Because Splunk is involved, it’s important to understand why we’ve set up the hash tables the way we have. The host, time, source, and sourcetype keys in the top-level hash table are what Splunk calls event metadata (scroll down to Event Metadata). It’s optional to send in this data, but for me, it made sense. I wanted to guarantee that I had control over these values versus what their default values might be. The nested hash table is the event data. It’s not data about data, such as metadata is; it’s the data we collected and want Splunk to retain.

We’re going to wrap this part up, but I do want to provide how I got a hash table to nest inside of another hash table. We won’t bother working with the complete information above, but I’ll explain what you need to know. Here’s our downsized event hash table.

$Event = @{
    OperatingSystemPlatform = 'Win32NT'
    PSVersion = 'PowerShell 7.0.1'
    PSHostProgram = 'ConsoleHost'
}

And here it is, stored in the $Event variable and being used as a value in the parent, or previously used term, top-level hash table.

$EventHashTable = @{
    event = $Event
    host = 'mainpc.tommymaynard.com\TOMMYCOMPUTER'
    time = [int](Get-Date -Date (Get-Date) -UFormat %s)
    source = 'PowerShellFunctionTemplate'
    sourcetype = 'Telemetry'
}
$EventHashTable
Name                           Value
----                           -----
source                         PowerShellFunctionTemplate
sourcetype                     Telemetry
host                           mainpc.tommymaynard.com\TOMMYCOMPUTER
event                          {OperatingSystemPlatform, PSHostProgram, PSVersion}
time                           1608148655

If you opt to include the time event metadata, do recognize that it uses, and expects, epoch time. This is the number of seconds after epoch, or  June 1, 1970, 00:00:00 UTC. The -UFormat parameter with the %s parameter value will return this value. I believe this was added in Windows PowerShell 3.0, so if you have a client with an earlier version, first, I’m sorry, and second, there’s another way to obtain this value without the UFormat parameter: think datetime subtraction.

Okay, check back for more; there’s still plenty left to cover. Part III will be up soon!

Edit: Part III is now available.