Extract Media Folder from PowerPoint Files

On Wednesday, I wrote a response to something on Stack Overflow. So I don’t have to chase it down one day, I though I’d briefly discuss it here and include the updated code I would be more likely to use, providing I ever need to use it.

If you open a PowerPoint .pptx file in something like 7-Zip, you’ll quickly realize that the .pptx is a compressed, or archived, file format. While this brings down the file size (1.65MB file vs. 3.27MB when expanded), this is more more to say that there are a few folders, and several files, that make up a single PowerPoint file.

A person on Stack Overflow wanted to automate the expansion of a series of .pptx files and extract the media folder. This folder holds all the images in a PowerPoint file. I’m not going to bother to explain the code, but let me set the stage. I created a folder on my desktop called pptx. Inside the folder were four PowerPoint files using the .pptx file extension. My code created a new folder for each file in the pptx folder, expanded each PowerPoint file in their own new folders, located the nested media folder, moved it to the top level of their new folder, and then deleted all the other folders and files inside the new folder. If there wasn’t a media folder, it would still create the new folder; however, it would create a single text file in there called “No media folder.txt.” This was to help ensure the user didn’t think the code didn’t work, when they didn’t find the media folder inside the new folder.

I’ll included the original code from the forum post further below, but for now here’s an updated and cleaner version.

$Path = 'C:\users\tommymaynard\Desktop\pptx'
$Files = Get-ChildItem -Path $Path

Foreach ($File in $Files) {
    New-Item -Path $File.DirectoryName -ItemType Directory -Name $File.BaseName | Out-Null
    $NewFolder = $File.FullName.Split('.')[0]

    Add-Type -AssemblyName System.IO.Compression.FileSystem
    [System.IO.Compression.ZipFile]::ExtractToDirectory($File.FullName,$NewFolder)
    
    If (Test-Path -Path "$($NewFolder)\ppt\media") {
        Move-Item -Path "$($NewFolder)\ppt\media" -Destination $NewFolder
        Get-ChildItem -Path $NewFolder | Where-Object {$_.Name -ne 'media'} | Remove-Item -Recurse

    } Else {
        Get-ChildItem -Path $NewFolder | Remove-Item -Recurse
        New-Item -Path $NewFolder -ItemType File -Name 'No media folder.txt' | Out-Null
    }
}

And, here’s the original code and a link to the post itself. This below version didn’t use the $NewFolder variable, so there’s overwhelming inclusion of this piece of code: $File.FullName.Split(‘.’)[0]. It also defaulted to use the Expand-Archive cmdlet, if that was available. As I mentioned in the Stack Overflow post itself, it’s so slow. I’ve removed that so that it’s much faster now, which reminded me, I’ve actually written about this time difference before. It was just last October.

$Path = 'C:\users\tommymaynard\Desktop\pptx'
$Files = Get-ChildItem -Path $Path

Foreach ($File in $Files) {
    New-Item -Path $File.DirectoryName -ItemType Directory -Name $File.BaseName | Out-Null

    If (Get-Command -Name Expand-Archive) {
        Expand-Archive -Path $File.FullName -OutputPath $File.FullName.Split('.')[0]

    } Else {
        Add-Type -AssemblyName System.IO.Compression.FileSystem
        [System.IO.Compression.ZipFile]::ExtractToDirectory($File.FullName,$File.FullName.Split('.')[0])
    } # End If-Else.

    If (Test-Path -Path "$($File.FullName.Split('.')[0])\ppt\media") {
        Move-Item -Path "$($File.FullName.Split('.')[0])\ppt\media" -Destination $File.FullName.Split('.')[0]
        Get-ChildItem -Path $File.FullName.Split('.')[0] | Where-Object {$_.Name -ne 'media'} | Remove-Item -Recurse

    } Else {
        Get-ChildItem -Path $File.FullName.Split('.')[0] | Remove-Item -Recurse
        New-Item -Path $File.FullName.Split('.')[0] -ItemType File -Name 'No media folder.txt' | Out-Null
    } # End If-Else.
} # End Foreach.

Leave a Reply

Your email address will not be published.