Search code examples
windowspowershelldirectory-structurerobocopy

Windows Powershell / robocopy: copy directory structure and full folder content - but only if folder contains PDF


I try to write a script that does the following:

  • determine folders that contain PDF files from a source dir
  • create the full original directory structure down to these folders to a destination dir
  • copy the full content of the folder that contains a PDF - regardless of the types of the other files in that folder
  • do not copy any of the files in the parent folders

screenshot with description of what I try to achieve

I hope I described it well enough for you who don't live in my head to understand what i mean. :D

I got this script now.

# Define the source and destination paths
$source = "H:\MC4"
$destination = "H:\Mirror"

# Get all subdirectories that contain at least one PDF file
$dirsWithPDFs = Get-ChildItem -Path $source -Recurse -Directory | Where-Object {
    Get-ChildItem -Path $_.FullName -Filter *.pdf -File -Recurse | Where-Object { $_.Extension -eq ".pdf" }
}

# Copy each directory with PDF files using Robocopy
foreach ($dir in $dirsWithPDFs) {
    $sourceDir = $dir.FullName
    $destDir = $sourceDir.Replace($source, $destination)
    robocopy $sourceDir $destDir /MIR
}

It nearly does the job. Only problem: It won't leave the parent folders of my "PDF folder" empty of files.

Can you tell me how to do that? Sorry, I've found many similar questions and answers but not that exact situation.

Thanks so much!


Solution

  • The following code does not use robocopy, but the Copy-Item could be replaced by it. It first creates a list of directories which contain one or more .pdf files. It then constructs the target path directory based on the directory path without the initial source directory information.

    [CmdletBinding()]
    param ()
    $Source = 'C:\src'
    $Destination = 'C:\mirror'
    
    $SourceDir = $Source
    $DestinationDir = $Destination
    
    if ($SourceDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $SourceDir += [IO.Path]::DirectorySeparatorChar }
    if ($DestinationDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $DestinationDir += [IO.Path]::DirectorySeparatorChar }
    Write-Verbose "SourceDir is $SourceDir"
    Write-Verbose "DestinationDir is $DestinationDir"
    
    $SourceDirectories = Get-ChildItem -Recurse -Path $SourceDir -Filter '*.pdf' |
        Select-Object -ExpandProperty DirectoryName |
        Sort-Object -Unique
    
    $SourceRegex = [regex]::Escape($SourceDir)
    Write-Verbose "SourceRegex = $SourceRegex"
    
    foreach ($SourceDirectory in $SourceDirectories) {
        $TargetPath = $DestinationDir + ($SourceDirectory -replace $SourceRegex,'')
        Write-Verbose "TargetPath = $TargetPath"
    
        if (-not (Test-Path -Path $TargetPath)) { New-Item -ItemType Directory -Path $TargetPath }
        Copy-Item -Path (Join-Path -Path $SourceDirectory -ChildPath '*') -Destination $TargetPath
    }
    

    Update:

    This code will allow for a list of excluded directories.

    [CmdletBinding()]
    param ()
    $Source = 'C:\src\'
    $Destination = 'C:\mirror'
    $Exclusions = @('t\t2')
    $ExclusionDirs = foreach ($Exclusion in $Exclusions) { Join-Path -Path $Source -ChildPath $Exclusion }
    
    $SourceDir = $Source
    $DestinationDir = $Destination
    
    if ($SourceDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $SourceDir += [IO.Path]::DirectorySeparatorChar }
    if ($DestinationDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $DestinationDir += [IO.Path]::DirectorySeparatorChar }
    Write-Verbose "SourceDir is $SourceDir"
    Write-Verbose "DestinationDir is $DestinationDir"
    
    $SourceDirectories = Get-ChildItem -Recurse -Path $SourceDir -Filter '*.pdf' |
        Where-Object { $_.DirectoryName -notin $ExclusionDirs } |
        Select-Object -ExpandProperty DirectoryName |
        Sort-Object -Unique
    
    $SourceRegex = [regex]::Escape($SourceDir)
    Write-Verbose "SourceRegex = $SourceRegex"
    
    foreach ($SourceDirectory in $SourceDirectories) {
        $TargetPath = $DestinationDir + ($SourceDirectory -replace $SourceRegex,'')
        Write-Verbose "TargetPath = $TargetPath"
    
        if (-not (Test-Path -Path $TargetPath)) { New-Item -ItemType Directory -Path $TargetPath }
        Copy-Item -Path (Join-Path -Path $SourceDirectory -ChildPath '*') -Destination $TargetPath
    }