After setting up a Network Video Recorder recently (using Milestone XProtect Essentials software), I needed a way to add up and compare how much disk space is being used for each day of recordings. Without this information, there is no easy way to see what effect various settings have on disk space (e.g. frames per second, motion sensing, etc.). And because there are often tens of thousands of files per day, Windows Explorer balks at filtering the files by date and summing their space.
Here’s a custom PowerShell script to go through all the files in a folder and sum their sizes by date. Only the first parameter, the folder to evaluate, is required. Other parameters let you specify display units (default is GB), whether to include subfolders (default is True), and specify filename and date filters.
Sample Output
Here’s an actual evaluation of a 4TB volume set to retain 15 days of video:
FolderToEvaluate: E:\MediaDatabase IncludeSubfolders: True IncludePatterns: *.* MinDateTime: 01/01/0001 00:00:00 MaxDateTime: 12/31/9999 23:59:59 Date Total Size Count Avg. Size ---------- ---------- ----- --------- 01/26/2017 13.3 GB 1,133 12 MB 01/27/2017 117.6 GB 24,423 5 MB 01/28/2017 186.4 GB 51,046 4 MB 01/29/2017 197.6 GB 56,639 4 MB 01/30/2017 179.1 GB 49,945 4 MB 01/31/2017 246.0 GB 60,129 4 MB 02/01/2017 287.3 GB 69,614 4 MB 02/02/2017 232.5 GB 61,035 4 MB 02/03/2017 171.9 GB 48,650 4 MB 02/04/2017 179.4 GB 51,118 4 MB 02/05/2017 189.6 GB 52,592 4 MB 02/06/2017 206.1 GB 77,955 3 MB 02/07/2017 264.9 GB 108,529 2 MB 02/08/2017 296.9 GB 121,659 2 MB 02/09/2017 329.6 GB 135,037 2 MB 02/10/2017 255.0 GB 104,497 2 MB 02/11/2017 116.7 GB 47,798 2 MB Matching files: 1,121,799 Total size: 3,725,687,629,210 bytes ( 3,469.8 GB ) Average size: 3,321,172 bytes ( 3 MB )
That’s odd…I decreased frames per second on February 7, and both daily file count and disk space increased? I’ll have to look into that.
By the way, it took about 18.5 minutes to run that script to tally 1.08 million files.
The Script
Here’s the script. Copy and paste to a file, e.g. ListFileSpaceByDate.ps1:
<# .Synopsis Given a folder name, add up disk space used by files, summarized by date. Optionally include subfolders. Optionally filter by file names and modified dates. Copyright (c) 2017 by MCB Systems. All rights reserved. Free for personal or commercial use. May not be sold. No warranties. Use at your own risk. .Notes Name: MCB.ListFileSpaceByDate.ps1 Author: Mark Berry, MCB Systems Created: 02/10/2017 Last Edit: 02/11/2017 Changes: 02/11/2017 Add average file size column and grand total. .Parameter FolderToEvaluate The folder containing files to to check. Default: none .Parameter DisplayUnits Specify KB, MB, or GB for display units, or empty for bytes. GB is displayed with one decimal; others have no decimals. Default: "GB" .Parameter IncludeSubfolders Whether or not to include subfolders when checking file sizes. Default: $true .Parameter IncludePatterns String array of pattern(s) to include when selecting files. Default: "*.*" .Parameter MinDateTime Files with Date Modified before this time stamp will not be included. Default: none (no minimum date) .Parameter MaxDateTime Files with Date Modified after this time stamp will not be included. Default: none (no maximum date) .Parameter LogFile Path to a log file. Required by MaxRM script player. Not used here. Default: "". #> param( [Parameter(Mandatory = $true, Position = 0, ValueFromPipelineByPropertyName = $true)] [String]$FolderToEvaluate, [Parameter(Mandatory = $false, Position = 1, ValueFromPipelineByPropertyName = $true)] [String[]]$DisplayUnits="GB", [Parameter(Mandatory = $false, Position = 2, ValueFromPipelineByPropertyName = $true)] [Boolean]$IncludeSubfolders=$true, [Parameter(Mandatory = $false, Position = 3, ValueFromPipelineByPropertyName = $true)] [String[]]$IncludePatterns="*.*", [Parameter(Mandatory = $false, Position = 4, ValueFromPipelineByPropertyName = $true)] [DateTime]$MinDateTime, [Parameter(Mandatory = $false, Position = 5, ValueFromPipelineByPropertyName = $true)] [DateTime]$MaxDateTime, [Parameter(Mandatory = $false, Position = 6, ValueFromPipelineByPropertyName = $true)] [String]$LogFile="" ) $ErrFound = $false ################################################################################ # Set up parameters ################################################################################ # Set up and start stopwatch so we can print out how long it takes to run script # http://stackoverflow.com/questions/3513650/timing-a-commands-execution-in-powershell $StopWatch = [Diagnostics.Stopwatch]::StartNew() # If $MinDateTime not specified, use Windows' minimum, i.e. don't filter by min date if ([string]::IsNullOrEmpty($MinDateTime)) { $MinDateTime = [DateTime]::MinValue} # If $MaxDateTime not specified, use Windows' maximum, i.e. don't filter by max date if ([string]::IsNullOrEmpty($MaxDateTime)) { $MaxDateTime = [DateTime]::MaxValue} # When user specifies max date without time, assume the user means for it to be INclusive, i.e. up to 23:59:59 on that date if ($MaxDateTime.Hour -eq 0 -and $MaxDateTime.Minute -eq 0 -and $MaxDateTime.Second -eq 0) { $MaxDateTime = $MaxDateTime.AddDays(1).AddSeconds(-1) # plus 1 day minus 1 second: 23:59:59 } "Select files matching the following parameters and sum their sizes by Date Modified:" "" "FolderToEvaluate: $FolderToEvaluate" "IncludeSubfolders: $IncludeSubfolders" "IncludePatterns: $IncludePatterns" "MinDateTime: $MinDateTime" "MaxDateTime: $MaxDateTime" ################################################################################ # Process files ################################################################################ # Create a hashtable with LastWriteDate as the key (a string in dddd/yy/mm format) and the SUM of lengths as the value # Second column of hashtable will be an array containing two elements: total size, file count $hashByDate = @{} $TotalSize = 0 $TotalFileCount = 0 # Notes: # -force to include ReadOnly, Hidden, and System files. http://stackoverflow.com/a/26425580/550712. # { ! $_.PSIsContainer } excludes folders (just list files). http://superuser.com/a/150762/171670 Get-ChildItem $FolderToEvaluate -recurse -force -include $IncludePatterns ` | Where-Object { ! $_.PSIsContainer -and $_.LastWriteTime -ge $MinDateTime -and $_.LastWriteTime -le $MaxDateTime} ` | ForEach-Object { $TotalFileCount++ $TotalSize = $TotalSize + $_.Length [string]$LastWriteDate = $_.LastWriteTime.ToString("yyyy/MM/dd") # store string as yyyy/MM/dd for correct sorting if ( $hashByDate.ContainsKey($LastWriteDate) ) { # date already exists in hashtable: add current file size to sum $FileSizeSum = $hashByDate.$LastWriteDate[0] $FileCount = $hashByDate.$LastWriteDate[1] $FileSizeSum = $FileSizeSum + $_.Length $FileCount++ $hashByDate.Set_Item($LastWriteDate, @($FileSizeSum, $FileCount)) } else { # date doesn't already exists in hashtable: add new entry with current file size and count $hashByDate.Add($LastWriteDate, @($_.Length, 1) ) } } # end of ForEach-Object ################################################################################ # Output results ################################################################################ # Print table sorted by date (the two columns in a hash table are always called Name and Value). # Re-format the first column as MM/dd/yyyy by first converting to DateTime, then ToString. # Format the second column with thousands separators. # Format the average size column one "less" than the DisplayUnits, e.g. for GB total, show average MB. # Seems inelegant to use a Switch for DisplayUnits but would be messy to embed logic in the same format-table. switch ($DisplayUnits) { "KB" { $hashByDate.GetEnumerator() | Sort-Object Name ` | format-table -autosize ` @{Label="Date ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, ` @{Label="Total Size";Expression={"{0:N0} KB" -f ($_.Value[0]/1KB)};align="right"}, ` @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, ` @{Label="Avg. Size";Expression={"{0:N0}" -f ($_.Value[0]/$_.Value[1])};align="right"} "Matching files: " + "{0:N0}" -f $TotalFileCount "Total size: " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N0} KB" -f ($TotalSize/1KB) + " )" "Average size: " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes" } "MB" { $hashByDate.GetEnumerator() | Sort-Object Name ` | format-table -autosize ` @{Label="Date ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, ` @{Label="Total Size";Expression={"{0:N0} MB" -f ($_.Value[0]/1MB)};align="right"}, ` @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, ` @{Label="Avg. Size";Expression={"{0:N0} KB" -f ($_.Value[0]/$_.Value[1]/1KB)};align="right"} "Matching files: " + "{0:N0}" -f $TotalFileCount "Total size: " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N0} MB" -f ($TotalSize/1MB) + " )" "Average size: " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes ( " + "{0:N0} KB" -f ($TotalSize/$TotalFileCount/1KB) + " )" } "GB" { $hashByDate.GetEnumerator() | Sort-Object Name ` | format-table -autosize ` @{Label="Date ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, ` @{Label="Total Size";Expression={"{0:N1} GB" -f ($_.Value[0]/1GB)};align="right"}, ` @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, ` @{Label="Avg. Size";Expression={"{0:N0} MB" -f ($_.Value[0]/$_.Value[1]/1MB)};align="right"} "Matching files: " + "{0:N0}" -f $TotalFileCount "Total size: " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N1} GB" -f ($TotalSize/1GB) + " )" "Average size: " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes ( " + "{0:N0} MB" -f ($TotalSize/$TotalFileCount/1MB) + " )" } default { # includes empty value: display exact bytes $hashByDate.GetEnumerator() | Sort-Object Name ` | format-table -autosize ` @{Label="Date ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, ` @{Label="Total Size";Expression={"{0:N0}" -f ($_.Value[0])};align="right"}, ` @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, ` @{Label="Avg. Size";Expression={"{0:N0}" -f ($_.Value[0]/$_.Value[1])};align="right"} "Matching files: " + "{0:N0}" -f $TotalFileCount "Total size: " + "{0:N0}" -f $TotalSize + " bytes" "Average size: " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes" } } ################################################################################ # Wrap-up ################################################################################ # Conclude with local time "`n======================================================" 'Local Machine Time: ' + (Get-Date -Format G) # Stop the stopwatch and show the elapsed time $StopWatch.Stop() "Script execution took $($StopWatch.Elapsed)." "" if ($ErrFound) { $ExitCode = 1001 # Cause script to report failure in MaxRM dashboard } else { $ExitCode = 0 } "Exit Code: " + $ExitCode Exit $ExitCode
Sample Execution
Here’s an example of running the script with all parameters (except the unused -LogFile) specified, limiting the date range to one month. Put this on one line:
powershell.exe -NoLogo -NoProfile -NonInteractive ".\ListFileSpaceByDate.ps1" -FolderToEvaluate "E:\MediaDatabase" -IncludeSubfolders $true -IncludePatterns "*.*" -DisplayUnits "MB" -MinDateTime "01/01/2019" -MaxDateTime "01/31/2019"
Known Issue If no files are found, the script will abort with a divide by zero error when trying to compute average file size.
Thanks so much, Mark!
BTW, you actually mean “Copy and paste”, not “Cut and paste”, in the instruction:
“Cut and paste to a file, e.g. ListFileSpaceByDate.ps1:”
Also
Thanks again.
What, your browser doesn’t let you cut? Okay, fixed :).
Hi,
How to change the min date and max date range?
@Brandon, you can use the -MinDateTime and -MaxDateTime parameters. I’ve added an example to the end of the post above.
Hey Mark,
I am running your script on an insanely large directory. The script just stopped running this morning, no errors or anything in the logs. Do you know if there are limitations on the size of the data set it collects?
Thanks in advance
@Steve, I’m not aware of any limitations. How are you running it? If in Task Scheduler, it may be hitting the time limit of the task. Try running it interactively from a command prompt. Out of curiosity, how much data are you processing (total number of files, total TB)?
Mark, really great, useful script. Nicely done! Thanks for sharing