I have the following setup:
- Windows 8.1 32-bit
- Drive 0: system drive, SSD, NTFS, mounted at
- Drive 1: data drive, magnetic HDD, NTFS, mounted at
In a sub-sub-directory of
C:\Users\Database User\Documents I have about 50 000 files with about 2KB on average in about 10 subdirectories. (A bcolz column database.)
With cross-drive NTFS junction points I find huge performance discrepancies depending on whether a process' file IO targets its working directory (or a sub-directory thereof) or any other directory.
Below the NTFS junction acceptable performance is only achieved in the processes' working directory or a subdirectory of the working directory:
C:\Users\Database User\Documents\abc\def: executing
rmdir /Q /S mydata.bcolzis a IO bound (Disk bound) operation
C:\Users\Database User\Documents\abc: executing
rmdir /Q /S def\mydata.bcolzis a IO bound (Disk bound) operation
C:\Users\Database User\Documents\abc\def\xyz: executing
rmdir /Q /S ..\mydata.bcolzis a CPU bound operation
In the first two cases, the cmd.exe process hardly consumes any CPU time, while in the latter it consumes 100% of one core. The operation is identical in all three cases. Only the working directories differs.
- Working directory
rmdir /Q /S ..\mydata.bcolzis again an IO bound operation!
This phenomenon occurs with any rapid file IO with a very large number of very small files. It is not limited to
cmd.exe. The above example is only for illustration.
Any idea what is going on and how to fix it?