• 7
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

To access files on ext3, and (if dir_index is not being used) what is the optimal directory depth vs the number of files per directory? Does file size effect this? The total number of files might be a factor, but there still should be an equation I think...

If you don't have the benchmarks to back it up, I would still be interested in what you think might be optimal and why? Maybe certain system calls take longer, or maybe your computer science knowledge suggests what might be the answer. Or, if you have examples from other file systems that could be very interesting too, but I want to know what the answer is without having a separate indexing mechanism such as the dir_index tune2fs option.

I have seen this question danced around, wondered the answer, but never found it. At this point, practically a database very well might be the answer. However, I still want to know what the answer would be for the file system.

To access files on ext3, and (if dir_index is not being used) what is the optimal directory depth vs the number of files per directory?

You'll want to run your own benchmarks for this.

Does file size effect this? The total number of files might be a factor, but there still should be an equation I think...

File size does not affect this, this is a function related to the number of file header entries for whatever filesystem you're using.

If you don't have the benchmarks to back it up, I would still be interested in what you think might be optimal and why?

32,000 files is pretty much the upper limit, but from my own empirical experience, I suggest less than 10,000 files, unless you want to wait a minute or two. A few thousand can be done in about 5-20 seconds, depending on I/O and server load, etc. A few hundred, almost instantaneously.

Follow-up edit (to posted comment):

Having 8 directories of 2,500 files each is far better than having two directories of 10,000 files each. The secret is in reducing the search time in each directory.

Strangely enough, I just posted a similar answer to a similar question here.

  • 2
Reply Report
      • 1
    • Not sure that really gets to the heart of what I am asking. To go with what you said. For a total of 20k files, 2 directories each with 10,000 files would be better or worse than 4 directories with 5k files. That clarify what I am after?
      • 2
    • Not really strange, That is what inspired me to ask this question :-) I might actually go and do they benchmarks, but because of the controls needed (A new filesystem each time at the minimum) that would be no small task. So maybe someone already knows.

Trending Tags