• 7
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I am currently benchmarking a hard drive. I am using HW32 for the measurement.

The result has two parts:

random seek time: 20 ms

random read throughput: 30 Mbytes/s

I am not sure the methods HW32 was doing for the benchmark.

But I find it very strange for the random seek time result.

From my understanding, random seek time means the time that is spent on where a specific data is. So I presume that for a random read should contain many times of random seek, right?

For example, I try to read 100MB of data from the disk. And because of the fragments of the disk, it has 1000 random blocks on the disk, each place has 100KB data. So when I read it, the disk head will have to move 1000 times to find all the data blocks, right?

So if random seek time is 20ms, then does that mean we will have to spend 1000 * 20 = 20,000ms = 20 sec on the random seeking? I guess not, right?

Can anyone explain to me? If a benchmark like HW32 tells me random seek time = 20 ms, what does that mean? Does that mean total random seek time for a random read or average seek time?

Thanks

Random seek time is the average time the disk needs to reach the position where the data is located and read a single block (or even a single sector). The real values can both be much higher and much lower than that.

For the random read rate there is usually more than one block read at the same position to calculate the average, so this rate depends on both the seek time and the linear read throughput the drive has.

Modern disks even have ways to increase the transfer rate with a technique called Native Command Queuing that allows them to resort the requests in order to minimize head movements.

  • 2
Reply Report
      • 1
    • if a random read invovles 120 blocks, then there will be 120 random seek time, right? so a naive calculation is that a 120 block of read will have 120 * 20 ms = 2.4 sec random seek time, right?
      • 2
    • You have to differentiate between 120 blocks randomly arranged on the disk and one random read that can suck in 120 blocks in one sweep. The first will indeed take about 2.4 sec on average (!!), while the second is done almost immediately. Of course, there are endless possibilities in between.
      • 1
    • Also, modern filesystems work very hard to avoid fragmentation in the first place and having a 100MB file in 1000 locations in highly unlikely.
      • 2
    • If HW32 gives random seek time = 20, how should I understand it. Is that average or peak? for example, it can be that there are 120 blocks seeking seriously, so average is 20ms. It also can be only one serious random seek and then 120 blocks are continuous, then HW32 just record the peak. Otherwise, the value should be much lower in the 2nd case. is it?
    • Also, in a sweep (continuous blocks), let's say 120 blocks are all continuous, can I say there are still 120 random seeks inside when reading and it is just this kind of random seek is nearly nothing?

You're assuming the system seeks the random blocks in random order. It would never do that.

If you have to buy something at a randomly-chosen store, it may take you on average two hours to drive to that store, pick something up, and then drive back. But if you had to drive to 1,000 randomly-chosen stores, it wouldn't take you two hours per store because you would pick the optimum order. Some stores would be right next to each other. And so on.

  • 2
Reply Report
    • @DavidSchwartz: I was moving your analogy back to the real-world case of disk random access. The system can't afford to wait until it has a number of requests to sort before starting but has to try to do very local optimizations as it goes along.
      • 2
    • @acolyte: reading a single large, fragmented file will often issue a request to read multiple disk sectors (up to the buffer's size); and all OS nowadays are multitasking and is running hundred of processes, they'd have no problem getting lots of I/O requests from multiple running processes.
      • 1
    • This works only for both a short timeframe and a small number of requests. If the timeframe for the sorting is too long, you would need to wait for the optimization to finish, and if the number of requests is too large, you end up in the travelling salesman's land.
      • 1
    • @SvenW: even TSP can be solved quickly if only need to get close enough to the optimal solution, and not necessarily the "most" optimal solution. The textbook example of disk-seek optimization is the elevator algorithm (lookup on Wikipedia).
      • 2
    • @SvenW: If you think realistically about real people who actually have to pick things up at multiple stores, you'll quickly realize that it actually works perfectly fine regardless of the number of requests or the timeframe. For the vast majority of cases, it is extremely easy to dramatically outperform a random order traversal, even without super-advanced mathematics.

Your numbers are a little strange. An average latency of 20ms means 50 IOs per second, and in order for those 50 IOs to add up to 30MB, you'd have to be using an unusually large sector size. Is there any chance they're reporting peak values rather than average?

  • 0
Reply Report

20ms is not an unusual value for a consumer grade drive at all.

  • -1
Reply Report
      • 2
    • Depends on the blocksize of the random read. And you will only take 20msec for something that has neither been cached because it already was requested, or was even gratuitously cached since it was near the disk head anyway.

Trending Tags