我怎样才能确定一个Linux块设备的请求队列

我正在通过networking连接硬盘的这个驱动程序。 有一个错误,如果我启用计算机上的两个或更多硬盘,只有第一个获取分区查看和识别。 结果是,如果我有hda上的1个分区和hdb上的1个分区,只要我连接hda,就有一个可以挂载的分区。 所以hda1一旦挂载就会得到一个blyid xyz123。 但是当我继续安装hdb1时,它也出现了相同的blkid,事实上,驱动程序正在从hda读取它,而不是hdb。

所以我觉得我find了司机弄乱的地方。 下面是一个包含dump_stack的debugging输出,我把它放在第一个看起来正在访问错误设备的地方。

这是代码部分:

/*basically, this is just the request_queue processor. In the log output that follows, the second device, (hdb) has just been connected, right after hda was connected and hda1 was mounted to the system. */ void nblk_request_proc(struct request_queue *q) { struct request *req; ndas_error_t err = NDAS_OK; dump_stack(); while((req = NBLK_NEXT_REQUEST(q)) != NULL) { dbgl_blk(8,"processing queue request from slot %d",SLOT_R(req)); if (test_bit(NDAS_FLAG_QUEUE_SUSPENDED, &(NDAS_GET_SLOT_DEV(SLOT_R(req))->queue_flags))) { printk ("ndas: Queue is suspended\n"); /* Queue is suspended */ #if ( LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,31) ) blk_start_request(req); #else blkdev_dequeue_request(req); #endif 

这是一个日志输出。 我已经添加了一些评论,以帮助了解正在发生的事情以及糟糕的呼叫似乎出现在哪里。

  /* Just below here you can see "slot" mentioned many times. This is the identification for the network case in which the hd is connected to the network. So you will see slot 2 in this log because the first device has already been connected and mounted. */ kernel: [231644.155503] BL|4|slot_enable|/driver/block/ctrldev.c:281|adding disk: slot=2, first_minor=16, capacity=976769072|nd/dpcd1,64:15:44.38,3828:10 kernel: [231644.155588] BL|3|ndop_open|/driver/block/ops.c:233|ing bdev=f6823400|nd/dpcd1,64:15:44.38,3720:10 kernel: [231644.155598] BL|2|ndop_open|/driver/block/ops.c:247|slot =0x2|nd/dpcd1,64:15:44.38,3720:10 kernel: [231644.155606] BL|2|ndop_open|/driver/block/ops.c:248|dev_t=0x3c00010|nd/dpcd1,64:15:44.38,3720:10 kernel: [231644.155615] ND|3|ndas_query_slot|netdisk/nddev.c:791|slot=2 sdev=d33e2080|nd/dpcd1,64:15:44.38,3696:10 kernel: [231644.155624] ND|3|ndas_query_slot|netdisk/nddev.c:817|ed|nd/dpcd1,64:15:44.38,3696:10 kernel: [231644.155631] BL|3|ndop_open|/driver/block/ops.c:326|mode=1|nd/dpcd1,64:15:44.38,3720:10 kernel: [231644.155640] BL|3|ndop_open|/driver/block/ops.c:365|ed open|nd/dpcd1,64:15:44.38,3724:10 kernel: [231644.155653] BL|8|ndop_revalidate_disk|/driver/block/ops.c:2334|gendisk=c6afd800={major=60,first_minor=16,minors=0x10,disk_name=ndas-44700486-0,private_data=00000002,capacity=%lld}|nd/dpcd1,64:15:44.38,3660:10 kernel: [231644.155668] BL|8|ndop_revalidate_disk|/driver/block/ops.c:2346|ed|nd/dpcd1,64:15:44.38,3652:10 /* So at this point the hard disk is added (gendisk=c6...) and the identifications all match the network device. The driver is now about to begin scanning the hard drive for existing partitions. the little 'ed', at the end of the previous line indicates that revalidate_disk has finished it's job. Also, I think the request queue is indicated by the output dpcd1 near the very end of the line. Now below we have entered the function that is pasted above. In the function you can see that the slot can be determined by the queue. And the log output after the stack dump shows it is from slot 1. (The first network drive that was already mounted.) */ kernel: [231644.155677] ndas-44700486-0:Pid: 467, comm: nd/dpcd1 Tainted: P 2.6.32-5-686 #1 kernel: [231644.155711] Call Trace: kernel: [231644.155723] [<fc5a7685>] ? nblk_request_proc+0x9/0x10c [ndas_block] kernel: [231644.155732] [<c11298db>] ? __generic_unplug_device+0x23/0x25 kernel: [231644.155737] [<c1129afb>] ? generic_unplug_device+0x1e/0x2e kernel: [231644.155743] [<c1123090>] ? blk_unplug+0x2e/0x31 kernel: [231644.155750] [<c10cceec>] ? block_sync_page+0x33/0x34 kernel: [231644.155756] [<c108770c>] ? sync_page+0x35/0x3d kernel: [231644.155763] [<c126d568>] ? __wait_on_bit_lock+0x31/0x6a kernel: [231644.155768] [<c10876d7>] ? sync_page+0x0/0x3d kernel: [231644.155773] [<c10876aa>] ? __lock_page+0x76/0x7e kernel: [231644.155780] [<c1043f1f>] ? wake_bit_function+0x0/0x3c kernel: [231644.155785] [<c1087b76>] ? do_read_cache_page+0xdf/0xf8 kernel: [231644.155791] [<c10d21b9>] ? blkdev_readpage+0x0/0xc kernel: [231644.155796] [<c1087bbc>] ? read_cache_page_async+0x14/0x18 kernel: [231644.155801] [<c1087bc9>] ? read_cache_page+0x9/0xf kernel: [231644.155808] [<c10ed6fc>] ? read_dev_sector+0x26/0x60 kernel: [231644.155813] [<c10ee368>] ? adfspart_check_ICS+0x20/0x14c kernel: [231644.155819] [<c10ee138>] ? rescan_partitions+0x17e/0x378 kernel: [231644.155825] [<c10ee348>] ? adfspart_check_ICS+0x0/0x14c kernel: [231644.155830] [<c10d26a3>] ? __blkdev_get+0x225/0x2c7 kernel: [231644.155836] [<c10ed7e6>] ? register_disk+0xb0/0xfd kernel: [231644.155843] [<c112e33b>] ? add_disk+0x9a/0xe8 kernel: [231644.155848] [<c112dafd>] ? exact_match+0x0/0x4 kernel: [231644.155853] [<c112deae>] ? exact_lock+0x0/0xd kernel: [231644.155861] [<fc5a8b80>] ? slot_enable+0x405/0x4a5 [ndas_block] kernel: [231644.155868] [<fc5a8c63>] ? ndcmd_enabled_handler+0x43/0x9e [ndas_block] kernel: [231644.155874] [<fc5a8c20>] ? ndcmd_enabled_handler+0x0/0x9e [ndas_block] kernel: [231644.155891] [<fc54b22b>] ? notify_func+0x38/0x4b [ndas_core] kernel: [231644.155906] [<fc561cba>] ? _dpc_cancel+0x17c/0x626 [ndas_core] kernel: [231644.155919] [<fc562005>] ? _dpc_cancel+0x4c7/0x626 [ndas_core] kernel: [231644.155933] [<fc561cba>] ? _dpc_cancel+0x17c/0x626 [ndas_core] kernel: [231644.155941] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 /* here are the output of the driver debugs. They show that this operation is being performed on the first devices request queue. */ kernel: [231644.155948] BL|8|nblk_request_proc|/driver/block/block26.c:494|processing queue request from slot 1|nd/dpcd1,64:15:44.38,3408:10 kernel: [231644.155959] BL|8|nblk_handle_io|/driver/block/block26.c:374|struct ndas_slot sd = NDAS GET SLOT DEV(slot 1) kernel: [231644.155966] |nd/dpcd1,64:15:44.38,3328:10 kernel: [231644.155970] BL|8|nblk_handle_io|/driver/block/block26.c:458|case READA call ndas_read(slot=1, ndas_req)|nd/dpcd1,64:15:44.38,3328:10 kernel: [231644.155979] ND|8|ndas_read|netdisk/nddev.c:824|read io: slot=1, cmd=0, req=x00|nd/dpcd1,64:15:44.38,3320:10 

我希望这是足够的背景资料。 也许现在一个明显的问题是“什么时候在什么地方分配request_queues?”

那么在add_disk函数之前处理一下。 添加磁盘,是日志输出的第一行。

 slot->disk = NULL; spin_lock_init(&slot->lock); slot->queue = blk_init_queue( nblk_request_proc, &slot->lock ); 

据我所知,这是标准操作。 所以回到我原来的问题。 我可以在哪里find请求队列,并确保每个新设备的请求队列都增加或唯一,或者Linux内核对每个主要编号只使用一个队列? 我想知道为什么这个驱动程序在两个不同的块存储中加载相同的队列,并确定在初始注册过程中是否导致重复的blkid。

感谢您为我看这种情况。

 Queue = blk_init_queue(sbd_request, &Device.lock); 

我分享了导致我发布这个问题的错误的解决scheme。 尽pipe实际上并没有回答如何识别设备请求队列的问题。

在上面的代码中是以下内容:

 if (test_bit(NDAS_FLAG_QUEUE_SUSPENDED, &(NDAS_GET_SLOT_DEV(SLOT_R(req))->queue_flags))) 

那么,那个“SLOT_R(req)”正在引起麻烦。 这是定义在哪里返回的Gendisk设备。

 #define SLOT_R(_request_) SLOT((_request_)->rq_disk) 

这将返回磁盘,但不是以后各种操作的适当值。 所以当额外的块设备被加载时,这个函数基本上保持返回1.(我认为它正在处理为一个布尔值)。因此,所有的请求堆积在磁盘1的请求队列中。

修复的方法是访问已经存储在磁盘的private_data中的正确的磁盘标识值,并将其添加到系统中。

 Correct identifier definition: #define SLOT_R(_request_) ( (int) _request_->rq_disk->private_data ) How the correct disk number was stored. slot->disk->queue = slot->queue; slot->disk->private_data = (void*) (long) s; <-- 's' is the disk id slot->queue_flags = 0; 

现在正确的磁盘ID是从私人数据返回,所以所有的请求到正确的队列。

如上所述,这并不显示如何识别队列。 一个没受过教育的猜测可能是:

  x = (int) _request_->rq_disk->queue->id; 

参考。 linux下的request_queue函数http://lxr.free-electrons.com/source/include/linux/blkdev.h#L270&321

感谢大家的帮助!

Interesting Posts