ColonINST-v1

ColonINST是一个专为结肠镜多模态分析设计的大规模指令调优数据集。该数据集包含303,001张结肠镜图像，汇集自19个公开可用的子数据集源。通过采用GPT-4V驱动的半自动化流程，我们生成了128,620条详细医学描述，增强了数据集在AI模型训练中的实用性。最终我们重构了450,724组视觉对话，用于指导AI模型完成四项下游任务：图像分类（CLS）、指代表达生成（REG）、指代表达理解（REC）和描述生成（CAP），这些任务对多模态医疗AI应用至关重要。

xianweichengxiang

可视化图片

ColonINST-v1-example.png

ColonINST-v1-overview.png

数据集元信息

维度	2D
模态	other
任务类型	other
解剖结构	结肠
解剖区域	腹部
类别数	2
数据量	450,724
文件格式	.json, .jpg, .png

文件结构

├──cache
    ├──ColonINST
        ├──Json-file
            ├──train
                ├──ColonINST-train.json
            ├──val
                ├──ColonINST-val-cls.json
                |...
            ├──test
                ├──ColonINST-test-cls.json
                |...

        ├──Positive-images
            ├──CPC-Paired
                ├──Train
                    ├──polyp
                        |──image_name.jpg
                        |...
                ├──Val
                    ├──polyp
                        |──image_name.jpg
                        |...
                ├──Test
                    ├──polyp
                        |──image_name.jpg
                        |...
            |...

图像尺寸统计

统计类型	间距 (mm)	尺寸
最小值	`-`	`-`
中位值	`-`	`-`
最大值	`-`	`-`

引用

@article{ji2024frontiers
  author = {Ji, Ge-Peng and Liu, Jingyi and Xu, Peng and Barnes, Nick and Khan, Fahad Shahbaz and Khan, Salman and Fan, Deng-Ping},
  title = {Frontiers in Intelligent Colonoscopy},
  journal = {arXiv preprint arXiv:2410.17241},
  year = {2024}
}

来源信息

官方网站：
访问官网

下载链接：

下载数据

公开下载，无需权限

相关论文：
查看论文

发布日期： 2024-10

统计信息

创建时间： 2025-09-10 10:20

更新时间： 2025-09-12 18:52