Quilt-1M

Quilt-1M是迄今为止最大的视觉-语言组织病理学数据集，包含了100万对配对的图像-文本样本，源自4504段共计1087小时的叙述视频，涵盖了超过438K个独特图像与768K对应的文本对。数据集包含了共1.469M个UMLS实体，涉及文本中提到的28.5K个独特实体。图像涵盖了不同的显微镜放大倍数范围（0-10x, 10-20x, 20-40x），分别从每个范围获得了（280K, 75K, 107K）幅图像。文本标题的平均长度为22.76个单词，感兴趣区域（ROI）文本平均8.68个单词，平均每幅图像有1.74个医学句子（最多为5.33，最少为1.0）。

xianweichengxiang

可视化图片

Quilt-1M_1.webp

Quilt-1M_2.webp

数据集元信息

维度	2D
模态	other
任务类型	other
解剖结构	组织
解剖区域	全身
数据量	1M
文件格式	.csv, .jpg

文件结构

.
├── quilt_1M_lookup.csv
└── quilt-1m
       ├── 00001000010913.jpg
       └── 0-RuE0Ldx6U_image_ffbf02cb-a316-4b76-9810-5f7d41d73842.jpg
       └── ...

图像尺寸统计

统计类型	间距 (mm)	尺寸
最小值	`-`	`-`
中位值	`-`	`-`
最大值	`-`	`-`

引用

@article{ikezogwo2023quilt,
      title={Quilt-1M: One Million Image-Text Pairs for Histopathology},
      author={Ikezogwo, Wisdom Oluchi and Seyfioglu, Mehmet Saygin and Ghezloo, Fatemeh and Geva, Dylan Stefan Chan and Mohammed, Fatwir Sheikh 
      and Anand, Pavan Kumar and Krishna, Ranjay and Shapiro, Linda},
      journal={arXiv preprint arXiv:2306.11207},
      year={2023}
    }

来源信息

官方网站：
访问官网

下载链接：

下载数据

公开下载，无需权限

相关论文：
查看论文

发布日期： 2023.6

统计信息

创建时间： 2025-09-10 10:21

更新时间： 2025-09-13 06:29