Annotated datasets have become one of the most crucial preconditions for the development and evaluation of machine learning-based methods designed for the automated interpretation of remote sensing data. In this study, we have reviewed the historic development of such datasets, discussed their features based on a few selected examples, and addressed open issues for future developments.
To the best of our knowledge, and to the best of our findings, until May 2021, the following graph displays more than 90% of the benchmark datasets in remote sensing and photogrammetry. This graph includes all the available datasets which are acquired using airborne/spaceborne imaging/radar sensors. (It DOES NOT include POINT CLOUD datasets). This list is going to be more complete in the future and we are still working on it.
This study is a collaboration between Michael Schmitt, Ronny Hänsch and me.
181 benchmark satellite datasets have been reviewed and their statistics is provided in the following table. The table includes a link to the datasets webpages, their volume, publication date, their specidied task and some other stats. The figures in the paper summarize these stats.
(For the high resolution scatter image please refer to the paper.)
Also, the following drawing simplifies the concept behind our "Size Measure" and "Volume Measure".
UPDATE (09 August 2023): Our review paper with a team of scientists around the world is published in IEEE GRSM. You can find it in this link. You can find more information at our IGARSS-2021 paper.
⭐⭐⭐ Our new paper which is an extended version of the above paper is published as a review paper in IEEE Geoscience and Remote Sensing Magazine (GRSM). You can find it in this link.⭐⭐⭐
NOTICE: You can find each dataset's link on the most right column. Scroll the table to right! NOTICE: Finish the table. You'll find more AWESOME stuff!
Sample Image | Dataset Name | Year of publication | Number of images | Size of images | Size Measure | Task | Number of classes | Volume (MB) | link |
---|---|---|---|---|---|---|---|---|---|
AID | 2017 | 10000 | 600 | 3600000000 | Class | 30 | 2440 | https://captain-whu.github.io/AID/ | |
AID | 2018 | 400000 | Class | 46 | |||||
Oil Storage Tanks | 2019 | 10000 | 512 | 2621440000 | OD | 3000 | https://www.kaggle.com/towardsentropy/oil-storage-tanks | ||
BigEarthNet | 2019 | 590326 | 120 | 8500694400 | Class | 121000 | http://bigearth.net/ | ||
Brazilian Coffee Scene | 2015 | 2876 | 64 | 11780096 | Class | 2 | 4.50 | http://www.patreo.dcc.ufmg.br/2017/11/12/brazilian-coffee-scenes-dataset/ | |
BrazilDAM | 2020 | 769 | 384 | 113393664 | Class | 2 | 57000 | http://www.patreo.dcc.ufmg.br/2020/01/27/brazildam-dataset/ | |
Bridges Dataset | 2019 | 500 | 3822 | 68232000000 | OD | 1 | 1450 | http://www.patreo.dcc.ufmg.br/2019/07/10/bridge-dataset/ | |
Brazilian Cerrado-Savanna Scenes | 2016 | 1311 | 64 | 5369856 | Class | 4 | 11 | http://www.patreo.dcc.ufmg.br/2017/11/12/brazilian-cerrado-savanna-scenes-dataset/ | |
BH-Pools BH-WaterTanks | 2020 | 350 | 3000 | 3150000000 | SemSeg | 1 | 1900 | http://www.patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/ | |
AiRound | 2020 | 11753 | 300 | 1057770000 | Class | 11 | 33000 | http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/ | |
CV-BrCT (Cross-View Brazilian Construction Type) | 2020 | 24000 | 500 | 6000000000 | Class | 9 | 9200 | http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/ | |
EuroSAT | 2018 | 27000 | 64 | 110592000 | Class | 10 | 1920 | https://github.com/phelber/EuroSAT# | |
NWPU-RESISC45 | 2016 | 31500 | 256 | 2064384000 | Class | 45 | 404.7 | https://github.com/tensorflow/datasets/blob/master/docs/catalog/resisc45.md | |
NWPU-VHR10 | 2014 | 800 | 1000 | 800000000 | OD | 10 | 73 | https://github.com/chaozhong2010/VHR-10_dataset_coco | |
SSDD (RadarSat-2, TerraSAR-X, S-1) | 2017 | 1160 | 500 | 295000000 | OD | 1 | |||
Dataset for Ship Classification (DSCR) | 2019 | 1951 | Class | https://github.com/DYH666/DSCR | |||||
SAR Ship Detection (GF-2, S-1) | 2019 | 43819 | 256 | 2871721984 | OD | 1 | 407 | https://github.com/CAESAR-Radi/SAR-Ship-Dataset | |
AIR-SARShip-2.0 (GF-3) | 2020 | 300 | 1000 | 300000000 | OD | 1 | 224 | http://radars.ie.ac.cn/web/data/getData?dataType=SARDataset | |
LS-SSDD (Large Scale) | 2020 | 15 | 20000 | 5760000000 | OD | 1 | 7800 | https://github.com/TianwenZhang0825/LS-SSDD-v1.0-OPEN | |
HRSID (Ship Detection, S-1, TerraSAR-X) | 2020 | 5604 | 800 | 3586560000 | OD | 1 | 581 | https://github.com/chaozhong2010/HRSID | |
High Resolution Semantic Change Detection (HRSCD) | 2019 | 582 | 10000 | 58200000000 | CD | 5000 | https://ieee-dataport.org/open-access/hrscd-high-resolution-semantic-change-detection-dataset ; https://rcdaudt.github.io/hrscd/ | ||
HRSC2016 (Ship Detection) | 2017 | 1061 | 1100 | 816970000 | OD | 26 | http://www.escience.cn/people/liuzikun/DataSet.html” | ||
Remote Sensing Object Detection (RSOD) | 2017 | 946 | 1000 | 946000000 | OD | 4 | 309 | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | |
PatternNet | 2018 | 30400 | 256 | 1992294400 | Class | 38 | 1300 | https://sites.google.com/view/zhouwx/dataset | |
Bijie Landslide | 2020 | 2773 | 200 | 110920000 | Class | 1 | 513 | http://gpcv.whu.edu.cn/data/Bijie_pages.html | |
RSC11 | 2016 | 1232 | 512 | 322961408 | Class | 11 | |||
RSD46-WHU | 2017 | 117000 | 256 | 7667712000 | Class | 46 | https://pan.baidu.com/s/1mMDKUu02V0s8rXstewv26A | ||
RSI-CB256 | 2017 | 24000 | 256 | 1572864000 | Class | 35 | 2240 | https://1drv.ms/u/s!Am218i8VSQEBaTyXDc-zA56zPv4 | |
RSI-CB128 | 2017 | 36000 | 128 | 589824000 | Class | 45 | 879 | https://1drv.ms/u/s!Auv9HKTH1GC9jBbv-XzBFyMegqlL | |
RSSCN7 | 2015 | 2800 | 400 | 448000000 | Class | 7 | 86 | https://figshare.com/articles/dataset/RSSCN7_Image_dataset/7006946 | |
SATIN | 2023 | 775632 | 28-10494 | Class | >250 | 56600 | https://satinbenchmark.github.io/ | ||
SAT-4 | 2015 | 500000 | 28 | 392000000 | Class | 4 | 1150 | http://csc.lsu.edu/~saikat/deepsat/ | |
SAT-6 | 2015 | 405000 | 28 | 317520000 | Class | 6 | 1150 | http://csc.lsu.edu/~saikat/deepsat/ | |
SemCity Toulouse | 2020 | 16 | 3500 | 196000000 | SemSeg | 5200 | http://rs.ipb.uni-bonn.de/data/ | ||
SEN12MS | 2019 | 180662 | 256 | 11839864832 | Class/SemSeg | 510000 | https://mediatum.ub.tum.de/1474000 | ||
SEN12MS-CR | 2020 | 122218 | 256 | 8009678848 | CR | 272000 | https://mediatum.ub.tum.de/1554803 | ||
SIRI-WHU (Google USGS) | 2016 | 2400 | 200 | 96000000 | Class | 12 | 700 | http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html | |
SZTAKI AirChange | 2012 | 13 | 800 | 7920640 | OD | 2 | 42 | http://web.eee.sztaki.hu/remotesensing/airchange_benchmark.html | |
Things And Stuff (TAS) | 2008 | 30 | 792 | 18817920 | OD | 1 | 11 | http://ai.stanford.edu/~gaheitz/Research/TAS/ | |
UC Merced | 2010 | 2100 | 256 | 137625600 | Class | 21 | 317 | http://weegee.vision.ucmerced.edu/datasets/landuse.html | |
WHU-RS19 | 2012 | 1013 | 600 | 364680000 | Class | 19 | http://captain.whu.edu.cn/datasets/WHU-RS19.zip | ||
SpaceNet-1 (Building Detection v1) | 2016 | 9735 | 650 | 4113037500 | SemSeg | 1 | 31000 | https://spacenet.ai/spacenet-buildings-dataset-v1/ | |
SpaceNet-2 (Building Detection v2) | 2017 | 24586 | 650 | 10387585000 | SemSeg | 1 | 182200 | https://spacenet.ai/spacenet-buildings-dataset-v2/ | |
SpaceNet-3 (Road Network Detection) | 2017 | 3711 | 1300 | 6271595000 | SemSeg | 1 | 182200 | https://spacenet.ai/spacenet-roads-dataset/ | |
SpaceNet-4 (Multi-View Overhead Imagery) | 2018 | 60000 | 900 | 48600000000 | OD | 186000 | https://spacenet.ai/off-nadir-building-detection/ | ||
SpaceNet-5 (Road Network Extraction and Route Travel Time Est.) | 2019 | 2369 | 1300 | 4003610000 | SemSeg | 1 | 84103 | https://spacenet.ai/sn5-challenge/ | |
SpaceNet-6 (Multi-Sensor All Weather Mapping) | 2020 | 3401 | 900 | 2754810000 | SemSeg | 1 | 368 | https://spacenet.ai/sn6-challenge/ | |
SpaceNet-7 (Multi-Temporal Urban Development) | 2020 | 1525 | 1024 | 1599078400 | SemSeg | 1 | 20582 | https://spacenet.ai/sn7-challenge/ | |
Functional Map of the World | 2018 | 523846 | 1.084E 12 | Class | 63 | 3500000 | https://github.com/fMoW/dataset | ||
xView | 2018 | 1413 | 3000 | 56000000000 | Class | 60 | 20000 | https://challenge.xviewdataset.org/welcome | |
xView2 | 2018 | 22068 | 1024 | 23139975168 | CD | 4 | 51000 | https://xview2.org/ | |
LandCoverNet v1.0 | 2020 | 1980 | 256 | 129761280 | Class | 7 | 81900 | https://registry.mlhub.earth/10.34911/rdnt.d2ce8i/ | |
Agriculture-Vision | 2020 | 21061 | 512 | 5521014784 | OD | 6 | 4392.50 | https://www.agriculture-vision.com/dataset#h.p_C89EwHgTp3-L | |
INRIA Aerial Image Labeling | 2017 | 360 | 1500 | 810000000 | OD | 19510 | https://project.inria.fr/aerialimagelabeling/ | ||
DeepGlobe (Road Detection) | 2018 | 8570 | 1024 | 8986296320 | Class | 1 | 3840 | https://www.kaggle.com/balraj98/deepglobe-road-extraction-dataset | |
DeepGlobe (Building Detection) | 2018 | 24586 | 650 | 10387585000 | Class/SemSeg | 1 | https://competitions.codalab.org/competitions/18544 | ||
DeepGlobe (LandCover Classification) | 2018 | 1146 | 2448 | 6867638784 | SemSeg | 7 | 2750 | https://competitions.codalab.org/competitions/18544 | |
Slovenia Land Cover | 2019 | 940 | 500 | 235000000 | Class | 10 | 195000 | http://eo-learn.sentinel-hub.com/ ; http://eo-learn.sentinel-hub.com.s3.eu-central-1.amazonaws.com/eopatches_slovenia_2017_full.zip | |
AIST Building Change Detection | 2017 | 16950 | 160 | 356100096 | CD | 1 | 18200 | https://github.com/gistairc/ABCDdataset | |
Onera Satellite Change Detection | 2018 | 24 | 600 | 8640000 | CD | 2 | 489 | https://rcdaudt.github.io/oscd/ | |
DSTL Feature Detection (3Band) | 2016 | 450 | 3391 | 5174496450 | OD | 10 | 12870 | https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data?select=three_band.zip | |
DSTL Feature Detection (16Band) | 2016 | 1350 | 3391 | 15523489350 | OD | 10 | 7300 | https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data?select=three_band.zip | |
Kaggle Planet Forest | 2017 | 150000 | 256 | 9830400000 | Class | 12 | 33000 | https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data | |
DOTA v1.0 | 2018 | 2806 | 4000 | 44896000000 | OD | 15 | 18000 | https://captain-whu.github.io/DOTA/index.html | |
DOTA v1.5 | 2019 | 2806 | 4000 | 44896000000 | OD | 16 | 18000 | https://captain-whu.github.io/DOTA/index.html | |
DOTA v2.0 | 2020 | 11268 | 4000 | 1.80288E 11 | OD | 18 | 34280 | https://captain-whu.github.io/DOTA/index.html | |
iSAID | 2020 | 2806 | 4000 | 44896000000 | SemSeg | 15 | 6544 | https://captain-whu.github.io/iSAID/index.html | |
DLR-SkyScapes | 2019 | 16 | 4680 | 336420864 | SemSeg | 31 | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58694 | ||
VIVID - PETS2005 | 9283 | 640 | 2851737600 | OD | 1 | ||||
RarePlanes | 2020 | 713348 | 512 | 1.87E 11 | OD | 110 | 318000 | https://www.cosmiqworks.org/rareplanes-public-user-guide/ | |
Kaggle Airbus Ship Detection | 2018 | 192556 | 768 | 1.13574E 11 | OD | 30000 | https://www.kaggle.com/c/airbus-ship-detection | ||
DLR-ACD | 2019 | 33 | 4458 | 636542478 | OD | 1 | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58354 | ||
STGAN Cloud Removal | 2019 | 217190 | 256 | 14233763840 | SemSeg | 1 | 1500 | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BSETKZ | |
WHU Building Dataset | 2018 | 25577 | 512 | 6704857088 | OD | 1 | 25000 | http://gpcv.whu.edu.cn/data/building_dataset.html | |
MtS-WH | 2019 | 190 | 150 | 4275000 | CD | 1 | 459 | http://sigma.whu.edu.cn/newspage.php?q=2019_03_26_ENG | |
Synthetic & Real Dataset | 2018 | 16000 | 256 | 1048576000 | CD | 2700 | https://drive.google.com/file/d/1GX656JqqOyBi_Ef0w65kDGVto-nHrNs9/edit | ||
CloudCast | 2020 | 70080 | 1229 | 99502387200 | Other | 328000 | https://vision.eng.au.dk/cloudcast-dataset/ | ||
SECOND | 2020 | 4662 | 512 | 1222115328 | CD | 30 | 2200 | http://www.captain-whu.com/project/SCD/ | |
LEVIR-CD | 2020 | 637 | 1024 | 667942912 | CD | 1 | 2700 | https://justchenhao.github.io/LEVIR/ | |
COWC | 2016 | 388435 | 256 | 25456476160 | OD | 1 | 64000 | https://gdo152.llnl.gov/cowc/ | |
Hurricane Wind Speed | 2021 | 114634 | 366 | 15355912104 | Other | 1500 | https://www.drivendata.org/competitions/72/predict-wind-speeds/page/274/ | ||
Proba-V Super Resolution | 2018 | 1160 | 384 | 171048960 | SR | 692 | https://kelvins.esa.int/proba-v-super-resolution/data/ | ||
SEN1-2 | 2018 | 282384 | 256 | 18506317824 | SemSeg | 43700 | https://mediatum.ub.tum.de/1520883?show_id=1436631 | ||
So2Sat LCZ42 | 2019 | 400673 | 32 | 410289152 | Class | 51800 | https://mediatum.ub.tum.de/1454690 | ||
Indian Pines | 2015 | 1 | 145 | 21025 | Class | 16 | 6 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
DFC21-MSD | 2021 | ||||||||
DFC21-DSE | 2021 | ||||||||
DFC20 | 2020 | ||||||||
DFC19 | 2019 | ||||||||
DFC18 (Multi-sensor land use land cover classification) | 2018 | 20 | |||||||
DFC17 (Local Climate Zones Classification) | 2017 | ||||||||
DFC 2007 | 2007 | 1 | 787 | 619369 | SemSeg | 19 | http://www.grss-ieee.org/community/technical-committees/data-fusion/ | ||
Salinas | 2015 | 1 | 365 | 111104 | Class | 16 | 27 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
Botswana | 2015 | 1 | 875 | 375000 | Class | 14 | 79 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
SanFrancisco | 2015 | 1 | Class | ||||||
Kennedy Space Center | 2015 | 1 | 550 | 311100 | Class | 13 | 57 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
Pavia Center | 2015 | 1 | 1096 | 1201216 | Class | 9 | 124 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
Pavia University | 2015 | 1 | 610 | 372100 | Class | 9 | 33 | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | |
ISPRS 2D - Potsdam | 2011 | 38 | 6000 | 1368000000 | SemSeg | 6 | 16000 | https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-potsdam/ | |
ISPRS 2D - Vaihingen | 2011 | 33 | 2200 | 156750000 | SemSeg | 6 | 17000 | https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-vaihingen/ | |
GTA Birds Eye View (Surrounding Vehicle Awareness) | 2017 | 1000000 | 1920 | 1920000000 | Other | 350000 | https://github.com/ndrplz/surround_vehicles_awareness | ||
Semantic Drone Dataset | 2019 | 400 | 5000 | 9600000000 | SemSeg | 20 | 4000 | https://www.tugraz.at/index.php?id=22387 | |
Kaggle Cloud Detection | 2019 | 9244 | 1750 | 27177360000 | Class | 4 | 6000 | https://www.kaggle.com/c/understanding_cloud_organization/data | |
Vehicle Detection in Aerial Imagery (VEDAI) | 2015 | 1250 | 1024 | 1310720000 | Class | 9 | 3990 | https://downloads.greyc.fr/vedai/ | |
WAMI DIRSIG | |||||||||
SEN-12-FLOOD | https://ieee-dataport.org/open-access/sen12-flood-sar-and-multispectral-dataset-flood-detection | ||||||||
WHU Cloud Dataset | 2020 | 859 | 512 | 225181696 | OD/SemSeg | 1 | 3650 | http://gpcv.whu.edu.cn/data/WHU_Cloud_Dataset.html | |
WHU MVS/Stereo Dataset | 2020 | 1776 | 5376 | 51328843776 | Other | 98000 | http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html | ||
WHU Multi-View Dataset | 2020 | 28400 | 768 | 8375500800 | Other | 12600 | http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html | ||
WHU Stereo Dataset | 2020 | 21868 | 768 | 6449135616 | Other | 8600 | http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html | ||
Aerial Maritime Drone Dataset | 508 | https://public.roboflow.com/object-detection/aerial-maritime | |||||||
ERA (Event Recognition in Aerial videos) | 2020 | 343680 | 640 | 1.40771E 11 | Other | 6300 | https://lcmou.github.io/ERA_Dataset/ | ||
AU-AIR | 2020 | 32823 | 1920 | 68061772800 | OD | 8 | 2200 | https://bozcani.github.io/auairdataset | |
BIRDSAI | 2020 | 162000 | 640 | 49766400000 | OD | 43200 | https://sites.google.com/view/elizabethbondi/dataset | ||
38-Cloud | 2018 | 17601 | 384 | 2595373056 | OD/SemSeg | 16000 | https://www.kaggle.com/sorour/38cloud-cloud-segmentation-in-satellite-images | ||
95-Cloud | 2020 | 34701 | 384 | 5116870656 | OD/SemSeg | 18000 | https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images/version/1 | ||
MTARSI (Muti-type Aircraft of Remote Sensing Images) | 2019 | 9385 | 256 | 615055360 | OD | 1 | 480 | https://zenodo.figshare.com/articles/dataset/Muti-type_Aircraft_of_Remote_Sensing_Images_MTARSI/11587569 | |
FGSD (Fine-grained Ship Detection Dataset) | 2020 | 4736 | 930 | 4096166400 | OD | 43 | [email protected] | ||
SIMD (Satellite Imagery Multi-vehicles Dataset) | 2020 | 5000 | 1024 | 3932160000 | OD | 15 | https://vision.seecs.edu.pk/simd-project/ | ||
FAIR1M (FinegrAined object recognItion in high-Resolution) | 2020 | 15000 | 5000 | OD | 37 | ||||
Humpback Whale Identification Challenge | 2018 | 25460 | 1050 | 16039800000 | OD/Class | 700 | https://www.kaggle.com/c/whale-categorization-playground/rules | ||
NOAA Fisheries Steller Sea Lion Population Count | 2017 | 950 | 4900 | 15361500000 | OD/Class | 4 | 96000 | https://www.kaggle.com/c/noaa-fisheries-steller-sea-lion-population-count/data | |
OVERHEAD MNIST | 2020 | 1000 | 28 | 784000 | Class | 9 | 17.50 | https://github.com/reveondivad/ov-mnist | |
RSOC (Remote Sensing Object Counting) | 2020 | 3057 | 2500 | 3621481392 | OD/Class | 4 | |||
Olive tree | 10 | 4000 | 120000000 | ||||||
PlanesNet | 2017 | 32000 | 20 | 12800000 | OD | 390 | https://www.kaggle.com/rhammell/planesnet | ||
Iceberg Detection (Statoil/C-CORE Iceberg Classifier Challenge) | 2018 | 10028 | 75 | 56407500 | OD | 290 | https://www.kaggle.com/c/statoil-iceberg-classifier-challenge/data?select=train.json.7z | ||
NaSC-TG2 | 2021 | 20000 | 256 | 1310720000 | Class | 10 | http://www.msadc.cn/jszc_xzq/ | ||
SynthAer | 2018 | 765 | 1280 | 705024000 | SemSeg | 8 | 1000 | https://figshare.com/articles/dataset/SynthAer_-_a_synthetic_dataset_of_semantically_annotated_aerial_images/7083242/1 | |
Synthinel-1 | 2020 | 2108 | 572 | 689703872 | SemSeg | 1 | 1000 | https://drive.google.com/open?id=1T2fO-VLfyQoQdy5C4at_uHkP0KBRZkit | |
VALID | 2020 | 6690 | 1024 | 7014973440 | SemSeg/OD | 30 | 15700 | https://sites.google.com/view/valid-dataset | |
Aerial Image Segmentation Dataset | 2013 | 80 | 512 | 20971520 | SemSeg | 7 | http://jiangyeyuan.com/ASD/Aerial Image Segmentation Dataset.html | ||
Aeroscapes | 2018 | 3269 | 1280 | 3012710400 | SemSeg | 11 | 752 | https://drive.google.com/file/d/1W7yQtrGUnPQ1fB2dPb5wPjrLrlQi395g/view?usp=sharing | |
MLRSNet | 2020 | 109161 | 256 | 7153975296 | Class | 46 | 1254 | https://md-datasets-cache-zipfiles-prod.s3.eu-west-1.amazonaws.com/7j9bv9vwsx-1.zip | |
Kaggle Satellite buildings semantic segmentation | 2020 | 6038 | 256 | 395706368 | SemSeg | 1 | 878 | https://www.kaggle.com/hyyyrwang/buildings-dataset | |
Kaggle Satellite Images of Water Bodies | 2020 | 2841 | 1000 | 2841000000 | SemSeg | 1 | 274 | https://www.kaggle.com/franciscoescobar/satellite-images-of-water-bodies | |
Kaggle Massachusetts Buildings Dataset | 2020 | 151 | 1500 | 339750000 | SemSeg | 1 | 3000 | https://www.kaggle.com/balraj98/massachusetts-buildings-dataset | |
Kaggle Semantic segmentation of aerial imagery | 2020 | 72 | 800 | 46080000 | SemSeg | 6 | 32 | https://www.kaggle.com/humansintheloop/semantic-segmentation-of-aerial-imagery?select=Semantic segmentation dataset | |
USTC_SmokeRS | 2019 | 6225 | 256 | 407961600 | Class | 6 | 795 | http://complex.ustc.edu.cn/2019/0802/c18202a389656/page.htm | |
ALSAT-2B | 2021 | 5518 | 256 | 192114688 | SR | 70 | https://github.com/achrafdjerida/Alsat-2B | ||
ITCVD (Vehicle Detection) | 2018 | 173 | 5616 | 3637550592 | OD | 1 | 1300 | https://eostore.itc.utwente.nl:5001/fsdownload/zZYfgbB2X/ITCVD | |
Zurich Summer Dataset | 2015 | 20 | 1000 | 20000000 | SemSeg | 132 | https://sites.google.com/site/michelevolpiresearch/data/zurich-dataset | ||
ISPRS Aerial Image Segmentation Dataset | 2017 | 21 | 2500 | 131250000 | SemSeg | 23800 | https://zenodo.org/record/1154821#.YIsmuY4zYdV | ||
EvLab-SS Dataset | 2017 | 60 | 4500 | 1215000000 | SemSeg | 11 | http://earthvisionlab.whu.edu.cn/zm/SemanticSegmentation | ||
Gaofen Image Dataset (GID) | 2018 | 150 | 7200 | 7344000000 | Class | 15 | 47000 | https://x-ytong.github.io/project/GID.html | |
CrowdAI Mapping Challenge | 2018 | 401755 | 300 | 36157950000 | SemSeg | 1 | 5335 | https://www.aicrowd.com/challenges/mapping-challenge#datasets | |
Aerial Imagery for Roof Segmentation (AIRS) | 2019 | 1047 | 10000 | 1.047E 11 | SemSeg | 1 | 17600 | https://www.airs-dataset.com/ | |
Massachusetts Roads Dataset | 2013 | 1171 | 1500 | 2634750000 | OD | 7552 | https://www.cs.toronto.edu/~vmnih/data/ | ||
built-structure-count dataset | 2019 | 5364 | 512 | 1406140416 | OD | 2000 | http://im.itu.edu.pk/deepcount/ | ||
OpenSARShip | 11346 | ||||||||
MAritime SATellite Imagery dataset (MASATI) | 2018 | 7389 | 512 | 1936982016 | Class | 7 | 2300 | https://www.iuii.ua.es/datasets/masati/ | |
VisDrone | 2020 | 275437 | 1400 | 4.24173E 11 | OD | 85000 | http://aiskyeye.com/ | ||
SatStereo | 2019 | 144 | 127000 | https://engineering.purdue.edu/RVL/Database/SatStereo/index.html | |||||
Satellite Pose Estimation | 2020 | 15300 | 1920 | 35251200000 | Other | 4600 | https://kelvins.esa.int/ | ||
Kaggle Satellite Images of Hurricane Damage | 2019 | 16000 | 128 | 262144000 | Class | 64 | https://www.kaggle.com/kmader/satellite-images-of-hurricane-damage | ||
MidAir | 2019 | 420000 | 1024 | 4.40402E 11 | Other | 1000000 | https://midair.ulg.ac.be/ | ||
QXS-SAROPT | 2021 | 40000 | 256 | 2621440000 | Other | 2700 | https://github.com/yaoxu008/QXS-SAROPT | ||
oriEnted object detection using Aerial imaGery in real-worLd scEnarios (EAGLE) | 2020 | 8820 | 936 | 7727166720 | OD | ||||
Parking Lot Database (PKLot) | 2015 | 12417 | 1280 | 11443507200 | OD | 1 | 4600 | http://web.inf.ufpr.br/vri/databases/parking-lot-database/ | |
Car Parking Lot Dataset (CARPK) | 2016 | 1448 | 1280 | 1334476800 | OD | 1 | 2000 | https://lafi.github.io/LPN/ | |
Kaggle Find a Car Park | 2019 | 3262 | 1296 | 4109180544 | Class | 2 | 2750 | https://www.kaggle.com/daggysheep/find-a-car-park | |
LEarning, VIsion and Remote (LEVIR) | 2016 | 21952 | 800 | 10536960000 | OD | 3 | |||
AeroRIT | 2019 | 1 | 3900 | 7842675 | SemSeg | 1800 | https://drive.google.com/drive/folders/1yCMqa9uDC_CEGtbnxeWEQCTb-odC2r4c | ||
RIT-18 | 2018 | 1 | 9300 | 52080000 | SemSeg | 1500 | https://github.com/rmkemker/RIT-18 | ||
UAVid | 2020 | 420 | 4000 | 3628800000 | SemSeg | 5880 | https://uavid.nl/ | ||
Aerial Change Detection in Video Games (AICD) | 2018 | 1000 | 800 | 480000000 | CD | 1700 | https://www.kaggle.com/kmader/aerial-change-detection-in-video-games | ||
Sentinel-2 Cloud Mask Catalogue | 2020 | 513 | 1022 | 535820292 | SemSeg/Class | 15380 | https://zenodo.org/record/4172871#.YIvbA44zbIV | ||
LandCoverAI | 2020 | 41 | 9500 | 2979420000 | SemSeg | 1400 | https://landcover.ai.linuxpolska.com/ | ||
Sentinel-2 Cloud Detection (ALCD) | 2019 | 38 | 1830 | 127258200 | OD | 234 | https://zenodo.org/record/1460961#.YIvbAI4zbIV | ||
SPARCS | 2016 | 80 | 1000 | 80000000 | Other | 1400 | https://www.usgs.gov/core-science-systems/nli/landsat/spatial-procedures-automated-removal-cloud-and-shadow-sparcs | ||
Sentinel-2 Multitemporal Cities Pairs | 2020 | 1520 | 600 | 547200000 | CD | 10600 | https://zenodo.org/record/4280482#.YIvbH44zbIV | ||
Hi-UCD | 2020 | 1293 | 1024 | 1355808768 | CD | ||||
GTA-V SID | 2020 | 121 | 500 | 30250000 | SemSeg | 100 | https://github.com/jiupinjia/gtav-sattellite-imagery-dataset | ||
transmission towers and power lines (TTPLA) | 2020 | 1100 | 3840 | 9123840000 | SemSeg | 4200 | https://github.com/r3ab/ttpla_dataset | ||
AerialLanes18 | 2018 | 20 | 5616 | 420526080 | SemSeg | ||||
WHU-Hi | 2020 | 3 | 1217 | 1035251 | SemSeg | 817 | http://rsidea.whu.edu.cn/resource_WHUHi_sharing.htm | ||
MOR-UAV | 2020 | 10948 | 1080 | 12769747200 | SemSeg | https://visionintelligence.github.io/Datasets.html | |||
Sentinel-2 Cloud Detection (WHUS2-CD ) | 2021 | 36 | 10980 | 4340174000 | CD | 27800 | https://github.com/Neooolee/WHUS2-CD | ||
CDD (season-varying) | 2018 | 10000 | 256 | CD | 2700 | https://drive.google.com/file/d/1GX656JqqOyBi_Ef0w65kDGVto-nHrNs9 |
Any contribution in expanding this list are welcomed. You can introduce your own benchmark datasets, or other published ones to be added to this list.
In case you use this information in your studies, please consider citing
@article{schmitt2021there,
title={There is no data like more data--current status of machine learning datasets in remote sensing},
author={Schmitt, Michael and Ahmadi, Seyed Ali and H{\"a}nsch, Ronny},
journal={arXiv preprint arXiv:2105.11726},
year={2021}
}
Here I acknowledge some useful lists and pages which can enrich your mind about Earth Observation and make us closer.
🛰️ List of satellite image training datasets with annotations for computer vision and deep learning
🌟 WOW! Take a look at Robin's awesome page. Almost everything for deep learning in remote sensing.
A curated list of awesome tools, tutorials, code, helpful projects, links, stuff about Earth Observation and Geospatial stuff!
A curated list of awesome tools, tutorials and APIs related to data from the Copernicus Sentinel Satellites.
Remote Sensing is very exciting.
Long list of geospatial analysis tools.
List of 500 geospatial companies & interactive map.
List of datasets, codes, and contests related to remote sensing change detection.
The Top 112 Super Resolution Open Source Projects.