Skip to content

This repository is a fork of https://github.com/langnico/global-canopy-height-model/ and contains the code used to create the results presented in the paper: A high-resolution canopy height model of the Earth. The model estimates canopy top height for Sentinel-2 images

License

Notifications You must be signed in to change notification settings

Scicrop/brazil-canopy-height-model

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A high-resolution canopy height model of the Earth optimized for Brazil

This repository is a fork of https://github.com/langnico/global-canopy-height-model/ and contains the code used to create the results presented in the paper: A high-resolution canopy height model of the Earth. Here, we developed a model to estimate canopy top height anywhere on Earth. The model estimates canopy top height for every Sentinel-2 image pixel and was trained using sparse GEDI LIDAR data as a reference. In this fork with we fixed some small bugs, added some automation for canopy estimation in Brazil biomes. Now you can choose the AOI of the place you want to predict the canopy height. Also we added GPU parallelization support for the inference.

canopy-height-low.mp4

Table of Contents

  1. Data availability
  2. Installation and credentials
  3. Loading the model
  4. Deploying
  5. Training
  6. ALS preprocessing for independent comparison
  7. Citation

Data availability

This is a summary of all the published data:

Installation and credentials

Please follow the instructions in INSTALL.md.

Loading the model

from gchm.models.xception_sentinel2 import xceptionS2_08blocks_256
# load the model with random initialization
model = xceptionS2_08blocks_256()

Please see the example notebook on how to load the model with the trained weights.

Deploying

This is a demo how to run the trained ensemble to compute the canopy height map for a Sentinel-2 tile (approx. 100 km x 100 km).

Running:

  1. With Brazilian CAR code:

    cd gchm
    python3 app.py --car {car-code}
    
  2. With Geojson:

    cd gchm
    python3 app.py --aoi {geojson path}
    
  3. Just preparation:

     cd gchm
     python3 app.py --prepare
    

    This creates the following directories:

    deploy_example/
    ├── ESAworldcover
    │   └── 2020
    │       └── sentinel2_tiles
    │           └── ESA_WorldCover_10m_2020_v100_32TMT.tif
    ├── image_paths
    │   └── 2020
    │       └── 32TMT.txt
    ├── image_paths_logs
    │   
    ├── predictions_provided
    │   
    ├── sentinel2
    │   
    └── sentinel2_aws
    
    trained_models/
    └── GLOBAL_GEDI_2019_2020
        ├── model_0
        │   ├── FT_Lm_SRCB
        │   │   ├── args.json
        │   │   ├── checkpoint.pt
        │   │   ├── train_input_mean.npy
        │   │   ├── train_input_std.npy
        │   │   ├── train_target_mean.npy
        │   │   └── train_target_std.npy
        │   ├── args.json
        │   ├── checkpoint.pt
        │   ├── train_input_mean.npy
        │   ├── train_input_std.npy
        │   ├── train_target_mean.npy
        │   └── train_target_std.npy
        ├── model_1
        │   ├── ...
        ├── model_2
        │   ├── ...
        ├── model_3
        │   ├── ...
        ├── model_4
        │   ├── ...
    

    The checkpoint.pt files contain the model weights. The subdirectories FT_Lm_SRCB contain the models finetuned with a re-weighted loss function.

Deploy and merge example for multiple images of a Sentinel-2 tile

This demo script processes 10 images (from the year 2020) for the tile "32TMT" in Switzerland and aggregates the individual per-image maps to a final annual map.

Provide a text file with the image filenames per tile saved as ${TILE_NAME}.txt. The demo data contains the following file:

cat ./deploy_example/image_paths/2020/32TMT.txt 
S2A_MSIL2A_20200623T103031_N0214_R108_T32TMT_20200623T142851.zip
S2A_MSIL2A_20200723T103031_N0214_R108_T32TMT_20200723T142801.zip
S2A_MSIL2A_20200812T103031_N0214_R108_T32TMT_20200812T131334.zip
...

The corresponding images are stored in ./deploy_example/sentinel2/2020/.

  1. Set the paths in gchm/bash/config.sh
  2. Set the tile_name in gchm/bash/run_tile_deploy_merge.sh
  3. Run the script:
    bash gchm/bash/run_tile_deploy_merge.sh
    

Note on ESA World Cover post-processing:

The ESA WorldCover 10 m 2020 v100 reprojected to Sentinel-2 tiles is available on Zenodo. We apply minimal post-processing and mask out built-up areas, snow, ice and permanent water bodies, setting their canopy height to ”no data” (value: 255). See the script here.

Note on AWS:

Sentinel-2 images can be downloaded on the fly from AWS S3 by setting GCHM_DOWNLOAD_FROM_AWS="True" and providing the AWS credentials as described above. This was tested for 2020 data, but might need some update in the sentinelhub routine to handle newer versions.

Training

Data preparation

  1. Download the train-val h5 datasets from here.
  2. Merge the parts file to a single train.h5 and val.h5 by running this script. Before running it, set the variables in_h5_dir_parts and out_h5_dir to your paths. Then run:
    bash gchm/preprocess/run_merge_h5_files_per_split.sh`
    

Running the training script

A slurm training script is provided and submitted as follows. Before submitting, set the variable CODE_PATH at the top of the script and set the paths in gchm/bash/config.sh. Then run:

sbatch < gchm/bash/run_training.sh

ALS preprocessing for independent comparison

In cases where rastered high-resolution canopy height models are available (e.g. from airborne LIDAR campaigns) for independent evaluation, some preprocessing steps are required to make the data comparable to GEDI canopy top height estimates corresponding to the canopy top within a 25 meter footprint.

  1. A rastered canopy height model with a 1m GSD should be created (E.g. using gdalwarp).
  2. The 1m canopy height model can then be processed with a circular max pooling operation to approximate "GEDI-like" canopy top heights. This step is provided as a pytorch implementation.

Example: Download the example CHM at 1m GSD from here. Then run:

python3 gchm/preprocess/ALS_maxpool_GEDI_footprint.py "path/to/input/tif" "path/to/output/tif"

Citation

Please cite the first authors of the paper if you use this code or any of the provided data.

Lang, N., Jetz, W., Schindler, K., & Wegner, J. D. (2023). A high-resolution canopy height model of the Earth. Nature Ecology & Evolution, 1-12.

@article{lang2023high,
  title={A high-resolution canopy height model of the Earth},
  author={Lang, Nico and Jetz, Walter and Schindler, Konrad and Wegner, Jan Dirk},
  journal={Nature Ecology \& Evolution},
  pages={1--12},
  year={2023},
  publisher={Nature Publishing Group UK London}
}

About

This repository is a fork of https://github.com/langnico/global-canopy-height-model/ and contains the code used to create the results presented in the paper: A high-resolution canopy height model of the Earth. The model estimates canopy top height for Sentinel-2 images

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 87.8%
  • Shell 6.2%
  • Jupyter Notebook 6.0%