Command-Line Interface (CLI)

Basics

geoextent can be called on the command line with this command :

usage: geoextent [-h] [--formats] [--list-features] [--version] [--debug] [--details] [--output] [output file] [--join] [-b] [-t] [--convex-hull] [--no-download-data] [--no-metadata-fallback] [--no-progress] [--quiet] [--format {geojson,wkt,wkb}] [--no-subdirs] [--geojsonio] [--browse] [--placename] [--placename-service GAZETTEER] [--placename-escape] [--max-download-size SIZE] [--max-download-method {ordered,random,smallest,largest}] [--max-download-method-seed SEED] [--download-skip-nogeo] [--download-skip-nogeo-exts EXTS] [--max-download-workers WORKERS] [--keep-files] [--assume-wgs84] input1 [input2 ...]
files

input file, directory, DOI, or repository URL (supports multiple inputs including mixed types)

-h, --help

show help message and exit

--formats

show supported formats

--list-features

output machine-readable JSON with all supported file formats and content providers

--version

show installed version

--debug

turn on debug logging, alternatively set environment variable GEOEXTENT_DEBUG=1

--details

Returns details of folder/zipFiles geoextent extraction

--output <output>

Export results to a file. Format is auto-detected from extension: .gpkg (GeoPackage), .geojson/.json (GeoJSON), .csv (CSV). Works with single files, directories, and remote sources.

-b, --bounding-box

extract spatial extent (bounding box)

-t, --time-box

extract temporal extent (%Y-%m-%d)

--time-format <format>

output format for temporal extents. Presets: ‘date’ (%Y-%m-%d, default), ‘iso8601’ (%Y-%m-%dT%H:%M:%SZ). Also accepts strftime format strings (e.g. ‘%Y/%m/%d %H:%M’).

--convex-hull

extract convex hull instead of bounding box for vector geometries

--no-download-data

for repositories: disable downloading data files and use metadata only (not recommended for most providers)

--metadata-first

try metadata-only extraction first, fall back to data download if metadata yields no results (mutually exclusive with –no-download-data)

--no-metadata-fallback

disable automatic metadata fallback when data download yields no files (by default, geoextent falls back to metadata-only extraction if data files are unavailable and the provider supports metadata)

--no-follow

disable following external DOIs/URLs to other providers (e.g., DEIMS-SDR datasets referencing Zenodo). By default, geoextent follows these references to extract actual data extents.

--no-progress

disable progress bars during download and extraction

--quiet

suppress all console messages including warnings, progress bars, map preview messages, and terminal display (–map FILE still saves the image silently)

--format {geojson,wkt,wkb}

output format for spatial extents (default: geojson)

--no-subdirs

only process files in the top-level directory, ignore subdirectories

--geojsonio

generate and print a clickable geojson.io URL for the extracted spatial extent

--browse

open the geojson.io URL in the default web browser (use with –geojsonio to also print URL)

--map <file>

save a map preview image of the spatial extent as PNG. If FILE is given, saves to that path; otherwise saves to a temporary file. (requires: pip install geoextent[preview])

--preview

display a map preview of the spatial extent in the terminal (requires: pip install geoextent[preview])

--map-dim <wxh>

dimensions of the map preview image in pixels (default: 600x400)

--no-metadata

exclude extraction metadata and statistics from GeoJSON output

--max-download-size <max_download_size>

maximum download size limit (e.g. ‘100MB’, ‘2GB’). Uses filesizelib for parsing.

--max-download-method {ordered,random,smallest,largest}

method for selecting files when size limit is exceeded: ‘ordered’ (as returned by provider), ‘random’, ‘smallest’ (smallest files first), ‘largest’ (largest files first) (default: ordered)

--max-download-method-seed <max_download_method_seed>

seed for random file selection when using –max-download-method random (default: 42)

--placename

enable placename lookup using default gazetteer (geonames). Use –placename-service to specify a different gazetteer

--placename-service {geonames,nominatim,photon}

specify gazetteer service for placename lookup (requires –placename)

--placename-escape

escape Unicode characters in placename output (requires –placename)

--ext-metadata

retrieve external metadata for DOIs (title, authors, publisher, publication year, URL, license) from CrossRef and DataCite

--ext-metadata-method {auto,all,crossref,datacite}

method for retrieving external metadata: ‘auto’ (try CrossRef first, then DataCite), ‘all’ (query all sources), ‘crossref’ (CrossRef only), ‘datacite’ (DataCite only) (default: auto)

--download-skip-nogeo

skip downloading files that don’t appear to contain geospatial data (e.g., PDFs, images, plain text)

--download-skip-nogeo-exts <download_skip_nogeo_exts>

comma-separated list of additional file extensions to consider as geospatial (e.g., ‘.xyz,.las,.ply’)

--max-download-workers <max_download_workers>

maximum number of parallel downloads (default: 4, set to 1 to disable parallel downloads)

--keep-files

keep downloaded and extracted files instead of cleaning them up (for debugging purposes)

--legacy

use traditional GIS coordinate order (longitude, latitude) instead of EPSG:4326 native order (latitude, longitude)

--assume-wgs84

assume WGS84 (EPSG:4326) for raster files without projection information (e.g., world files without .prj). By default, ungeoreferenced rasters are skipped.

-p <workers>, --parallel <workers>

enable parallel file extraction within directories. Without a number, uses all available CPU cores. Specify a number (e.g., -p 4) to set worker count. Default: sequential processing.

--join

Join multiple exported files (from –output) into a single file. Requires –output to specify the destination.

Examples

Note

Depending on the local configuration, geoextent might need to be called with the python interpreter prepended:

python -m geoextent …

Show help message

geoextent -h

geoextent is a Python library for extracting geospatial and temporal extents of a file
 or a directory of multiple geospatial data formats.

usage: geoextent [-h] [--formats] [--list-features] [--version] [--debug] [--details] [--output] [output file] [--join] [-b] [-t] [--convex-hull] [--no-download-data] [--no-metadata-fallback] [--no-progress] [--quiet] [--format {geojson,wkt,wkb}] [--no-subdirs] [--geojsonio] [--browse] [--placename] [--placename-service GAZETTEER] [--placename-escape] [--max-download-size SIZE] [--max-download-method {ordered,random,smallest,largest}] [--max-download-method-seed SEED] [--download-skip-nogeo] [--download-skip-nogeo-exts EXTS] [--max-download-workers WORKERS] [--keep-files] [--assume-wgs84] input1 [input2 ...]

positional arguments:
  files                 input file, directory, DOI, or repository URL
                        (supports multiple inputs including mixed types)

options:
  -h, --help            show help message and exit
  --formats             show supported formats
  --list-features       output machine-readable JSON with all supported file
                        formats and content providers
  --version             show installed version
  --debug               turn on debug logging, alternatively set environment
                        variable GEOEXTENT_DEBUG=1
  --details             Returns details of folder/zipFiles geoextent
                        extraction
  --output OUTPUT       Export results to a file. Format is auto-detected from
                        extension: .gpkg (GeoPackage), .geojson/.json
                        (GeoJSON), .csv (CSV). Works with single files,
                        directories, and remote sources.
  -b, --bounding-box    extract spatial extent (bounding box)
  -t, --time-box        extract temporal extent (%Y-%m-%d)
  --time-format FORMAT  output format for temporal extents. Presets: 'date'
                        (%Y-%m-%d, default), 'iso8601' (%Y-%m-%dT%H:%M:%SZ).
                        Also accepts strftime format strings (e.g. '%Y/%m/%d
                        %H:%M').
  --convex-hull         extract convex hull instead of bounding box for vector
                        geometries
  --no-download-data    for repositories: disable downloading data files and
                        use metadata only (not recommended for most providers)
  --metadata-first      try metadata-only extraction first, fall back to data
                        download if metadata yields no results (mutually
                        exclusive with --no-download-data)
  --no-metadata-fallback
                        disable automatic metadata fallback when data download
                        yields no files (by default, geoextent falls back to
                        metadata-only extraction if data files are unavailable
                        and the provider supports metadata)
  --no-follow           disable following external DOIs/URLs to other
                        providers (e.g., DEIMS-SDR datasets referencing
                        Zenodo). By default, geoextent follows these
                        references to extract actual data extents.
  --no-progress         disable progress bars during download and extraction
  --quiet               suppress all console messages including warnings,
                        progress bars, map preview messages, and terminal
                        display (--map FILE still saves the image silently)
  --format {geojson,wkt,wkb}
                        output format for spatial extents (default: geojson)
  --no-subdirs          only process files in the top-level directory, ignore
                        subdirectories
  --geojsonio           generate and print a clickable geojson.io URL for the
                        extracted spatial extent
  --browse              open the geojson.io URL in the default web browser
                        (use with --geojsonio to also print URL)
  --map [FILE]          save a map preview image of the spatial extent as PNG.
                        If FILE is given, saves to that path; otherwise saves
                        to a temporary file. (requires: pip install
                        geoextent[preview])
  --preview             display a map preview of the spatial extent in the
                        terminal (requires: pip install geoextent[preview])
  --map-dim WxH         dimensions of the map preview image in pixels
                        (default: 600x400)
  --no-metadata         exclude extraction metadata and statistics from
                        GeoJSON output
  --max-download-size MAX_DOWNLOAD_SIZE
                        maximum download size limit (e.g. '100MB', '2GB').
                        Uses filesizelib for parsing.
  --max-download-method {ordered,random,smallest,largest}
                        method for selecting files when size limit is
                        exceeded: 'ordered' (as returned by provider),
                        'random', 'smallest' (smallest files first), 'largest'
                        (largest files first) (default: ordered)
  --max-download-method-seed MAX_DOWNLOAD_METHOD_SEED
                        seed for random file selection when using --max-
                        download-method random (default: 42)
  --placename           enable placename lookup using default gazetteer
                        (geonames). Use --placename-service to specify a
                        different gazetteer
  --placename-service GAZETTEER
                        specify gazetteer service for placename lookup
                        (requires --placename)
  --placename-escape    escape Unicode characters in placename output
                        (requires --placename)
  --ext-metadata        retrieve external metadata for DOIs (title, authors,
                        publisher, publication year, URL, license) from
                        CrossRef and DataCite
  --ext-metadata-method {auto,all,crossref,datacite}
                        method for retrieving external metadata: 'auto' (try
                        CrossRef first, then DataCite), 'all' (query all
                        sources), 'crossref' (CrossRef only), 'datacite'
                        (DataCite only) (default: auto)
  --download-skip-nogeo
                        skip downloading files that don't appear to contain
                        geospatial data (e.g., PDFs, images, plain text)
  --download-skip-nogeo-exts DOWNLOAD_SKIP_NOGEO_EXTS
                        comma-separated list of additional file extensions to
                        consider as geospatial (e.g., '.xyz,.las,.ply')
  --max-download-workers MAX_DOWNLOAD_WORKERS
                        maximum number of parallel downloads (default: 4, set
                        to 1 to disable parallel downloads)
  --keep-files          keep downloaded and extracted files instead of
                        cleaning them up (for debugging purposes)
  --legacy              use traditional GIS coordinate order (longitude,
                        latitude) instead of EPSG:4326 native order (latitude,
                        longitude)
  --assume-wgs84        assume WGS84 (EPSG:4326) for raster files without
                        projection information (e.g., world files without
                        .prj). By default, ungeoreferenced rasters are
                        skipped.
  -p [WORKERS], --parallel [WORKERS]
                        enable parallel file extraction within directories.
                        Without a number, uses all available CPU cores.
                        Specify a number (e.g., -p 4) to set worker count.
                        Default: sequential processing.
  --join                Join multiple exported files (from --output) into a
                        single file. Requires --output to specify the
                        destination.


Examples:

geoextent -b path/to/directory_with_geospatial_data
geoextent -t path/to/file_with_temporal_extent
geoextent -b -t path/to/geospatial_files
geoextent -b -t --details path/to/zipfile_with_geospatial_data
geoextent -b -t file1.shp file2.csv file3.geopkg
geoextent -b -t --geojsonio --no-download-data 10.25928/HK1000
geoextent -t *.geojson
geoextent -b -t https://doi.org/10.1594/PANGAEA.918707 https://doi.pangaea.de/10.1594/PANGAEA.858767
geoextent -b --convex-hull https://zenodo.org/record/4567890 10.1594/PANGAEA.123456
geoextent -b --placename file.geojson
geoextent -b --placename --placename-service nominatim https://zenodo.org/record/123456
geoextent -b --placename --placename-service photon --placename-escape https://doi.org/10.3897/BDJ.13.e159973


Supported formats:
- CSV (comma-separated values) (.csv, .txt)
- Vector data (.shp, .shx, .dbf, .prj, .geojson, .json, .gpkg, .gdb, .gpx, .kml, .kmz, .gml, .fgb)
- Raster data (.tif, .tiff, .geotiff, .nc, .netcdf, .asc, .wld, .jgw, .pgw, .pngw, .tfw, .tifw, .bpw, .gfw)
- Point cloud data (.las, .laz)

Supported data repositories:
- Wikidata (wikidata.org)
- Dryad (datadryad.org)
- 4TU.ResearchData (data.4tu.nl)
- Figshare (figshare.com)
- Zenodo (zenodo.org)
- InvenioRDM (inveniosoftware.org/products/rdm)
- Pangaea (pangaea.de)
- OSF (osf.io)
- Dataverse (dataverse.org)
- GFZ (dataservices.gfz-potsdam.de)
- RADAR (radar-service.eu)
- Arctic Data Center (arcticdata.io)
- DataONE (dataone.org)
- GBIF (gbif.org)
- Pensoft (pensoft.net)
- BGR (geoportal.bgr.de)
- BAW (datenrepository.baw.de)
- MDI-DE (mdi-de.org)
- GDI-DE (geoportal.de)
- Opara (opara.zih.tu-dresden.de)
- Senckenberg (dataportal.senckenberg.de)
- CKAN (ckan.org)
- Mendeley Data (data.mendeley.com)
- DEIMS-SDR (deims.org)
- NFDI4Earth (onestop4all.nfdi4earth.de)
- HALO DB (halo-db.pa.op.dlr.de)
- SEANOE (seanoe.org)
- GeoScienceWorld (pubs.geoscienceworld.org)
- UKCEH (catalogue.ceh.ac.uk)
- STAC (stacspec.org)
- GitHub (github.com)
- GitLab (gitlab.com)
- Forgejo (codeberg.org)
- Software Heritage (softwareheritage.org)
- Remote Raster (COG) (cogeo.org)

Extract bounding box from a single file

Note

You can find the file used in the examples of this section from muenster_ring_zeit. Furthermore, for displaying the rendering of the file contents, see rendered blob.

geoextent -b muenster_ring_zeit.geojson

Output:

Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?task/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?task/s, ../tests/testdata/geojson/muenster_ring_zeit.geojson]
                                                                                                                                    

Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?it/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?it/s, Spatial extent extracted]
                                                                                                      

{'format': 'geojson',
 'geoextent_handler': 'handle_vector',
 'bbox': [51.94881477206191,
  7.6016807556152335,
  51.974624029877454,
  7.647256851196289],
 'crs': '4326',
 'file_size_bytes': 1695}

Extract time interval from a single file

Note

You can find the file used in the examples of this section from muenster_ring_zeit. Furthermore, for displaying the rendering of the file contents, see rendered blob.

geoextent -t muenster_ring_zeit.geojson

Output:

Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?task/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?task/s, ../tests/testdata/geojson/muenster_ring_zeit.geojson]
                                                                                                                                    

Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?it/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/1 [00:00<?, ?it/s, Temporal extent extracted]
                                                                                                       

{'format': 'geojson',
 'geoextent_handler': 'handle_vector',
 'tbox': ['2018-11-14', '2018-11-14'],
 'file_size_bytes': 1695}

Extract both bounding box and time interval from a single file

Note

You can find the file used in the examples of this section from muenster_ring_zeit. Furthermore, for displaying the rendering of the file contents, see rendered blob.

geoextent -b -t muenster_ring_zeit.geojson
Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?task/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?task/s, ../tests/testdata/geojson/muenster_ring_zeit.geojson]
                                                                                                                                    

Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?it/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?it/s, Spatial extent extracted]
                                                                                                      

Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?it/s]
Processing muenster_ring_zeit.geojson:   0%|          | 0/2 [00:00<?, ?it/s, Temporal extent extracted]
                                                                                                       

{'format': 'geojson',
 'geoextent_handler': 'handle_vector',
 'bbox': [51.94881477206191,
  7.6016807556152335,
  51.974624029877454,
  7.647256851196289],
 'crs': '4326',
 'tbox': ['2018-11-14', '2018-11-14'],
 'file_size_bytes': 1695}

Folders or ZIP files(s)

Geoextent also supports queries for multiple files inside folders or ZIP file(s).

Extract both bounding box and time interval from a folder or zipfile

geoextent -b -t folder_two_files
Processing directory: folder_two_files:   0%|          | 0/2 [00:00<?, ?item/s]
Processing directory: folder_two_files:   0%|          | 0/2 [00:00<?, ?item/s, Processing districtes.geojson]
Processing directory: folder_two_files:  50%|█████     | 1/2 [00:00<00:00, 121.81item/s, Processing muenster_ring_zeit.geojson]
                                                                                                                               

Merging results: 0it [00:00, ?it/s]
Merging results: 0it [00:00, ?it/s, folder_two_files]
                                                     

{'format': 'folder',
 'crs': '4326',
 'bbox': [41.31703852240476,
  2.052333387639205,
  51.974624029877454,
  7.647256851196289],
 'tbox': ['2018-11-14', '2019-09-11']}

The output of this function is the combined bbox or tbox resulting from merging all results of individual files (see: Supported file formats) inside the folder or zipfile. The resulting coordinate reference system CRS of the combined bbox is always in the EPSG: 4326 system.

Multiple Inputs

Geoextent supports processing multiple files and/or directories in a single command. Results are merged into a single spatial and temporal extent.

Extract merged bounding box from multiple files

geoextent -b file1.geojson file2.csv file3.gpkg

Extract merged extent from files and directories

geoextent -b -t tests/testdata/geojson/muenster_ring_zeit.geojson tests/testdata/folders/folder_two_files

Extract convex hull from multiple files

geoextent -b --convex-hull tests/testdata/geojson/muenster_ring_zeit.geojson tests/testdata/folders/folder_two_files/districtes.geojson tests/testdata/csv/cities_NL.csv

Use --details to see per-file results alongside the merged extent:

geoextent -b -t --details tests/testdata/geojson/muenster_ring_zeit.geojson tests/testdata/csv/cities_NL.csv

Remote Repositories

Geoextent supports extracting geospatial extent from multiple research data repositories including Zenodo, PANGAEA, OSF, Figshare, Dryad, GFZ Data Services, RADAR, Arctic Data Center, 4TU.ResearchData, B2SHARE, BAW, MDI-DE, GDI-DE, DEIMS-SDR, NFDI4Earth, GBIF, Dataverse, Pensoft, and GitHub repositories.

Extract from Zenodo

geoextent -b -t https://doi.org/10.5281/zenodo.4593540

Extract from PANGAEA

geoextent -b -t https://doi.org/10.1594/PANGAEA.734969

Extract from OSF

geoextent -b -t https://doi.org/10.17605/OSF.IO/4XE6Z
geoextent -b -t OSF.IO/4XE6Z

Extract from GFZ Data Services

geoextent -b -t 10.5880/GFZ.4.8.2023.004

Extract from RADAR

geoextent -b -t 10.35097/tvn5vujqfvf99f32

Extract from Arctic Data Center

geoextent -b -t 10.18739/A2Z892H2J

Extract from Arctic Data Center (metadata only)

geoextent -b --no-download-data 10.18739/A2Z892H2J

Extract from 4TU.ResearchData

geoextent -b -t https://data.4tu.nl/articles/_/12707150/1

Extract from 4TU.ResearchData (metadata only)

geoextent -b --no-download-data https://data.4tu.nl/articles/_/12707150/1

Extract from BAW-Datenrepository (landing page URL)

geoextent -b -t --no-data-download https://datenrepository.baw.de/trefferanzeige?docuuid=40936F66-3DD8-43D0-99AE-7CA5EF2E1287

Extract from BAW-Datenrepository (DOI, small measurement site)

geoextent -b -t --no-data-download 10.48437/02.2023.K.0601.0001

Extract from BAW-Datenrepository (DOI, sedimentology dataset)

geoextent -b -t --no-data-download 10.48437/929835b7fca4

Extract from B2SHARE (Place Names in Tainan, 647KB)

geoextent -b -t https://b2share.eudat.eu/records/a096d-k2g86

Extract from B2SHARE (Migda Soil Moisture, GeoPackage)

geoextent -b -t 10.23728/b2share.3d918bf3c1f94c3d8d8e29958ed763a9

Extract from B2SHARE (Hainich GPP, with 20MB size limit)

geoextent -b -t --max-download-size 20MB 10.23728/b2share.26jnj-a4x24

Extract from MDI-DE (metadata only)

geoextent -b -t --no-download-data https://nokis.mdi-de-dienste.org/trefferanzeige?docuuid=00100e9d-7838-4563-9dd7-2570b0d932cb

Extract from MDI-DE (direct download)

geoextent -b -t https://nokis.mdi-de-dienste.org/trefferanzeige?docuuid=00100e9d-7838-4563-9dd7-2570b0d932cb

Extract from MDI-DE (WFS download, bare UUID)

geoextent -b -t c7d748c9-e12f-4038-a556-b1698eb4033e

Extract from GDI-DE (metadata only, geoportal.de URL)

geoextent -b -t --no-download-data https://www.geoportal.de/Metadata/75987CE0-AA66-4445-AC44-068B98390E89

Extract from GDI-DE (metadata only, bare UUID)

geoextent -b -t --no-download-data cdb2c209-7e08-4f4c-b500-69de926e3023

Extract from DEIMS-SDR (dataset)

geoextent -b -t https://deims.org/dataset/3d87da8b-2b07-41c7-bf05-417832de4fa2

Extract from DEIMS-SDR (site)

geoextent -b https://deims.org/8eda49e9-1f4e-4f3e-b58e-e0bb25dc32a6

Extract from GBIF (metadata only, by DOI)

geoextent -b -t --no-download-data 10.15468/6bleia

Extract from GBIF (metadata only, by dataset URL)

geoextent -b --no-download-data https://www.gbif.org/dataset/378651d7-c235-4205-a617-2939d6faa434

Extract from GBIF (DwC-A data download)

geoextent -b -t 10.15468/6bleia

Extract from GBIF with geojson.io preview

geoextent -b --geojsonio --no-download-data 10.15472/lavgys

Extract from SEANOE (metadata only, French Mediterranean CTD)

geoextent -b -t --no-download-data 10.17882/105467

Extract from SEANOE (data download, Ireland coastline)

geoextent -b 10.17882/109463

Extract from SEANOE (whale biologging with geojson.io preview)

geoextent -b -t --geojsonio --no-download-data 10.17882/112127

Extract from DEIMS-SDR without following external references

By default, DEIMS-SDR datasets that reference external repositories (e.g., Zenodo, PANGAEA) are followed for actual data extent extraction. Use --no-follow to disable this and use DEIMS metadata only:

geoextent -b -t --no-follow https://deims.org/dataset/3d87da8b-2b07-41c7-bf05-417832de4fa2

Extract from NFDI4Earth Knowledge Hub (OneStop4All URL)

geoextent -b -t https://onestop4all.nfdi4earth.de/result/dthb-7b3bddd5af4945c2ac508a6d25537f0a/

Extract from NFDI4Earth Knowledge Hub (Cordra URL)

geoextent -b https://cordra.knowledgehub.nfdi4earth.de/objects/n4e/dthb-82b6552d-2b8e-4800-b955-ea495efc28af

Extract from NFDI4Earth without following landing page

By default, NFDI4Earth datasets with a landingPage URL are followed to other supported providers. Use --no-follow to disable this and use NFDI4Earth SPARQL metadata only:

geoextent -b -t --no-follow https://onestop4all.nfdi4earth.de/result/dthb-82b6552d-2b8e-4800-b955-ea495efc28af/

Extract from GitHub

geoextent -b https://github.com/fraxen/tectonicplates

Extract from a specific subdirectory:

geoextent -b https://github.com/Nowosad/spDataLarge/tree/master/inst/raster

Skip non-geospatial files (recommended for repos with many non-geo files):

geoextent -b --download-skip-nogeo https://github.com/fraxen/tectonicplates

Extract from Software Heritage

geoextent -b --download-skip-nogeo "https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/AWMC/geodata&path=Cultural-Data/political_shading/hasmonean"

Extract from a directory SWHID:

geoextent -b --download-skip-nogeo swh:1:dir:92890dbe77bbe36ccba724673bc62c2764df4f5a

Extract from a Remote GeoTIFF (COG)

Extract extent directly from a remote Cloud Optimized GeoTIFF (COG) URL — only the file header is downloaded:

geoextent -b https://raw.githubusercontent.com/GeoTIFF/test-data/main/files/gfw-azores.tif

Extract with temporal extent:

geoextent -b -t https://zenodo.org/records/14711942/files/FSM_1-km_MED-epsg.4326_v01.tif

Smart metadata-first extraction

Use --metadata-first to try metadata-only extraction first, falling back to data download if the provider has no metadata or the metadata didn’t yield results. This is useful for batch extractions across multiple providers:

geoextent -b --metadata-first 10.12761/sgn.2018.10225
geoextent -b --metadata-first Q64

Extract from GEO Knowledge Hub (automatic metadata fallback)

Some providers (e.g., GEO Knowledge Hub packages) have data files disabled. Geoextent automatically falls back to metadata-only extraction when this happens:

geoextent -b https://gkhub.earthobservations.org/packages/msaw9-hzd25

To disable the automatic fallback, use --no-metadata-fallback:

geoextent -b --no-metadata-fallback https://gkhub.earthobservations.org/packages/msaw9-hzd25

Extract from three German regional datasets with a convex hull — Wikidata (Berlin), 4TU (Dresden), and Senckenberg all use fast metadata extraction, producing a compact convex hull over central Germany:

geoextent -b --convex-hull --metadata-first Q64 https://data.4tu.nl/datasets/3035126d-ee51-4dbd-a187-5f6b0be85e9f/1 10.12761/sgn.2018.10225

Download size limits

Use --max-download-size to cap how much data geoextent will download from a repository. The value accepts human-friendly size strings (parsed by filesizelib):

Format

Meaning

100MB

100 megabytes (decimal)

2GB

2 gigabytes (decimal)

500KB

500 kilobytes (decimal)

10MiB

10 mebibytes (binary)

0.5GiB

0.5 gibibytes (binary)

1.5TB

1.5 terabytes (decimal)

When the total download exceeds the limit, the CLI prompts for confirmation instead of silently truncating the file list. This works for all providers whose APIs report file sizes before download:

Zenodo: the download is approximately 45.2 MB (limit is 20 MB).
Proceed with download? [y/N]

Answering y retries with the actual size as the new limit. In non-interactive contexts (scripts, CI pipelines), geoextent exits with an error. To avoid the prompt entirely, use --no-download-data for metadata-only extraction or set a sufficiently large --max-download-size.

Note

The interactive prompt relies on providers reporting file sizes in their API metadata before download. Metadata-only providers (DEIMS-SDR, NFDI4Earth, HALO DB, Wikidata, Pensoft) do not download data files, so the size limit does not apply to them.

# Download at most 20 MB of data
geoextent -b -t --max-download-size 20MB 10.23728/b2share.26jnj-a4x24

# Limit GBIF DwC-A download to 500 MB
geoextent -b -t --max-download-size 500MB 10.15468/6bleia

# Use binary units
geoextent -b --max-download-size 0.5GiB 10.5281/zenodo.4593540

For GBIF datasets, Darwin Core Archive (DwC-A) downloads have an additional built-in soft limit of 1 GB. When a DwC-A archive exceeds this limit (or the --max-download-size value, whichever is smaller), the CLI also prompts interactively.

You can trigger this prompt intentionally by setting a very small limit:

$ geoextent -b --max-download-size 1KB 10.5281/zenodo.820562

Zenodo: the download is approximately 2.3 MB (limit is 0 MB).
Proceed with download? [y/N] N

Answering N (or pressing Enter) cancels the download and produces no output.

Comparing extraction modes: metadata, download, and convex hull

The following three calls on an Arctic Data Center dataset of ice wedge thermokarst polygons at Point Lay, Alaska illustrate how --no-download-data and --convex-hull affect the output geometry.

1. Metadata-only extraction (--no-download-data): Uses the bounding box stored in the repository metadata — fast, no file downloads. The bbox is slightly larger because it comes from the dataset-level metadata rather than the actual geometries:

geoextent -b -t --no-download-data 10.18739/A2Z892H2J

Output bbox: [-163.049, 69.721, -162.935, 69.760] with tbox [1949-01-01, 2020-01-01]. View on geojson.io

2. Full download extraction (default): Downloads the 2 GeoJSON files (1.6 MB) and computes the merged bounding box from the actual feature geometries — tighter than metadata:

geoextent -b -t 10.18739/A2Z892H2J

Output bbox: [-163.027, 69.723, -162.931, 69.751]. View on geojson.io

3. Convex hull extraction (--convex-hull): Downloads the same files but computes a convex hull around all feature vertices instead of an axis-aligned bounding box — most precise representation of the data footprint:

geoextent -b -t --convex-hull 10.18739/A2Z892H2J

View on geojson.io

The three modes yield progressively tighter representations: metadata bbox > download bbox > convex hull. Use --no-download-data for speed when approximate extents suffice, or --convex-hull for the most faithful footprint of the actual data.

The output of this function is the combined bbox or tbox resulting from merging all results of individual files (see: Supported file formats) inside the repository. The resulting coordinate reference system CRS of the combined bbox is always in the EPSG: 4326 system.

For comprehensive examples including all supported repositories and advanced features, see Examples.

Parallel extraction

Use -p / --parallel to extract extents from files within a directory in parallel using multiple threads. This speeds up processing of directories with many geodata files:

# Auto-detect CPU count
geoextent -p -b -t path/to/directory

# Use 4 workers
geoextent -p 4 -b -t path/to/directory

# Parallel extraction from a remote repository
geoextent -p -b -t https://doi.org/10.5281/zenodo.4593540

Without -p, files are processed sequentially (the default).

Note

geoextent extracts spatial extents by reading file headers, which is very fast (a few milliseconds per file regardless of file size). Parallel extraction helps most when a directory contains many files (tens or more), where the per-file I/O latency adds up. For directories with only a few files, sequential processing is already fast and -p provides little benefit.

Debugging

You can enable detailed logs by passing the --debug option, or by setting the environment variable GEOEXTENT_DEBUG=1.

geoextent --debug -b -t muenster_ring_zeit.geojson

GEOEXTENT_DEBUG=1 geoextent -b -t muenster_ring_zeit.geojson

Details

You can enable details for folders and ZIP files by passing the --details option, this option allows you to access to the geoextent of the individual files inside the folders/ ZIP files used to compute the aggregated bounding box (bbox) or time box (tbox).

geoextent --details -b -t folder_one_file
Processing directory: folder_one_file:   0%|          | 0/1 [00:00<?, ?item/s]
Processing directory: folder_one_file:   0%|          | 0/1 [00:00<?, ?item/s, Processing muenster_ring_zeit.geojson]
                                                                                                                     

Merging results: 0it [00:00, ?it/s]
Merging results: 0it [00:00, ?it/s, folder_one_file]
                                                    

{'format': 'folder',
 'crs': '4326',
 'bbox': {'type': 'Polygon',
  'coordinates': [[[51.94881477206191, 7.608118057250977],
    [51.953258408047034, 7.602796554565429],
    [51.96537036973145, 7.6016807556152335],
    [51.97361943924433, 7.606401443481445],
    [51.974624029877454, 7.62125015258789],
    [51.97240332571046, 7.636871337890624],
    [51.96817310852836, 7.645368576049805],
    [51.96780294552556, 7.645540237426757],
    [51.96330786509095, 7.6471710205078125],
    [51.95807185013927, 7.647256851196289],
    [51.953258408047034, 7.643308639526367],
    [51.94881477206191, 7.608118057250977]]]},
 'convex_hull': True,
 'tbox': ['2018-11-14', '2018-11-14']}

Map preview

Generate a map preview to a temporary file (requires pip install geoextent[preview]):

geoextent --map -b muenster_ring_zeit.geojson

Save the map to a specific file:

geoextent -b --map extent.png muenster_ring_zeit.geojson

Display the map directly in the terminal:

geoextent -b --preview muenster_ring_zeit.geojson

Save to a specific file and display in the terminal:

geoextent -b --map extent.png --preview muenster_ring_zeit.geojson

Customize the image dimensions (default: 600x400):

geoextent -b --map extent.png --map-dim 800x600 muenster_ring_zeit.geojson

The path of the saved map is always printed to stderr (suppressed by --quiet).

For more details on map preview options, see Core Features.

Export to file

Export extraction results to a file. The format is auto-detected from the file extension:

Single file to GeoPackage:

geoextent -b -t --output result.gpkg tests/testdata/geojson/muenster_ring_zeit.geojson

Directory to GeoJSON:

geoextent -b -t --output result.geojson tests/testdata/folders/folder_two_files

Multiple files to CSV:

geoextent -b -t --output result.csv file1.shp file2.geojson

Convex hull geometry:

geoextent -b --convex-hull --output hull.gpkg tests/testdata/folders/folder_two_files

CSV with WKB geometry (via –format):

geoextent -b --format wkb --output result.csv tests/testdata/folders/folder_two_files

For more details on export options, see Core Features.

Join export files

Merge multiple exported files into a single file. Summary rows are excluded — only individual-file features are kept. Input files can be any supported format; the output format is auto-detected from the extension:

geoextent --join --output merged.gpkg run1.gpkg run2.gpkg

Cross-format join (GeoJSON + GPKG -> CSV):

geoextent --join --output combined.csv run1.geojson run2.gpkg

For more details, see Core Features.