Skip to content

Conversation

isaaccorley
Copy link
Collaborator

No description provided.

@isaaccorley isaaccorley self-assigned this Sep 1, 2025
@github-actions github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing dependencies Packaging and dependencies labels Sep 1, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the Fiona library with Geopandas everywhere in the codebase to consolidate geospatial file I/O operations. The changes maintain the same functionality while switching to a more modern and unified geospatial data handling approach.

  • Removes Fiona dependency from the project and updates all imports to use Geopandas
  • Refactors file reading operations to use gpd.read_file() instead of fiona.open()
  • Updates coordinate transformation logic to use Geopandas and PyProj instead of Fiona's transform functions

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
torchgeo/datasets/spacenet/base.py Replaces Fiona file reading with Geopandas for mask loading
torchgeo/datasets/pastis.py Updates metadata loading to use Geopandas instead of Fiona
torchgeo/datasets/openbuildings.py Converts building polygon processing to use Geopandas and PyProj
torchgeo/datasets/idtrees.py Replaces Fiona geometry loading with Geopandas iterrows
torchgeo/datasets/geo.py Updates vector dataset handling to use Geopandas throughout
torchgeo/datasets/enviroatlas.py Converts spatial index reading to use Geopandas
torchgeo/datasets/chesapeake.py Updates spatial index processing with Geopandas
tests/data/pastis/data.py Updates test data generation to use Geopandas
tests/data/eurocrops/data.py Converts test data creation to use Geopandas
tests/data/enviroatlas/data.py Updates spatial index generation with Geopandas
tests/data/README.md Updates documentation examples to use Geopandas
requirements/required.txt Removes Fiona dependency
requirements/min-reqs.old Removes Fiona from minimum requirements
pyproject.toml Removes Fiona from project dependencies
docs/conf.py Removes Fiona class from documentation ignore list

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

)
for _, row in gdf.iterrows():
if row['split'] in splits:
geometries.append(shapely.geometry.shape(row['geometry']))
Copy link
Preview

Copilot AI Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When iterating over a GeoDataFrame with iterrows(), the geometry column is already a Shapely object, so shapely.geometry.shape() is not needed and may cause errors. Should be geometries.append(row.geometry).

Suggested change
geometries.append(shapely.geometry.shape(row['geometry']))
geometries.append(row['geometry'])

Copilot uses AI. Check for mistakes.

geometries.append(geometry)
except fiona.errors.FionaValueError:
# Skip files that fiona is unable to read
except pyogrio.errors.DataSourceError:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamjstewart unfortunately geopandas doesn't have it's own error that it throws when it can't read a file, it defaults to letting pyogrio throw an error. Basically, would have to add pyogrio as a dependency just for this error...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataSourceError is a subclass of RuntimeError, just catch that instead.

@@ -896,21 +903,6 @@ def __getitem__(self, query: GeoSlice) -> dict[str, Any]:

return sample

def get_label(self, feature: 'fiona.model.Feature') -> int:
Copy link
Collaborator Author

@isaaccorley isaaccorley Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamjstewart Not sure if we need to deprecate this or not. This is not as performant as using geopandas directly as this function requires a for loop and my change using geopandas doesn't. The only gotcha here is that the Eurocrops dataset overrides this method with some custom logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently documented, so it's technically a public API function. If we can find a way to deprecate it, that would be ideal. If we can't, well that's what the major version bump is for. I'll mark this as currently backwards-incompatible for now.

@adamjstewart adamjstewart added the backwards-incompatible Changes that are not backwards compatible label Sep 3, 2025

ROOT = "data/cbf"
FILENAME = "Ontario.geojson"

rec = {"type": "Feature", "id": "0", "properties": OrderedDict(), "geometry": {"type": "Polygon", "coordinates": [[(0, 0), (0, 1), (1, 1), (1, 0), (0, 0)]]}}
with fiona.open(os.path.join(ROOT, FILENAME), "r") as src:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason we're using fiona in these docs is so that we can copy the src.meta profile from the real data and use it when generating our fake data. If we can ensure the same metadata is transferred using geopandas, then that's fine. If not, let's keep using fiona in the docs.

@@ -7,8 +7,7 @@
import shutil
from typing import Any

import fiona
import fiona.transform
import geopandas as gpd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. We technically don't have to replace fiona in our data.py files if we don't want to, as this code isn't used at run-time or test-time. The only incentive to replace it with geopandas is that we may someday generate test data on-the-fly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I might leave the data.py files untouched for now since they are already working.

@adamjstewart adamjstewart added this to the 0.8.0 milestone Sep 3, 2025
@adamjstewart adamjstewart mentioned this pull request Sep 3, 2025
67 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-incompatible Changes that are not backwards compatible datasets Geospatial or benchmark datasets dependencies Packaging and dependencies documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants