My Objective: I would like to use GDAL to convert a GeoPDF. I want the vector layers as shp files and the raster layers as tif files. I want to do this in a programmatic way.
Edit: In reality, I want to do this with many geospatial PDFs. I’m prototyping the workflow using Python, but it will probably end up being C++. (End Edit)
The Problem: Naturally, the command to convert a vector layer differs from a raster layer. And I don’t know (again in a programmatic way) which layers are vector and which are raster.
What I’ve Tried: First, here is my sample data https://www.terragotech.com/images/pdf/webmap_urbansample.pdf.
gdalinfo webmap_urbansample.pdf -mdd LAYERS
gives the layer names:
I know to look at the data which are vector and which are raster, but I don’t know how to parse this information to know whether to use ogr2ogr or gdal_translate to do the conversion.
Then I thought I could use
ogrinfo and just diff all the layers to deduce which ones are raster, but
ogrinfo gives me:
1: Cadastral Boundaries (Polygon)
2: Water Lines (Line String)
3: Sewerage Lines (Line String)
4: Sewerage Jump-Ups (Line String)
6: Water Points (Point)
7: Sewerage Pump Stations (Point)
8: Sewerage Man Holes (Point)
9: BPS - Buildings (Polygon)
10: BPS - Facilities (Polygon)
11: BPS - Water Sources (Point)
So there’s not a one-to-one correspondence with the way these are output.
So my question is: Does anyone know how to have gdal print the GeoPDF layers and indicate which are raster vs. vector? Or is there another way to infer this?