Converting Coastline Data (GSHHG) to Shapefile Format
This example shows how to:
Extract a subset of coastline data from the Global Self-consistent Hierarchical High-resolution Geography (GSHHG) data set
Manipulate polygon features to add lakes and other interior water bodies as inner polygon rings ("holes")
Save the modified data set to a shapefile for future use in MATLAB®, or for export to a geographic information system
The Global Self-consistent Hierarchical High-resolution Geography (GSHHG; formerly Global Self-consistent Hierarchical High-resolution Shorelines, or GSHHS) data set, by Paul Wessel and Walter H. F. Smith, provides a consistent set of hierarchically arranged closed polygons. They can be used to construct base maps, or in applications or analyses that involve operations like geographic searches or the statistical properties of coastlines.
Step 1: Define a Working Folder
This example creates several temporary files and uses the variable workingFolder
to denote their location. The value used here is determined by the output of the tempdir
command, but you could easily customize this.
workingFolder = tempdir;
Step 2: GNU® Unzip and Index the Coarse-Resolution GSHHG Layer
GSHHG is available in wide range of spatial resolutions. This example uses the lowest-resolution data, from the binary file gshhs_c.b
. A GNU zipped copy of this file is included in the Mapping Toolbox™ data folder, on the MATLAB path.
Use the MATLAB gunzip
function to decompress gshhs_c.b.gz
and create the file gshhs_c.b
in the location indicated by workingFolder
. Then create an index file, gshhs_c.i
, in the same folder. In general, having an index file helps to accelerate later calls to the gshhs
function. Note that when you use the 'createindex'
option, gshhs
does not extract data.
files = gunzip('gshhs_c.b.gz', workingFolder); filename = files{1}; indexfile = gshhs(filename, 'createindex');
Step 3: Import the GSHHG Data for South America
Select data for a specific latitude-longitude quadrangle and import it as a Mapping Toolbox "geostruct" array:
latlim = [-60 15]; lonlim = [-90 -30]; S = gshhs(filename, latlim, lonlim);
If you have finished extracting data, you can remove the decompressed GSHHS file and the index file.
delete(filename) delete(indexfile)
Step 4: Examine the Data Set
Examine the first element of the geostruct array S
. In addition to the Lat
and Lon
coordinate arrays, note the various attribute fields that are present.
S(1)
ans = struct with fields:
Geometry: 'Polygon'
BoundingBox: [2x2 double]
Lat: [-53.0004 -53 -52.5017 -52.7963 -52.0434 -52.0838 -52.3638 -51.9692 -52.4763 -51.7925 -51.5217 -51.1569 -51.4650 -51.7722 -51.5550 -52.0364 -51.6342 -52.0427 -51.6283 -51.1117 -50.9233 -50.6942 -50.3871 -50.8513 ... ] (1x972 double)
Lon: [-73.3617 -73.3626 -72.8909 -73.7034 -73.7392 -72.9983 -72.9884 -72.7708 -72.9184 -72.4792 -73.1442 -72.9335 -73.4259 -72.6031 -73.3626 -73.1710 -73.4026 -73.3898 -73.9309 -73.6609 -74.2375 -73.3392 -73.5700 ... ] (1x972 double)
South: -53.9004
North: 71.9942
West: 191.8947
East: 325.2054
Area: 3.7652e+07
Level: 1
LevelString: 'land'
NumPoints: 971
FormatVersion: 3
Source: 'WVS'
CrossesGreenwich: 0
GSHHS_ID: 1
GSHHS comprises four levels of shorelines:
Level 1 - "Land"
Level 2 - "Lake"
Level 3 - "Island in lake"
Level 4 - "Pond in island in lake"
Check to see which levels the data you've imported includes. The Level
field contains numerical level numbers.
levels = [S.Level]; unique(levels)
ans = 1×3
1 2 3
The LevelString
field provides their interpretation. For example,
S(104).LevelString
ans = 'lake'
shows that feature 104 is a lake (a Level 2 feature).
In this example, due either to the low resolution or to spatial subsetting, no Level 4 features are present.
Step 5: Extract the Top Two Levels into Separate Geostruct Arrays
This example manipulates the top two levels of the GSHHS hierarchy, inserting each "lake" into the surrounding land mass.
Extract GSHHS Level 1 (exterior coastlines of continents and oceanic islands):
L1 = S(levels == 1);
Extract Level 2 (coastlines of lakes and seas within Level 1 polygons):
L2 = S(levels == 2);
To see their spatial relationships, you can map Level 1 edges as blue lines and Level 2 edges as red lines:
figure axesm('mercator', 'MapLatLimit', latlim, 'MapLonLimit', lonlim) gridm; mlabel; plabel geoshow([L1.Lat], [L1.Lon], 'Color', 'blue') geoshow([L2.Lat], [L2.Lon], 'Color', 'red') tightmap
Step 6: Merge Level 2 Polygons into Level 1
Define an anonymous predicate function to detect bounding-box intersections (returning true if a pair of bounding boxes intersect and false otherwise). Inputs A
and B
are 2-by-2 bounding-box matrices of the form:
[min(lon) min(lat) max(lon) max(lat)].
boxesIntersect = ...
@(A,B) (~(any(A(2,:) < B(1,:)) || any(B(2,:) < A(1,:))));
For convenience in looping over them, copy the Level 1 bounding boxes to a 3-D array:
L1boxes = reshape([L1.BoundingBox],[2 2 numel(L1)]);
Check each Level 1 - Level 2 pair of features for possible intersection. See if polybool
returns any output or not, but avoid calling polybool
unless a bounding box intersection is detected first:
for k = 1:numel(L2) for j = 1:numel(L1) % See if bounding boxes intersect if boxesIntersect(L2(k).BoundingBox, L1boxes(:,:,j)) % See if actual features intersect if ~isempty(polybool('intersection', ... L2(k).Lon, L2(k).Lat, L1(j).Lon, L1(j).Lat)) % Reverse level 2 vertex order before merge to % correctly orient inner rings L1(j).Lon = [L1(j).Lon fliplr(L2(k).Lon) NaN]; L1(j).Lat = [L1(j).Lat fliplr(L2(k).Lat) NaN]; end end end end
Step 7: Save Results in a Shapefile
With a single call to shapewrite
, you can create a trio of files,
gshhs_c_SouthAmerica.shp gshhs_c_SouthAmerica.shx gshhs_c_SouthAmerica.dbf
in your working folder.
shapepath = fullfile(workingFolder,'gshhs_c_SouthAmerica.shp');
shapewrite(L1, shapepath)
Step 8: Validate the Shapefile
To validate the results of shapewrite
, read the new shapefile into the geospatial table southAmerica
:
southAmerica = readgeotable(shapepath,'CoordinateSystemType','geographic')
southAmerica=79×13 table
Shape South North West East Area Level LevelString NumPoints FormatVersi Source CrossesGree GSHHS_ID
____________ _______ _______ ______ ______ __________ _____ ___________ _________ ___________ ______ ___________ ________
geopolyshape -53.9 71.994 191.89 325.21 3.7652e+07 1 "land" 971 3 "WVS" 0 1
geopolyshape -55.055 -52.469 287.98 294.89 46269 1 "land" 19 3 "WVS" 0 42
geopolyshape -52.36 -51.235 300.26 302.29 4402.4 1 "land" 11 3 "WVS" 0 85
geopolyshape -55.675 -54.933 289.99 292.02 2669.1 1 "land" 10 3 "WVS" 0 97
geopolyshape -50.065 -48.67 284.53 285.62 4021.9 1 "land" 9 3 "WVS" 0 115
geopolyshape -52.26 -51.336 298.92 300.81 4326.2 1 "land" 9 3 "WVS" 0 121
geopolyshape -54.135 -53.38 286.33 287.85 3809.7 1 "land" 8 3 "WVS" 0 143
geopolyshape -43.448 -41.768 285.55 286.65 10188 1 "land" 8 3 "WVS" 0 144
geopolyshape -53.478 -52.73 287 288.46 3006.8 1 "land" 7 3 "WVS" 0 151
geopolyshape 10.041 10.841 298.07 299.09 5174 1 "land" 7 3 "WVS" 0 174
geopolyshape -54.276 -53.566 289.1 289.8 1420.3 1 "land" 5 3 "WVS" 0 292
geopolyshape -44.937 -44.364 286.54 287.31 955.5 1 "land" 5 3 "WVS" 0 329
geopolyshape -54.401 -53.936 288.29 289.04 704.5 1 "land" 5 3 "WVS" 0 330
geopolyshape -51.125 -50.71 285.04 285.64 462 1 "land" 5 3 "WVS" 0 339
geopolyshape -50.458 -50.001 284.54 285.24 612.9 1 "land" 5 3 "WVS" 0 341
geopolyshape -48.711 -48.137 284.98 285.53 1039.8 1 "land" 5 3 "WVS" 0 343
⋮
Note that the two longest fieldnames, 'FormatVersion'
and 'CrossesGreenwich'
, have been truncated to 11 characters. This happened during the call to shapewrite
and is unavoidable because of a rigid 11-character limit in the xBASE tables (.DBF format) used to store attributes in shapefiles. (In general, when writing shapefiles you may want to re-define fieldnames longer than 11 characters in order to avoid or control the effects of automatic truncation.)
Optionally, remove the new shapefiles from your working folder. (This example needs to clean up after itself; in a real application you would probably want to omit this step.)
delete([shapepath '.*'])
Display the data from the new shapefile. Note the various "holes" in the South America polygon indicating lakes and shorelines of other extended bodies of water in the interior of the continent.
figure ax = axesm('mercator', 'MapLatLimit', latlim, 'MapLonLimit', lonlim); ax.Color = 'cyan'; gridm; mlabel; plabel geoshow(southAmerica, 'FaceColor', [0.15 0.8 0.15]) tightmap
Reference
Wessel, P., and W. H. F. Smith, 1996, A global self-consistent, hierarchical, high-resolution shoreline database, Journal of Geophysical Research, Vol. 101, pp. 8741-8743.
Additional Data
The complete GSHHG data set may be downloaded from the U.S. National Oceanic and Atmospheric Administration (NOAA) web site. For more information about finding data sets, see Find Geospatial Vector Data.
Credits
The GSHHG data file is provided in the Mapping Toolbox courtesy of Dr. Paul Wessel of the University of Hawaii and Dr. Walter H. F. Smith of NOAA.
For more information, run:
>> type gshhs_c.txt