Geopandas Tutorial¶
Overview¶
teaching: 30 minutes
exercises: 0
questions:
How can I analyze and visualize vector data in Python with geopandas?
Table of contents¶
Pandas and Geopandas primer¶
Pandas is a core scientific Python library to work with “Panel Data” (PanDas). Basically if you have a spreadsheet or database you should be using Pandas. Pandas has many input/output (I/O) functions, and two core data structures - the “Series” and “DataFrame”. Geopandas extends Pandas to work efficently with collections of geographic Vector data - geometric shapes that are georeferenced to a position on Earth’s surface. Geopandas data objects are, you might have guessed, called “GeoSeries” and “GeoDataFrame”.
[1]:
#These libraries are mature, but constantly improving, so it's always good to keep track of the version:
import pandas as pd
import geopandas as gpd
print('Pandas version: ', pd.__version__)
print('Geopandas version: ', gpd.__version__)
Pandas version: 1.0.4
Geopandas version: 0.7.0
Tabular data with Pandas¶
We’ll use the Smithsonian Global Volcanism database. This could be a local csv, excel file, sql database etc… or remote data or results from a server (https://volcano.si.edu/database/webservices.cfm)
[2]:
# Load csv results from server into a Pandas DataFrame
server = 'https://webservices.volcano.si.edu/geoserver/GVP-VOTW/ows?'
query = 'service=WFS&version=2.0.0&request=GetFeature&typeName=GVP-VOTW:Smithsonian_VOTW_Holocene_Volcanoes&outputFormat=csv'
df = pd.read_csv(server+query)
print(type(df))
df.head()
<class 'pandas.core.frame.DataFrame'>
[2]:
FID | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | Longitude | Elevation | Tectonic_Setting | Geologic_Epoch | Evidence_Category | Primary_Photo_Link | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | GeoLocation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210010 | West Eifel Volcanic Field | Maar(s) | -8300.0 | Germany | The West Eifel Volcanic Field of western Germa... | Mediterranean and Western Asia | Western Europe | 50.170 | 6.85 | 600 | Rift zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0150... | The lake-filled Weinfelder maar is one of abou... | Photo by Richard Waitt, 1990 (U.S. Geological ... | Foidite | POINT (50.17 6.85) |
1 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210020 | Chaine des Puys | Lava dome(s) | -4040.0 | France | The Chaîne des Puys, prominent in the history ... | Mediterranean and Western Asia | Western Europe | 45.775 | 2.97 | 1464 | Rift zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0880... | The central part of the Chaîne des Puys volcan... | Photo by Ichio Moriya (Kanazawa University). | Basalt / Picro-Basalt | POINT (45.775 2.97) |
2 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210030 | Olot Volcanic Field | Pyroclastic cone(s) | NaN | Spain | The Olot volcanic field (also known as the Gar... | Mediterranean and Western Asia | Western Europe | 42.170 | 2.53 | 893 | Intraplate / Continental crust (> 25 km) | Holocene | Evidence Credible | https://volcano.si.edu/gallery/photos/GVP-1199... | The forested Volcà Montolivet scoria cone rise... | Photo by Puigalder (Wikimedia Commons). | Trachybasalt / Tephrite Basanite | POINT (42.17 2.53) |
3 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210040 | Calatrava Volcanic Field | Pyroclastic cone(s) | -3600.0 | Spain | The Calatrava volcanic field lies in central S... | Mediterranean and Western Asia | Western Europe | 38.870 | -4.02 | 1117 | Intraplate / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-1185... | Columba volcano, the youngest known vent of th... | Photo by Rafael Becerra Ramírez, 2006 (Univers... | Basalt / Picro-Basalt | POINT (38.87 -4.02) |
4 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 211003 | Vulsini | Caldera | -104.0 | Italy | The Vulsini volcanic complex in central Italy ... | Mediterranean and Western Asia | Italy | 42.600 | 11.93 | 800 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0150... | The 16-km-wide Bolsena caldera containing Lago... | Photo by Richard Waitt, 1985 (U.S. Geological ... | Trachyte / Trachydacite | POINT (42.6 11.93) |
[3]:
# Use the dataframe indexing to extract subsets
df.iloc[2:5]
[3]:
FID | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | Longitude | Elevation | Tectonic_Setting | Geologic_Epoch | Evidence_Category | Primary_Photo_Link | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | GeoLocation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210030 | Olot Volcanic Field | Pyroclastic cone(s) | NaN | Spain | The Olot volcanic field (also known as the Gar... | Mediterranean and Western Asia | Western Europe | 42.17 | 2.53 | 893 | Intraplate / Continental crust (> 25 km) | Holocene | Evidence Credible | https://volcano.si.edu/gallery/photos/GVP-1199... | The forested Volcà Montolivet scoria cone rise... | Photo by Puigalder (Wikimedia Commons). | Trachybasalt / Tephrite Basanite | POINT (42.17 2.53) |
3 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210040 | Calatrava Volcanic Field | Pyroclastic cone(s) | -3600.0 | Spain | The Calatrava volcanic field lies in central S... | Mediterranean and Western Asia | Western Europe | 38.87 | -4.02 | 1117 | Intraplate / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-1185... | Columba volcano, the youngest known vent of th... | Photo by Rafael Becerra Ramírez, 2006 (Univers... | Basalt / Picro-Basalt | POINT (38.87 -4.02) |
4 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 211003 | Vulsini | Caldera | -104.0 | Italy | The Vulsini volcanic complex in central Italy ... | Mediterranean and Western Asia | Italy | 42.60 | 11.93 | 800 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0150... | The 16-km-wide Bolsena caldera containing Lago... | Photo by Richard Waitt, 1985 (U.S. Geological ... | Trachyte / Trachydacite | POINT (42.6 11.93) |
[4]:
# Query a column for a value of interest
df.query('Volcano_Name == "Shasta"')
[4]:
FID | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | Longitude | Elevation | Tectonic_Setting | Geologic_Epoch | Evidence_Category | Primary_Photo_Link | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | GeoLocation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
940 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 323010 | Shasta | Stratovolcano | 1250.0 | United States | The most voluminous of the Cascade volcanoes, ... | Canada and Western USA | USA (California) | 41.409 | -122.193 | 4317 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0321... | Mount Shasta in northern California, seen here... | Photo by Lyn Topinka, 1984 (U.S. Geological Su... | Andesite / Basaltic Andesite | POINT (41.409 -122.193) |
[5]:
# Pandas is all about efficient data access and visualization
# Here are just a few examples
df.Last_Eruption_Year.describe()
[5]:
count 868.000000
mean 758.534562
std 2356.079056
min -10450.000000
25% 837.750000
50% 1906.000000
75% 2000.000000
max 2020.000000
Name: Last_Eruption_Year, dtype: float64
[6]:
df.Region.unique()
[6]:
array(['Mediterranean and Western Asia', 'Africa and Red Sea',
'Middle East and Indian Ocean', 'New Zealand to Fiji',
'Melanesia and Australia', 'Indonesia', 'Philippines and SE Asia',
'Japan, Taiwan, Marianas', 'Kuril Islands',
'Kamchatka and Mainland Asia', 'Alaska', 'Canada and Western USA',
'Hawaii and Pacific Ocean', 'México and Central America',
'South America', 'West Indies', 'Iceland and Arctic Ocean',
'Atlantic Ocean', 'Antarctica'], dtype=object)
[7]:
df.groupby('Region').Last_Eruption_Year.describe()
[7]:
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
Region | ||||||||
Africa and Red Sea | 43.0 | 244.790698 | 3333.202854 | -10450.0 | 1050.0 | 1888.0 | 2002.50 | 2020.0 |
Alaska | 58.0 | 1036.482759 | 1861.482754 | -7600.0 | 1287.5 | 1921.5 | 1997.75 | 2020.0 |
Antarctica | 16.0 | 527.062500 | 3094.533060 | -8350.0 | 1366.5 | 1936.5 | 2009.25 | 2020.0 |
Atlantic Ocean | 25.0 | 1184.160000 | 1566.565795 | -4500.0 | 1564.0 | 1865.0 | 1962.00 | 2015.0 |
Canada and Western USA | 49.0 | -1116.816327 | 3238.339265 | -9450.0 | -2850.0 | 440.0 | 1260.00 | 2008.0 |
Hawaii and Pacific Ocean | 27.0 | 1099.407407 | 1630.550291 | -3490.0 | 850.0 | 1972.0 | 1994.50 | 2018.0 |
Iceland and Arctic Ocean | 29.0 | 1034.620690 | 1535.082162 | -3500.0 | 950.0 | 1831.0 | 1973.00 | 2015.0 |
Indonesia | 79.0 | 1817.215190 | 1151.475182 | -8050.0 | 1938.5 | 2000.0 | 2015.50 | 2020.0 |
Japan, Taiwan, Marianas | 109.0 | 942.174312 | 2276.020475 | -9540.0 | 1190.0 | 1919.0 | 1996.00 | 2020.0 |
Kamchatka and Mainland Asia | 69.0 | -359.507246 | 2772.805352 | -8050.0 | -1550.0 | 390.0 | 1907.00 | 2020.0 |
Kuril Islands | 31.0 | 1647.354839 | 1694.989863 | -7480.0 | 1899.0 | 1957.0 | 2011.50 | 2020.0 |
Mediterranean and Western Asia | 30.0 | -590.166667 | 3040.350400 | -8300.0 | -1975.0 | 95.0 | 1864.00 | 2020.0 |
Melanesia and Australia | 43.0 | 1529.046512 | 1374.555235 | -4946.0 | 1899.0 | 1972.0 | 2014.00 | 2020.0 |
Middle East and Indian Ocean | 20.0 | 707.950000 | 2337.536658 | -6050.0 | 647.5 | 1801.5 | 2004.25 | 2020.0 |
México and Central America | 49.0 | 1061.857143 | 1902.070878 | -6050.0 | 1270.0 | 1953.0 | 2016.00 | 2020.0 |
New Zealand to Fiji | 38.0 | 1373.210526 | 1547.841866 | -5060.0 | 1577.5 | 1968.0 | 2008.00 | 2020.0 |
Philippines and SE Asia | 26.0 | 823.461538 | 2371.108825 | -6050.0 | 1376.0 | 1871.0 | 1949.50 | 2020.0 |
South America | 115.0 | 578.252174 | 2415.602579 | -6890.0 | 272.5 | 1850.0 | 1986.00 | 2020.0 |
West Indies | 12.0 | 1493.416667 | 691.394961 | 160.0 | 1182.5 | 1849.0 | 1983.50 | 2017.0 |
[8]:
# Save the results of your analysis
results = df.groupby('Region').Last_Eruption_Year.describe()
results.to_csv('last_eruption_year_stats.csv')
[9]:
df.Elevation.plot.hist()
[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd5b32fdd90>
[10]:
df.groupby('Region').Volcano_Name.count().sort_values().plot.barh()
[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd5b2c79850>
Make a new plot!
Change the query to get eruption information
Vector data with Geopandas¶
Since the Volcano database has geolocation information we should consider visualizing information on a map!
[11]:
# Now load query results as json directly in geopandas
query = 'service=WFS&version=2.0.0&request=GetFeature&typeName=GVP-VOTW:Smithsonian_VOTW_Holocene_Volcanoes&outputFormat=json'
gf = gpd.read_file(server+query)
print(type(gf))
gf.head()
<class 'geopandas.geodataframe.GeoDataFrame'>
[11]:
id | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | Longitude | Elevation | Tectonic_Setting | Geologic_Epoch | Evidence_Category | Primary_Photo_Link | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210010 | West Eifel Volcanic Field | Maar(s) | -8300.0 | Germany | The West Eifel Volcanic Field of western Germa... | Mediterranean and Western Asia | Western Europe | 50.170 | 6.85 | 600 | Rift zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0150... | The lake-filled Weinfelder maar is one of abou... | Photo by Richard Waitt, 1990 (U.S. Geological ... | Foidite | POINT (6.85000 50.17000) |
1 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210020 | Chaine des Puys | Lava dome(s) | -4040.0 | France | The Chaîne des Puys, prominent in the history ... | Mediterranean and Western Asia | Western Europe | 45.775 | 2.97 | 1464 | Rift zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0880... | The central part of the Chaîne des Puys volcan... | Photo by Ichio Moriya (Kanazawa University). | Basalt / Picro-Basalt | POINT (2.97000 45.77500) |
2 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210030 | Olot Volcanic Field | Pyroclastic cone(s) | NaN | Spain | The Olot volcanic field (also known as the Gar... | Mediterranean and Western Asia | Western Europe | 42.170 | 2.53 | 893 | Intraplate / Continental crust (> 25 km) | Holocene | Evidence Credible | https://volcano.si.edu/gallery/photos/GVP-1199... | The forested Volcà Montolivet scoria cone rise... | Photo by Puigalder (Wikimedia Commons). | Trachybasalt / Tephrite Basanite | POINT (2.53000 42.17000) |
3 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 210040 | Calatrava Volcanic Field | Pyroclastic cone(s) | -3600.0 | Spain | The Calatrava volcanic field lies in central S... | Mediterranean and Western Asia | Western Europe | 38.870 | -4.02 | 1117 | Intraplate / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-1185... | Columba volcano, the youngest known vent of th... | Photo by Rafael Becerra Ramírez, 2006 (Univers... | Basalt / Picro-Basalt | POINT (-4.02000 38.87000) |
4 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 211003 | Vulsini | Caldera | -104.0 | Italy | The Vulsini volcanic complex in central Italy ... | Mediterranean and Western Asia | Italy | 42.600 | 11.93 | 800 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0150... | The 16-km-wide Bolsena caldera containing Lago... | Photo by Richard Waitt, 1985 (U.S. Geological ... | Trachyte / Trachydacite | POINT (11.93000 42.60000) |
[12]:
# NOTE this looks the same as the dataframe from before,
# but it is actual a 'geodataframe' with a specified coordinate reference system (crs)
print(type(gf))
print(gf.crs)
<class 'geopandas.geodataframe.GeoDataFrame'>
epsg:4326
[13]:
# The same indexing and operations work with geodataframes
gf.iloc[2]
[13]:
id Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae...
Volcano_Number 210030
Volcano_Name Olot Volcanic Field
Primary_Volcano_Type Pyroclastic cone(s)
Last_Eruption_Year NaN
Country Spain
Geological_Summary The Olot volcanic field (also known as the Gar...
Region Mediterranean and Western Asia
Subregion Western Europe
Latitude 42.17
Longitude 2.53
Elevation 893
Tectonic_Setting Intraplate / Continental crust (> 25 km)
Geologic_Epoch Holocene
Evidence_Category Evidence Credible
Primary_Photo_Link https://volcano.si.edu/gallery/photos/GVP-1199...
Primary_Photo_Caption The forested Volcà Montolivet scoria cone rise...
Primary_Photo_Credit Photo by Puigalder (Wikimedia Commons).
Major_Rock_Type Trachybasalt / Tephrite Basanite
geometry POINT (2.53 42.17)
Name: 2, dtype: object
[14]:
# But now we have a variety of spatial operations at our disposal
# Subsetting is very easy in Geopandas. Often we only want points in a certain bounding box
ymin, ymax, xmin, xmax = [45, 49, -120, -124]
subset = gf.cx[xmin:xmax, ymin:ymax]
subset
[14]:
id | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | Longitude | Elevation | Tectonic_Setting | Geologic_Epoch | Evidence_Category | Primary_Photo_Link | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
919 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321010 | Baker | Stratovolcano(es) | 1880.0 | United States | Mount Baker, the northernmost of Washington's ... | Canada and Western USA | USA (Washington) | 48.777 | -121.813 | 3285 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0273... | The glaciated Mount Baker is the northernmost ... | Photo by Lee Siebert, 1981 (Smithsonian Instit... | Andesite / Basaltic Andesite | POINT (-121.81300 48.77700) |
920 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321020 | Glacier Peak | Stratovolcano | 1700.0 | United States | Glacier Peak, the most isolated of the Cascade... | Canada and Western USA | USA (Washington) | 48.112 | -121.113 | 3213 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0276... | Glacier Peak rises above the forested slopes o... | Photo by Lee Siebert, 1985 (Smithsonian Instit... | Dacite | POINT (-121.11300 48.11200) |
921 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321030 | Rainier | Stratovolcano | 1450.0 | United States | Mount Rainier, the highest peak in the Cascade... | Canada and Western USA | USA (Washington) | 46.853 | -121.760 | 4392 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0456... | Mount Rainier is located east of the Puget Sou... | Photo by Lee Siebert, 1981 (Smithsonian Instit... | Andesite / Basaltic Andesite | POINT (-121.76000 46.85300) |
922 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321040 | Adams | Stratovolcano | 950.0 | United States | Although lower in height than its neighbor to ... | Canada and Western USA | USA (Washington) | 46.206 | -121.490 | 3742 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0281... | Mount Adams in the Cascade Range is seen here ... | Photo by Lee Siebert, 1981 (Smithsonian Instit... | Andesite / Basaltic Andesite | POINT (-121.49000 46.20600) |
923 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321050 | St. Helens | Stratovolcano | 2008.0 | United States | Prior to 1980, Mount St. Helens formed a conic... | Canada and Western USA | USA (Washington) | 46.200 | -122.180 | 2549 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0549... | The 1980 eruption of Mount St. Helens dramatic... | Photo by Lyn Topinka, 1981 (U.S. Geological Su... | Dacite | POINT (-122.18000 46.20000) |
924 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321060 | West Crater | Volcanic field | -5750.0 | United States | West Crater, a small andesitic lava dome with ... | Canada and Western USA | USA (Washington) | 45.880 | -122.080 | 1329 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-1072... | West Crater is a little-known Quaternary volca... | Photo by Lee Siebert, 2002 (Smithsonian Instit... | Andesite / Basaltic Andesite | POINT (-122.08000 45.88000) |
925 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 321070 | Indian Heaven | Shield(s) | -6250.0 | United States | The Pleistocene-to-Holocene Indian Heaven volc... | Canada and Western USA | USA (Washington) | 45.930 | -121.820 | 1806 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Dated | https://volcano.si.edu/gallery/photos/GVP-0295... | The youngest eruption of the Indian Heaven vol... | Photo by Lee Siebert, 1995 (Smithsonian Instit... | Basalt / Picro-Basalt | POINT (-121.82000 45.93000) |
926 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 322010 | Hood | Stratovolcano | 1866.0 | United States | Mount Hood, Oregon's highest peak, forms a pro... | Canada and Western USA | USA (Oregon) | 45.374 | -121.695 | 3426 | Subduction zone / Continental crust (> 25 km) | Holocene | Eruption Observed | https://volcano.si.edu/gallery/photos/GVP-0296... | Sharp-topped Mount Hood, Oregon's highest peak... | Photo by Richard Fiske (Smithsonian Institution). | Andesite / Basaltic Andesite | POINT (-121.69500 45.37400) |
[15]:
# Geopandas by default plots latitude and longitude of each entry (row) in a table
subset.plot()
[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd5b2b588d0>
[16]:
# Maybe we want to get a polygon that encloses all those points
# Geopandas uses shapely under the surface
import shapely
point_collection = shapely.geometry.MultiPoint(subset.geometry.tolist())
polygon = point_collection.convex_hull
polygon
[16]:
[17]:
# We can convert that polygon to a new CRS easily with geopandas
# For example, convert to UTM to get area in units of square meters
# https://spatialreference.org/ref/epsg/wgs-84-utm-zone-10n/
# EPSG:32610
gfShape = gpd.GeoDataFrame(geometry=[polygon], crs = {'init': 'epsg:4326'})
gfShape
/srv/conda/envs/notebook/lib/python3.7/site-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
return _prepare_from_string(" ".join(pjargs))
[17]:
geometry | |
---|---|
0 | POLYGON ((-121.69500 45.37400, -122.08000 45.8... |
[18]:
print(f'Polygon area km^2')
area = gfShape.to_crs(epsg=32610).area * 1e-6
area
Polygon area km^2
[18]:
0 16918.631068
dtype: float64
[19]:
# Save shape as geospatial vector format for GIS software
myshape = gfShape.to_crs(epsg=32610)
myshape.to_file('myshape.gpkg', driver='GPKG')
[20]:
# Finally, let's say you have a different polygon and want to extract all the volcanoes in it
# This is referred to a 'spatial join' http://geopandas.org/mergingdata.html
# gpd has some built-in datasets from the natural earth project https://www.naturalearthdata.com
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world
[20]:
pop_est | continent | name | iso_a3 | gdp_md_est | geometry | |
---|---|---|---|---|---|---|
0 | 920938 | Oceania | Fiji | FJI | 8374.0 | MULTIPOLYGON (((180.00000 -16.06713, 180.00000... |
1 | 53950935 | Africa | Tanzania | TZA | 150600.0 | POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... |
2 | 603253 | Africa | W. Sahara | ESH | 906.5 | POLYGON ((-8.66559 27.65643, -8.66512 27.58948... |
3 | 35623680 | North America | Canada | CAN | 1674000.0 | MULTIPOLYGON (((-122.84000 49.00000, -122.9742... |
4 | 326625791 | North America | United States of America | USA | 18560000.0 | MULTIPOLYGON (((-122.84000 49.00000, -120.0000... |
... | ... | ... | ... | ... | ... | ... |
172 | 7111024 | Europe | Serbia | SRB | 101800.0 | POLYGON ((18.82982 45.90887, 18.82984 45.90888... |
173 | 642550 | Europe | Montenegro | MNE | 10610.0 | POLYGON ((20.07070 42.58863, 19.80161 42.50009... |
174 | 1895250 | Europe | Kosovo | -99 | 18490.0 | POLYGON ((20.59025 41.85541, 20.52295 42.21787... |
175 | 1218208 | North America | Trinidad and Tobago | TTO | 43570.0 | POLYGON ((-61.68000 10.76000, -61.10500 10.890... |
176 | 13026129 | Africa | S. Sudan | SSD | 20880.0 | POLYGON ((30.83385 3.50917, 29.95350 4.17370, ... |
177 rows × 6 columns
[21]:
# Get volcanoes of Colombia
colombia = world.query('name == "Colombia"')
colombia
[21]:
pop_est | continent | name | iso_a3 | gdp_md_est | geometry | |
---|---|---|---|---|---|---|
32 | 47698524 | South America | Colombia | COL | 688000.0 | POLYGON ((-66.87633 1.25336, -67.06505 1.13011... |
[22]:
colombian_volcanoes = gpd.sjoin(gf, colombia, how="inner", op='within')
colombian_volcanoes
[22]:
id | Volcano_Number | Volcano_Name | Primary_Volcano_Type | Last_Eruption_Year | Country | Geological_Summary | Region | Subregion | Latitude | ... | Primary_Photo_Caption | Primary_Photo_Credit | Major_Rock_Type | geometry | index_right | pop_est | continent | name | iso_a3 | gdp_md_est | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1106 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351011 | Romeral | Stratovolcano | -5390.0 | Colombia | Recent work has documented the northernmost Ho... | South America | Colombia | 5.203 | ... | Romeral, a recently documented Holocene volcan... | NASA Landsat 7 image (worldwind.arc.nasa.gov) | Andesite / Basaltic Andesite | POINT (-75.36300 5.20300) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1107 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351012 | Bravo, Cerro | Stratovolcano | 1720.0 | Colombia | Cerro Bravo is a relatively low dominantly dac... | South America | Colombia | 5.091 | ... | Cerro Bravo is seen from Delgaditas on its eas... | Photo by David Lescinsky, 1988 (University of ... | Dacite | POINT (-75.29300 5.09100) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1108 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351021 | Santa Isabel | Shield | -850.0 | Colombia | Santa Isabel is a small andesitic shield volca... | South America | Colombia | 4.818 | ... | Santa Isabel is a small, glacier-clad shield v... | Photo by Norm Banks, 1985 (U.S. Geological Sur... | Andesite / Basaltic Andesite | POINT (-75.36500 4.81800) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1109 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351020 | Ruiz, Nevado del | Stratovolcano | 2020.0 | Colombia | Nevado del Ruiz is a broad, glacier-covered vo... | South America | Colombia | 4.892 | ... | Nevado del Ruiz is a broad, glacier-covered sh... | Photo by Norm Banks, 1985 (U.S. Geological Sur... | Andesite / Basaltic Andesite | POINT (-75.32400 4.89200) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1110 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351030 | Tolima, Nevado del | Stratovolcano | 1943.0 | Colombia | The steep-sided, glacier-clad Nevado del Tolim... | South America | Colombia | 4.658 | ... | The steep-sided, glacier-clad Tolima volcano i... | Photo by Tom Pierson, 1985 (U.S. Geological Su... | Andesite / Basaltic Andesite | POINT (-75.33000 4.65800) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1111 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351040 | Machin | Stratovolcano | 1180.0 | Colombia | The small Cerro Machín stratovolcano lies at t... | South America | Colombia | 4.487 | ... | Two central dacitic domes of Cerro Machín volc... | Photo by José Macías, 1996 (Universidad Autómo... | Dacite | POINT (-75.38900 4.48700) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1112 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351050 | Huila, Nevado del | Stratovolcano | 2012.0 | Colombia | Nevado del Huila, the highest peak in the Colo... | South America | Colombia | 2.930 | ... | Huila, the highest active volcano in Colombia,... | Photo by Juan Carlos Diago, 1995 (courtesy of ... | Andesite / Basaltic Andesite | POINT (-76.03000 2.93000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1113 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351061 | Sotara | Stratovolcano | NaN | Colombia | Volcán Sotará, also known as Cerro Azafatudo, ... | South America | Colombia | 2.108 | ... | None | None | Andesite / Basaltic Andesite | POINT (-76.59200 2.10800) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1114 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351060 | Purace | Stratovolcano(es) | 1977.0 | Colombia | One of the most active volcanoes of Colombia, ... | South America | Colombia | 2.320 | ... | Snow-capped Puracé volcano has a 500-m-wide su... | Photo by Federmán Escobar Chávez, 2005. | Andesite / Basaltic Andesite | POINT (-76.40000 2.32000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1115 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351070 | Dona Juana | Stratovolcano | 1906.0 | Colombia | The forested Doña Juana stratovolcano contains... | South America | Colombia | 1.500 | ... | None | None | Andesite / Basaltic Andesite | POINT (-76.93600 1.50000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1116 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351080 | Galeras | Complex | 2014.0 | Colombia | Galeras, a stratovolcano with a large breached... | South America | Colombia | 1.220 | ... | Galeras, a stratovolcano with a large breached... | Photo by Norm Banks, 1989 (U.S. Geological Sur... | Andesite / Basaltic Andesite | POINT (-77.37000 1.22000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1117 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351090 | Azufral | Stratovolcano | -930.0 | Colombia | Azufral stratovolcano in southern Colombia, al... | South America | Colombia | 1.080 | ... | Azufral stratovolcano in southern Colombia, se... | Photo by Norm Banks, 1989 (U.S. Geological Sur... | Dacite | POINT (-77.68000 1.08000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
1118 | Smithsonian_VOTW_Holocene_Volcanoes.fid--71eae... | 351100 | Cumbal | Stratovolcano | 1926.0 | Colombia | Many youthful lava flows extend from the glaci... | South America | Colombia | 0.950 | ... | Cumbal is the southernmost historically active... | Photo by Norm Banks, 1989 (U.S. Geological Sur... | Andesite / Basaltic Andesite | POINT (-77.87000 0.95000) | 32 | 47698524 | South America | Colombia | COL | 688000.0 |
13 rows × 26 columns
Visualization with holoviz¶
For geographic data on a map holoviz libraries are fantastic!
[23]:
import geoviews as gv
import hvplot.pandas
print('Geoviews version: ', gv.__version__)
print('hvplot version: ', hvplot.__version__)
Geoviews version: 1.8.1
hvplot version: 0.6.0
[24]:
# Geoviews offers many basemaps
tiles = gv.tile_sources.StamenTerrain()
tiles
[24]:
[25]:
# hvplot makes it easy to plot dataframes or geodataframes
volcano_names = gf.loc[:,['Volcano_Name','geometry']]
points = volcano_names.hvplot(geo=True, hover_cols=['Volcano_Name'], frame_width=600)
points
[25]:
[26]:
# Combining data in geoviews is done like so:
tiles * points
[26]:
Recreate bar and histogram plots with hvplot!