Scraping and visualizing 117 years of hurricane data, Part 2

Per the Part 1 post, we now have two data files – one with storm-level information and another larger file with path-level information. Now let’s build some visualizations and try to answer our questions about Atlantic hurricanes.¬†For my workbook, I joined the two files on the field hLink, which is a snippet of the URL path where we scraped the data for each hurricane. A few storms in the years 2014 and 2015 actually have links to PDF files at the NOAA website, where I had to supplement a handful of cases of missing path-level data. This wasn’t as painful as it sounds since the PDFs had tables of basically the same format as wunderground’s. The important part about the hLink field is it’s distinct to each storm. Something like storm name wouldn’t be since names can repeat and it wasn’t even until the 1950s that we started using human names for storms.

The longitude value scraped requires one minor tweak for visualization work. In the tables, longitude is sometimes cited as a positive number with a “W” to indicate that it’s west of the prime meridian. Tableau prefers to interpret west longitudes as negative numbers, so we just create an adjusted longitude value. Remember that we only scraped Atlantic storms, so every longitude will be negative.

Another calculation I found useful was to isolate the top strength of each storm, which can be done with a level-of-detail calculation that finds the max category strength at the storm level. Using this calc, I filtered out all the storms scraped that didn’t rise above tropical depression strength, which should have a Max Category value of -1.

Now we can answer some research questions:

  1. When is hurricane season, and how common is it for storms to form in off-peak months?
    It’s conventional wisdom that storm season reaches its height in late-summer/early-fall, but rare off-peak storms can spring up in any month. If you read XKCD, you probably knew this already.
  2. Has there been an increase in the number of storms in recent years?
    It’s been widely reported in climate research that as the seas warms, hurricanes have an easier time forming. The number of named storms is noisy from year-to-year, but if we run a 10-year moving average you can see a noticeable uptick in storm volume since 1990.

    To look at this another way, we can rank the top years for storm activity. Notice how the majority of the top 10 are after the year 2000.
  3. Are storms growing more intense over time?
    Finally some good news. While storms appear to be growing more numerous, we aren’t necessarily breaking wind speed records. Hurricanes Camille (1969) and Allen (1980) are the only two Atlantic storms to crack the 190mph mark on wind speed.
  4. What are the typical paths for hurricanes striking Florida?
    If we break out the Florida-crossing storms by decade, it’s interesting that the 1940s was the most active period for Floridians. The line thickness and color is encoded based on the strength of the hurricane, so you can see the paths for stronger hurricanes like Andrew (1992), Donna (1960), and Matthew (2016).

In part 3, I’ll go into some of the styling techniques to create these visualizations. In the meantime, here’s a Tableau Public visualization for exploring every year of Atlantic hurricanes.

Scraping and visualizing 117 years of hurricane data, Part 2