Data story behind a Global Fishing Power BI report

, 2019-03-01

(2019-Mar-01) The first time I heard about the Data For A Cause initiatives was at one of the Power BI meetups in Toronto. And I actually didn't attend that meeting myself, but only saw pictures and learned a bit more about this virtual community of data professionals and Olga Tsubiks, who is the lead of this group. They help not-for-profit organizations to explore their datasets and create visualizations to assist them build awareness and better understanding. 

This time the Data for a Cause challenge was to work with Global Fishing Watch and visualize the fishing activity of vessels fishing in other country's waters. I was able to create a Power BI report using their data, however, this blog post is not about best practices of creating analytical reports in Power BI, but it's rather a data story of working with provided datasets and a thinking process of creating an interactive tool to visually explore fishing data.

Link to Global Fishing Power BI report.





Data

Fishing dataset had been presented in a flat file format with the following list of attributes:
       fishing_year
      ,fishing_date: a string in format “YYYY-MM-DD”
      ,fishing_lat_bin: grid cell
      ,fishing_lon_bin: grid cell
      ,fishing_mmsi: unique ID of the fishing vessel, AIS tracking device
      ,fishing_fishing_hours: hours the vessels were present in this location on this day
      ,vessel_mmsi: same as fishing_mmsi
      ,vessel_shipname
      ,vessel_callsign
      ,vessel_flag:  the country of the fishing vessel, in iso3 value
      ,vessel_imo
      ,vessel_registry_geartype: vessel gear types. The column Vessel Registry Geartype refers to the official purpose of the vessel. By analyzing the movement of the boat, GFW identified the vessel as a fishing vessel (even if registered otherwise) - Vessel Inferred Geartype.
      ,vessel_inferred_geartype
      ,vessel_inferred_geartype_score
      ,vessel_inferred_subgeartype
      ,vessel_inferred_subgeartype_score
      ,vessel_registry_length
      ,vessel_inferred_length
      ,vessel_registry_tonnage
      ,vessel_inferred_tonnage
      ,vessel_registry_engine_power
      ,vessel_inferred_engine_power
      ,vessel_source
      ,ISO_Ter1: the country where the fishing occurred (an exclusive economic zone of, in iso3). 
      ,ISO_Ter2
      ,ISO_Ter3

Data file contained 6,929,885 rows and it covered 5 years of fishing activities from 2012 to 2016.

Additionally, I pulled the list country codes with their names from this Wikipedia resource - https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes.
And I also found a Capture fisheries production dataset from the World Data Bank that contained data for the period of time from 1960 to 2016. This datasets was in an Excel file format and where one of the worksheets contained list of countries with some additional attributes, which I thought would be a good candidate for my country profile data table.


Eventually, after pulling data from the main data file, Wikipedia county codes list and fish capture data, my final data model in Power BI looked the following way, where I replicated the county code table both for Vessel & Fishing Area country info.




Metrics & Relationships between different data attributes

This was my thinking process to create a visual story with this data model:
1) Metrics:
    - Total Fishing Hours
    - Counts of Vessels (by Country, by Types/SubTypes)
    - Fish Caught (other data set)
2) Relationships:
    - Fish Caught by Country and its fishing activity on the map
    - Relationships between Vessel countries and Fishing area countries
3) Additional consideration to use maps since the Latitude/Longitude coordinates were provided.

Creating visualizations in Power BI

Summary report




For the summary page report, I used the following custom and standard visual components:
- Information cards to show counts of vessels, fishing hours and fish caught. 
Enlighten World Flag Slicer to filter vessel countries
Drill-down donut chart by ZoomCharts to show country fleet vessel distribution by vessel type
- Area and Bar charts to show fish capture data
- Narrative Science for Power BI component to create natural language narratives based on fish capture time series. I really liked this component in Power BI since it allows me to automatically and interactively generate a textual description of my numeric data, it also helps to show any potential trendlines in my dataset. And I had already blogged about this Narrative Science component for Power BI in the past: http://datanrg.blogspot.com/search?q=narrative+science.

Example of narratives generated for one selected country:



Heat map




For the second page of my report, I decided to use Heatmap visualization to show fishing hours density across different regions of the world. Latitude & longitude coordinates helped me to plot vessel fishing events on this map along with fishing hours to show different scales of heat.
Timeline slicer was used to select a date range for my heat map using the fishing_date attribute.


Flow map



Knowing that my main data set contained information about vessel and fishing area countries, it came naturally to assess how both of them relate to each, i.e. to analyze how each vessel country fish in another country's waters.

And Flow map visualization worked well for this type of analysis, where you can set Origin and Destination fields from your dataset. Also, you can set whether your connections would be in a form of straight or curved lines. 

In overall this Flow Map visualization helped to analyze fishing activities of richer countries in poorer countries' waters. And you could select a country vessels flow with the help of the filter at the top of this report page.

Summary
- Working with open data sets is always an interesting experience, especially when it contains geo coordinates that you can utilize with various map visualizations.
- I hope that this Power BI report does help to analyze fishing activities in the world. And if you are aware of other nonprofit organizations that I can help with this type of data analytics, please let me know, it would be my way of giving back to the community.

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

2009-02-23

1,567 reads

Networking - Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

2009-02-13

360 reads