Here are a few things that people have asked me to explain how they were done.
I decided to go w/ Are We Alone In The Universe b/c there is a lot of information on Wikipedia (the required data source) about planets discovered by the Kepler Spacecraft.
I wanted to start w/ a Splash Page to set the tone and (hopefully) grab the viewers attention. I used a couple of free pictures I found on the web for the iPad and the background sky. Since March is Women in History (and science) month, I was SURE to use an image w/ a female hand.
It was easy to copy/paste this data into a spreadsheet.... but I wanted additional data that could only be found by clicking on the Planet link. That's where Python came in handy. Python allowed me to quickly scrape data from 548 webpages in just a few minutes. Here's the code and what it does.
import requests from bs4 import BeautifulSoup import json
I set my stars = all of the 548 planets found on the above link (this was as simple as copying the data in my planet column and pasting it between the brackets [ ] and adding a single quote around the planet name.
Note: for a better explanation of how to use Python, check out Curtis Harris' blog. http://curtisharris.weebly.com/blog/iron-viz-significant-births-throughout-history The code ran surprisingly fast but ended up with a lot of data that I didn't need. I only wanted the Coordinates, Constellation, and Temperature. (see picture)
For me, it was easier to then just query the data clob for the key words 'Coordinates:', 'Constellation', and 'Temperature'. (Eventually I will improve my Python skills so I can scrape just that data that I need.) I then added a couple of columns to my result set (which was in Excel) and pasted in the results.
As you can see, not all of the links had results so I did a Left Outter join between my 2 data sets.
When I start I Viz I tend to throw in EVERYTHING; then start chip chip chipping away. 1. Does the information/workbook/graph relate directly to the subject? 2. Is the information/workbook/graph interesting and helpful? 3. Does the information/workbook/graph distract from the Viz?
Example: I had created a workbook called The Drake Equation (calculates the # of possible alien civilizations in the Universe.) I thought it was BRILLIANT!!! Users could play around w/ the variables (parameters) which affected the results... but when I asked some friends to check out the workbook, I found that most were confused. Each variable had to be explained and the results were pretty vague... so while Fun for the creator, it distracted from the Viz.
For the Star Constellation Chart, I wanted to add line coordinates that could be used later w/ tool tips. I found the image that I wanted to use on Wikipedia and then drew some lines intersecting the stars I wanted to identify. I then pasted the picture into excel and added #'s to the columns and rows to figure out my X and Y coordinates.
For the Size of Planets Grouped, I wanted the Reference Planet on the top of each column... this required the creation of some calculated fields.
Planet Size Grouping Order is a simple IF statement that put each planet into a numbered group based on size. I used this for my ordering (Earth Size first, then Super Earth Size, Etc.)
if [Radius (Rj)] <= .1115 then 1 elseif [Radius (Rj)] > .1115 and [Radius (Rj)] <= .1784 then 2 elseif [Radius (Rj)] > .1784 and [Radius (Rj)] <= .5352 then 3 elseif [Radius (Rj)] > .5352 and [Radius (Rj)] <= 1.338 then 4 elseif [Radius (Rj)] > 1.338 then 5 end
Planet Size Grouping is the same IF statement except I used Descriptions instead of Numbers. This gave me my labels at the bottom.
if [Radius (Rj)] <= .1115 then 'Earth Size' elseif [Radius (Rj)] > .1115 and [Radius (Rj)] <= .1784 then 'Super Earth Size' elseif [Radius (Rj)] > .1784 and [Radius (Rj)] <= .5352 then 'Neptune Size' elseif [Radius (Rj)] > .5352 and [Radius (Rj)] <= 1.338 then 'Jupiter Size' elseif [Radius (Rj)] > 1.338 then 'Larger' end
CNTD(Planet) is a simple Count Distinct of the planets... creating the ordered layers going up the column (showed how many planets fell into each category.
AGG(Planet Count Plus) is used to display the Planet Shapes on the top of each layer. Here's the calculation.
I then made it a Dual Axis and Shape; assigning a different planet to each shape. I used the same Planet Size Grouping