Amen to this.
When tools to produce refined-looking graphics are only accessible to, or usable by, professionals and/or when expert-to-public translation leads to inaccuracies, a fear of elegant-looking graphics, and a concomitant exploratory-explanatory divide is understandable. The divide often leaves behind the aforementioned idea that exceptionally fancy graphics, and the time invested to make them, are only for public consumption, and not really useful for serious scientists or others pursuing deep quantitative analysis.
As the “democratization of data” continues, more and more services are making data behind both scholarly journal figures and public outreach graphics freely accessible. These open data sets represent a wealth of new information that researchers can combine with more traditional data acquisitions in their inquiries. If it’s quick and easy to get the data behind explanatory graphics, scientists will use those data, and learn more.
To generalise on that, take a particularly obvious lesson from nudge theory : software needs to be easy to use if people are going to use it. Visualisation is innately fun, but installing software is not. Interface design really matters - I'm far more likely to experiment with something if the basics just involve pressing a button. If have to pause to write even ten lines of code, well, I'm not going to do that. Case in point : HI source extraction. If the only software available doesn't let you record detected galaxies very easily, you might go away thinking that human-based detection is very difficult. In reality it is not, it's simply the lack of a sensible recording interface that makes it tedious. Humans are great at this, but they get bored by having to write down long numbers. Visualisation software should make it as easy as possible for humans to do what they're good at and ease the burden of the less interesting tasks.
This problem is particularly acute in fields where everyone writes their own code. Also, on a related point, visualisation software should have a freakin' GUI. No, I don't want to have to type commands to generate a plot, that's just plain silly. Code should be used to manipulate data, and not - wherever possible - be used to visualise it. Major caveat : it should always be possible to access the underlying code for experimentation with non-standard, custom techniques. Modern versions of Blender make it very easy to access the appropriate commands to control each module, thus giving the best of both worlds.
Sometimes, even though tools permit easy explanatory-exploratory travel, sociology or culture prohibits it. By way of a very simple example, consider color. To a physicist portraying temperature, the color blue encodes “hot,” since bluer photons have higher energy, but in popular Western culture, blue is used to mean cold. So, a figure colored correctly for a physicist will not necessarily work for public outreach. Still, though, a physicist’s figure produced in an exploratory system like the one portrayed in Figure 2 would work fine as an explanatory graphic for other physicists reading a scholarly report on the new findings.
It might be interesting to have some app/website that lets people play with the raw data behind images to compile them themselves. After a while, you start to lose the bias against thinking that what you can see with your eyes is an especially privileged view of the Universe, e.g. http://www.rhysy.net/the-hydrogen-sky.html
In 2006, no one quite knew what a “data scientist” was, but today, those words describe one of the most in-demand, high-paying, professions of the 21st century. Data volume is rising faster and faster, as is the diversity of data sets available – both in the commercial and academic sectors. Despite the rise of data science, though, today’s students are typically not trained–at any level of their education–in data visualization. Even the best graduate students in science at Harvard typically arrive completely naive about what visualization researchers have learned about how humans perceive graphical displays of information.
Over the past decade or so, more and more PhD students in science fields are taking computer science and data science courses. These courses often focus almost entirely on purely statistical approaches to data analysis, and they foster the idea that machine learning and AI are all that is needed for insight. They do not foster the ideas that one of the 20th century's greatest statisticians, John Tukey, put forward about visualization: 1) having the potential to give unanticipated insight to later be followed up with quantitative, statistical, analysis; or 2) that algorithms can make errors easily discovered and understood with visualization.
Exactly. It's true that human pattern recognition is fallible. However, it's at least equally true that statistical analyses can be fallible too. Having an objective procedure is not at all the same as being objectively correct. Working in concert, visualisation and statistical measurements are more than the sum of their parts. Finding a pattern suggests new ways to measure data, which in turn forces you to consider what it is you're actually measuring.
https://arxiv.org/abs/1805.11300
Sister blog of Physicists of the Caribbean. Shorter, more focused posts specialising in astronomy and data visualisation.
Subscribe to:
Post Comments (Atom)
Giants in the deep
Here's a fun little paper about hunting the gassiest galaxies in the Universe. I have to admit that FAST is delivering some very impres...
-
Of course you can prove a negative. In one sense this can be the easiest thing in the world : your theory predicts something which doesn...
-
Why Philosophy Matters for Science : A Worked Example "Fox News host Chris Wallace pushed Republican presidential candidate to expand...
-
In the last batch of simulations, we dropped a long gas stream into the gravitational potential of a cluster to see if it would get torn...
"Having an objective procedure is not at all the same as being objectively correct." Amen. Added immediately to quotes.txt
ReplyDelete