Skip to end of metadata
Go to start of metadata

The release date for the first XSEDE Data Visualization Badge certification was February 2017.

Update - June 2017: We are currently working to connect the Data Visualization Badge with upcoming workshops provided at XSEDE sites. 

Background

XSEDE proposes to offer a badge certification for beginning, intermediate, and advanced Data visualization expertise. 

Our first step was to create a badge design document defining what visualization badges we plan to offer and what the goal of each badge is. For example, the Beginner badge would be given to learners who demonstrate a basic understanding of the key concepts of data visualization. They would do this by passing a quiz and maybe creating a simple visualization from detailed instructions.

Part 1: Knowledge Assessment

This assessment is targeted towards university-level faculty and students interested in assessing their knowledge of data visualization. This is a beginner-level assessment. It does not include questions pertaining to things such as high-performance computing or big data. Questions are designed to avoid topics which are software application-specific, but rather designed to assess knowledge of data visualization principles and practices which may apply to most software applications.

Learning Objectives

Our set of learning objectives were guided in part by the Computational Science Undergraduate Competency list provided by the HPC University site at http://shodor.org/media/content/hpcu/website/educators/undergradCompetencies

We sorted the elements from Area 7: Visualization into Beginner/Intermediate/Advanced. Then we extracted the Beginner competencies and modified them to incorporate some of the more recently-emerging visualization methodologies. 

In order to pass this assessment, Learners will need to be able to:

- Define the purpose of visualizing scientific data.

- Differentiate between scientific visualization and information visualization.

- Give examples of scientific visualization.

- Explain the basic neurophysiology of the visual system as it pertains to perception of information.

- Describe basic principles which a good visualization leverages (e.g. contrast, alignment, scale, orientation, etc.).

- Explain the differences between the various visualization methods (e.g. charts, isosurfaces, streamlines, graph networks), and identify the types of data for which they are most appropriate and why.

- List coordinate systems relevant to visualization.

- Explain the difference between scalar data and vector data and describe visualization methods which may be applied to either.

- Be aware of the many different visualization resources and applications available to them.

- Describe a ‘glyph’ and explain why glyphs are useful.

- Differentiate between 2D and 3D visualization applications and explain when one is more appropriate than the other.

- Explain what a GPU is and describe its advantages/disadvantages.

- Explain why polygon count can significantly impact performance in a 3D visualization.

- For a specific 3-dimensional dataset, describe what point, surface, and volumetric visualization techniques might be appropriately applied and why.

- Determine whether a dataset is univariate or multivariate and how that impacts visualization decisions,

- Demonstrate how to add dynamic animation and interactivity and determine when it is appropriate


Assessment Questions

The initial set of assessment questions is targeted towards university-level faculty and students interested in assessing their knowledge of scientific visualization. This is a beginner-level assessment. It does not include questions pertaining to things such as high-performance computing or big data. Questions are designed to avoid topics which pertain to a specific software program, but rather designed to assess knowledge of scientific visualization principles and practices which may apply to many software programs.

 We are in the process of creating an initial quiz for a beginner-level visualization badge. At the same time, we have mapped the quiz questions to the objectives. We are maintaining and updating a shared google doc of assessment-to-objective mapping. Because these Confluence pages are public but the assessment content needs to remain private, we cannot post links to our assessment material. You may send a request for information to XSEDE support at support@xsede.org. 

An initial set of ~30 questions were created, and Alan Craig kindly reviewed this initial set of questions and provided valuable feedback. The most significant issue was the lack of assessments pertaining to the application of a specific visualization to a specific type of data.

We created ~10 additional questions to address this gap, and added some questions for objectives which were not as well represented as others.

The final version of this assessment will be implemented on the XSEDE HPC Training Moodle. Access to the badge assessment will be by request only. You may view an HTML version of the initial assessment prototype at:

http://users.sdsc.edu/~jsale/xsede2/badge_assessments/viz_quiz1.html

 

Acknowledgements

In addition to Alan Craig, we are grateful for additional assistance from Mark Vanmoer at NCSA, Katia Oleinik of Boston U., and Amit Chourasia of SDSC.

 

Notes

We may want to offer multiple quizzes and have learners level up to earning the badge but that is probably more appropriate for the intermediate and advanced badges. 

Part 2: Practical Assessment

In the practical component of this badge certification, users will demonstrate their understanding of scientific visualization principles and practices by applying them to a dataset provided for them to download.

About The Weather Dataset

Users will be given a dataset containing one month’s worth of weather data for 124 weather stations distributed throughout San Diego County. The data source is a publicly-available web page at the following url:

http://hpwren.ucsd.edu/cgi-bin/sm_sdge2.pl

The data were collected from the above link, preprocessed to remove the HTML code, and saved to an individual comma-separated ascii text file every 10 minutes.

This data is spatiotemporal. In addition to a temporal parameter in the form of a time stamp, the data include spatial parameters in the form of latitude and longitude for each weather station.

The data include the following parameters (note that values for average, maximum, and minimum assume 10 minute intervals unless otherwise stated):

  • Station abbreviation
  • Date and time
  • Average wind speed
  • Average wind direction
  • Maximum wind gust
  • Maximum wind gust direction
  • Temperature
  • Dew Point
  • Relative humidity
  • Maximum temperature since midnight
  • Minimum temperature since midnight
  • Maximum wind gust since midnight
  • Time of maximum wind gust since midnight
  • Direction of maximum wind gust since midnight
  • Latitude
  • Longitude
  • Location

Requirements For This Assessment

For this part of the badge certification, users are required to select one or more of the above parameters highlighted in red and visualize the entire month’s worth of data for all 124 stations in 3 dimensions in a way that will allow those viewing the visualization to clearly see how the parameters change over time and space.

Users will submit the follow:

  • A sufficient number of images (5 or less) to clearly show how your chosen parameter changes over space and time

  • A short description (500 words or less) of the visualization to assist your audience with interpretation

  • A short description (500 words or less) of the process you used

  • A short discussion (500 words or less) regarding the limitations and alternatives one might consider in attempting a similar visualization.

An example of such a visualization might look something like the one shown below for the relative humidity parameter:


*Feedback from Mark Van Moer:

I tried to put myself in the mindset of a person totally new to vis (though that's hard). We just hired a new person into our group who has a I/O performance and analytics background, but no vis experience, so I was thinking about the kinds of questions he's been asking, too.

This material covers sci vis, info vis, and computer graphics (plus some human perception items). I think dropping computer graphics should be considered. I hate to say this, because that was my background, I did computer graphics of one kind or another for years before I got into sci vis, but when I think about how something like OpenGL has changed in that time, not to mention how GPUs and display tech have changed, it seems like more than a domain scientist needs to know to get started.

The other high level thing, and this is hard to address until there's course material, is I thought there was a disconnect between the learning objectives and what was in the quizzes. (Also, both quizzes have the same learning objectives? The URL is different, but they have the same content.)

For example, objectives like "list coordinate systems relevant to visualization", "explain what a GPU is and describe its advantages/disadvantages", "Explain why polygon count can significantly impact performance in a 3D visualization", etc., aren't touched on in the quizzes. Since these are compiled from Shodor, do they have course material that covers all these things? (Again, I'd consider dropping the computer graphics specific items.)

For Quiz 1, Question 2 gets kind of repeated in the following questions, which is fine, but for Question 6 I'd look for a different image instead of repeating it.  

For Quiz 2, I had too look up what a space-time cube was, apparently those are used in GIS quite a bit? I never work with GIS material because there's a different group here specifically for that. If those end up being covered in course material then leaving them in would make sense. But, say the course material was really a modified VisIt tutorial (like what Amit C. usually presents) or modified ParaView tutorial (what I usually present) then space-time cubes wouldn't ever come up. Unlike isosurfaces, streamlines, etc., which are more general technique.

For Question 25, when I hear clipping plane I think of the Clip filter in ParaView, which is a general clipping plane. To me, if the question is asking about the view frustum clip planes, it should say specifically near and far clipping planes, but maybe I'm being pedantic.

For both quizzes, I thought some of the images were too low res or too small to see clearly. E.g., Q13 in Quiz 2 is asking people to inspect thumbnail sized images for features, I could do it, because I already know what an isosurface is, but I could see that being a tough visual task for a newbie.

For the Practical Test, I like the idea of this type of data, because it really mixes concepts from both info vis and sci vis into one activity, but, I can also imagine a first year grad student that works with FE meshes quickly loosing interest in doing the badge. I don't have a good answer for this, because obviously it's not feasible to have separate projects for every possible type of science.

I looked at the data in ParaView using their CSV importer. (Also tried VisIt, but I'm not familiar enough with VisIt's spreadsheet plot to get anything useful out of it.)

The ParaView importer is not very smart, e.g., timestamps like 20140531.224000 are going to be imported as floats. So, one could try to make a space-time cube out of may_2014_weather_data.csv, but they'd have to figure out how to make the timestamps be useful Z coords. Alternatively, to make an animation, the separate files could be read in, but for ParaView to understand this is a time series, all the files have to be renamed or be handed a wrapper file like a PVD or an XDMF to associate files to timesteps. I don't think there's a way for ParaView to understand may_2014_weather_data.csv as containing multiple timesteps without some additional scripting. ParaView also won't recognize a timeseries unless all the base names are the same so I ended with something like

export i=0
for f in *.txt; do
cp $f sdge_data.$i.txt
i=$((i+1))
done

to get files of the name sdge_data.[0-4463].txt. There are 175 empty files, that might throw a newbie for a loop. This all requires someone already be moderately familiar with ParaView. Do GIS apps handle all these issues automatically? Maybe this would be a better project for an advanced badge. I think for an intermediate badge maybe a straightforward rectilinear mesh might be a better starting place.




  • No labels