Assessing Aesthetic Judgment and Related Skills – Blog Post #1

The purpose of this project is to examine the potential of virtual reality (VR) training to improve aesthetic skills. Our main goal is to create an aesthetic quality scale, which will be used to measure the aesthetic quality of 3D drawings created in a VR setting. In future research, it may be possible to use such a scale to determine whether VR training has a significant effect upon aesthetic skills, by seeing if there is an improvement in the aesthetic quality of 3D drawings people produce after VR training. The scale may also come in useful for other research related to measuring aesthetic quality, but for our purposes, it will make it possible to measure the observable quality of drawings being produced by participants, in order to assess any potential changes to the internal characteristic of aesthetic sensitivity. The aesthetic quality of drawings may be used as a way to assess an individual’s aesthetic skills. Before research can be conducted into that, the project will focus on developing a suitable aesthetic quality scale, assessing its validity, and determining whether the scores assigned to drawings with the scale can be used as a proxy of an individual’s aesthetic sensitivity.

The data being used in this particular project was collected for a different study which examines the link between VR training and creativity. It involved assessing 3D drawings produced by the participants in a VR setting on a scale designed to measure creativity. Each participant was instructed to create a drawing of an animal of their choosing (real or imaginary) using the program. The Visual Aesthetic Sensitivity Test (VAST) is an accepted measure of aesthetic sensitivity, and a revised version of it (VAST-R) was used to determine participants’ aesthetic sensitivity. Prior to producing the drawing, all participants completed the VAST-R. In the current study, the VAST-R scores will be examined in their relation to the scores given to the drawings on the aesthetic quality scale. If participants who achieve higher VAST-R scores have their drawings scored higher on the aesthetic quality scale, it will add credibility to the validity of the scale, as well as function as an important variable to consider since the project is examining the potential for VR to improve aesthetic sensitivity.

The VR training itself consisted of various immersive experiences. Participants watched a music video which fully surrounded them, explored an underwater scene, played an interactive story game, and interacted with a recorded scene which utilized tactile stimulation from the researcher which mimicked what the participant would expect to feel during the experience. This training was utilized in the previous study, which involved a group of 68 participants, each of whom created their 3D drawing after experiencing the VR training. Each of the drawings produced in the study were scored on a scale for creativity. This scale utilized the Consensual Assessment Technique (CAT), which involves raters being given a characteristic number of creative productions to rate using a multipoint scale which they can use to freely assign the score they think best suits each production. In the case of this study, the group of judges were asked to rate the 68 productions for creativity on a seven point scale, as well as for the new, similarly structured scale, of aesthetic quality. Judges were also given the VAST-R prior to scoring to see whether their aesthetic sensitivity would impact the scores they give on either of the scales.

Currently, we are focused upon establishing the validity of the aesthetic sensitivity scale in particular. A valid scale is one which yields the correct score for the construct it is measuring. The validity of the scale is vital because a scale which produces an incorrect score is likely invalid and not measuring what it is intended to measure. In the case of the aesthetic quality scale, it has not yet been established that the scores being produced are indicative of aesthetic quality, so that is our immediate next step. To determine whether the scale is valid, we will be looking for trends in how individuals score items using the aesthetic quality scale.

If the aesthetic quality scale measures what it is intended to measure, then the scores given to each 3D drawing will likely reflect the aesthetic sensitivity of the individual producing it. To that end, we will be examining whether each individual’s VAST-R score is predictive of the aesthetic quality rating assigned to their drawing. If higher scores on the participant’s VAST-R predict higher scores on the aesthetic quality scale, that will suggest that the aesthetic quality of the drawings being produced are a result of an individual’s aesthetic sensitivity ability. This may make it possible for future studies to determine whether aesthetic sensitivity can be improved through VR training. Furthermore, comparing the scores given on the aesthetic quality scale with the scores given on the creativity scale will determine the extent to which the construct of aesthetic quality is separate from the construct of creativity. It is necessary to determine that the aesthetic quality scale is measuring something unique and specific in order to establish its validity.

In addition to establishing the validity, we will look into whether the scale is reliable. If the aesthetic quality scale is reliable, then scores assigned to a drawing will be consistent between raters. If the scale is not consistent between raters, we will examine differences between individuals in the scores they assign to drawing. Since we administered the VAST-R to the judges, then that will be our primary focus when it relates to which judges are more likely to produce valid scores. It may be the case that judges will need to have a high VAST-R score if their judgment is to be relied upon. If there is strong interrater reliability, however, the judging can likely be done by anyone who understands the scale. Establishing how reliable the scale is will determine how the aesthetic sensitivity scale can be utilized in the future.

We are also working on writing syntax for automating the IRT scoring process of the VAST-R. IRT refers to Item Response Theory – a framework used in designing, analyzing, and scoring measures of abilities, attitudes, and other psychological dimensions – which uses the unobserved ability of an individual and the unobserved characteristics of the items (both of which are considered fixed values for a given individual and item) and considers these to be predictors of an individual’s observable score on each item. In the process, unobservable parameters (notably, each individual’s latent ability) are estimated, making IRT an accurate alternative to regular “sum” scoring.

This is my first experience writing syntax, as well as my first encounter with the IRT framework, so I am learning quite a bit as I go. I hope to familiarize myself with these skills since they will be useful in future projects. Additionally, by writing out a syntax for automating the IRT scoring process, the same syntax can be used by me and by other people who wish to use the automated IRT syntax in the future. This, along with the progress that is being made towards the project’s ultimate goal of assessing aesthetic judgment and creating an aesthetic quality scale, are both informative in and of themselves and will be useful for future projects. I hope to gain more knowledge about psychometrics in the process of working on this project, as well as produce results that future studies can build upon.

Leave a Reply

Your email address will not be published. Required fields are marked *