After obtaining all the scores given to each drawing on the Creativity scale and the Aesthetic Quality scale, I have been working on establishing the reliability of the scales, the construct validity of the Aesthetic Quality scale, and the potential relationship between the scores that drawings obtained on each scale with the scores that participants obtained on the VAST.
First, inter-rater reliability was determined by finding the Cronbach’s alpha of each scale. This was found using the alpha function in R, which determined the Cronbach’s alpha to be .80 on the Creativity scale and .82 on the Aesthetic Quality scale. This suggests that the raters judging the drawings gave consistent scores to each drawing. A more in depth analysis of each rater showed that there would be very little change in the inter-rater reliability if one were to remove the scores given by any particular rater, which also suggests that raters were consistent with the scores given to each drawing.
Given that raters were consistent with their scores of the drawings, I included all of the scores when creating a composite score for each drawing. For each scale, I found the average (mean) of all the scores assigned to a particular drawing. This was simple enough to calculate within LibreOffice (a program used to create and manage data in a spreadsheet). There was a good deal of variety among the means of each drawing, suggesting that the individual drawings differed in regard to their apparent creativity and aesthetic quality.
Using a Pearson’s Product-Moment Correlation test (the “cor.test” function in R), I determined that the correlation between the mean scores assigned to drawings on the Aesthetic Quality and Creativity scales to be approximately .81 (p<.05). I also conducted the same test on the relationship between the mean scores on each scale to the participants’ VAST scores, which returned non-significant results (weak correlation, p>.05). These results suggest that the Aesthetic Quality and Creativity scales are being scored similarly, meaning that the two constructs are not as distinct from each other as they should be. Also, there does not appear to be a significant correlation between participants’ VAST scores and the scores given to their drawings on either scale.
Moving forward I still intend to look into if there are any individual raters who differentiated between the two scales and, if that is the case, examining whether the raters’ VAST scores impacted how they scored the test. Given that there is strong inter-rater reliability, however, I am not expecting to find anything particularly significant. In addition to that, I will also look into determining if it is possible to better distinguish the constructs in the instructions given to the raters.
There was some difficulty in matching the scores of participants’ drawings to their VAST scores since the scores (on both the creativity and aesthetic quality scales) used different means of identifying the participants than the VAST used. After obtaining a file which linked the two methods of identification, I created two new files to organize the data for the Creativity scale and for the Aesthetic Quality scale. In each file, I linked the VAST user ID number to the participant’s average score and their VAST score. The VAST was administered twice, so I included both the pre and post test scores, but I am focusing primarily on the VAST Pretest data since not every participant completed the second VAST test. I was able to link all but one participant, whose user ID did not seem to match with any of the available VAST scores. I intend to look more into that later, but for now, the participant was not included in analyses involving the VAST scores.
Given that Aesthetic Quality does not appear to have been scored differently from Creativity, it is necessary for future research into it to make an effort to differentiate the constructs. I intend to look further into how to do that, referring the literature related to art and aesthetics which attempts to define aesthetic quality. Barring any particularly noteworthy results that may arise from examining the individual raters, defining Aesthetic Quality in a way that is more distinct from Creativity will be the next step if progress is to be made towards determining if training in virtual reality can improve skills related to aesthetic judgment or sensitivity.