Keyboard binding for model rendition previews

Problem description

When comparing images, the brain retains a sharp and detailed memory of one image and is able to notice the tiniest differences in another image if the visual receptive field is in the same spot and the pixels abruptly change to those of another picture. The change is perceived as motion, and the human visual system is highly tuned to detect motion, which in the case of image comparison translates to image differences. If a person is forced to either move their eyes, or has an intermediate image introduced while swapping images, a significant amount of visual memory from the previous image to compare the new one with is lost. This leads to the user being unable to perceive critical differences thay wanted to consider between different image renditions being compared.

In Gigapixel there are two completely different views for comparing images. Both of them unfortunately suffer from the core issue above. In the comparison view, the user is forced to move their eyes over different regions (tiles) on the screen. When moving their eyes, much of the visual memory from the previous image they were focused on is lost, and likewise their ability to notice all the differences.

When using the main application view, the user is also forced to move their eyes, but this time over to the “AI Models” selection pane, to find the position of the model description to click. They then have to move the cursor over to that description and click it, and then move their eyes again, back to the preview area of the application. The preview image doesn’t then immediately transition to the output for the new model rendition, instead the preview first briefly cycles to a point resized version of the source image, and then to the new model rendition. This introduces a significant amount of undesirable visual stimulation, causing the user to lose contrastive visual information, and consequently the ability to pick out any but the most glaring differences between model renditions.

As such, the product presently offers no satisfactory option for comparing and contrasting different model outputs, only two very coarse grained options, both which are insufficient to assess the actual differences between the source and model renditions, or between different model renditions.

Principle for addressing

Minimize visual disruption when swapping between different renditions.

Offering a way to reduce the perceptive interruption would increase users ability to perceive even the tiniest differences between the different renditions, including against the source image.

Example of how it could be realized

By using easy to access key-bindings that bind to the respective model previews (trigger same behavior as clicking on a model in the AI model pane), including a binding for the Original image would allow the eyes to stay focused on one spot. Since the human vision is very tuned to notice movement, changes to the pixels between one rendition (or source) and another would make it very easy to spot exactly how the model outputs differ.

This realization has been used expertly in both the MSU VQMT (video clip comparison) tool, as well as in Video Comparer, where they bind the number keys, starting with 1 to the respective renditions they offer, so as to make the key easily accessible, easy to remember, and natural to reach for (a sequence).

Below is a mockup (animated gif) of how it could look like and behave in Gigapixel.

Note: the popup for key-presses would of course not be rendered in the actual product, as it would be needlessly distracting. I’m providing it only to convey the application response to which key-press.


Thank you for the thoughtful feedback, @jojje . I’ll be sure to bring this up to our product owners to better investigate the complexity of implementing this.