Hi, I'm Steve
Postdoc @ TU BS Computer Graphics Lab
About
I've studied computer science, did my doctorate with an emphasis on computer graphics, and am currently working as post-doctoral researcher at the Computer Graphics Lab at TU Braunschweig, Germany.
My (professional) research so far has concentrated on human visual perception in immersive environments with projects on in-HMD eye tracking, foveated rendering and visual attention guidance. Most of this was done with C++, Python, GLSL, OpenGL, and OpenCV.
With the exception of the most recent project (that is about ... well, have a look down here), the majority of my private projects are somehow web-related, e.g. a genealogy website with a network visualization and infinite editing history, a markdown editor implemented as offline-available progressive web app, or a simple drag-n-drop file sharing web site with full shared access, i.e. not only up- and download, but also permissions to replace or delete files for everyone with access (not yet added here, please have a look on GitHub 😅).
Education
Research topic: Investigating and developing various visual stimuli to unobtrusively guide user gaze/attention in immersive environments.
Advisor: Prof. Dr. Marcus Magnor
Final project topic: Designing and building a head-mounted display with an integrated optical eye tracking system.
Advisors: Prof. Dr. Marcus Magnor and Dr. Michael Stengel
Final project topic: Investigating delayed visual response on user input in an interactive virtual environment.
Advisor: Prof. Dr. Douglas W. Cunningham
Personal Projects
Here you can find a selection of my private projects.
I typically provide them as open source or under a license close to this on my GitHub page, so if you're interested I'm happy to see forks and pull requests ;)
Digital Mirror Chamber
This project started because I wanted to test the new Meta Avatars.
Unfortunately, the premiere avatars app (Horizon) is not yet available in Germany and I did not readily found another app that was using the avatars in a way to properly check out "yourself". So I started to browse through the docs and found that the current version was only available for the Unity Engine. While I got some experience using the Unreal Engine during my masters thesis, I had never used Unity until then — you got it ... great opportunity to take a look.
An app idea was quickly found in this case, as indicated before: some small environment that uses your customized Meta avatar such that you can properly see it yourself.
Here it comes, the Digital Mirror Chamber, a kind of fitting room with mirror like displays arranged around the player.
Why displays instead of actual mirrors, you might ask.
I actually started digging through the Unity docs to see how to use the stencil buffer to create a "classical" mirror rendering, but discarded this approach quickly, as it wouldn't have led to the goal of enabling players to view themselves (their avatars) from all sides — without having to look over their shoulders, as you would have to with a "real" mirror.
That's why, now, there is a digital webcam in front, and human-sized video walls around the player.
This simple setup already has the effect that, while each display shows the exact same "camera feed", players will see another side of their avatar by just turning around, towards any of them.
After a few lessons on the Unity Learn platform, from how to even navigate through Unity, to setting up a VR project, I felt ready to go. I rather quickly found some neat assets to compose a cozy "chamber" with four displays and some knick-knacks in-between them.
I was a bit surprised that many objects from the Unity Assets Store did not render correctly with the (if I did not mix up something) default Universal Render Pipeline, and showed up in a solid pink.
Maybe there was a switch in the defaults at some point?
I don't know and don't have to, fortunately, since a simple switch of the used material shader (and sometimes re-applying the textures to their corresponding material slots) fixed the issue rather easily.
Another hurdle was the fact that the app needs access to users' avatars, for which I needed the app-id of an appropriate meta app, for which I needed to register such an app in the developer dashboard, for which I needed to create an account with a developer organization 🥴. While none of these steps took too long, it took me some time to find them, and to realize that I actually need all this already during development, if I want to see my actual own avatar instead of only pre-configured local ones. But — once everything is set up, the avatars work pretty smoothly. There is a prefab bundled in the Avatars SDK that simplifies loading user avatars, with automatic fall-back to a local avatar if the user has not yet configured one.
Setting up the virtual webcam with an attached render texture, that is used in the material of the four screens, was very straightforward using Unity.
To give the whole thing some interaction, I've added a tablet with some simple controls to, e.g., toggle lights and music.
Since the tablet should always be at hand without looking glued, I added a cute (at least I think so 😊) effect to shrink it into a ring when not used, and re-materialize it upon tapping on the ring.
While there are still a ton of ideas to extend this little app, (if there is time) the next thing to check is most probably profiling and optimization in Unity, as it feels a bit low-fps sometimes.
Instant Markdown
By now I did find a few pretty good candidates, but back then I started this project because of the absence of (or my inability to find) a proper in-place live-formatting markdown editor.
While concepts like a live side-by-side preview or manually switching between code and preview do work, I personally find them rather suboptimal. The side-by-side preview just wastes space and the toggle can become annoying quickly. So I thought of quickly (haha 😅) writing a simple markdown editor web app.
Haha because, first, markdown formatting is not as simle as it seemed to me at first glance, and, second, in-place live formatting turned out to be not that easy.
My initial approach was to use the contenteditable html feature to allow typing directly in the formatted code, that is generated by transparently injecting html formatting tags that do not change the actual text.
Well, that didn't work out for me because browsers seem to uncontrollably insert new html tags (e.g. paragraphs when hitting enter) when typing in such editable html elements.
So, now the user is typing into an invisible textarea behind the formatted preview, which is continuously re-generated upon user input.
Currently supported features include basic inline text formatting, i.e. bold, italic and strikethrough styles. Links are either automatically generated from recognized URLs or, as for images, explicitly defined with a separate link text. For text structuring, (6 levels of nested) headings, bullet points and numbered lists, quotes, and horizontal lines are supported. Code snippets may be included inline or as whole paragraphs.
You can either just use it to write (and copy) Markdown code, or even edit local files via the File System Access API. Advancing the latter, it is installable as PWA to almost give the look and feel of a native system app, and even allow use it offline. It's the first time I tried to extend a web tool with PWA support, and I'm actually surprised how close you cen get to a native app.
Two other browser feature that I extensively tried for the first time in this project are automatic system/browser dependent light and dark mode rendering, and CSS variables. By overloading the latter I also included options for detailed customization of the whole editor style.
There are still some issues, especially the (almost completely) missing mouse interaction, but things are moving forward.
Family Net
This project is about collecting and visualizing information about the whole network of your family, including really all your relatives and their relations.
While there are many tools and websites out there in the field of genealogy, they can only show direct ancestors and descendants of a person in a tree-like structure.
None of them seems to be able to show a complete family network, which, of course, is not simple because of the quickly exploding amount of people in there and all the overlapping connections that represent their relations.
So I started out to create a system to manage and show all my family members in one large network, doing my best to maintain readability.
As usual, as soon as the basics were done and I started to fill in the first information, new feature requests popped up in my mind. Since then the system grew quite a bit, by now including e.g. storage of personal data (first/last/birth names, birth/death dates, additional notes), relations between persons, sources (annotatable document photos), access control with different user roles (admin, editor, viewer, guest), git-based editing history, and 1½ additional layouts to automatically arrange the family tree (up and down) of a selected person.
The latest feature is an automatic calculation of the relationship between any two (connected) persons. I started with "simple" relations, like lineal consanguinity (great-...-grandparents or children) and collateral relatives with common ancestors (aunts/uncles, brothers/sisters or nieces/nephews), which was relatively easy.
Doing this, I picked up some fun useless knowledge:
For what is {"", "grand", "great grand"} parents/children in English, we say {"", "Groß", "Urgroß"} eltern/kinder in German.
Let's call these three prefix generation levels.
So far so similar.
Now here it comes ... Since we Germans seem to love to have well-defined rules (i.a., to name things), we combine these three prefixes with another 7 distinct names for every three ancestor generations.
This yields 24 (!) generations of ancestors of a person that can be precisely named without doing the great great great... thing.
Assuming an optimistic age of 20 during the first childbirth, that's almost 500 years 🤯.
Crazy Germans ...
Back to business. Time to demystify those distant relatives we meet every once in a while on big family birthday parties — those third cousins of the husband of our great-granduncle's wife's niece — got it? Me neither. So, while finding the connecting path in the network is fairly easy using a basic breadth-first search, the generation of a readily understandable wording for these cases turned out pretty tricky, but was worth it. In combination with highlighting the actual path in the network, this gives a nice tool to better understand family relations (and opportunities for smart shitting during the next family reunion 🤓).
PhD Research
Below is a selection of the academic research projects I have been most involved to date.
You can find a full list of publications on my website at the TU BS Computer Graphics Lab.
Immersive Attention Guidance
In this project we investigated visual means to guide user attention through immersive virtual environments.
Thereby we evaluated (existing) and developed (new) visual stimuli that can be integrated into a virtual scene, to unobtrusively steer viewing behavior, i.e. when do people look where.
Applications range from storytelling in virtual adventures, where you don't want your viewers to miss important actions, to digital tours through museums, where certain details of exhibition pieces shall be highlighted without changing the overall look of these pieces (too much).

Gaze tracking heatmap without and with guidance
To study viewing behavior, we've had the special requirement of (live) eye tracking capabilities for our VR systems.
While, by now, this is not a big deal anymore, no such HMD was available when we started this project.
Luckily, back then we'd just finished our preceding project on building just that — an eye-tracking HMD.
To be fair: We've later replaced our custom-built device with a commercially upgraded HTC Vive for better display quality and, mainly, because it was more sturdily built.
The second device that we've used is the ICG Dome, a 5m dome lit by projectors, in combination with eye-tracking glasses.
In the first phase, we've examined existing subtle visual stimuli, that have been originally developed for desktop settings.
They mostly exploited the weak visual acuity in our peripheral visual field, i.e. the regions where we're not looking at, in combination with timely deactivation, i.e. hiding stimuli before our eyes can focus on them.
We then investigated how to extend these (desktop) methods in immersive (VR) systems, e.g. via viewing-angle-based scaling in peripheral vision, and perspective distortion correction.
Building upon our insights, we've then devised a new method (SIBM) that may, but don't necessarily need to exploit the visual acuity degradation or timely deactivation to remain unobtrusive.
The method instead exploits our brain's sensitivity to mismatches in our binocular vision, i.e. if our eyes look a the same spot, but see (slightly) different things.
In our case the slight difference is a smoothly blended light–dark adjustment.
Light for one eye, dark for the other.
This creates the perception of a slight shimmer, but it's not really discernible, even when looking directly at it.
Binocular stimulus yields conflicting information in our brain
Eye-tracking Head-mounted Display
In my master thesis (and the early days as a PhD candidate) we've developed one of the first ever head-mounted displays (HMDs) with an integrated eye tracking system to capture the user's gaze at real time.
The whole case is 3D-designed and printed to include all necessary components: a 5.6" LCD (1280×800 pixel); lenses borrowed from an Oculus Rift Development Kit 1 are kept in place by custom 3D-adjustable lens-holders with integrated infrared LEDs to illuminate the eyes; two angled infrared mirrors that enable a frontal view onto the eyes for the two cameras attached to the sides; a front-facing camera and orientation sensor for head tracking.
We've created different software components to use the system with all its parts.
The HMD's display is recognized as a generic additional monitor, i.e. the windows of VR apps are simply positioned on that screen.
As users can freely adjust the lenses positions for best viewing quality, we've implemented a calibration procedure to find the lens' position relative to the screen, and appropriate shaders for warping to correct the resulting lens distortion.
Head tracking is split in two parts: orientation and position.
Head-orientation tracking is computed from gyro and acceleration data from an 6-DOF IMU sensor chip (run by an Arduino at 200Hz) for instant feedback to slightest head movements.
SLAM-based inside-out tracking using a front-facing camera is used for consistent head position tracking.
Both head tracking data streams are received and processed in a motion tracking module, giving access to the final data for any application via local sockets.
The video streams from the two eye tracking cameras are processes by a separate module.
Here, the pupils are first localized assuming they're the darkest region in the images (exploiting that typically hardly any light comes out of the pupil).
Afterwards, a starburst-inspired algorithm extracts the exact pupil position, which is then mapped to screen-coordinates based on the calibration data — all at real time (at least 60Hz).
The system was evaluated in a custom OpenGL renderer and as a plugin in Unreal Engine 4.
Application: Foveated Rendering
The major application that we've investigated with this system, is gaze-contingent adaptive rendering a.k.a. foveated rendering. The idea is to reduce rendering workload by reducing image quality, such that users will not notice. This is achieved by rendering the image in full quality only in the visual center (gazed/foveal region) while reducing the quality for the rest, i.e. where we do not look at directly.
Technically the approach builds upon discarding shading of "probably less relevant" individual pixels, using a GPU feature called early fragment/depth test. For this, the expensive fragment shader (lighting) computation is conditionally skipped based on a probability map. For the well-known Sponza scene, we were able to reduce the number of fully shaded pixels to about 40%, overall (including sampling and infill computation) yielding a render time reduced by about 25%.
Perceptually, the approach exploits the severe degradation in our visual acuity towards peripheral vision. A radial fall-off contributes the basic distribution to the probability map. On top of this we also increase the sampling rate in "relevant" image regions, which we've defined as geometry edges, areas with fine details (textures) as well as highly specular (mirror-like) areas.
Awards
ICG Dome
This research device can be best described as a small-sized planetarium. It's a (slightly angled) half sphere whose inner surface is entirely lit by projectors.
A key issue for many VR devices is that, despite great progress has been made since the sword of damocles, they're still too heavy and uncomfortable for long-term use.
This is one important advantage of a dome, as at most light-weight 3D shutter glasses (with some trackable markers for head tracking) are required in there, allowing free motion with no reduction in comfort.
Moreover, there is no space restriction that caps the possible resolution (you can always put in more projectors 😏), which yields a clear image with no visible pixel grid.
While this was already possible with good old cave systems that even allowed full surrounding projection, the spherical shape inherently avoids any visible construction edges or corners.
Because all systems got their pros and cons, feature-wise, a dome perfectly complements a set of recent HMDs.
In detail, this specific system comprises the dome itself, six 2.5K 120Hz Barco projectors and 6 OptiTrack motion capture cameras attached all around at the bottom edge of the dome. The projectors are wired to six dedicated rendering machines, controlled by an additional work station.


Warping rendered views before projection onto the curved dome surface
To drive this rather complex system, an OpenGL-based application framework was developed throughout working with the system.
By now, it includes basic rendering capabilities (various 3D model formats, geometry, instances, volumes, lights, shadows, environment mapping, ...), position & orientation tracking along with warping for the curved dome surface, network communication and synchronization, gesture interaction, eye tracking via external or head-mounted systems, and many more.
Most features where implemented as required for our research projects, some emerged from student projects.
Have a look at the project page for an overview.
As a current side project, we're in the process of getting the dome ready as an output display system for Unreal Engine via its nDisplay module, which would be great to see — fingers crossed!
Contact
Steve Grogorick
LinkedInStay connected.
GitHubCheck out and contribute to my projects.
Google ScholarGet a quick overview of my research papers.
TU BS Computer Graphics LabGet free author version PDFs and additional details about the papers.
Basic random sampling pattern
Gaze-contingency
Discontinuities in scene geometry
High-contrast texture regions
Specular reflections
Combined sampling
Comparison to original
The Dome
The Dome showing the
The Dome with work station and render cluster aside