Scripted motion in VR: Begone!

 

Considering that all movement can be taken in by a given Virtual Reality ‘suit’, and that all of these movements can be resisted with haptic feedback, an element of gaming (which is Virtual Reality) is then to be done away with; scripted motion.

 

The formal note for this blog; I have designed a suit which is capable of the ranges of motion here detailed, and seek to present this idea to an appropriate company who are interested. The blog is secondary to the design; the ‘suit’ came first and inspired all blog posts on here.

 

It is perhaps the most visually conspicuous and disaffecting feature of games in this generation and in all those previous, to see, aside from low-resolution textures, a certain movement repeated over and over again. It denies the freedom of true motion, and the expression therein, and it adds an element which is rarely to be sought in any interactive experience; predictability. Any level of immersion is broken when the game ‘takes over’, and what’s even worse is that many scripted motions occur at pivotal moments, when a very specific activity is taking place. You have fought hard to get to an objective, suffered innumerable trials and injuries, solved problems and overcome challenges which tried your patience for hours at a time, all to be presented with a brief animation, and hence be relieved of all responsibility, of your character doing something which you could do yourself – but now, you can do it, with this ‘suit’. And you will have to. But that joy which is visited on you is mirrored by dread in every game programmer, because scripted motion is a workable solution to an intensely complex problem.

 

The principle of ‘Ragdoll’ physics, while it may be intensely realistic, is usually also intensive to render. Any kind of physics which reacts to stimuli is in fact many times more difficult to program and render than simply playing an animation in 3D space, and that’s partly why scripted motion has seen extensive use in video games. Another reason is that, increasingly often, scripted motion is, in the context of human avatars, ‘filmed’ by using motion-capture sensors placed on actors, who then act out the scene, which is many times easier than trying to compile a cartoon-like animation of describing what motion takes place per frame, or even using existing code to manipulate virtual avatars. Once it has been recorded, it’s very much inadvisable (in other words, simply not done) to alter the motions which have been recorded. Finally, complicated and unique motions are difficult to command with the limited amount of variation possible from the controls of any modern handset.

 

If Ragdoll were made available and, for instance, the player shoved one of the game’s characters off of a cliff, a cutscene must now be altered to react to this event. In a more mild example, the player’s getting in a character’s way is likely to ruin the timing of certain events, and have dialogue run late, making it overall worth the lack of realism to deny them the chance at this mistake. In short, if you allow the player to influence all events, then you must have NPCs programmed with every conceivable reaction, and over any lengthy period this quickly becomes an almost infinite set of specific reactions. Scripted motion is an easy solution.

 

Perhaps you don’t bemoan scripted motion much and have gotten used to it, but remember, you are acting out a virtual series of events, with as much emphasis placed on your individual involvement and personal choice as possible, so why shouldn’t it be possible for you to stick out your foot, trip the hero and thus prevent them from making the leap onto a departing ship while serious music plays and the credits roll? Just the once. You’ll reset and watch him go like a good player, next time. Maybe.

 

Besides, could it not be said that we have justified this request by making the whole business of games design much easier, by instantiating the Generic and Branded Object Libraries, which do away with weeks of laborious coding and tweaking? We’ve created an amount of slack, so now we can ask for other things to be improved upon to pick up that slack.

 

True motion is an excellent choice for this extra space. We might not do away totally with cutscenes, as surely they will always have a place, but for the most part, you’ll appreciate being able to chuckle some suitably dry remark before finishing off a boss you’ve chased for days, or make the decision about whether to let them live on after all, without a message popping up on the screen saying “Make a decision – here are your choices,”. The choices remain, but it will truly feel like you could have done anything if all you see is a defeated enemy at your feet and a gun in your hand, rather than being told that there were only two possibilities. You can express yourself in pivotal moments such as these, and thus truly claim them as your own – nobody else in the world will have a recording of any game’s ending which is quite the same as yours.

 

So, let’s address the problem of Ragdoll and movement adjustment, at least with relation to player-NPC interaction.

 

Now, I’m no game programmer, but from what I observe of the games I’ve played which allow Ragdoll to interrupt scripted motion, such as the GTA series since GTA IV, it works thus;

The game makes avatars’ legs move in the direction they are forced once the centre of mass proceeds far enough past the current positions of the feet. In this way, avatars that are pushed, stumble. The other component appears to be that their arms are generally raised to certain (usually defensive) positions, to which they gradually return if moved or obstructed. Also, hands seem to try and protect avatars from a fall by registering the direction in which they are falling and gradually adjust their position to place their hands in the way, palms facing the ground, all the while the physics engine is computing the force exerted on them by stimuli.

 

These basic steps toward simulation of a human being off-balance are at times extremely realistic, while at other times make NPCs appear as though they are frightened children when the player bumps gently into them. It’s quite difficult to program the full range of human instincts and body language patterns into every NPC, but this is precisely what I think should be done, and I have a shortcut in mind.

 

If we can take a moment to reflect, now, we remember from other blog posts that motion-input, formats for objects, menus, interfaces and the default physics engine have all been standardised (in the hypothetical sense; we have established that to do so is beneficial, but we have not yet designed them all explicitly), and it makes a certain amount of sense to add NPCs to the Object Libraries. They are, after all, entirely fictional people without any personality or intelligence of their own, and as such it is difficult to think of them as anything other than objects, from a programmer’s point of view.

 

So, with this in mind, I propose that the following be done to begin addressing complex, free-form motion for NPC objects in Virtual Reality;

 

  1. Innate human reflexes, such as the knee-jerk reflex, the ducking of the head at a loud noise, raising of the hands to protect the face and closing of the eyes if lunged at or if a fall is taken, etc., be catalogued to some degree (perhaps not exhaustively for early versions of the API) and that code which describes these actions be linked to NPC objects,

  2. Certain elements of human body language, such as the clasping of hands over the chest when frightened, curling inward of the fingers and clenching of the teeth when angry, raising of the hands to the back or top of the head when shocked or distraught, also be catalogued (certainly not exhaustively – I think we’ll be lucky to get even the most common behaviours agreed on, due to the objectionable nature of much of psychology) and put into code which is again linked to NPC objects,

  3. NPC objects themselves are described according to medical literature, for instance to account for the typical relational sizes of the lower leg bones and lower arm bones (these are approximately equal in length for most humans), as well as typical ratios for an increase in height from the average to correspond with broader shoulders, thicker facial bones, larger hands, etc.,

  4. For the purposes of gore and wound simulation, which surely are integral to many genres of both games and training simulations, the quantity of blood contained within an NPC is decided upon as a variable, relying on body mass and volume to at least give a roughly accurate figure, the long bones are considered an amalgamation of objects (bone shards), into which these bones disintegrate if struck with enough force (it is much easier this way than to try and decompose an object, I think), and if a game/simulator is striving for true realism, limbs should all themselves be objects which comprise the whole body, so that they can be removed by stimuli,

  5. The emotions, again catalogued by study of psychology, are described by variables whose value affects behaviour of NPC objects appropriately, and is affected by stimuli, in order to simulate some small extent of emotion, i.e. anger making physical motions more muscular and forceful, the voice louder in volume as well as consonants more pronounced, and contorting the face,

  6. Finally, that the face area of NPCs be allocated more code than other areas, as this is where humans instinctively look to, and that the major muscles there be simulated, their extensions being kept as variables.

 

Some of these, naturally, have already been implemented in past game engines, and others demonstrate my lack of experience in this particular field perhaps, but they all seem valid enough to me, so if none of the above have offended technical knowledge of yours which is superior to mine, then we can proceed.

 

It seems to me that all of these variables, such as extension of individual facial muscles, violence of reaction (such as how sharply an NPCs head is to be ducked at a loud noise), depth of anger felt, etc., need only be considered between extremes and have all stimuli act as contributors or detractors to their value.

For example, the ‘anger’ variable is manifest in how far narrowed the eyes are, how tightly clenched the teeth/fists are, how loud the voice is and so on, and so long as these all have a base point (relaxed face; anger = 0, facialMuscle1 = 0, facialMuscle2 = 0, voiceVolumeAtSource = 50 etc.) and an extreme (fury; anger = 100, facialMuscle1 = 100, facialMuscle2 = 100, voiceVolumeAtSource = 100 etc.), then by simple unary operations on local variables we can adjust all of these things, according to a logical rulebase which reflects human psychology.

This is a complex process, but remember, because this will be standardised code which is inherited by all NPCs, it need only be written once and then adjusted with unique parameters such as the specific temperament of a character making them more susceptible to rage by ‘annoying’ stimuli. We need not simulate personality directly, we just need to decide what stimuli are ‘annoying’, and which other variables are affected by the ‘annoyance’ variable, and so on for the other emotions.

 

The nature of human muscles fits right in with this scheme, as our muscles only ‘pull’, and so can be expressed at all times by a figure between 0 for minimum extension and 100 for maximum extension. By contrast, it is somewhat arbitrary as to the precise value of annoyance brought about by any given event, but a healthy medium shouldn’t take too long to reach, at the discretion of the ‘director’ of any scene.

 

Lastly, interaction with NPC movement would ideally be based on many of the same principles that current Ragdoll uses, except without making NPCs always ‘play stunned’ whenever the player touches them. In the same way that gravity battles with the efforts of the legs to stand upright, so can the player’s input slow down or stop NPC motions, which in response are either exaggerated or stopped. Simple obstruction or impact to a limb is responded to either by aggressive or submissive actions, decided upon as a combination of those low-level reflexes and high-level psychological responses. One example;

The player discourages an NPC from shouting for help by pointing a gun at them. The NPC’s code realises that an Object is moving both 1. in close proximity to the face and 2. towards the face, and this triggers the arms to jerk upward to cover the face. The NPC’s anger variable is increased, causing their teeth and fists to clench, and the volume of their voice to be raised. Finally, a unique sound-file containing an angry statement or threat is played in time with their mouth moving. This anger variable allows them to slowly calm down, rather than flip like a binary digit if the player gets dust on their shoes.

 

This level of variability is necessary for the expulsion of scripted motion from games, and the knowledge that it is objectively as purpose-neutral as possible and therefore applicable to all games, providing the best possible overall realism, is what warrants the considerable effort of making a ‘One for all’ framework of games engines. It will also mean that games are massively easier to make, even to the level where amateur programmers can successfully make a richly-detailed, expansive game that relies on their originality of plot and dialogue, rather than hard-learned programming expertise and a huge graphics budget.

Leave a comment