Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments
- Authors: Glass, Kevin Robert
- Date: 2009
- Subjects: Virtual computer systems , Virtual storage (Computer science) , Virtual reality , Computer animation , Fiction -- Computer programs , Narration (Rhetoric) -- Computer simulation , Animation (Cinematography) , Natural language processing (Computer Science)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: vital:4632 , http://hdl.handle.net/10962/d1006518
- Description: Popular fiction books describe rich visual environments that contain characters, objects, and behaviour. This research develops automated processes for converting text sourced from fiction books into animated virtual environments and multi-modal films. This involves the analysis of unrestricted natural language fiction to identify appropriate visual descriptions, and the interpretation of the identified descriptions for constructing animated 3D virtual environments. The goal of the text analysis stage is the creation of annotated fiction text, which identifies visual descriptions in a structured manner. A hierarchical rule-based learning system is created that induces patterns from example annotations provided by a human, and uses these for the creation of additional annotations. Patterns are expressed as tree structures that abstract the input text on different levels according to structural (token, sentence) and syntactic (parts-of-speech, syntactic function) categories. Patterns are generalized using pair-wise merging, where dissimilar sub-trees are replaced with wild-cards. The result is a small set of generalized patterns that are able to create correct annotations. A set of generalized patterns represents a model of an annotator's mental process regarding a particular annotation category. Annotated text is interpreted automatically for constructing detailed scene descriptions. This includes identifying which scenes to visualize, and identifying the contents and behaviour in each scene. Entity behaviour in a 3D virtual environment is formulated using time-based constraints that are automatically derived from annotations. Constraints are expressed as non-linear symbolic functions that restrict the trajectories of a pair of entities over a continuous interval of time. Solutions to these constraints specify precise behaviour. We create an innovative quantified constraint optimizer for locating sound solutions, which uses interval arithmetic for treating time and space as contiguous quantities. This optimization method uses a technique of constraint relaxation and tightening that allows solution approximations to be located where constraint systems are inconsistent (an ability not previously explored in interval-based quantified constraint solving). 3D virtual environments are populated by automatically selecting geometric models or procedural geometry-creation methods from a library. 3D models are animated according to trajectories derived from constraint solutions. The final animated film is sequenced using a range of modalities including animated 3D graphics, textual subtitles, audio narrations, and foleys. Hierarchical rule-based learning is evaluated over a range of annotation categories. Models are induced for different categories of annotation without modifying the core learning algorithms, and these models are shown to be applicable to different types of books. Models are induced automatically with accuracies ranging between 51.4% and 90.4%, depending on the category. We show that models are refined if further examples are provided, and this supports a boot-strapping process for training the learning mechanism. The task of interpreting annotated fiction text and populating 3D virtual environments is successfully automated using our described techniques. Detailed scene descriptions are created accurately, where between 83% and 96% of the automatically generated descriptions require no manual modification (depending on the type of description). The interval-based quantified constraint optimizer fully automates the behaviour specification process. Sample animated multi-modal 3D films are created using extracts from fiction books that are unrestricted in terms of complexity or subject matter (unlike existing text-to-graphics systems). These examples demonstrate that: behaviour is visualized that corresponds to the descriptions in the original text; appropriate geometry is selected (or created) for visualizing entities in each scene; sequences of scenes are created for a film-like presentation of the story; and that multiple modalities are combined to create a coherent multi-modal representation of the fiction text. This research demonstrates that visual descriptions in fiction text can be automatically identified, and that these descriptions can be converted into corresponding animated virtual environments. Unlike existing text-to-graphics systems, we describe techniques that function over unrestricted natural language text and perform the conversion process without the need for manually constructed repositories of world knowledge. This enables the rapid production of animated 3D virtual environments, allowing the human designer to focus on creative aspects.
- Full Text:
- Date Issued: 2009
Using virtual reality to monitor and control an industrial robot via the Internet
- Authors: Vermeulen, Heinrich
- Date: 2001
- Subjects: Virtual reality , Robotics -- Computer programs
- Language: English
- Type: Thesis , Masters , MTech (Engineering)
- Identifier: vital:10820 , http://hdl.handle.net/10948/74 , Virtual reality , Robotics -- Computer programs
- Description: Manufacturing processes may be modeled in various ways, including 3D modeling. There is a need to visualise, control and monitor manufacturing processes remotely via the Internet. Virtual Reality (VR) can be described as the science of integrating man with information. It is based on three distinct environments: three-dimensional, interactive and computer-generated. VR has come to the Internet in the form of VR modeling. The evolution of Web technologies in recent years has enabled the use of VR modeling for visualisation of manufacturing processes. The VR modeling language (VRML), which has become the standard for transmitting 3D virtual worlds across the Internet, can be used to control and monitor manufacturing processes visually. A 3D model of a manufacturing process, specifically an industrial robot arm, was created for this project. This model was successfully linked to the industrial robot that it represents in order to control and monitor the robot’s actions remotely via the Internet using Web technologies. This dissertation proves the viablity of using Virtual Reality to effectively visualise, monitor and control an industrial robot via the Internet. It also describes the methodology that was followed in modeling the industrial robot arm in VRML as well as linking the model to the real world application.
- Full Text:
- Date Issued: 2001
Designing and implementing a virtual reality interaction framework
- Authors: Rorke, Michael
- Date: 2000
- Subjects: Virtual reality , Computer simulation , Human-computer interaction , Computer graphics
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4623 , http://hdl.handle.net/10962/d1006491 , Virtual reality , Computer simulation , Human-computer interaction , Computer graphics
- Description: Virtual Reality offers the possibility for humans to interact in a more natural way with the computer and its applications. Currently, Virtual Reality is used mainly in the field of visualisation where 3D graphics allow users to more easily view complex sets of data or structures. The field of interaction in Virtual Reality has been largely neglected due mainly to problems with input devices and equipment costs. Recent research has aimed to overcome these interaction problems, thereby creating a usable interaction platform for Virtual Reality. This thesis presents a background into the field of interaction in Virtual Reality. It goes on to propose a generic framework for the implementation of common interaction techniques into a homogeneous application development environment. This framework adds a new layer to the standard Virtual Reality toolkit – the interaction abstraction layer, or interactor layer. This separation is in line with current HCI practices. The interactor layer is further divided into specific sections – input component, interaction component, system component, intermediaries, entities and widgets. Each of these performs a specific function, with clearly defined interfaces between the different components to promote easy objectoriented implementation of the framework. The validity of the framework is shown in comparison with accepted taxonomies in the area of Virtual Reality interaction. Thus demonstrating that the framework covers all the relevant factors involved in the field. Furthermore, the thesis describes an implementation of this framework. The implementation was completed using the Rhodes University CoRgi Virtual Reality toolkit. Several postgraduate students in the Rhodes University Computer Science Department utilised the framework implementation to develop a set of case studies. These case studies demonstrate the practical use of the framework to create useful Virtual Reality applications, as well as demonstrating the generic nature of the framework and its extensibility to be able to handle new interaction techniques. Finally, the generic nature of the framework is further demonstrated by moving it from the standard CoRgi Virtual Reality toolkit, to a distributed version of this toolkit. The distributed implementation of the framework utilises the Common Object Request Broker Architecture (CORBA) to implement the distribution of the objects in the system. Using this distributed implementation, we are able to ascertain that CORBA is useful in the field of distributed real-time Virtual Reality, even taking into account the extra overhead introduced by the additional abstraction layer. We conclude from this thesis that it is important to abstract the interaction layer from the other layers of a Virtual Reality toolkit in order to provide a consistent interface to developers. We have shown that our framework is implementable and useful in the field, making it easier for developers to include interaction in their Virtual Reality applications. Our framework is able to handle all the current aspects of interaction in Virtual Reality, as well as being general enough to implement future interaction techniques. The framework is also applicable to different Virtual Reality toolkits and development platforms, making it ideal for developing general, cross-platform interactive Virtual Reality applications.
- Full Text:
- Date Issued: 2000
Development of the components of a low cost, distributed facial virtual conferencing system
- Authors: Panagou, Soterios
- Date: 2000 , 2011-11-10
- Subjects: Virtual computer systems , Virtual reality , Computer conferencing
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4622 , http://hdl.handle.net/10962/d1006490 , Virtual computer systems , Virtual reality , Computer conferencing
- Description: This thesis investigates the development of a low cost, component based facial virtual conferencing system. The design is decomposed into an encoding phase and a decoding phase, which communicate with each other via a network connection. The encoding phase is composed of three components: model acquisition (which handles avatar generation), pose estimation and expression analysis. Audio is not considered part of the encoding and decoding process, and as such is not evaluated. The model acquisition component is implemented using a visual hull reconstruction algorithm that is able to reconstruct real-world objects using only sets of images of the object as input. The object to be reconstructed is assumed to lie in a bounding volume of voxels. The reconstruction process involves the following stages: - Space carving for basic shape extraction; - Isosurface extraction to remove voxels not part of the surface of the reconstruction; - Mesh connection to generate a closed, connected polyhedral mesh; - Texture generation. Texturing is achieved by Gouraud shading the reconstruction with a vertex colour map; - Mesh decimation to simplify the object. The original algorithm has complexity O(n), but suffers from an inability to reconstruct concave surfaces that do not form part of the visual hull of the object. A novel extension to this algorithm based on Normalised Cross Correlation (NCC) is proposed to overcome this problem. An extension to speed up traditional NCC evaluations is proposed which reduces the NCC search space from a 2D search problem down to a single evaluation. Pose estimation and expression analysis are performed by tracking six fiducial points on the face of a subject. A tracking algorithm is developed that uses Normalised Cross Correlation to facilitate robust tracking that is invariant to changing lighting conditions, rotations and scaling. Pose estimation involves the recovery of the head position and orientation through the tracking of the triangle formed by the subject's eyebrows and nose tip. A rule-based evaluation of points that are tracked around the subject's mouth forms the basis of the expression analysis. A user assisted feedback loop and caching mechanism is used to overcome tracking errors due to fast motion or occlusions. The NCC tracker is shown to achieve a tracking performance of 10 fps when tracking the six fiducial points. The decoding phase is divided into 3 tasks, namely: avatar movement, expression generation and expression management. Avatar movement is implemented using the base VR system. Expression generation is facilitated using a Vertex Interpolation Deformation method. A weighting system is proposed for expression management. Its function is to gradually transform from one expression to the next. The use of the vertex interpolation method allows real-time deformations of the avatar representation, achieving 16 fps when applied to a model consisting of 7500 vertices. An Expression Parameter Lookup Table (EPLT) facilitates an independent mapping between the two phases. It defines a list of generic expressions that are known to the system and associates an Expression ID with each one. For each generic expression, it relates the expression analysis rules for any subject with the expression generation parameters for any avatar model. The result is that facial expression replication between any subject and avatar combination can be performed by transferring only the Expression ID from the encoder application to the decoder application. The ideas developed in the thesis are demonstrated in an implementation using the CoRgi Virtual Reality system. It is shown that the virtual-conferencing application based on this design requires only a bandwidth of 2 Kbps. , Adobe Acrobat Pro 9.4.6 , Adobe Acrobat 9.46 Paper Capture Plug-in
- Full Text:
- Date Issued: 2000
Minimal motion capture with inverse kinematics for articulated human figure animation
- Authors: Casanueva, Luis
- Date: 2000
- Subjects: Virtual reality , Image processing -- Digital techniques
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4620 , http://hdl.handle.net/10962/d1006485 , Virtual reality , Image processing -- Digital techniques
- Description: Animating an articulated figure usually requires expensive hardware in terms of motion capture equipment, processing power and rendering power. This implies a high cost system and thus eliminates the use of personal computers to drive avatars in virtual environments. We propose a system to animate an articulated human upper body in real-time, using minimal motion capture trackers to provide position and orientation for the limbs. The system has to drive an avatar in a virtual environment on a low-end computer. The cost of the motion capture equipment must be relatively low (hence the use of minimal trackers). We discuss the various types of motion capture equipment and decide to use electromagnetic trackers which are adequate for our requirements while being reasonably priced. We also discuss the use of inverse kinematics to solve for the articulated chains making up the topology of the articulated figure. Furthermore, we offer a method to describe articulated chains as well as a process to specify the reach of up to four link chains with various levels of redundancy for use in articulated figures. We then provide various types of constraints to reduce the redundancy of non-defined articulated chains, specifically for chains found in an articulated human upper body. Such methods include a way to solve for the redundancy in the orientation of the neck link, as well as three different methods to solve the redundancy of the articulated human arm. The first method involves eliminating a degree of freedom from the chain, thus reducing its redundancy. The second method calculates the elevation angle of the elbow position from the elevation angle of the hand. The third method determines the actual position of the elbow from an average of previous positions of the elbow according to the position and orientation of the hand. The previous positions of the elbow are captured during the calibration process. The redundancy of the neck is easily solved due to the small amount of redundancy in the chain. When solving the arm, the first method which should give a perfect result in theory, gives a poor result in practice due to the limitations of both the motion capture equipment and the design. The second method provides an adequate result for the position of the redundant elbow in most cases although fails in some cases. Still it benefits from a simple approach as well as very little need for calibration. The third method provides the most accurate method of the three for the position of the redundant elbow although it also fails in some cases. This method however requires a long calibration session for each user. The last two methods allow for the calibration data to be used in latter session, thus reducing considerably the calibration required. In combination with a virtual reality system, these processes allow for the real-time animation of an articulated figure to drive avatars in virtual environments or for low quality animation on a low-end computer.
- Full Text:
- Date Issued: 2000
Virtual sculpting : an investigation of directly manipulated free-form deformation in a virtual environment
- Authors: Gain, James Edward
- Date: 1996
- Subjects: Computer simulation , Computer graphics , Virtual reality
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4660 , http://hdl.handle.net/10962/d1006661 , Computer simulation , Computer graphics , Virtual reality
- Description: This thesis presents a Virtual Sculpting system, which addresses the problem of Free-Form Solid Modelling. The disparate elements of a Polygon-Mesh representation, a Directly Manipulated Free-Form Deformation sculpting tool, and a Virtual Environment are drawn into a cohesive whole under the mantle of a clay-sculpting metaphor. This enables a user to mould and manipulate a synthetic solid interactively as if it were composed of malleable clay. The focus of this study is on the interactivity, intuitivity and versatility of such a system. To this end, a range of improvements is investigated which significantly enhances the efficiency and correctness of Directly Manipulated Free-Form Deformation, both separately and as a seamless component of the Virtual Sculpting system.
- Full Text:
- Date Issued: 1996
Parallel implementation of a virtual reality system on a transputer architecture
- Authors: Bangay, Shaun Douglas
- Date: 1994 , 2012-10-11
- Subjects: Virtual reality , Computer simulation , Transputers
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4668 , http://hdl.handle.net/10962/d1006687 , Virtual reality , Computer simulation , Transputers
- Description: A Virtual Reality is a computer model of an environment, actual or imagined, presented to a user in as realistic a fashion as possible. Stereo goggles may be used to provide the user with a view of the modelled environment from within the environment, while a data-glove is used to interact with the environment. To simulate reality on a computer, the machine has to produce realistic images rapidly. Such a requirement usually necessitates expensive equipment. This thesis presents an implementation of a virtual reality system on a transputer architecture. The system is general, and is intended to provide support for the development of various virtual environments. The three main components of the system are the output device drivers, the input device drivers, and the virtual world kernel. This last component is responsible for the simulation of the virtual world. The rendering system is described in detail. Various methods for implementing the components of the graphics pipeline are discussed. These are then generalised to make use of the facilities provided by the transputer processor for parallel processing. A number of different decomposition techniques are implemented and compared. The emphasis in this section is on the speed at which the world can be rendered, and the interaction latency involved. In the best case, where almost linear speedup is obtained, a world containing over 250 polygons is rendered at 32 frames/second. The bandwidth of the transputer links is the major factor limiting speedup. A description is given of an input device driver which makes use of a powerglove. Techniques for overcoming the limitations of this device, and for interacting with the virtual world, are discussed. The virtual world kernel is designed to make extensive use of the parallel processing facilities provided by transputers. It is capable of providing support for mUltiple worlds concurrently, and for multiple users interacting with these worlds. Two applications are described that were successfully implemented using this system. The design of the system is compared with other recently developed virtual reality systems. Features that are common or advantageous in each of the systems are discussed. The system described in this thesis compares favourably, particularly in its use of parallel processors. , KMBT_223
- Full Text:
- Date Issued: 1994