Finger tracking
Encyclopedia
In the field of technology
and image processing
, finger tracking is a high-resolution technique that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D
.
In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard
and a mouse
. A good example can be seen here (using markers) or here (without markers)
This system was born based on the human-computer interaction problem. The objective is to allow the
communication between them and the use of gesture
s and hand movements to be more intuitive,
Finger tracking systems have been created. These systems track in real time the position in 3D and 2D of
the orientation of the fingers of each marker and use the intuitive hand movements and gestures to interact.
of finger tracking. A great number of theses have been done in this field in order to make a global partition as an objective. We could divide this technique into finger tracking and interface
. Regarding the last one, it computes a sequence estimation of the image which detects the hand part of the background. Regarding the first one, to carry out this tracking, we need an intermediate external device, used as a tool for executing different instructions.
; a tracking of the location of the markers and patterns in 3D is performed, the system identifies them and labels each marker according to the position of the user’s fingers. The coordinates in 3D of the labels of these markers are produced in real time with other applications.
The visual occlusion is a very intuitive method to provide a more realistic viewpoint of the virtual information in three dimensions. The interfaces provide more natural 3D interaction techniques over base 6.
Markers operate through interaction point
s, which are usually already set and we have the knowledge about the regions. Because of that, it is not necessary to follow each marker all the time; the multipointers can be treated in the same way when there is only one operating pointer. To detect such pointers through an interaction, we enable ultrasound
infrared
sensors. The fact that many pointers can be handled as one, problems would be solved. In the case when we are exposed to operate under difficult conditions like bad illumination
, motion blur
s, malformation of the marker or occlusion. The system allows following the object, even though if some markers are not visible. Because of the spatial relationships of all the markers are known, the positions of the markers that are not visible can be computed by using the markers that are known. There are several methods for marker detection like border marker and estimated marker methods.
. This simplicity acts with less precision than the previous technique. It provides a new base for new interactions in the modeling, the control of the animation
and the added realism. It uses a glove composed of a set of colors which are assigned according to the position of the fingers. This color test is limited to the vision system of the computers and based on the capture function and the position of the color, the position of the hand is known.
, the legs and hands can be modeled as articulated mechanisms, system of rigid bodies that are connected between them to articulations with one or more degrees of freedom. This model can be applied to a more reduced scale to describe hand motion and based on a wide scale to describe a complete body motion. A certain finger motion, for example, can be recognized from its usual angles and it does not depend on the position of the hand in relation to the camera.
Many tracking systems are based on a model focused on a problem of sequence estimation, where a sequence of images is given and a model of changing, we estimate the 3D configuration for each photo.
All the possible hand configurations are represented by vector
s on a state space
, which codes the
position of the hand and the angles of the finger’s joint. Each hand configuration generates a set of
images through the detection of the borders of the occlusion of the finger’s joint. The estimation of each
image is calculated by finding the state vector that better fits to the measured characteristics.
The finger joints have the added 21 states more than the rigid body movement of the palms; this means
that the cost computational of the estimation is increased. The technique consists of label each finger joint links is modeled as a cylinder. We do the axes at each joint and bisector of this axis is the projection of the joint. Hence we use 3 DOF, because there are only 3 degrees of movement.
In this case, it is the same as in the previous typology
as there is a wide variety of deployment thesis on
this subject. Therefore the steps and treatment technique are different depending on the purpose and
needs of the person who will use this technique. Anyway, we can say that a very general way and in most systems, you should carry out the following steps:
. However its application has
gone to professional level 3D modeling
, companies and projects directly in this case overturned. Thus
such systems rarely have been used in consumer applications due to its high price and complexity.
In any case, the main objective is to facilitate the task of executing commands to the computer via
natural language or interacting gesture.
The objective is centered on the following idea computers should be easier in terms of usage if there is a
possibility to operate through natural language or gesture interaction. The main application of this
technique is to highlight the 3D design and animation, where software like Maya and 3D StudioMax
employ these kinds of tools. The reason is that is that allow a more accurate and simple control of
the instructions that we want to execute. This technology offers many possibilities, where the sculpture,
building and modeling in 3D in real time through the use of a computer is the most important.
Technology
Technology is the making, usage, and knowledge of tools, machines, techniques, crafts, systems or methods of organization in order to solve a problem or perform a specific function. It can also refer to the collection of such tools, machinery, and procedures. The word technology comes ;...
and image processing
Image processing
In electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...
, finger tracking is a high-resolution technique that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D
Three-dimensional space
Three-dimensional space is a geometric 3-parameters model of the physical universe in which we live. These three dimensions are commonly called length, width, and depth , although any three directions can be chosen, provided that they do not lie in the same plane.In physics and mathematics, a...
.
In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard
Keyboard (computing)
In computing, a keyboard is a typewriter-style keyboard, which uses an arrangement of buttons or keys, to act as mechanical levers or electronic switches...
and a mouse
Mouse (computing)
In computing, a mouse is a pointing device that functions by detecting two-dimensional motion relative to its supporting surface. Physically, a mouse consists of an object held under one of the user's hands, with one or more buttons...
. A good example can be seen here (using markers) or here (without markers)
Introduction
The finger tracking system is focused on user-data interaction, where the user interacts with virtual data, by handling through the fingers the volumetric of a 3D object that we want to represent.This system was born based on the human-computer interaction problem. The objective is to allow the
communication between them and the use of gesture
Gesture
A gesture is a form of non-verbal communication in which visible bodily actions communicate particular messages, either in place of speech or together and in parallel with spoken words. Gestures include movement of the hands, face, or other parts of the body...
s and hand movements to be more intuitive,
Finger tracking systems have been created. These systems track in real time the position in 3D and 2D of
the orientation of the fingers of each marker and use the intuitive hand movements and gestures to interact.
Types of tracking
There are many options for the implementationImplementation
Implementation is the realization of an application, or execution of a plan, idea, model, design, specification, standard, algorithm, or policy.-Computer Science:...
of finger tracking. A great number of theses have been done in this field in order to make a global partition as an objective. We could divide this technique into finger tracking and interface
Interface (computer science)
In the field of computer science, an interface is a tool and concept that refers to a point of interaction between components, and is applicable at the level of both hardware and software...
. Regarding the last one, it computes a sequence estimation of the image which detects the hand part of the background. Regarding the first one, to carry out this tracking, we need an intermediate external device, used as a tool for executing different instructions.
Tracking with interface
In this system we use motion captureMotion capture
Motion capture, motion tracking, or mocap are terms used to describe the process of recording movement and translating that movement on to a digital model. It is used in military, entertainment, sports, and medical applications, and for validation of computer vision and robotics...
; a tracking of the location of the markers and patterns in 3D is performed, the system identifies them and labels each marker according to the position of the user’s fingers. The coordinates in 3D of the labels of these markers are produced in real time with other applications.
Markers
Some of the optical systems, like Vicon, are able to capture hand motion through markers. In each hand we have a marker per each “operative” finger. Three high-resolution cameras are responsible for capturing each marker and measure its positions. This will be only produced when the camera is able to see them. The visual markers, usually known as rings or bracelets, are used to recognize user gesture in 3D. In addition, as the classification indicates, these rings act as an interface in 2D.Occlusion as an interaction method
The visual occlusion is a very intuitive method to provide a more realistic viewpoint of the virtual information in three dimensions. The interfaces provide more natural 3D interaction techniques over base 6.
Marker functionality
Markers operate through interaction point
Interaction point
In particle physics, an interaction point is the place where particles collide. One differentiates between the nominal IP, which is the design position of the IP, and the real or physics IP, which is the position where the particles actually collide...
s, which are usually already set and we have the knowledge about the regions. Because of that, it is not necessary to follow each marker all the time; the multipointers can be treated in the same way when there is only one operating pointer. To detect such pointers through an interaction, we enable ultrasound
Ultrasound
Ultrasound is cyclic sound pressure with a frequency greater than the upper limit of human hearing. Ultrasound is thus not separated from "normal" sound based on differences in physical properties, only the fact that humans cannot hear it. Although this limit varies from person to person, it is...
infrared
Infrared
Infrared light is electromagnetic radiation with a wavelength longer than that of visible light, measured from the nominal edge of visible red light at 0.74 micrometres , and extending conventionally to 300 µm...
sensors. The fact that many pointers can be handled as one, problems would be solved. In the case when we are exposed to operate under difficult conditions like bad illumination
Lighting
Lighting or illumination is the deliberate application of light to achieve some practical or aesthetic effect. Lighting includes the use of both artificial light sources such as lamps and light fixtures, as well as natural illumination by capturing daylight...
, motion blur
Motion blur
Motion blur is the apparent streaking of rapidly moving objects in a still image or a sequence of images such as a movie or animation. It results when the image being recorded changes during the recording of a single frame, either due to rapid movement or long exposure.- Photography :When a camera...
s, malformation of the marker or occlusion. The system allows following the object, even though if some markers are not visible. Because of the spatial relationships of all the markers are known, the positions of the markers that are not visible can be computed by using the markers that are known. There are several methods for marker detection like border marker and estimated marker methods.
- The Homer technique includes ray selection with direct handling: An object is selected and then its position and orientation are handled like if it was connected directly to the hand.
- The Conner technique presents a set of 3D widgetsWidget (computing)In computer programming, a widget is an element of a graphical user interface that displays an information arrangement changeable by the user, such as a window or a text box. The defining characteristic of a widget is to provide a single interaction point for the direct manipulation of a given...
that permit an indirect interaction with the virtual objects through a virtual widget that acts as an intermediary.
Articulated hand tracking
This is an interesting technique from the point of view that is more simple and less expensive, because it only needs one cameraCamera
A camera is a device that records and stores images. These images may be still photographs or moving images such as videos or movies. The term camera comes from the camera obscura , an early mechanism for projecting images...
. This simplicity acts with less precision than the previous technique. It provides a new base for new interactions in the modeling, the control of the animation
Animation
Animation is the rapid display of a sequence of images of 2-D or 3-D artwork or model positions in order to create an illusion of movement. The effect is an optical illusion of motion due to the phenomenon of persistence of vision, and can be created and demonstrated in several ways...
and the added realism. It uses a glove composed of a set of colors which are assigned according to the position of the fingers. This color test is limited to the vision system of the computers and based on the capture function and the position of the color, the position of the hand is known.
Tracking without interface
In terms of visual perceptionVisual perception
Visual perception is the ability to interpret information and surroundings from the effects of visible light reaching the eye. The resulting perception is also known as eyesight, sight, or vision...
, the legs and hands can be modeled as articulated mechanisms, system of rigid bodies that are connected between them to articulations with one or more degrees of freedom. This model can be applied to a more reduced scale to describe hand motion and based on a wide scale to describe a complete body motion. A certain finger motion, for example, can be recognized from its usual angles and it does not depend on the position of the hand in relation to the camera.
Many tracking systems are based on a model focused on a problem of sequence estimation, where a sequence of images is given and a model of changing, we estimate the 3D configuration for each photo.
All the possible hand configurations are represented by vector
Vector space
A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...
s on a state space
State space
In the theory of discrete dynamical systems, a state space is a directed graph where each possible state of a dynamical system is represented by a vertex, and there is a directed edge from a to b if and only if ƒ = b where the function f defines the dynamical system.State spaces are...
, which codes the
position of the hand and the angles of the finger’s joint. Each hand configuration generates a set of
images through the detection of the borders of the occlusion of the finger’s joint. The estimation of each
image is calculated by finding the state vector that better fits to the measured characteristics.
The finger joints have the added 21 states more than the rigid body movement of the palms; this means
that the cost computational of the estimation is increased. The technique consists of label each finger joint links is modeled as a cylinder. We do the axes at each joint and bisector of this axis is the projection of the joint. Hence we use 3 DOF, because there are only 3 degrees of movement.
In this case, it is the same as in the previous typology
Typology
Typology is the study of types. More specifically, it may refer to:*Typology , division of culture by races*Typology , classification of things according to their characteristics...
as there is a wide variety of deployment thesis on
this subject. Therefore the steps and treatment technique are different depending on the purpose and
needs of the person who will use this technique. Anyway, we can say that a very general way and in most systems, you should carry out the following steps:
- Background subtraction: the idea is to convolve all the images that are captured with a Gauss filter of 5x5, and then these are scaled to reduce noisy pixel data.
- Segmentation: a binary mask application is used to represent with a white color, the pixels that belong to the hand and to apply the black color to the foreground skin image.
- Region extraction: left and right hand detection based on a comparison between them.
- Characteristic extraction: location of the fingertips and to detect if it is a peak or a valley. To classify the point, peaks or valleys, these are transformed to 3D vectors, usually named pseudo vectors in the xy-plane, and then to compute the cross product. If the sign of the z component of the cross product is positive, we consider that the point is a peak, and in the case that the result of the cross product is negative, it will be a valley.
- Point and pinch gesture recognition: taking into account the points of reference that are visible (fingertips) a certain gesture is associated.
- Pose estimation: a procedure which consists on identify the position of the hands through the use of algorithms that compute the distances between positions.
Other tracking techniques
It is also possible to perform active tracking of fingers. The Smart Laser Scanner is a marker-less finger tracking system using a modified laser scanner/projector developed a the University of Tokyo in 2003-2004. It is capable of acquiring three dimensional coordinates in real time without the need of any image processing at all (essentially, it is a rangefinder scanner that instead of continuously scanning over the full field of view, restricts its scanning area to a very narrow window precisely the size of the target). Gesture recognition has been demonstrated with this system. The sampling rate can be very high (500Hz), enabling smooth trajectories to be acquired without the need of filtering (such as Kalman).Application
Definitely, the finger tracking systems are used to represent a virtual realityVirtual reality
Virtual reality , also known as virtuality, is a term that applies to computer-simulated environments that can simulate physical presence in places in the real world, as well as in imaginary worlds...
. However its application has
gone to professional level 3D modeling
3D modeling
In 3D computer graphics, 3D modeling is the process of developing a mathematical representation of any three-dimensional surface of object via specialized software. The product is called a 3D model...
, companies and projects directly in this case overturned. Thus
such systems rarely have been used in consumer applications due to its high price and complexity.
In any case, the main objective is to facilitate the task of executing commands to the computer via
natural language or interacting gesture.
The objective is centered on the following idea computers should be easier in terms of usage if there is a
possibility to operate through natural language or gesture interaction. The main application of this
technique is to highlight the 3D design and animation, where software like Maya and 3D StudioMax
employ these kinds of tools. The reason is that is that allow a more accurate and simple control of
the instructions that we want to execute. This technology offers many possibilities, where the sculpture,
building and modeling in 3D in real time through the use of a computer is the most important.
External links
- http://www.vicon.com/
- http://www.dgp.toronto.edu/~ravin/videos/graphite2006_proxy.mov
- http://actuality-medical.com/Home.html
- http://www.dgp.toronto.edu/
- http://www.k2.t.u-tokyo.ac.jp/perception/SmartLaserTracking/