Development and Performance Evaluation of Distance Measurements Detection Using 3 Axis Binocular Vision System

No. Abstract The world currently entering industry 4.0, which the industrial processes become automated and remotely control by computer with little input from human operators. Binocular vision is the result of signal sent to brain from eyes simultaneously which having advantage of depth perception. This project is about distance measurement of the object in 3-axis by using the advantages of binocular vision system. In this project, two webcams were used as binocular vision system to capture image and detect the distance measurement of the sample objects in 3-axis. The cameras must be calibrated before start to computing the distance measurement due to the fish eyes effect caused by the lens. In this project, the cameras were calibrated by using chessboard pattern with the size of 9x6 internal corners due to the simple geometry. The distance measurement algorithm will be executed once the cameras are calibrated, and the distance measurement is computed by using trigonometry concept.


Introduction
The world currently is entered industry 4.0, which computers and automation will come together in an entirely new way with robotics connected remotely to computer systems equipped with machine learning algorithms that can learn and control the robotics with very little input from human operators. Vision sensor is a tool that consists of video camera, display and interface, and computer processor to automate industrial processes and decisions. It utilizes the images captured by a camera for determining the presence, orientation, and accuracy of parts. The vision sensor is widely used in many applications such as measurement, and observable characteristics regarding to the product quality.
Binocular vision is the result of the signal sent from the eye to brain, where the brain usually receive signal from both eyes simultaneously. The information contained in the signal for each eye is slightly different, and with the well-functioning binocular vision, which enables brain to judge the distance or the coordinates of the eye movement based on the differences. One of the benefits of binocular vision is the ability to judge the speed and depth of the objects, people with poor-functioning binocular vision might have difficulties to perform daily tasks such as pouring water into glass (Hamed et al., 2013).
Many researchers have been trying to simulate the binocular vision of human in robot applications by using vision sensor and software since the robot applications have been rising rap-idly in the society of well-developed technology. University of Washington and Microsoft developed the wide baseline stereo vision system in 2003, the system used the same camera in different position of the Mars Reconnaissance Orbiter shoot images to determine the precise location of the camera with the non-linear optimization algorithm based on the landscape observed. Besides that, Massachusetts Institute of Technology had pro-posed a new sensor fusion algorithm for intelligence transportation to divide target location under high rates by combining the rough target depth from radar system and binocular vision, and with the improved fingerprint image segmentation method (Wang et al., 2017).
The computer vision for binocular vision system has been implemented in many robotic applications and safety purpose, where it concerned with the automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. In addition, the development of theoretical and algorithm basis is involved in order to achieve automated visual understanding. Based on previous research, the trigonometry intersection points will be triggered by the combination of area of sight of the stereo vision in order to determine the distance of the object from the baseline. Furthermore, the measurement detection algorithm is one of the issue that need to confront consequent since there are various sort of calculations of measurement detection in the same field, however has unmistakable of use.

Hardware Set Up
The vision sensor can be constructed with a couple of low-cost such as webcam (Abu Hassan et al., 2017), while a binocular stereo vision system can be constructed according to the particular application requirement. In this project, two units of Microsoft Lifecam HD-5000 were used in the set up as the cameras able to provide maximum output video resolution up to 1280 x 720 and maximum output still image resolution up to 1280 x 800. Furthermore, it also provides the diagonal field of view of 66° and allows to adjust image parameters setting such as brightness, exposure, sharpness, saturation, contrast, and focus (Microsoft, 2012). Figure 1 shows the experimental set-up of stereo cameras used in this project as the prototype of binocular vision system. The hardware was set up at Automation and Materials Lab in KDU University College.

Camera Calibration
There are various methods to perform camera calibration for binocular vision system. According to research, there are two methods to do camera calibration with low cost stereo camera, which are using chessboard pattern, and feature descriptor pattern. The calibration using chessboard pattern was produced less mean re-projection error, which was better than using feature descriptor pattern. Furthermore, calibration with chessboard pattern required two cameras to capture multiple planar of chessboard pattern from different orientation view. In addition, this method consists of a closed-form solution and then a nonlinear refinement based on the maximum likelihood criterion (Abu Hassan et al., 2017).
Therefore, the chessboard pattern calibration method is chosen as the calibration method in this project due to the simple geometry. The procedure of stereo calibration process involved were taking few sets of raw image pair, execute calibration algorithm by OpenCV, process distortion images, process rectification and save the intrinsic and extrinsic camera parameters (Abdullah et al., 2015).  Stereo matching is the process of matching a 3D point in the two different camera views, which can be computed only over the visual areas in which the views of the two cameras overlap ( Bradski & Kaehler, 2008). This is the most crucial step in stereo vision domain as a lot of factors in the scene in a 3D space are composited and represented by single grey value when projected to the in the 2D image. In addition, the factors include illumination, geometry shape of the object and physical property, noise effects and distortion, and the camera properties as well (Zou & Li, 2010). Furthermore, the disparity map will be generated by stereo correspondence and the information is used to determine the distance of the object in this project.
The block matching algorithm has three phases in order to create disparity map as known as depth map, which are pre-filtering phase, correspondence searching and computation, and post-filtering phase as shown in Figure 3.

Distance Calculation
Based on the related research projects, the researchers used trigonometry concept to determine the measurement for binocular vision system (Abu Hassan et al., 2017) (Huang & Cheng, 2013). Trigonometry concept is a method to determine the distance dimension in 3axis. Figure 4 shows the analysis of binocular vision system. the binocular vision system mainly utilizes the differences of the targeted point in horizontal coordinates which are xl and xr from the two directed image from both left and right camera, and the disparity has an inverse relationship with distance Z from target point to imaginary plane as shown in (1), (2), and (3).
(1) Furthermore, the distance measurement for X and Y axis can be determined after depth distance was determined by using the usual camera projective formula as shown in (4) and (5). (4) (5)

Results and Discussion
From the tabulated data, Table 2, Table 3, and Table 4 show the result of field measurement and distance measurement with binocular vision system in 3-axis of three primary colour objects. The possible reasons of errors occurred in the field measurement were parallax error and the scale on the ruler of field measurement only indicated one decimal place data unlike the binocular vision system the data can measure up to two or more decimal place data. Moreover, the distance detection results of this system are heavily relying on the calibration data, because the calibration algorithm would generate intrinsic and extrinsic parameters of the camera from the fixed set up, where the parameters were used to compute the distance of the object in 3-axis. If the setup was moved but the calibration data remain unchanged, then the disparity computed in block matching will not be accurate, which also affect the accuracy of distance measurement.  Figure 6 shows the graph of percentage error of Z-axis measurement against the field measurement. Figure 7 and Figure 8 show the graph of percentage error of distance measurement while the objects were located in different depth location. There is no any significant trend that show in these figures and show any indication of having better measurement on certain colour. Hence, the colour of the object did not affect the distance measurement of binocular vision system. In addition, the system determines distance from disparity map in stereo matching, where the block that matched would appear on the disparity map regardless the object colour, and the matched block would appear in white, grey, or black colour depending on the object location.  Table 5 and Figure 9 show the average error of distance measurement in 3-axis when the object was located in different depth regrardless the object colour. The measurement of X-axis has highest average percentage error among the other two axis. Furthermore, the trend lines of the measurement of three axis show that the accuracy of the distance measurement were decreased as the further of the object located.  The distance measurement sensors or application used in industry such as optical sensors, ultrasonic sensors, eddy current sensors, laser sensors could provide high accuracy output. Generally, these distance measurements send signal to the target surface and receive the reflected signal to generate the output. In addition, these sensors only able to detect the surface of the target and measure the distance in one dimensional axis.
Although the distance measurement of binocular vision system might not as accurate as these sensors, this system has several advantages in the application. This system able to detect the colour and the geometry of the object, which allows the system to determine the target object with the coordinates in the applications. Furthermore, this system is able to measure the distance of the object from the camera in 3-axis in once, which also has the advantages of time saving compared to other application

Conclusion
The proposed program algorithms for binocular vision system to detect 3-axis location of the RGB object was created successfully despite the accuracy might lower than the other applications. Furthermore, this system was able to filter the colours and detect the RGB object and the accuracy of this system was evaluated with laser distance meter.