“Robot eyes” boost productivity within the manufacturing sector The story behind the development of Canon's 3-D Machine Vision Systems

3-D Machine Vision Systems (3DMV) recognize the position and orientation of objects; in other words, they function as “robot eyes.” By helping to improve productivity, these systems are expected to record significant market growth. The collaboration of Canon hardware and software engineers enabled the Company to enter this new market.

Introducing Canon's RV series of 3-D Machine Vision Systems

Developers interviewed

Takashi Urakawa

In charge of mechanical design.

Hiroyuki Yuki

In charge of optical design.

Hiroshi Yoshikawa

In charge of software development.

Koji Dokai

In charge of application engineering.

POINT 1

Compact, maintenance-free design

Read more

Canon's 3DMV RV-series lineup targets production plants in the automobile, electronic equipment and other manufacturing sectors, which typically require the picking of randomly piled parts. How were Canon engineers able to fulfill the strict demands from customers for high speed and high accuracy, ease of installation, resistance to dust and water, and maintenance-free operation?

Takashi Urakawa

In charge of mechanical design.

Joined Canon in 2000; works on the design of members for holding optical elements and the housing design of such products as MR Systems and ultra-compact projectors. Strives to develop products that will captivate a wide range of users.

From a global perspective, 3DMV is still a new market. When did Canon decide to seize this market opportunity?

Application example: Automobile manufacturer

Takashi Urakawa
We started discussing the component technology in 2008 and it was around 2012 that we set our sights on commercializing a product. Considering Canon's strength in optical technology and image processing, we believed we could dominate this field.

Generally speaking, 3DMV systems can be divided between stationary types and compact types that are installed in a picking robot. We launched this project based on our belief that there was demand for a stationary type capable of picking individual parts from a pile of randomly stacked parts. Our first step was to combine a Canon projector and camera in order to create a prototype capable of picking parts. The prototype at the time was quite large, about three or four times the size of the current product.

How did you determine the specifications for the RV series?

Urakawa
After interviewing companies that were considering the introduction of 3DMV, we learned that their biggest challenge was setting up the system.

Hiroshi Yoshikawa
3DMV is able to detect the position and orientation of individual parts by projecting patterns onto randomly piled parts ([1] in Fig. below) and measuring the distance to the parts ([2] in Fig. below), after which the data is compared against a pre-registered pattern dictionary ([3] in Fig. below). With currently available 3DMV systems, registering parts in the pattern dictionary is a complicated and time-consuming process. Even for the initial system that we developed, it was necessary to place each part on a rotating stage and photograph them from various angles using ten cameras, and then have a team of engineers with specialized knowledge perform minute parameter settings based on the image data.

Our goal was to develop a system that, for the registering of the CAD (computer-aided design) data for the targeted parts, allowed customers, particularly those lacking any specialized knowledge, to simply take five photographs of the randomly piled parts and then the system would automatically register the data in the pattern dictionary. We realized that achieving this goal would give us a huge advantage over the competition.

3DMV Workflow

The RV series measures 25 cm wide and 20 cm high and weighs about 6 kg, so it is extremely compact compared with competing products. The camera and projector are integrated into a single unit, which makes for very simple installation. Wasn't it difficult to enable such high-precision recognition capabilities in such a compact unit?

Urakawa
In order to recognize the shapes of the parts and their distances, the RV series uses the Active Stereo Method. A projector projects light onto the targeted object, the reflection of the light is measured with a camera, and the object's position is computed. The greater the distance, or baseline length, between the projector and the camera, the easier it is to reduce measurement errors. But since we had to respond to the need for a compact size, it wouldn't do for us to make the system considerably larger than rival products. We determined the baseline length for the RV series early in the product-planning stage, and everyone in charge of optical design, mechanical design and software worked together in concert to realize that target.

What innovations were employed in terms of mechanical design?

Urakawa
The structure of the unit's exterior, and the way it's assembled, are unique in some respects. It is constructed somewhat like an intricately intertwined wire puzzle, where the exterior is assembled up to a certain point before the interior is assembled.

Hiroyuki Yuki
Those in charge of assembly at the production plant must have been quite surprised (laughs).

Features of the Canon RV series

Urakawa
If it had been okay to make the exterior bigger, we would have adopted a simpler structure. But partway into the development process, we were made aware of the need for dust- and water-resistance and for natural air cooling. Initially, the specs called for the same kind of cooling fan used in rival products. But because of the large amount of oil mist generated in automobile assembly plants and similar manufacturing sites, we realized such fans could easily break down. As such, if a fan were to fail, an entire production line could be shut down, which led to the demand to avoid using anything with a drive shaft as much as possible. In addition, there was a growing need to make the product maintenance-free so that, once installed, it wouldn't require any upkeep. That's because maintenance performed on-site can cause errors in precisely adjusted calibration settings.

In short, we arrived at this structure as a result of our thorough efforts to prevail over the competition by keeping the size small, realizing dust- and water-resistance, providing a natural cooling system, and achieving maintenance-free operation. Among ourselves, we call this our “ship in a bottle” design (laughs).

How did you go about designing such a complex structure?

Urakawa
Using our CAD system, we repeatedly checked to make sure the parts would fit properly. We carefully moved the parts as if we were actually assembling the unit on the CAD system itself. I remember this as being a slow and tedious effort (laughs). Still, we remained a bit anxious thinking that we might have overlooked something.

POINT 2

Accurately tracking the optical path to ensure outstanding optical performance

Read more

This 3DMV system uses a camera and projector to compute the distance to targeted parts. Engineers ran repeated simulations and tests to achieve optical performance unlike that of conventional cameras.

Hiroyuki Yuki

In charge of optical design.

Joined Canon in 2004; works on the optical design for semiconductor lithography equipment and components. Strives to develop truly unique technologies made possible through teamwork and curiosity.

What kind of issues did you face in dealing with the optical system?

Yuki
Our biggest challenge was ensuring high accuracy. The optical performance required for 3DMV systems is quite different from that of conventional lenses.

Conventional lenses must minimize image distortion and have resolution and brightness. To put it in extreme terms, it doesn't matter how light passes through the lens as long as there is no problem with the final image.

In contrast, to ensure high speed and high accuracy, 3DMV uses the stereo ranging method, an approach based on the principle of the pinhole camera. I can explain the principle behind the stereo ranging method in simple terms using a basic illustration (Fig. 1 below). The projector illuminates a single pixel, the light of which is reflected off of an object and captured by a single pixel of the camera. Since light travels in a straight line, the light from the projector travels as a straight line that passes through the pinhole (the dotted line in Fig. 1). In the same manner, the light captured by the camera passes through the pinhole in a straight line. Therefore, the point where the two straight lines intersect is the reflection point; in other words, the location of the object. It is this principle that enables us to determine an object's location in three dimensions.

If, however, we were actually to place a pinhole in the path of the light, the intensity of the light passing through the pinhole would be insufficient for detection, so we would use a lens. Fig. 2 shows an example of light shining on the portion of an object indicated by the red circle. The actual path that the light flux travels is contained within the red lines. With the stereo ranging method, because the calculation is performed based on the pinhole camera model, if we were to use a conventional lens, the path shown by the dotted line would be used to calculate the path of light, which would result in a discrepancy between the computed reflection point and the actual reflection point.

But for the 3DMV lens, in addition to the performance of a conventional lens, a far more rigorous optical design was needed to ensure that the light travels along the same path that it would in a pinhole camera.

Figure 1: Pinhole Camera and Stereo Ranging
Figure 2: Differences Between a Conventional Lens and a 3DMV Lens
Stereo Ranging Method

Were you able to increase the accuracy of the optical design through computer simulations?

Yuki
We were able to solve some problems through design, but when we actually fabricated the lens, we weren't able to achieve the accuracy we assumed in some instances. We had to ask ourselves, “What is the principle behind this phenomenon?” We developed a hypothesis, carried out tests, and tried to come up with solutions. It was a process we had to repeat over and over.

Did you find the optical design often affecting the mechanical design?

Yuki
On the mechanical side, we established the baseline length first. We designed an optical system to achieve that and studied how to incorporate that in the body. We proceeded with development as we went back and forth over that.

Urakawa
We changed the seating arrangements of the people in charge of optical design and mechanical design so they could work side by side. This enabled the mechanical design team to respond immediately to requests from the optics team.

Yuki
The optical performance of the lens is affected by heat, so we had a lot of back and forth, saying things like, “Please make sure there are no fluctuations in heat here.”

POINT 3

Algorithms for high-speed, high-precision parts recognition

Read more

The system recognizes the shape of each part and can pick them out from a mound of randomly piled parts. To enable production lines to operate at high speed, the RV series measures the distance to the part and recognizes it in as little as approximately 1.8 to 2.5 seconds. To achieve a balance of accuracy and speed, the software engineers developed new algorithms.

Hiroshi Yoshikawa

In charge of software development.

Joined Canon in 2005; after handling the optical design and assessment of visual inspection equipment, was assigned to machine vision software development. Strives to view things from various perspectives and come up with ideas that are different from those of others.

The software is responsible for the image processing that recognizes the shape of targeted parts and determines their locations, right? Can you explain in simple terms how this works?

Yoshikawa
3DMV operations can be divided into three basic steps: pattern projection, distance measurement, and part recognition.

Pattern projection involves the projection of specific patterns from a projector onto randomly piled parts and capturing the resulting images with a camera. The RV series uses a method known as space encoding.

The space encoding method works by projecting patterns of black and white stripes onto the target object (Fig. 1) and capturing images of the resulting scenes with a camera (Fig. 2). The black and white striped patterns shown in Fig. 1 employ a binary code called a gray code that is frequently used with systems employing sensors. To project eight stripes from the projector, three pattern variations are prepared with stripes of differing widths. Using the camera to capture images of the patterns as they change results in images like those shown in Fig. 3. Depending on the height of the target object, the projected stripes will shift. The black portions of captured images are assigned the number “0” and the white portions are given a “1.” Applying this process to all of the pixels in the captured image results in the assignment of a triple digit binary number to each pixel, as shown at the bottom of Fig. 3. Because the binary number differs for each stripe, each stripe can be identified. Since each stripe can be identified, the directional output of the projector can be determined. Triangulation is then carried out using the boundary delineations between the stripes projected by the projector. The red line extending from the camera in Fig. 2 shows an example of the triangulation that occurs at the boundary between the fourth and fifth stripes. My explanation uses eight stripes, but the RV series actually projects more than 1,000 such stripes. As a result, we can obtain a highly detailed group of measurement points representing the three-dimensional shape of an entire mound of randomly piled parts.

Once the group of measurement points has been acquired, the part recognition process can begin. Part recognition makes use of a dictionary of pre-registered part patterns that enables the general determination of the position and orientation of parts. Finally, applying 3D CAD (computer-aided design) data to measurement values, the system is able to measure the position and orientation of parts with a high degree of precision. The system then determines whether the picking robot is able to pick up the part, that result is conveyed to the robot controller, and the picking robot acts accordingly.

Figure 1: Pattern Images
Figure 2: Schematic Diagram of Projection and Image Capture
Figure 3: Images Captured by the Camera
The space encoding method

What modifications were made in terms of the software?

Yoshikawa
The algorithms were modified to reduce the number of calculations and accelerate the measurement and detection of parts.

For example, conventionally, when the patterns are projected, negative images of the black-and-white stripes are prepared and both patterns are used as a set at each stage to boost measurement accuracy. But this requires the processing of twice the number of patterns. That's why we decided to use negative image patterns only for the last step, thus cutting in half the number of patterns needed.

Another innovation was the algorithms for part detection. With the RV series, when initially registering the parts for the pattern dictionary, five images are needed of the randomly piled parts arranged in different patterns. By capturing grayscale images of the surface state as viewed from various angles, the system can easily determine the boundaries between parts, thus increasing part detection accuracy. A little while ago, I mentioned how the system collates 3D CAD data of the part for groups of measurement points. Combining that with the use of 2-D images that display changes in grayscale reduces the likelihood of errors in part detection. Adopting this method reduced the amount of calculation processing required, which enabled us to successfully reduce the time required for part detection.

The part detection method uses CG (computer graphics) to learn every view of a workpiece in accordance with changes in orientation

Koji Dokai

In charge of application engineering.

Joined Canon in 2003; after handling the electrical design of semiconductor lithography equipment, joined the R&D division and is currently in charge of machine vision business promotion. Aims to become someone capable of providing customers with solutions with an understanding of both the technological and marketing sides of the business.

What was the reaction from your customers?

Koji Dokai
With conventional 3DMV, the setup is very difficult. Completing the set-up process with the RV series, however, only requires that you snap five pictures of the randomly piled parts and press the Create Dictionary button. This feature has earned us a lot of high praise. The software user interface has been well received for its ease-of-use.

However, the parts used in the plants of our customers are more varied than we had anticipated: some are thin, others are small, some are transparent. We need to further expand the range of parts that the system can handle.

Also, some of our customers want us to increase recognition speed even further, while others have asked us to focus more on reliability. That's because, for customers who place the highest priority on reducing costs across the entire production line, halts in production can result in huge losses.

Yoshikawa
To respond to these needs, we'll work with the product planning team to upgrade the software.

POINT 4

New markets focused on “robot eyes”

Read more

Hardware and software engineers collaborated to commercialize the RV series. To enter this new market, engineers from several disciplines had to work together in order to meet customer needs.

What lies ahead in the future of 3DMV?

Dokai
The RV series targets the picking of randomly piled parts, but going forward, we plan to develop products for use in product inspection and other fields as well as models attached to robot arms.

Urakawa
Models attached to robot arms will be able to not only pick up parts, but also perform assembly processes. But such systems won't benefit from the extra time that the RV series has to detect the next part while the robot arm is in motion, which is why we'll have to make the measurement and part-detection processes faster than they currently are. To do this, we'll need to develop new algorithms.

Dokai
We've also been asked to accommodate industries outside of the manufacturing sector, such as distribution and warehousing. For such new applications, we'll have to re-examine performance and other aspects.

In terms of the development of the RV series, what demands do you anticipate for future product development?

Dokai
There are no longer any products that can be realized through just one technological domain. Our product development efforts now require the creation of entire systems that combine technologies spanning a range of fields. It's something that's long been said, but engineers need to look at the big picture.

Urakawa
This is very much the case with the RV series, which comprises products in which the hardware and software are inseparable. We held regular meetings in which those in charge of the various areas would gather and closely share information.

Yuki
What was probably unique about the RV series project was that all the participants joined forces to quickly realize a product. The key to our success may have been that it was just a few of us who gathered together, without any boundaries between hardware and software, and we were determined to do whatever it took to give form to our vision.

Yoshikawa
Speaking of software, we are always striving for improvement by actively introducing recent AI (artificial intelligence) and other innovative technologies. But just because an algorithm is highly advanced doesn't necessarily mean that it's good. Sometimes it's the simple algorithms that are fast and easy to use. What we need to keep asking ourselves during development is: “Will this really work or won't it?”

Dokai
Today, simply creating good products is no guarantee that they will sell. We need to get to know our customers and develop strong bonds of trust with them. I believe that's the kind of developers that are needed.

There is a major change taking place now in the industrial sector. With the advent of the IoT (Internet of Things), in which devices can communicate with each other and automatically process information, along with big data analysis capable of generating knowledge from vast amounts of data, the manufacturing industry is completely different from what it once was. Robots and 3DMV are also critical in accelerating this trend. In the future, in addition to picking parts, it's a given that these systems will be able to automate tasks that workers now perform by hand, such as assembly and inspection procedures.

We are witnessing the emergence of a world that further promotes the coexistence of people and machines. We plan to keep an eye on what kind of impact that 3DMV and other technologies will have on the industry.

Please help us improve our content by answering two quick questions.
Did you find this information useful?

Thank you for your feedback.
We will use it when creating future content.

Interview & Composition
Tatsuya Yamaji
Born in 1970. After working as a magazine editor, he became a freelance writer/editor and has been active as a researcher, interviewer and writer in the fields of IT, science and the environment.
Publications include The Day Apple and Google Become Gods (co-author), New Guide to Superconductivity, 72 Hours of Google (co-author), Affirmation (co-author), and others.

Previous Interviews