When the user measure targets of the same type, it would be cool if the machine would remember the first target and compare the rest of them to it and automatically find the interested edges.
Image registration is employed to find the correspondence between the first target (template) and the rest of them.
For VMM, the transform is apparently rigid, not even affine. So it should be an easy task, but two types of targets should be considered.
(1) targets with much texture information, like printed PCB board and colored particles.
(2) targets with only edge information, such as plugins and little mechanic parts.
I'm now testing two strategys, the first one is used in my MSc thesis to estimate camera motion, i.e. identifying some lanmark points, match them and then compute the affine transformation; the second strategy is based on Chamfer Matching (borgerfors, 1988).
The first strategy proves to be very accurate on PCB board images, but has poor performance on the second type of targets. The algorithms include:
(1) detect landmarkers (something like cvGoodFeaturesToTrack)
(2) match the landmarkers by neighborhood correlation
(3) compute the affine transformation by iterated outlier removing.
To speed up the computation, I did this on a image piramid with different image resolution. Firstly compute the affine model on the coasest image, then localize step(2) according to the model on a finer image and so on.
The average registration error ( defined in many literatures, it's not convinient to input formulas in a blog) decreases each level on the piramid. In the example below, the average error is 60, 53, 26 for each layer in the herachical structure respectively.
fig. layer 3 source image
fig. layer 3 dst image
fig. layer 2 dist image
fig.layer 2 source image
BTW: some literatures prefer methods in a optical flow fashion, but these ideas are unsuitable for VMM: the fundamental hypothesis that the image sequence would be 'continuous' doesn't hold here. When a user puts on targets on the platform, a minimum offset would cause the image shift a lot, so the source and target images won't be close spatially at all. For optical flow computation, a very large scan window should be used here, which is undesirable both in time and accuracy.
Things become ugly when they are zoomed very large, curves can be zig-zag and straight lines are not straight anymore.
Hough algorithm won't solve all problems. They are good at identifying parametric shapes, even the shapes are broken or not complete.
But when the shape itself cannot be approximated by parametric equations, Hough algorithms would find many scattered local maxima in the parameter space.
So when the images become ugly, we gotta fit something instead of detect a nice shape from the image.
To cope with the noise and to prune adjuncted edges from what we want, an iterated fitting strategy should be used to remove those points with large residuals with decreasing thresholds.
The steps are:
(1) track the edges
(2) fit it to line/circle/ellipse
(3) prune the points that has larger residual than current threshold
(4) decrease the threshold and goto (2)
fig. A plug zoom 30 times with a line fitted.
I'm now working on a Visual Measuring Machine. It's something like an automatic microscope.
With a powerful optical system, it can zoom the image for many times. After snapping a picture, a user would point out which edge he/she wants to measure, then the software system needs to recognize the simple geometries like lines and circles in the picture and locate their end points.
After obtaining these metrics in pixels, its real length could be computed using the scale factor.
The user can, of course, manually point out the line on this picture, but that would be a boring task. As the targets being measured are usually small and are manufactured according to CAD scripts. So, a vectorization of the picture is often desired and eligible.
The development has 2 stages:
1. recognize lines and circles. To aid user with recognition power.
2. implement a full-scaled vectorization feature and using CAD data to check the vectorization result automatically. This would be a very useful feature. After a product is manufatured, its every edge and surface is reconstructed with the system and compared with its original design for deficiency.
The first stage is almost done. It's relatively easy, anyway, here's the pictures.
1. the source image
2. after canny filtering
3. edge tracking ( to extract edge points chosen by the user)
4. The result edge with endpoints
After taking a ferocious screening test, the Heracles of Software Industry, Microsoft, offered me a free 3 days round trip to Beijing for visit to Microsoft Research Asia.
It's the first time for me to take a flight. ^__^
And I'm gonna meet so many geeks there!
The process are:
(1) recognize the green spots.
(2) solve the correspondence problem between the current frame and the registered global senario. ref: veenjman
(3) find out the Affine transform model by solving a linear equation
(4) transform the registered global senario and update their current positions.
Some of the frames:
The global view and result of counting:
It's a common sense that RHT is faster than HT. For, RHT makes several samples each term to calculate one point in the parameter space in a determined fashion, but RT use only one sample and compute a whole buntch of parameters( a curve or a surface in the parameter space). The higher the dimension is, the slower HT will be. But as the experiments shows that the "randomized" characteristics of the RHT seems to be problematic if there are many sample points. Because, the samples generated by computer is not random enough! So, when I detect long curves using RHT, the sample points tends to be localized on some part of the curve, and the precision of the algorithm is greatly penalized.
So, how to solve this problem?
One IEEE papar proposes a recursive RHT algorithm:
run RHT once, then narrow down the samples and the parameter space and run RHT again using higher precision.
This approach, which I tested by the tile quality checking system below, is still not good enough, although its execution time is very short.
My method is:
run RHT first , narrowing down the parameter space enough.
Then use the brutal HT on it.
Because the RHT, HT can be very fast, and retain its precision.
This optimization is thoroughly discussed in my paper.
Snap of the digits.
Picture after preprocessing it to extract the red channel.
After segmentation
Hoorah! The result.
These are steel bars running on the product chain at speed 5m/s. We use ccd camera to recognize and track them. When the number of passed steel bars reached to the predefined one, we stop the product chain and indicating the workers to wrap them.
Our algorithm has realtime execution speed(processing 25 frames in 1 second). But has one big problem: it might lose tracking of some steel bar when they vibrate sharply. Although the chance is very low (once per day), it is still annoying. We are working on this problem.
After the counting is done, we will work out a plan to wrap the steel bars by robot hand. This require very good tracking algorithm and control strategy. We gotta simulate human hand to do this job.
Above is a ceramic tile with one of its border broken. Below is the broken edge that the system detects. The system can also detects the spots on the ceramic tile and check the quality of the ceramic tile automatically.
All the algorithms were worked out and have realtime response speed. But, embarrassingly, I gotta problem to find a buyer now.
This project is to measuring steel pipes on the product line. The key point is to get accurate length of the pipe, then with the weighting system and the diameter of the pipe is known, we can caculate the thickness of these pipes.
We use Computer Vision to solve the problem. But, before explaining our solution, we shall get some background of the project.
The oldest solution belongs to a German company. They use laser device to get the reflection from the end of the pipe and after measuring the time of the reflection, they can caculate the distance from the light source to the end point of a steel pipe. The plan works well for thick pipes, but virtually useless for the smaller one, because, when the pipes get smaller, and thinner, one can hardly point a laster emitter to the end of the pipe accurately, so the reflection becomes impossible.
Then another company came, they use line scan cameras fixed on each side of the pipe, after measuring each end of the pipe, they simply add the distance between the two cameras and get the whole length. This plan proves to be effective for pipes with any diameter, yet, in the meantime, has big problems even worse than the German idea.
For one thing, in the factory, vibrations caused by all kinds of machinary is inevitable, these subtle vibrations make the line scan cameras vibrate too and the output of the cameras change drastically just like in a earthquake, so you can image how bad the result might be. And for another, when the scan line get some noise on it, it's very hard to get rid of them, because all you depend is a single line of data. You got no more data to recover from the noise.
So, here comes our rescue, we use ordinary CCD cameras.
As the picture shows, we use 11 cameras. The first one pointing to the head of a pipe and the other 10 cameras are divided into 5 groups, so we can choose one group of them to measuring pipes with particular length.
Like the second solution, we measuring the end and rear of the pipe, then add the distance between them to get the total length.
But how to cope with vibration? The point is that we fix 11 standard boxes on the manufacturing line. When the pipe vibrates, the boxes vibrates as well. And the dimensions of the standard boxes are known, we can caculate the ratio of the pixel number to the real-world length. So, taking the standard boxes as references, we make the things simpler, we even don't need to calibrate the cameras by hand and even if the cameras are blurred, we can still get a accurate length. The picture below demonstrate how we archieve this:
On this picture, the standard box and the pipe all get blurred due to the defocus of the camera. Then on the scanline, the trasition of gray scale of the pixels will not be so sharp, instead, it's a slope which I draw with a green curve, so we cannot get the edge point using Gaussian operator, otherwise, we could introduce bigger error.
First we average all these green curves of the standard box to get another curve, the black one. This curve describes how the gray scale change on the edge of the standard box, Then we can use this curve as a template to find the horizontal displacement of the pipe's edge to the box's edge.
Then, we do the same thing to the pipe and get its edge curve.
Now, we can shift the first curve from left to the right and find a best matching point, on which the two black curves fits perfect, that point, is where the 'true' edge of the pipe lies.
Other hard stuff include capturing 11 frames in a round-around way from 2 capture board and get weight data from a RS232 cable.
The project goes well and is in the tuning phrase.
This is the screen shot of our system(in Chinese), you can see the standard boxes and a pipe at the top. We got the accuracy around 5 mm, which is much better than expected. Vibrations are not problem at all, even you kick the camera (joke), or tap it by your hand, the system still works well.
This is the preview system for tuning the 11 cameras, some of these cameras are not switched on yet.
|
Search This Site
Syndicate this blog site
Powered by BlogEasy
Free Blog Hosting
|