This was my B4 level project. The goal wasn’t much more than simply experimenting with some interesting concepts and making some sort of system. In other words it was just a training project. But lets have a look at it anyway. The general idea was to make a stripe line sensor by using two web cams and a laser line generator. The data from each sensor would be processed and features would be extracted. The two camera’s would give slightly different environment features, this is where sensor fusion comes in. The features from each sensor(just 2 in my case) would be fused based in some way(simply the average in my case, in hind sight a weighted average based on each sensors specific accuracy would have been better). The data resulting from the fusion would then be given to the SLAM module which would plot a map of the environment and simultaneously figure out the position of the vehicle(well, the thing carrying the sensors in my case :P)
Line stripe sensor
The line stripe sensor basically makes use of geometry to extract depth from a 2D camera image. Observe the figure on the left, as the object moves further from the sensor, the angle theta becomes smaller. By this principle depth can be extracted from the image.
The sensor platform puts the two cameras at different positions. This allows me to play around with data fusion and also changes the accuracy and the sensor range of each camera. In theory by merging these two sensors you could get greater accuracy and detection range on the features.
Somehow initially i thought this would be simple yet it turned out to be my biggest headache.
In brief the line stripe sensor(check the link on the left) would make a bright line(due to the CCD cameras sensitivity to infra red the laser line was bright as a star). Using edge extraction and a pre-made filter(values of which were tuned by getting statistical data of what worked and what didn’t) I would get a binary map of the image. Then by using closest neighborhood clustering I would merge the points into lines. Using the split and merge algorithm I would then get a bunch of parameterized lines out and applying one final merger of lines a parametric set of laser lines in the image(shown on next page). Finally by converting the lines from the two cameras into a global coordinate frame I was able to merge them.
Let me give a quick overview of my image processing scheme. First the image is taken by both cameras while a laser line generator makes a line stripe. Then edge extraction and the filter find the line(more or less), this results in a binary image. Then the points of the binary image are clustered according to closest neighborhood. From there on “split and merge” makes short work of converting the clustered points to parametric values lines.
Ok now as for SLAM(Simultaneous localization and map building). Map building algorithms have been around for quite a long while now. The problem was that the vehicles position(assuming the sensors are mounted on it) needed to be known with precision. Odometry based solutions drift over time and relying on specially prepared environmental landmarks rules out the possibility of new(or changing) terrains. A human on the other hand is able to figure out both where he/she is based on sensor data and at the same time make a map of the environment. Fortunately the problem was solved in recent years, the main thrust to the solution came from the realization that the vehicle and the map positions were correlated. From there on a extended Kalman filter was used by many to solve the problem.
The EKF SLAM algorithm is performed recursively in prediction and update stages, these two stages effectively predict the states(think state space) of the system and correct them using sensory data. In this process a covariance matrix is continuously updated, this matrix basically decides how much the next iterations data is biased towards model data or towards sensor data(this is decided online by the statistical accuracy of the model and the sensors). In recent years museum guide robots have made use of this algorithm with impressive results. Below are the figures of my simulation. The green line represents the desired path which the robot tries to follow to within some distance. Red is where the algorithm thinks the objects(the box, the line and the robot) are and blue is the actual positions. The sensory data given to the algorithm has noise injected into it (this in part leads to the robot always traveling a slightly different path as can be seen in the black path lines). Amazingly enough the object positions converge to the true positions over time. In general a robot armed with SLAM is like a human in the sense that you cant get lost as long as you keep seeing and recognizing known landmarks(assuming the map doesn’t change too much).
My originality regarding SLAM(not very original although it was educational) was to create a line based SLAM algorithm. This can be done by simply keeping track of which feature points are associated with each other while using the end points of the lines to update the Kalman filter. One Problem with line based SLAM though, is dealing with incomplete line data. For example if the camera was turning to face a wall then initially it would not be able to see the entire wall as it would be at the edge of its vision. Therefore at every update step the algorithm might add a new, slightly longer version, of the same line. In order to avoid this problem the algorithm keeps lines associated with points that are near the edges of the detection range in a candidate line vector. Then at every update step it merges overlapping line candidates and checks if any of the candidates have had both of their end points clearly detected(i.e not near the detection edges), if so then the candidate lines are promoted to full lines. Likewise line end points which are near the edges of the detection range are marked as invalid features so as to avoid unnecessary features which can increase the chances of incorrectly associating features.