Computer Vision: Lane Finding Through Image Processing

Computer Vision: Lane Finding Through Image Processing

Lane Detection using Edge Detection and Hough Transform

Image Courtesy: Udacity

How does coding a Lane finding system in an active video feed sounds? It seemed ubercool to me when I did this project with Udacity back in 2017. At that point, the aim was just to know enough basics and code a system somehow to get to the end result to have a feel of the power of computer vision even if you do not understand the exact details of behind the scenes. However, the main aim of this post is to explain image processing techniques. It is more about the journey than the destination.You might end up with a cool project in the end but there are always some side effects 🙂

The aim is to identify the lane lines in an image and then extend the algorithm to a video to identify the left and right lane lines effectively irrespective of their shape ( solid vs dashed) or colour ( white vs yellow) or slope ( straight or angled or curved). The images/ video used for the project are from a US highway where the lanes are clearly marked and they are mostly straight and not much curved.

Following features have been used for the identification:

1- Colour : Lane lines are generally light coloured (white/ yellow) compared to the road (dark grey). So a black/white image works better as the lanes can easily be distinguished from the background.

2- Shape: Lane lines are usually solid or dashed lines which can be used to separate them from other objects in the image. Edge Detection algorithms such as Canny can be used the find all the edges/ lines in the image. Then we can use further information to decide which edges can be qualified as the lane lines.

3- Orientation: Highway Lane lines are more closed to vertical than they are to horizontal. So the slope of the lines detected in an image can be used to check whether it is a possible candidate for the lane or not.

4- Position in Image: In a regular highway image taken by a dash cam mounted looking ahead on the car, Lane lines typically appears in the lower half of the image. So the search area can be narrowed down to the region of interest to reduce any noise.

Colour Selection

Each image can be thought of made up of different colour channels. In RGB colour scheme, every pixel in the image is made up of values of Red, Green and Blue channels. The value of these vary from 0 to 255 with 0 being the darkest and 255 being the brightest. In this way the colour white is [255, 255, 255] and the black is [0,0,0]. If you take the average of the three and use that for every pixel, the resulting image is grayscale.

Image Courtesy: Udacity
Image Courtesy: Udacity

In order to achieve this, I used the following code

# Define our color criteria
red_threshold = 0
green_threshold = 0
blue_threshold = 0
rgb_threshold = [red_threshold, green_threshold, blue_threshold]

Here a colour threshold(rgb_threshold) is defined in the variables red_threshold, green_threshold, and blue_threshold . This vector contains the minimum values for red, green, and blue (R,G,B) that are allowed in the selection and anything above those could be set to zero (blacked out). This helps particularly in the case like this as lane lines are generally in contrast with the road and hence we can set the minimum value.

# Mask pixels below the threshold
color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \
(image[:,:,1] < rgb_threshold[1]) | \
(image[:,:,2] < rgb_threshold[2])
color_select[color_thresholds] = [0,0,0]

The result, color_select, is an image in which pixels that were above the threshold have been retained, and pixels below the threshold have been blacked out.

Region Masking

Next task is region masking. If we assume that the images are coming from dash cam that is mounted in a fixed position on the car, the lane lines would only appear in a certain region in the car. This can be utilised to narrow down the search region. Here we can define a triangle as shown in the figure below and only use the region within that for our search.

Image Courtesy: Udacity

The variables left_bottom, right_bottom, and apex are used in the code below as the vertices of a triangular region that I would like to retain for my color selection, while masking everything else out. Here, triangular mask is used but in principle, you could use any polygon.

Once the masking has been applied, the final image would look like :

Image Courtesy: Udacity

We can combine both the strategies as shown in the code below:

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np

# Read in the image
image = mpimg.imread('test.jpg')

# Grab the x and y sizes and make two copies of the image
# With one copy we'll extract only the pixels that meet our selection,
# then we'll paint those pixels red in the original image to see our selection
# overlaid on the original.
ysize = image.shape[0]
xsize = image.shape[1]
color_select= np.copy(image)
line_image = np.copy(image)

# Define our color criteria
red_threshold = 200
green_threshold = 200
blue_threshold = 200
rgb_threshold = [red_threshold, green_threshold, blue_threshold]

# Define a triangle region of interest (Note: if you run this code,
# Keep in mind the origin (x=0, y=0) is in the upper left in image processing
# you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz ;)
left_bottom = [0, 539]
right_bottom = [900, 300]
apex = [400, 0]

fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)

# Mask pixels below the threshold
color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \
(image[:,:,1] < rgb_threshold[1]) | \
(image[:,:,2] < rgb_threshold[2])

# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
(YY > (XX*fit_right[0] + fit_right[1])) & \
(YY < (XX*fit_bottom[0] + fit_bottom[1]))
# Mask color selection
color_select[color_thresholds] = [0,0,0]
# Find where image is both colored right and in the region
line_image[~color_thresholds & region_thresholds] = [255,0,0]

# Display our two output images
plt.imshow(color_select)
plt.imshow(line_image)

The resulting image after colour and region masking is imposed on the original image to indicate lane lines as :

Image Courtesy: Udacity

As shown in the figure, the colour detection was able to mark lane lines. However is it still not a great way to find them. It is easier when the lines are of single colour- white on black roads. However many times lines are different colours (yellow, white etc) and they may be solid or dashed. So we would need a more intelligent algorithm to find the lane lines in a video stream which is where edge detection comes in.

Canny Edge Detection

Sometimes in a computer vision project, shape of an object can be utilised for its detection. To determine the shape of the objects in an image, edge detection is required.

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986 — Wikipedia

A grayscale image is generally used as it is easier to get to edges. A greyscale image can be imagines as a 2D matrix function with each pixel as a matrix element containing x and y values. Hence we can perform mathematical operations like taking a gradient in x and y directions. An edge is just a rapid change in brightness of the pixel.

Image Courtesy: Udacity

A gradient in both x and y direction is taken which gives the intensity of the pixel. The gradient often gives thick edges where we then apply threshold to get to actual pixels.

Image Courtesy: Udacity

The Canny edge detection algorithm is composed of following steps:

  1. Noise reduction : Since the mathematics involved in the algorithm are mainly based on derivatives, edge detection results are highly sensitive to image noise. One way to get rid of the noise on the image, is by applying Gaussian blur to smooth it. To do so, image convolution technique is applied with a Gaussian Kernel (3×3, 5×5, 7×7 etc…). The kernel size depends on the expected blurring effect. Basically, the smallest the kernel, the less visible is the blur.

2. Gradient calculation: Smoothened image is then filtered with a Sobel kernel in both horizontal and vertical direction to get first derivative in both directions. From these two images, we can find edge gradient and direction for each pixel as follows:

Image Courtesy: Udacity

3. Non-maximum suppression: After getting gradient magnitude and direction, a full scan of image is done to remove any unwanted pixels which may not constitute the edge. For this, at every pixel, pixel is checked if it is a local maximum in its neighborhood in the direction of gradient

4. Edge Tracking by Hysteresis : This stage decides which are all edges are really edges and which are not. For this, we need two threshold values, minVal and maxVal. Any edges with intensity gradient more than maxVal are sure to be edges and those below minVal are sure to be non-edges, so discarded. Those who lie between these two thresholds are classified edges or non-edges based on their connectivity.

In openCV, cv2.Canny function can be used to find edges in a grayscale smoothened image. Here is the full code:

# Do relevant imports
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2

# Read in and grayscale the image
image = mpimg.imread('solidYellowCurve.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

# Define a kernel size and apply Gaussian smoothing
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)

# Define our parameters for Canny and apply
low_threshold = 30
high_threshold = 100
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)
plt.imshow(edges, cmap='Greys_r')

The resultant image after edge detection :

Image Courtesy: Udacity

All the edges in the image are visible now but how would you extract lane lines from this edge detected image ? Now that we have all the points that represent the edges, we need something to connect them. This is where Hough Transform comes in.

Hough Transform

In image space, a line is plotted as x vs. y, but in 1962, Paul Hough devised a method for representing lines in parameter space, which we call “Hough space” in his honour.

A lines equation can be represented as y = mx+b where m is the slope and b is the intercept. In Hough space, I can represent my “x vs. y” line as a point in “m vs. b” instead. The Hough Transform is just the conversion from image space to Hough space. So, the characterisation of a line in image space will be a single point at the position (m, b) in Hough space.

Image Courtesy: Udacity

A single point in image space has many possible lines that pass through it, but not just any lines, only those with particular combinations of the m and b parameters. Rearranging the equation of a line, we find that a single point (x,y) corresponds to the line b = y — xm. Therefore, a point in image space correspond to in a line in Hough space. And hence, if we plot various possible points in image space and corresponding lines in hough, intersection point of various lines in hough space would represent the line passing through those points in image space. This concept is used to identify various lines in an image.

Image Courtesy: Udacity

Now using this cartesian coordinate representation of lines might pose issues in cases where the lines are vertical as the slope is than infinity. To remove this anomaly, polar coordinates are used.

Image Courtesy: Udacity

To apply Hough Transform in OpenCV, a function called HoughLinesP is used that takes several parameters.

lines = cv2.HoughLinesP(masked_edges, rho, theta, threshold, np.array([]), min_line_length, max_line_gap)

In this case, we are operating on the image masked_edges (the output from Canny) and the output from HoughLinesP will be lines, which will simply be an array containing the endpoints (x1, y1, x2, y2) of all line segments detected by the transform operation. The other parameters define just what kind of line segments we’re looking for.

First off, rho and theta are the distance and angular resolution of our grid in Hough space. You need to specify rho in units of pixels and theta in units of radians.

The threshold parameter specifies the minimum number of votes (intersections in a given grid cell) a candidate line needs to have to make it into the output. The empty np.array([]) is just a placeholder, no need to change it. min_line_length is the minimum length of a line (in pixels) that you will accept in the output, and max_line_gap is the maximum distance (again, in pixels) between segments that you will allow to be connected into a single line. You can then iterate through your output lines and draw them onto the image to see what you got!

Here is the final code:

# Do relevant imports
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2

# Read in and grayscale the image
image = mpimg.imread('exit-ramp.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

# Define a kernel size and apply Gaussian smoothing
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)

# Define our parameters for Canny and apply
low_threshold = 50
high_threshold = 150
masked_edges = cv2.Canny(blur_gray, low_threshold, high_threshold)

# Define the Hough transform parameters
# Make a blank the same size as our image to draw on
rho = 1
theta = np.pi/180
threshold = 10
min_line_length = 20
max_line_gap = 30
line_image = np.copy(image)*0 #creating a blank to draw lines on

# Run Hough on edge detected image
lines = cv2.HoughLinesP(masked_edges, rho, theta, threshold, np.array([]), min_line_length, max_line_gap)

# Iterate over the output "lines" and draw lines on the blank
for line in lines:
for x1,y1,x2,y2 in line:
cv2.line(line_image,(x1,y1),(x2,y2),(255,0,0),10)

# Create a "color" binary image to combine with line image
color_edges = np.dstack((masked_edges, masked_edges, masked_edges))

# Draw the lines on the edge image
combo = cv2.addWeighted(color_edges, 0.8, line_image, 1, 0)
plt.imshow(combo)

The resulting image after applying Canny edge detection and hough transform :

Image Courtesy: Udacity

Putting Everything Together

Now all these concepts are utilised to identify the lane lines in the video. The pipeline used the following steps to mark the lane lines:

1- Converting the image to grayscale: Image is converted to grayscale by using OpenCV function cv2.cvtColor (). This made it easier to identify lanes and also helped with the next step of edge detection. Gausssian Blur is applied to average out the cell values in order to reduce noise before taking gradient.

2- Canny Edge Detection: Next the gradient of the image is taken to compute the boundaries or edges of the objects in an image. cv2.Canny(blur_gray, low_threshold, high_threshold) is the function used to perform canny edge detection where low and high thresholds are the thresholds for how strong an edge should be in order to be detected. We have used the values as low_threshold = 50 high_threshold = 150

3- Region of interest: Next a four sided polygon is defined to mask the region of interest. As mentioned before, the parts of the image where lane lines can be found are identified which in our case was a polygon in lower half of image and the other regions of the image are masked to limit our search within that area.

4- Hough Transform: Once the edge pixels are detected in an image, hough transform is applied to connect these edge pixels to form lane lines. cv2.HoughLinesP is the function used which takes input image from Canny and a few other parameters in order to define the lane lines. The output from Hough Transform contains all the lane lines (solid/dashed). Next step is to find a common average line to smoothen them

5- Finding average equation of line and smoothing: Once the lines are identified in the image, they are passed on to draw_lines(line_img, lines) function which finds the average equation of the line. It divides the left and right lane line by calculating slope. Negative values correspond to right positive to the left (based on image coordinates).

Once you have separate left and right lines, the slopes and intercepts for each left/right lines are averaged out by accumulating all the points in each left/ right and finding an average fit through them.

6- Extrapolation: Once you have average left /right line, they are extrapolated to generate a common solid average lane line. Once the pipeline starts working on the test images. The same is done for each frame of the video to generate the lane lines for the whole video.

The final pipeline was able to find the lane lines in the videos even when the lane lines are of different colours (yellow, white) and different shapes (solid, dashed etc).

Image Courtesy: Udacity

Even though the code worked well when the lines were straight, it did not work well when the lines were more curved and we are fitting straight lines in the equation. Also, when there is varied lighting the algorithm finds it hard to detect lane. Possible Improvements include use of second degree polynomial in order to account for the curved lane and use of different lane identification techniques in order to be more accurate.

Image Courtesy: Udacity

Final code for the project can be found at my github repo

References:

Written while listening to Prateek Kuhad