I want to know the size of bounding box in object-detection api
I wrote a full answer on how to find the bounding box coordinates here and thought it might be useful to someone on this thread too.
Google Object Detection API returns bounding boxes in the format [ymin, xmin, ymax, xmax] and in normalised form (full explanation here). To find the (x,y) pixel coordinates we need to multiply the results by width and height of the image. First get the width and height of your image:
width, height = image.size
Then, extract ymin,xmin,ymax,xmax from the boxes
object and multiply to get the (x,y) coordinates:
ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width
Finally print the coordinates of the box corners:
print 'Top left'
print (xmin,ymin,)
print 'Bottom right'
print (xmax,ymax)
You can call boxes, like the following:
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
similarly for scores, and classes.
Then just call them in session run.
(boxes, scores, classes) = sess.run(
[boxes, scores, classes],
feed_dict={image_tensor: imageFile})
Just to extend Beta's answer:
You can get the predicted bounding boxes from the detection graph. An example for this is given in the Tutorial IPython notebook on github. This is where Beta's code snipped comes from. Access the detection_graph
and extract the coordinates of the predicted bounding boxes from the tensor:
By calling np.squeeze(boxes)
you reshape them to (m, 4), where m denotes the amount of predicted boxes. You can now access the boxes and compute the length, area or what ever you want.
But remember that the predicted box coordinates are normalized! They are in the following order:
[ymin, xmin, ymax, xmax]
So computing the length in pixel would be something like:
def length_of_bounding_box(bbox):
return bbox[3]*IMG_WIDTH - bbox[1]*IMG_WIDTH