Cloud Vision API - PDF OCR

On April 6, 2018, support for PDF and TIFF files in document text detection was added to Google Cloud Vision API (see Release Notes).

According to the documentation:

The Vision API can detect and transcribe text from PDF and TIFF files stored in Google Cloud Storage.
Document text detection from PDF and TIFF must be requested using the asyncBatchAnnotate function, which performs an asynchronous request and provides its status using the operations resources.
Output from a PDF/TIFF request is written to a JSON file created in the specified Google Cloud Storage bucket.

Example:

1) Upload a file to your Google Cloud Storage

enter image description here

2) Make a POST request to perform PDF/TIFF document text detection

Request:

POST https://vision.googleapis.com/v1p2beta1/files:asyncBatchAnnotate
Authorization: Bearer <your access token>

{
  "requests":[
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://<your bucket name>/input.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "outputConfig": {
        "gcsDestination": {
          "uri": "gs://<your bucket name>/output/"
        },
        "batchSize": 1
      }
    }
  ]
}

Response:

{
  "name": "operations/9b1f9d773d216406"
}

3) Make a GET request to check if document text detection is done

Request:

GET https://vision.googleapis.com/v1/operations/9b1f9d773d216406
Authorization: Bearer <your access token>

Response:

{
    "name": "operations/9b1f9d773d216406",
    "metadata": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.OperationMetadata",
        "state": "RUNNING",
        "updateTime": "2018-06-17T20:18:09.117787733Z"
    },
    "done": true,
    "response": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.AsyncBatchAnnotateFilesResponse",
        "responses": [
            {
                "outputConfig": {
                    "gcsDestination": {
                        "uri": "gs://<your bucket name>/output/"
                    },
                    "batchSize": 1
                }
            }
        ]
    }
}

4) Check the results in the specified Google Cloud Storage folder

enter image description here

https://cloud.google.com/vision/docs/pdf

I know this question is old, but now Google Vision released support for PDF!

In 2016 PDF and TIFF formats was not supported for Cloud Vision.

The accepted formats are : (taken from the the doc)

JPEG
PNG8
PNG24
GIF
Animated GIF (first frame only)
BMP
WEBP
RAW
ICO

But now are added.

Docs for jpg:

https://cloud.google.com/vision/docs/ocr

Docs for pdf

https://cloud.google.com/vision/docs/pdf

Cloud Vision API - PDF OCR

Tags:

Google Cloud Vision

Related

Recent Posts