How can I locate something on my screen quickly in Python?

make a function and use threading confidence (requires opencv)

import pyautogui
import threading

def locate_cat():
    cat=None
    while cat is None:
        cat = pyautogui.locateOnScreen('Pictures/cat.png',confidence=.65,region=(1722,748, 200,450)
        return cat

you can use the region argument if you know the rough location of where it is on screen

there may be some instances where you can locate on screen and assign the region to a variable and use region=somevar as an argument so it looks in the same place it found it last time to help speed up the detection process.

eg:

import pyautogui

def first_find():
    front_door = None
    while front_door is None:
        front_door_save=pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=1722,748, 200,450)
        front_door=front_door_save
        return front_door_save


def second_find():
    front_door=None
    while front_door is None:
        front_door = pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=front_door_save)
        return front_door

def find_person():
    person=None
    while person is None:
        person= pyautogui.locateOnScreen('person.png',confidence=.95,region=front_door)


while True:
    first_find()
    second_find()
    if front_door is None:
        pass
    if front_door is not None:
        find_person()

I faced the same issue with pyautogui. Though it is a very convenient library, it is quite slow.

I gained a x10 speedup relying on cv2 and PIL:

def benchmark_opencv_pil(method):
    img = ImageGrab.grab(bbox=REGION)
    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
    # print(res)
    return (res >= 0.8).any()

Where using TM_CCOEFF_NORMED worked well. (obviously, you can also adjust the 0.8 threshold)

Source : Fast locateOnScreen with Python

For the sake of completeness, here is the full benchmark:

import pyautogui as pg
import numpy as np
import cv2 as cv
from PIL import ImageGrab, Image
import time

REGION = (0, 0, 400, 400)
GAME_OVER_PICTURE_PIL = Image.open("./balloon_fight_game_over.png")
GAME_OVER_PICTURE_CV = cv.imread('./balloon_fight_game_over.png')


def timing(f):
    def wrap(*args, **kwargs):
        time1 = time.time()
        ret = f(*args, **kwargs)
        time2 = time.time()
        print('{:s} function took {:.3f} ms'.format(
            f.__name__, (time2-time1)*1000.0))

        return ret
    return wrap


@timing
def benchmark_pyautogui():
    res = pg.locateOnScreen(GAME_OVER_PICTURE_PIL,
                            grayscale=True,  # should provied a speed up
                            confidence=0.8,
                            region=REGION)
    return res is not None


@timing
def benchmark_opencv_pil(method):
    img = ImageGrab.grab(bbox=REGION)
    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
    # print(res)
    return (res >= 0.8).any()


if __name__ == "__main__":

    im_pyautogui = benchmark_pyautogui()
    print(im_pyautogui)

    methods = ['cv.TM_CCOEFF', 'cv.TM_CCOEFF_NORMED', 'cv.TM_CCORR',
               'cv.TM_CCORR_NORMED', 'cv.TM_SQDIFF', 'cv.TM_SQDIFF_NORMED']


    # cv.TM_CCOEFF_NORMED actually seems to be the most relevant method
    for method in methods:
        print(method)
        im_opencv = benchmark_opencv_pil(eval(method))
        print(im_opencv)

And the results show a x10 improvement.

benchmark_pyautogui function took 175.712 ms
False
cv.TM_CCOEFF
benchmark_opencv_pil function took 21.283 ms
True
cv.TM_CCOEFF_NORMED
benchmark_opencv_pil function took 23.377 ms
False
cv.TM_CCORR
benchmark_opencv_pil function took 20.465 ms
True
cv.TM_CCORR_NORMED
benchmark_opencv_pil function took 25.347 ms
False
cv.TM_SQDIFF
benchmark_opencv_pil function took 23.799 ms
True
cv.TM_SQDIFF_NORMED
benchmark_opencv_pil function took 22.882 ms
True

The official documentation says it should take 1-2 seconds on a 1920x1080 screen, so your time seems to be a bit slow. I would try to optimize:

Use grayscaling unless color information is important (grayscale=True is supposed to give 30%-ish speedup)
Use a smaller image to locate (like only a part if this is already uniquely identifying the position you need to get)
Don't load the image you need to locate from file everytime new but keep it in memory
Pass in a region argument if you already know something about the possible locations (e.g. from previous runs)

This is all described in the documentation linked above.

Is this is still not fast enough you can check the sources of pyautogui, see that locate on screen uses a specific algorithm (Knuth-Morris-Pratt search algorithm) implemented in Python. So implementing this part in C, may result in quite a pronounced speedup.

How can I locate something on my screen quickly in Python?

Tags:

Python

Image

Image Processing

Python 3.X

Pyautogui

Related

Recent Posts