Fastest way to count the number of features in a feature class?
I am using an example with 1 million randomly generated points inside of a filegeodatabase. Attached here.
Here is some code to get us started:
import time
import arcpy
arcpy.env.workspace = "C:\CountTest.gdb"
time.sleep(5) # Let the cpu/ram calm before proceeding!
"""Method 1"""
StartTime = time.clock()
with arcpy.da.SearchCursor("RandomPoints", ["OBJECTID"]) as cursor:
rows = {row[0] for row in cursor}
count = 0
for row in rows:
count += 1
EndTime = time.clock()
print "Finished in %s seconds" % (EndTime - StartTime)
print "%s features" % count
time.sleep(5) # Let the cpu/ram calm before proceeding!
"""Method 2"""
StartTime2 = time.clock()
arcpy.MakeTableView_management("RandomPoints", "myTableView")
count = int(arcpy.GetCount_management("myTableView").getOutput(0))
EndTime2 = time.clock()
print "Finished in %s seconds" % (EndTime2 - StartTime2)
print "%s features" % count
And some initial results:
>>>
Finished in 6.75540050237 seconds
1000000 features
Finished in 0.801474780332 seconds
1000000 features
>>> =============================== RESTART ===============================
>>>
Finished in 6.56968596918 seconds
1000000 features
Finished in 0.812731769756 seconds
1000000 features
>>> =============================== RESTART ===============================
>>>
Finished in 6.58207512487 seconds
1000000 features
Finished in 0.841122157314 seconds
1000000 features
Imagine larger, more complex datasets. The SearchCursor will indefinitely crawl.
I am not at all dissatisfied with the results, however, the DataAccess module is being used extensively in our GIS development circle. I am looking to rebuild some of our function definitions with this module as it is more flexible than a MakeTableView + GetCount methodology.
I have tested solution from answer above and on my real world data the difference is negligible. Opposite to results in other answer, my times for arcpy.MakeTableView_management and arcpy.da.SearchCursor within ArcMap are same same.
I have tested variations with and without query, please see the code for query version, and final measured results below:
@staticmethod
def query_features(feature_class, query):
# Method 1
time.sleep(5) # Let the cpu/ram calm before proceeding!
start_time = time.clock()
count = len(list(i for i in arcpy.da.SearchCursor(feature_class, ["OBJECTID"], query)))
end_time = time.clock()
arcpy.AddMessage("Method 1 finished in {} seconds".format((end_time - start_time)))
arcpy.AddMessage("{} features".format(count))
# Method 2
time.sleep(5) # Let the cpu/ram calm before proceeding!
start_time = time.clock()
arcpy.MakeTableView_management(feature_class, "myTableView", query)
count = int(arcpy.GetCount_management("myTableView").getOutput(0))
end_time = time.clock()
arcpy.AddMessage("Method 2 in {} seconds".format((end_time - start_time)))
arcpy.AddMessage("{} features".format(count))
The results below:
No query:
Method 1 finished in 5.3616442 seconds
804140 features
Method 2 in 4.2843138 seconds
804140 features
Many results query:
Method 1 finished in 12.7124766 seconds
518852 features
Method 2 in 12.1396602 seconds
518852 features
Few results query:
Method 1 finished in 11.1421476 seconds
8 features
Method 2 in 11.2232503 seconds
8 features