How are people using Python data structures and classes in ArcPy?

Many arcpy functions that take multiple inputs accept Python list objects.

For example the Dissolve_management function accepts a list of field names to dissolve on:

arcpy.Dissolve_management("taxlots", "C:/output/output.gdb/taxlots_dissolved",
    ["LANDUSE", "TAXCODE"], "", "SINGLE_PART", "DISSOLVE_LINES")

A tuple can be used in place of a list when you do not need to modify the order or number of elements, as tuples are immutable. They are a useful data structure for heterogeneous but related pieces of data, such as the elements of a timestamp or the coordinates of a point. You will often see lists of tuples, where a tuple serves as a distinct record with a fixed number of attributes, while the list could easily change size, be re-ordered (sorted), etc. See this StackOverflow question for more on the uses of lists vs. tuples.

A dictionary can be used as a fast lookup table to cache a relatively small but frequently-used set of key-value pairs into memory. I saw an interesting example of this on the ArcGIS forums: http://forums.arcgis.com/threads/55099-Update-cursor-with-joined-tables-work-around-w-dictionaries

Their use of a dictionary instead of a join sped up their calculation from 3.5 hours to 15 minutes.

A simpler example might be if you have a million address records with an attribute with the abbreviated state name (CA), but for display purposes you want to spell out the proper name (California), you could use this dictionary as a lookup table when populating a full state name field.

I have not found a need to write a class in Python for use in arcpy myself, but that's not to say there isn't such a use case. A class might be useful when you have a set of closely-related functions (behaviors) that operate on some input (data), and you want to be able to use those data and behaviors in an object-oriented way, but this is more likely going to be business-logic specific and not related to arcpy.

Blah238 covers this topic well, so I will just add a couple of examples from my own work. I develop a lot of airport data, and one of the things I have to do regularly is read in order along the surveyed runway centerline points from a runway. You'd think that these points would be in order (in the GIS database) already, but they rarely are. The centerline points occur every 10 feet along the centerline and are flanked on either side by two other rows of survey points spaced 10 feet apart. You get the picture: a plethora of points ... and usually all mixed in together database-wise. With what I am doing in my scripts, it is usually easiest to just select out the centerline points by attributes (or spatially if need be), read the coordinates for each, and dump the results into a Python list. I can then sort, pop, reverse, etc. the list however I need, and it's fast.

Likewise, I use Python dictionaries extensively (probably far more than some would approve of). I have to create sets of 3D unit vectors for each runway end at an airport, and I access these constantly within a script and do this in many of my scripts. I keep many other sets of regularly accessed data in dictionaries, too. Like lists, they are fast and flexible. Highly recommended.

As far as classes go, like Blah238, I haven't found a need to create any. There are probably a few cases where a class would be preferred in my scripts, but I really haven't been able to identify those places. Someone with more programming experience would probably find them quickly.

I too love dictionaries - use 'em all the time. This method gets some spatial reference properties and stores it all in a dict:

def get_coord_sys(self, in_dataset):
    """Get and return info on dataset coord sys/projection"""
    spatial_ref = arcpy.Describe(in_dataset).spatialReference
    # Get spatial ref props and put in dictionary
    spat_ref_dict = {}
    spat_ref_dict["name"] = spatial_ref.name
    spat_ref_dict["type"] = spatial_ref.type
    spat_ref_dict["gcs_code"] = spatial_ref.GCSCode
    spat_ref_dict["gcs_name"] = spatial_ref.GCSName
    spat_ref_dict["pcs_code"] = spatial_ref.PCSCode
    spat_ref_dict["pcs_name"] = spatial_ref.PCSName
    return spat_ref_dict

This method snippet extracts point geometries from two featureclasses, I then use the geometries later on to do some trig:

def build_fields_of_view(self):
        """For all KOPs in a study area, build left, right, center FoV triangles"""
        try:    
            fcs = {os.path.join(self.gdb, "WindFarmArray"):[], os.path.join(self.gdb, "KOPs"):[]}
            # Build a dict of WTG and KOP array geometries, looks like:
            #  {'KOPs': [[1, -10049.2697098718, 10856.699451165374], 
            #            [2, 6690.4377855260946, 15602.12386816188]], 
            #   'WindFarmArray': [[1, 5834.9321158060666, 7909.3822339441513], 
            #                     [2, 6111.1759513214511, 7316.9684107396561]]}
            for k, v in fcs.iteritems():
                rows = arcpy.SearchCursor(k, "", self.sr)
                for row in rows:
                    geom = row.shape
                    point = geom.getPart()
                    id = row.getValue("OBJECTID")
                    v.append([id, point.X, point.Y])   

            kops = fcs[os.path.join(self.gdb, "KOPs")] # KOP array
            wtgs = fcs[os.path.join(self.gdb, "WindFarmArray")] # WTG array

A LOT of what I am currently working on involves extracting the coordinates and attributes from vector feature classes and rasters so the data can be pushed into another piece of software that doesn't even know what GIS data is. So, I use lists and dictionaries a lot for this.

How are people using Python data structures and classes in ArcPy?

Tags:

Arcpy

Datastructure

Related

Recent Posts