There are a few requirements before you can make an Arcpy process suitable for use with Python’s Multiprocessing library:
- The most calculation intensive (time consuming) part of the code must be able to be made into a Python ‘module’ and parallelised; which will be described in the following posts.
- Once it is made into a module, there must be no issues with data access – each invocation of the module should either write to a different output database (Arc locks the entrie *.gdb in use, not just the feature class being accessed) or pass data back for it to be written only at a later stage.
To explain using Multiprocessing with Python, I will set out a hypothetical example. The objective of the example is to identify the number and type, and accumulate a weight value, of all the Polygons within a certain distance of some Point features.
The Polygon feature class has ‘polyType’ and ‘polyWeight’ attributes. This is just a simplified example that I thought up for explaining Multiprocessing – sorry if there is a better method than what follows for actually doing this!
Method pseudo-code:
# get variables from Arc # check all inputs are valid # for Polygon types: # make feature layer of Polygons # make feature layer of Points # for rows in Points: # get PointID # select [the] Point row corresponding to PointID (only way I could find to make a row selection) # select by location: Polygons within the search distance # for Polygon rows (within the selection): # store the sums of weighting and count to a Python dictionary by PointID and polygonType # for rows in Points: # for Polygon types: # access data from dictionary by PointID and Type # write value to row
