Skip to content
October 31, 2006 / Steven Pousty

Geoprocessing error translation – the middle finger

So I do a buffer off of 11,227 features and of course I can’t choose dissolve in the buffer procedure otherwise it crashes. I check the geometry several times to make sure it is clean and it is fine (according to GP). My buffer completes and then I decide to dissolve as a seperate process. I am using a PGDB and I compress it first and then check the geometry one more time and then fire off:

Executing (Dissolve_3): Dissolve crlf2mi C:\data\newModels\geodatabase\scratch.mdb\crlf2midiss # # MULTI_PART
Start Time: Mon Oct 30 13:25:39 2006
Sorting Geometries…
Dissolving Geometries…
Error in executing function
Failed to execute (Dissolve_3).
End Time: Mon Oct 30 20:08:30 2006 (Elapsed Time: 6 hours 42 minutes 51 seconds)

DO YOU SEE THAT TIME! I sat there without using my machine for most of the afternoon (because I don’t want to have someone tell me that I was using memory that GP needed). I left it running over night and that is what greeted me this morning. I kept checking to make sure my PGDB was well below 2 gigs and it never got close. Thank goodness I had other work to do yesterday afternoon on another computer, but I am PO’ed. Can someone please tell me how GP is the way forward? Oh yeah, and just for kicks GP doesn’t release the memory it was consuming so I am left with 1.7gigs being used.

And now what do I do? Export to coverage and use ArcInfo – I am sure that will work but can anybody else besides me see how lame that is?

OK and on that note I am off to try and actually get something to work…

Advertisements

9 Comments

Leave a Comment
  1. Kirk / Oct 31 2006 11:04 am

    Hey Steve –
    I’ve struggled with buffering and unioning overlays (dissolve basically does a union). Maybe try slightly generalizing the geometries (so there are no circular arcs) before doing the dissolve.

    Seems to be a lot more stable in 9.2 http://forums.esri.com/Thread.asp?c=93&f=1741&t=203842&mc=3#611503

    Kirk

  2. dylan / Oct 31 2006 11:25 am

    Hi Steve.

    How about doing the operations one at a time? i.e. buffer, dissolve, etc. ?

    My wife is currently taking a GIS course (more like a ESRI button pushing course, but i digress) and this type of problem is always coming up. Doing the multi-part analysis in pieces always seems to do the trick. Or you might just try using PostGIS for such large operations…

    Cheers,

    Dylan

  3. Dave / Oct 31 2006 1:57 pm

    Steve,

    I’m right there with you on GP being a big fat scam. The overhead involved in wrapping everything up to a lovely GP object is huge, and it shows in the performance/stability. I’m not calling for a return to “Arc:” commandline, but if they can’t even come close to the same performance, maybe they should look at making some changes.

    ArcGIS is supposed to be the king kahuna of GIS packages and it takes 6 hours to deal with 11,000 features? What’s that – a whopping feature every 2 seconds (6 * 60 * 60 = 21,600 seconds / 11,000 features = ~2)

    So what happens if I’m going to try to analyze a large dataset, say with 1,000,000 features? Assuming it finishes, you’re looking at 2 milltion seconds, or 555 DAYS of processing on Steve’s box. Nice.

    It would be interesting to get some performance feedback from doing thinks like this in PostGIS just for comparison. Or hell – maybe in NetTopologySuite (http://nts.sourceforge.net/)

    Dave

  4. dylan / Oct 31 2006 2:01 pm

    If any of you guys are on the postgis mailing list (i am) we should put up some benchmarks of common tasks. Since not everyone has access to the same type of system, they would be a bit difficult to compare – but at least it would be a starting point.

    Also. Dave your calculations might be off if the GP functions do not scale linearly – or if there is some type of caching. One more reason for more data points.

    Steve, if you can post a subset of your data with the intent of the analysis, I will try and replicate it in PostGIS.

    Cheers,

  5. thesteve0 / Oct 31 2006 4:30 pm

    Just some quick notes. I can get buffer to run on over 11K features but only if I turn off dissolve during the buffer operation. So I did the buffer and then as a seperate step I ran the dissolve.

    I don’t think I am going to go down the benchmarking route for fear of legal issues. I am sure somewhere my EULA there is something saying I can’t do them and then publish them.
    I may do it internally and then just say which one we decided to go with, though I am so swamped right now I am not sure I have the time.

    And while I agree with Dave about how 6 hours seems long to dissolve 11K features the real bummer for me is that it tooks 6 hours for it to tell me it couldn’t do it. So who knows how long it would have taken to complete if it had worked. And then the error message gives me no hint as to how I should try to make it work. In a better GP world it would have done some sort of scan of the geometry and told my in 3 minutes or less that there was going to be a problem and it’s error message should at least give me a better hint about what is wrong.
    Could you imagine if I had embedded this in a ModelBuilder model that had already run for several hours. How would you even think about beginning to debug a model like that while still staying on budget or meeting deadlines.

  6. Ben Slater / Nov 1 2006 6:42 am

    I’ve found ArcView 3.x to be able to dissolve large datasets like this much faster and more reliably than ArcGIS 9.1. Have you tried exporting to a shapefile and running this analysis in old reliable 3.x?

  7. priour / Nov 1 2006 3:36 pm

    Steve,
    I am actively testing a Manifold System 7 GIS on my computer. I’ve been UNIMPRESSED with the results. I’m pretty much just testing map display & cartographic capabilities with a ~350,000 feature dataset & 256 NAIP images. To date it is HORRIBLE and crashes nearly everytime. UDig, QGIS, & ArcGIS 8.3 all kick it’s ass. I’m gonna post on all this sometime soon, but I’ve been totally swamped.

    Anyway, I’d love to test Manifold’s geoprocessing functions. Is that where they get their superiority complex from? Cause it ain’t from the display abilities!

    If you can send me the dataset, I’ll happily test it and report the relative performance on Manifold. I think it would be great to see ArcGIS, ArcView, Manifold, & PostGIS compared on a decent sized dataset.
    Matt

  8. Jeff / Nov 2 2006 3:59 am

    Hi!

    If you have the possibility, try to this kind of operations in FME (or in the Data Interoperability Extension for ArcGIS, which is actually an ESRI branded version of FME).

    It is designed for performance and it should buffer and dissolve your features in no time!

    Jeff

  9. tom / Feb 14 2007 7:09 am

    FME will do the trick… we had a similar something going where it looked like ESRI was going to run for about 30 hours on a top end workstation…… took 1 minute 37 seconds using FME workbench……..

    I just kind of went weeeeahhhh……!

    Lol best of luck… you can always try their eval version…..

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: