Steve’s list of improvements to geodatabases
I know that ESRI appreciates customer feedback and so this post is written in that vein. I have spent some time now wrestling with trying to put all the data for project into a geodatabase, both my spatial and non-spatial data. In the end I decided this was not a good idea given the smaller size of the project, the requirements of the project, and the machinations I would have been forced to use.
I could have done bunch of what I wanted/expected with a class extension but the trade-off in using them was too much. It wasn’t the writing of the code that was the problem, since there is already a sample which does 50% of what I want and I could see where to put the rest. The real problem is that if the dll isn’t installed there is nothing, and I mean nothing, that ArcObjects can do with the dataset (I tried a bunch of different ways to look at the data and they all failed). And locking up a dataset like that freaks me out. I know the personal geodb is locked up but as long as someone goes and buys arcview they can look at the data. If I add the class extension, then if, for some strange and unforeseen reason, the dll goes missing, there is no way anyone else can look at the data.
Using application logic wasn’t satisfactory either because there was nothing to stop users from opening ArcMap and editing away. One reason I use a database is because I would like to enforce some rules about the data no matter how the data is accessed.
So here is an abbreviated list of some of the improvements I would like to see ESRI do for geodatabases.
1. Ability to generate a primary key. There is no built in geodb type that I can use to autogenerate a primary key. As I mentioned before you can not use ObjectID. If you look at most of the models that are shared on the ESRI site they use text keys, and usually because there is some key already generated for the item, such as a FIPS code. Depending on the RDBMS used, you can go in after the geodb is generated and assign an autogeneration column type by hand. The problem with personal geodbs is that Access only allows 1 autoincrementing column per table, which is taken by ObjectID.
2. Ability to make a unique index on one of my columns and still be able to edit the data. This would help with the item above, since I could have the users enter text or something else and then insure that it is unique.
3. Make up your mind as to whether or not we should use UML. If you look on this page you will notice that ESRI provides it’s models in UML and that this can be used as a tool but then later on the page they say that UML is not really recommended. If UML is not the right tool, which I partially agree with since making the UML models for geodbs is rather tedious, then what I should I use for modelling? For anything past a few tables and feature classes I want to use a graphical tool. I need that picture to help visualize the relationships and I need it to help communicate with other developers and users. You know it is not that big a deal to me that I can’t model a topology. What tool should I use?
Which I think leads to the crux of the problem – the way ESRI has implemented object relational is not working for me. They have basically taken the path of the grape – and so there are three options, none of which look awfully likely.
1) Go full on object – be up front about it and take the hit. State that the DB is just an object store and people shouldn’t mess around with it. Using this route it will be clear that the only way into the geodatastore is through arcobjects and we shouldn’t spend time or effort trying to go down any other route. This would probably cause a lot of customers, especially the bigger ones who have a large investment in RDBMs, to complain loudly. For now, the file-geodatabase is an answer to this request. There is no way into a filegeodb without arcobjects. I am also sure this is why there will eventually be odbc and other interfaces to the filegeodb, which will bring us back to the current state.
2) Go much closer to the relational route: stop storing domains as blobs, give me Primary keys, timestamps, let me design with ER diagrams… This has the benefit of giving people a technology that is familiar to them and they also have better tools to work with than geodatabase. This would be a lot of work on ESRI’s part as good chunks of the object model would have to be rewritten and there would be much less business logic in ArcObjects.
3) Do a better job of support both sides of the object relational equation. Build a diagramming tool that will help professionals model there geodb in a way that makes sense for Geodbs. Add into the default ArcObjects a class extension that makes primary keys and timestamp datatypes. It would also help if there was better organization of the documentation and samples (or even a better search). I haven’t thought of all the things I would need but it seems like the relational side needs the most help.
Since none of these options are going to happen in the near future I guess I will maintains seperate data stores and probably try to keep things out of the geodb unless I absolutely need it in there.
My final list for this post is:
1) I am hoping that some of my issues will be pointed out as stupid by those with more experience or knowledge of geodbs. I will probably argue back for a bit and then be thankful that someone showed me that it does not have to be as bad as my experience.
2) I am sure the geodb is working fine for quite a lot of people, so these are things which just made my work life ugly for the last week or so. I actually see how this could work for larger shops with larger IT staffs, or for smaller shops who have more programmers on staff.