Wednesday, November 12, 2008

A subjective review of database modeling software

Background

I maintain and continue the development of an administrative database for the small college where I work. As part of a proposal for a Monbukagakusho grant (Japan's national department of education and science, among a few other things), I am proposing to have some outside developers help me to rewrite our db app in Ruby on Rails. I chose ROR because of the overwhelming community of Rubyists here in Kyushu, and I chose to hire developers because I will be learning Ruby from scratch and need to get this done fast.

The developers told me that they needed something more detailed than the .dot file (Graphviz) and the HTML documentation that is produced by our current framework, ERW. They suggested ERWin, which is over 100,000 yen (one thousand dollars) even with an educational discount. So I asked http://www.experts-exchange.com/ and combed the search engines for recommendations. I tested quite a lot of software, and these are my disorganized findings and recommendations.

Criteria

I have a large database with lots of unused tables and some design problems that need to be fixed. I need software that can accurately reverse engineer my current database so that I don't have to do everything by hand. I also need good printing capabilities.

My db is running on Postgresql7.3 on an intranet Apache RHEL3 server. I tried quite a few programs and lost a lot of time trying to make things work with my database in its current version. Support personnel from MicroOLAP and from ModelRight told me flat out that most software won't support reverse engineering from Postgresql 7.3 and that the ODBC drivers for 7.3 are not very good. They suggested that I pg_dump the db structure and install on Postgresql8.2 or 8.3, so I set up EnterpriseDB's one-click Postgresql package on my Windows VM and then tried them all again.

Findings

Toad Modeler (http://www.toadsoft.com/toaddm/toad_data_modeler.htm) was my first test. It does not support Postgresql 7.3, so I could not try it after I first installed it. I found a download for its previous version, x-casestudio, which does support 7.3. However, the company wouldn't give me a trial key, so I couldn't try that out, either. After I finally got the database duplicated on Postgresql8.3, the Toad trial period (less than the 30 days offer by other software) had ended, so I never got to try Toad. Toad does seem to have a lot of documentation online and a community, including screencasts, so it looks like it would be worth looking into.

I spent a long time trying MicroOLAP Database Designer for Postgresql (http://www.microolap.com/products/database/postgresql-designer/). I experimented and sent in bug reports to them as I tried to get it to work with my Pg7.3 db. Even then, it engineered nicely but crashed when I tried to save. When I finally installed Pg8.3 and tried it, I found it to be responsive, light, and graphically very easy to look at. It supports Postgresql natively and correctly reverse engineered everything. The support staff was very good, too, and the president of the company even responded to an email with advice and a coupon code for half price (making it about 70 USD). The relationship lines are drawn really well, by which I mean that when you move a table around, the links move with it, and they never bunch up or overlap, and lines automatically bend at right angles at places and go diagonal at others. The drawback to MicroOLAP was the lack of an explicit cardinality determination on the relationship properties. It is not obvious how to declare two tables in a IS_A relationship, which is 1:1, for example. I assume that it is done by setting the Null and Unique properties on the foreign keys in the child table, but that would be better as a property of the relationship. For traditional ERD with Postgresql, this was my favorite.

I installed and used HappyFish (http://www.polderij.nl/happyfish/) for a while. It is really, really easy to use, has great printing capabilities, and it offers all the features you need. The UI is intuitive and snappy, and support for Happy Fish was really good, too. The online docs are a little lacking, but the condemning factor for me was the fact that the relationship lines are straight and bend at right angles. They also snap to the grid, so a diagram of any complexity ends up having overlapping relationship lines and becomes hard to understand. Other than that, great software, and it pulled my db out through ODBC even with Pg7.3 pretty well.

I tried ConceptDraw (http://www.conceptdraw.com/en/products/cd5/main.php) and was happy to see that it is Mac-PC, so I installed it on the Mac but couldn't find the macro that is supposed to reverse engineer databases. ConceptDraw support informed me that this feature is only available on Windows, so I installed it on my Windows VM and tried again. It took a very long time to reverse engineer, and when it was done, there were no relationships. The other drawback to ConceptDraw is that the ERD it makes is just a diagram, without detailed properties. But I must say that if you are not reverse engineering and not working with a really, really large and complex database, ConceptDraw is a very good buy because it is very useful for other illustrations and diagrams. I still might consider ConceptDraw because one purchase would give me a great program for documenting our campus network, organizational flowcharts, and team processes for us in grant proposals like the one I am working on now. Fabulous website and really good tutorials and screencasts.

ModelRight (http://www.modelright.com/) is really good, in many ways. They have a Community edition with a smaller feature set for free, and they have a free read-only version that you can give to clients and contractors who don't own ModelRight so that they can inspect your models. I love this idea and hope that others emulate it. The view options and the customizations are great, but what I liked most of all was that ModelRight has a really active online community with forums and tutorials, so you will get prompt support and advice. It is a really good deal. However, ModelRight also failed to reverse engineer my relationships, though the soon-to-be-released version 3.5 (support said it will take a few more months) Postgresql version of ModelRight will probably do this better. For generic db ERD, ModelRight would be a very good buy. With a proactive support crew, a great online community, and a very feature-rich and easy-to-use interface, I think 3.5 will be my favorite ERD tool, but I cannot wait a few months, so the search goes on.

ERCreator (http://www.modelcreator.com/) has a lot of users, and it seemed pretty good on the website, but when I tried to reverse engineer from the 7.3 and then from the 8.3, it just froze. I had to use Process Explorer to kill it. Sorry that I cannot evaluate it, but it choked and died three times on me. I don't think I have done it justice, so do try it yourself.

A member of experts-exchange.com named "thewild" suggested Enterprise Architect (http://www.sparxsystems.com.au/), so I gave it a shot. It offers the features that other systems offer, but it is also a full-featured UML 2.1 modeling program. There are many versions, each of which is offered with educational discount for as low as 60USD. Another plus is that Sparxsystems offers EALite, which, like the lite version of ModelRight, is a free read-only program that allows clients and contractors to inspect your models.

EA reverse-engineered my db and the relationships just fine, but the resulting diagram was not an ERD, and I am ashamed to say that intimidated me. I studied database design formally and have read two different editions of Date's Intro to Database Systems and enjoyed it both times. I honestly enjoy relational databases, Venn diagrams, and set theory. I really like Chen ERDs, and I feel so confident with them that I sketch them free-hand when talking about domain issues with users. However, EA did not produce Chen but UML Data Modeling Profile (UML-DMP).

The EA help system states that EA implements the UML Data Modeling Profile, which is not yet ratified as UML but is widely used and pretty much is closer than any other UML database modeling framework to becoming an official part of UML, and UML itself is on the way to being an ISO standard. But I am in a hurry to get this done, and I don't know UML at all, so I was even more intimidate. I could not figure out what all these properties options were, and I didn't want to make a fool of myself. Because I am in a hurry to finish this diagram and get a formal estimate, I put down EA and went back to MicroOLAP.

But the issue stuck with me, and I read the EA Help system's explanation of the UML-DMP (http://www.sparxsystems.com.au/resources/uml_datamodel.html), and then I looked it up online and found a couple of informative articles.

Donald Golden at http://findarticles.com/p/articles/mi_qa4041/is_200604/ai_n17186203/pg_1?tag=artBody;col1 describes why Cleveland State University made UML their modeling tool for all information science classes, including database design. CSU students use UML in database design, software design, and networking classes, which means that they can construct a UML project that includes all aspects of a complete system and describes how the parts interact.

Fred Lewis at http://wescosoftware.com/papers/umldbase.html goes into the nuts and bolts of how UML replaces the ERD and expands on the information that can be modeled.

Conclusion

While ModelRight has the most features, it does not yet support Postgresql natively and did not successfully reverse engineer my db, leaving MicroOLAP the best candidate among the ERD tools. However, Enterprise Architect (EA) is not an ERD tool, so I was actually faced with a choice between ERD MicroOLAP and UML EA. ERD tools, and in particular Postgresql-specific tools like MicroOLAP can help you maintain and administer a database through the program, generating and applying SQL modification code through the diagramming software itself. This is something that EA does not offer. However, EA opens the door to the rewarding (if labor-intensive) acquisition of a new skill set and the possibility of applying UML to other parts of your work.

So I am going to go with Enterprise Architect, but I should make clear the misgivings I have about it:
  1. Printing: The printing system is not as easy to use as the ERD tools. Most of them have a page boundary layer on the screen that lets you see how many pages the diagram will take and even arrange the tables so that they don't break across pages. ModelRight, HappyFish, and MicroOLAP do this really well. EA's printing system allows you to specify how many pages wide and high of what size page to fit the diagram into, but you cannot prevent graphics being on the edges between pages.
  2. Speed: The software is slow. I am using an XP VM on Parallels under Mac, but I have given the VM 64 MB of video ram and 2GB of RAM, and EA is still jerky and slow. Where MicroOLAP and ModelRight tables glide across the screen under your cursor, EA lets you move a blank frame and then the table jumps into place; also, moving a table to the edge of the screen to scroll to an unused piece of drawing space takes a long time. Actually, all scrolling and panning takes too long and is jerky. Also, after clicking a class or attribute in the navigation panes there is a little lag before something happens. I would love to see EA speed up.
  3. Navigation: Zoom and span capability is a little lacking. The span and zoom window is fine, but you have to move the cursor over there and use the slide bar or move the frame in the window. Other tools let me use the mouse wheel to scroll up and down (no modifying key), left and right (when holding shift), and zoom in and out (when holding Ctrl), which is much more efficient and faster.
These issues aside, I am choosing and recommending EA because, for a very small fraction of the cost of Rational Rose or other enterprise-scale program, you will get a fully UML2.1-ready UML diagramming tool that also does the new Data Modeling Profile and can reverse-engineer existing databases and generate SQL to create databases. The price list at http://www.sparxsystems.com.au/products/ea/purchase.html lists the Desktop version at 135 USD, and the educational pricelist at http://www.sparxsystems.com/products/academic_pricing.html lists it at 65 USD. Also, choosing EA for database diagrams means that I will have to learn something new in order to use it. That is a burden but also a plus, because in addition to enlarging my skill set, this opens the door to using UML for other things, like network diagrams and organizational process analysis.

I finally chose to get into using UML for database modeling as a means to develop professionally and satisfy my interest in UML, but this might also mean getting in early on the next evolution in database modeling and design.