Object-Oriented Databases Very hot 10-12 years ago. * Big "buzz" * Several start-ups * Tons of DB research No longer mainstream * Object-relational databases fought back Major motivations Relax data model limitations * Identity, encapsulation, inheritance * Composite objects (w/sharing) * Versions/configurations (& long xacts) New language features Language Motivation Shrink the "impedance mismatch" problem for application programmers * DB vs. PL type systems * Declarative vs. procedural programming * Set-at-a-time vs. instance-at-a-time compilation Computationally complex methods (e.g. C++) Complex object "queries" The OODBMS Manifesto (Atkinson/Bancilhon/DeWitt/ Dittrich/Maier/Zdonik) MUST have: Complex objects (tuples, sets, bags, arrays + constructors & ops) Object identity Encapsulation (ADTs/ info hiding/ implementation vs. interface) Class/type hierarchies (inheritance, substitution for specialization) Late binding (polymorphism, "virtual" classes in C++ terms) Computational completeness (methods) Extensibility (system & user types are the same) Persistence (orthogonal to type) Secondary storage (large DBs) Concurrency control Recovery Ad hoc query facility (declarative, optimized) MAY additionally have: Multiple inheritance Type checking Distribution (client/server) Long transactions Version management ObjectStore By far the most successful OODBMS vendor Still in business!! Picked C++, and added * persistence * collections * queries * constraints, versions, transactions, ... Pointer Swizzling Objects move between (multiple locations in) memory and disk. How to manage pointers to such objects? One way is always to do an extra look-up. ObjectStore technique is: * Carefully map (relevant parts of the) DB into Virtual Memory. * Use VM protection mechanism to fault in pages * "Swizzle" pointers upon each page load. OR Motivation Need to handle non-traditional data * Structured, complex objects * New types * User-defined types and methods Competition from OO systems Challenge Relational systems designed for a well-specified relational model with a small number of operators. How to accommodate a wide variety of operators and methods? * Safety * Optimization ADT in Ingres Define a standard interface for an access method Implement this interface appropriately for each new type. * Open(relation-name)/Close(descriptor) * get-first(descriptor, OPR, value)/get-next() * insert(descriptor, tuple)/delete()/replace() * etc. Store info. on operators in special tables. Worry a lot about logical logging and smart locking. But do nothing? Safety versus efficiency managed by having two modes to run user-supplied code * safe mode, in different address space, for development and debugging * efficient mode, in same address space, for normal operation. Ingres Query Opt. Need to know size and selectivity estimates -- stored in operator info. table. Can use indices if operator is of suitable type. Makes it look easy. Methods may be more complex than simple operators. Cost estimation in general is very hard, and the big obstacle to Q Opt. BLOB "Binary Large Object" Used as an attribute value Database faithfully stores uninterpreted bytes Application is responsible for all operation on the BLOB In particular, no indexing is available. Large Objects What to do if object is large than a page? Use object as the granule for buffering, I/O, locking, etc. Buffer manager has to allocate contiguous space in multiple pages * use clever techniques to minimize copying Logging, locking etc. make (small) changes. "Post-Relational" Systems Starburst -- IBM-Almaden * Next big project after R* * Ultimately finding its way into DB2. Exodus -- Wisconsin * Not a complete database * Used widely as a DB "toolkit" * Solid base for many research prototypes. * Particularly OO. * New version called "Shore" Postgres Research project at Berkeley. Originally written in LISP, later re-implemented in C. Packed with many new ideas Productized as Illustra/Informix. Freeware version available from http://www.postgresql.org Postgres Data Model Co-opt the OO terminology * class = relation * instance = tuple * object-id = tuple-id * method = attribute or function of attributes Support class (multiple) inheritance. New base types can be defined. Collection types are allowed. * Arrays, class extents, and sets.