Revolution Newsletter

Articles

Other News

A Technical Introduction to Valentina by Ruslan Zasukhin, VP Engineering & New Technology Paradigma Software

There are many, many database systems available for developers - so it gives developers a lot of latitude to choose one that meets their needs in terms of cost, special features and specialty applications around them. Yet so many of them are virtually the same internally and because of that, it's rare to see any real innovation in databases - especially at the engine level. That's where Valentina is different. As a developer, you need to look underneath the surface of a pretty interface to make your application more responsive, useful, innovative and even scalable. The database that forms a critical piece of your solution should add value to make it more responsive, useful, innovative and scalable - and modern. This technical introduction to Valentina examines moving from conception to implementation of a database system that works with an address book application, and how following the Valentina way yields huge gains in performance and scalability. Examples rely on SQL so that they are more portable. Part I: Building Your Logical Database Structure When you start design your database structure, first of all you need to determine what real world entities you need describe in your database. Usually it is a good idea that each entity is mirrored by a separate database table. Write down all such entities and their properties - sometimes a structure suggests itself if you see it diagramed out. This gives you what is known as a logical database structure. Let's look at a very simple database consisting of two tables Person and Phone - which mirror two real world entities. Here is a first step example: Person { FirstName, LastName, BirthDate } Phone { Kind, AreaCode, Number } Second step in design of database structure is to determinate relationships between entities and the kind of relationships they can have. In database terms, these include: ONE to ONE, ONE to MANY or MANY to MANY. It is very easy actually, enough to answer the question: how many entities of object A can be connected to object B? For our example: How many Phones can a Person have? The answer is MANY - work, home, mobile, fax and more. How many Persons can be associated with a Phone? The answer is MANY. Think about general phone numbers at specific office locations of companies, where everyone is on an exchange. You might want to be able to look up all known contacts at a particular company by phone number. These entities then, Person and Phone, must be linked together in a many to many relationship. In database notation, this is referred to as M:M. Part II: Building Physical Database Structure After you defined all entities (objects), their properties and all relations (links), you have the logical database structure - the logical way to connect all data. But it doesn't stop there, because next we have to develop a physical database structure or implementation. There are many existing database models - Hierarchical, Network, Relational, Object-Relational and Object-Oriented. A DBMS usually is based on only one model - but if you understand the strengths of the model that you are using, you can maximize performance from the very beginning of development. Now that we know the logical database structure, we need to transform it to the physical world: this is the process of expressing objects, properties and links of the logical model into database structures. The Relational model is familiar to anyone who has been involved in database development of almost any sort - Access and Filemaker are both relational databases. Valentina supports the Relational model, but also the Extended Network and Object-Relational models. Why? It's all about performance as you scale your solution. If you take what may first appear to be the easiest and simplest route based on pre-conceptions you have from creating a database with a product like FileMaker, you may end up building in some severe limitations into your software. Let's compare the above simple example with two tables to be expressed in a relational model and by comparison, the unique object-relational model of Valentina. We will show how much easier and natural is work with the modern OR model of Valentina. Building the Database the Relational Way The relational model suggests using tables which have columns and rows. If we did this the relational way, we could create this database with two tables in SQL: CREATE TABLE Person( FirstName VarChar(128), LastName VarChar(128), BirthDate Date ) CREATE TABLE Phone( Kind BYTE, AreaCode INTEGER, Number INTEGER ) Next, we link records of the Person table to records of Phone table. The relational model uses an additional table to contain links. So we need to add KEY fields into our tables at this point. A KEY field is a field that has a unique, NOT NULL value in the table scope. Note: Relational key field links use a "pointer by value" method; in other words such fields will keep the same value as a KEY field of the table to which they point. So how do you pick a unique key for the Person table? Social security number? But what if you are not in USA? A passport number? This is not such an easy task given that there really isn't a universally common way to track people. Some DBMS offer an auto-increment integer field - it assigns a unique value into this field when you insert new records into the table - usually just adding a "1" to differentiate it from the last record. So lets correct our tables in SQL and add in a primary key: CREATE TABLE Person( Person_ID INTEGER PRIMARY KEY AUTOINCREMENT, FirstName VarChar(128), LastName VarChar(128), BirthDate Date ) CREATE TABLE Phone( Phone_ID INTEGER PRIMARY KEY AUTOINCREMENT, Kind BYTE, AreaCode INTEGER, Number INTEGER ) This is a shame. The beauty of our logical structure is beginning to fade- our tables now have some "garbage" fields and values inserted into them in a rather illogical way by having these keys. My web developer friends liken this to old style HTML which muddles up display logic with content versus nice clean content modified by external CSS. Now lets return to the M:M linking. We still have not solved this task, but only made some initial steps. Now that we have ID fields in both tables, we can apply a classic Relational Model solution to get a M:M link between two tables: we create a third helper table with two fields: CREATE TABLE PersonPhone( Person_ptr INTEGER INDEXED, Phone_ptr INTEGER INDEXED ) This table will contain information about links of Person records to Phone records. For example if Person with ID = 1 is linked to Phones with ID 2, 5, 77, this table will have three records: 1 2 1 5 1 77 That is, every conceivable link between combinations must be taken into account. As your database grows in size, this table can grow enormously. Building the Database the Valentina Way Now let's solve the same task using the Valentina Object-Relational model. We are still using SQL to generate our database. Valentina does not require the creation of KEY fields - you can avoid the headache of filling in unique values. Each Valentina Table has a special RecID field, which provides a unique number for a record of this table. Taking this into account we can make our tables as simple as: CREATE TABLE Person( FirstName VarChar(128), LastName VarChar(128), BirthDate Date ) CREATE TABLE Phone( Kind BYTE, AreaCode INTEGER, Number INTEGER )To establish the all important M:M link between these two tables, Valentina offers a very special feature: the BinaryLink. CREATE BINARY LINK PersonPhone(Person, Phone) AS MANY TO MANY And there it is. We created links but we haven't introduced any garbage fields into our tables - and remarkably, it means our physical structure more closely resembles our logical structure. The Benefits of the Valentina Object-Relational Way Okay, in the relational model we have additional tables and in OR model we also have an additional object in schema - this BinaryLink. Is this some kind of naming game? Not really, and this is where you see the benefits. Cleaner Schema. Using BinaryLinks, one Valentina developer told us that they were able to reduce an 800 table solution (using a traditional relational database) to about 500 tables. Your needs may not be this complex - today. Easier Transformation. The process of transforming your logical model into a physical implementation is much more straight forward and it follows conventions of separating what should be logic from content. Smaller Consumption of Resources Versus Primary Keys. RecID fields do not consume any disk space. Compare this with a primary key with associated indexes. Cut Fat Indexes in Half. At worst, a BinaryLink consumes less than half the disk space of adding a third table with indexes. JOINs are Faster and Easier. When BinaryLinks are used to JOIN two tables, they are at a minimum, 8 times faster than using a third table. SQL queries with JOIN are also much simpler as a result. Compare SQL queries you would have to write with JOIN for MS SQL Server or mySQL: SELECT * FROM Person JOIN PersonPhone ON Person.id = PersonPhone.person_ptr JOIN Phone ON PersonPhone.phone_ptr = Phone.phone_id If designed the Valentina way, you would execute the same query with Valentina: SELECT * FROM Person JOIN Phone If you designed your database using the relational way with SELECT * FROM Person JOIN PersonPhone JOIN Phone Your next steps would be to establish in your application logic how to connect the tables, that are then written into your Valentina database.