Separate Reading, Processing, and Writing

If an engine reads all the data prior to processing, then the reads can be optimized. The query can be structured to grab the required data, utilizing the best indexes, and avoid multiple queries of the same data. Similar to the read, separating the write can ensure the write is done in the best possible fashion.

Model Read or SQL

Engines as a rule should simply avoid the model or DAO read. The only time these perform is if there is a cache, and the cache will hit the majority of the time. Otherwise, what the processing Platform does to the read is just too great for the engine. Primarily this is caused by the permission issues that Platform is guaranteeing. This is valuable to control a user, but an engine on the other hand is developed by ONE and can be trusted to obey the rules.

The SQL to read should be done as a prepared statement IF it is to be repeated hundreds or thousands of times. But even if a SQL is only run once, the prepared statement does offer the natural protection to any form of SQL injection. Very low probability for an engine – but a simple safeguard.

When reading master data, there is often a desire to do a join between objects such as sites and site lanes. This is acceptable, but the performance implication must be understood. This forces the database server to do the join. The database time is gold; application server time is cheap. Sometimes a simple fast query and relying on the app server to join the data is better. No clear-cut rule here – but something to consider.

Model Write or SQL

Engines may use Model Writes but must understand the performance implication. Remember that model writes must look up IDs if given natural key or look up the natural key if given IDs and the model has denormalized fields.

What?

A Buffer has an Item Name. If you insert a buffer and you provide an Item Name, then it must look up an Item ID. If you insert a buffer and provide the Item ID, then it must look up the Item Name to fill out that denormalized item name on the buffer.

That takes time.

Model Writes also have the security check; a user (the engine) cannot write data outside of its permission.

This takes time.

The advantage of the Model Write is that it allows for a call back on that action. This allows for instant extensibility. This is powerful but ultimately limits performance.

Instead of the Model Write, direct SQL is fast. Even faster are the prepared statements. Engines are often inserting or updating millions of requires which means millions of queries. Prepared statements lower the burden on Oracle in that the statement is interpreted once. A few milliseconds maybe, but repeated a million times, on the database server whose time is gold, and the savings are real.

Merge Into, Update, or Truncate/Insert

Engines often are calculating over the same network or time for which they’ve already run. In these cases, they are simply updating the values. For example, if an engine is calculating a ratio of demand for an item vs. the demand of a whole product group, this ratio may be been calculated a thousand times before. But it may have shifted since the last calculation, and this ratio engine is to review it.

The problem is that the engine may not have calculated it before, which means the record may not exist. Now we have the issue of deciding whether we insert a new ratio or update the existing one.

Oracle provides a query called Merge Into. Merge Into has a select statement that determines the existence of a record, an update if the record exists, and an insert if it does not exist.

Basically, for every row that the engine must update, it must query for. Oracle won’t return it, of course, so the I/O time is not wasted between the DB and App servers. But it must look it up and find it. Based on its existence, then it may update or insert. Basically, this adds the select time to every update.

The advantage of the technique is that the developer can rely on the DB ensuring that it’s either an insert or update. If the select is tuned and the indexes are good, then the Merge Into can be quite fast.

The only way to mimic the Merge Into in the App server is to select the rows from the output table and bring them to the middle tier. This can be a massive amount of data, and the I/O time is not insignificant. And the application will be doing the same thing. Any performance benefit is arguable in that the app server does allow horizontal scalability, BUT the DB moves from being tied up with processing to tied up with I/O. But... if this is done, then a simple Update can be run.

The other way to accomplish this is Delete and Insert. Deletes are horribly expensive. There’s the redo log, locks, cascading deletes, etc. This is massively slower than the Merge Into. The trick is to “mark” records as Delete. So if records have a state, then the records can be marked deleted with no need for a delete or redo logs, etc. And all the records can be inserted. This is good if we are working with a new transaction or a transaction with a state machine.

If the transaction does not have a state, then it’s possible to do a Truncate/Insert. It can be as simple as Truncate the entire table and just let the engine insert. This violates the Value Chain concept but may be acceptable for certain instances (huge instances that would be an issue here are always on their own database). But there may be data that we need to keep – we are only running for a date range and we need to keep old data. This is where creating a table from a select can pay off. Create the table where you select only the data we want to keep. Truncate the table. Then insert the data back into the table.

These techniques sound painful and of little value. But in terms of performance, the inserts (if done as prepared statements) are very fast.