2PC and 3PC (Commit Protocols) in DBMS

Both Two-Phase Commit (2PC) protocol and Three-Phase Commit (3PC) protocol are popular with Distributed DBMS instances because all nodes must commit to a transaction or none of them will. It is an all-or-nothing proposition. Both protocols share a Prepare (Voting) and Commit/Abort phase, but 3PC adds an additional pre-Commit phase in which every participating node must vote yes to a commit before it is actually done. Compared to 3PC, Two-Phase Commit may be characterized as sending the command and hoping for the best, since the bulk of the transaction (the instructions for what to actually do) are transmitted with the commit phase. The return message after the transaction, from each participant, determines commit or abort status globally. The 3PC extra step of pre-commit is intended to clear up any global commit/abort failure issues or blocking. This step polls for availability before anything is done and the nodes can “act independently in the event of a failure” (Connolly & Begg, 2015). This is an important distinction. In 2PC, a single abort vote or acknowledgement undoes the entire process. In 3PC, assuming the pre-commit phase came back with a global commit vote, even a timeout or network partition would not cause a global abort.

Terminating a process, according to Connolly & Begg (2015), is where the differences between these protocols are most critical. In 2PC it is possible to have a block because after the vote, the nodes are waiting on a commit or abort message from coordinator before making the global commit. If partition occurs, they are stuck until coordinator re-establishes communication. A power failure is more catastrophic, as it may involve multiple nodes and the controller. In both 2PC and 3PC, backup procedures are activated. 2PC participants remain in a blocked state. Of course, overall, there are tradeoffs. The major issue with 3PC is the communication overhead, which is to be expected with the extra phase (Kumar, 2016).

References

Connolly, T. & Begg, C. (2015).  Database Systems: A Practical Approach to Design, Implementation, and Management (6th ed.). London, UK: Pearson.

Kumar, M. (2016). Commit protocols in distributed database system: A comparison. International Journal for Innovative Research in Science & Technology, 2(12), 277-281.

NXD and RDBMS Solutions

Comparing native XML database (NXD) and relational DBMS solutions is close to comparing apples and oranges. Both are spherical fruit, but they have very different flavors, applications, and characteristics. RDBMS has been around for a long time and is much more established than NXD; as a result, there is less collective knowledge around NXD and its implementations. RBMS solutions are practically ubiquitous and have a number of different implementations, both open-source and proprietary. Tables are normalized and typically in a fact/dimension model or star schema.

On the other hand, comparative NXD solutions rely on containers and documents in a simple tree structure. Complex joins and queries that are allowable in RDBMS are typically more difficult in NXD (Pavlovic-Lazetic, 2007). One area that NXD shows promise is in Web-enabled data warehousing (Salem, R., Boussaïd, O., & Darmont, J., 2013). Bringing multiple sources of unstructured and structured data together in an Active XML Repository addresses data heterogeneity, distribution, and interoperability issues.

A typical RDMBS implementation for business is a data warehouse in which structured data from various systems of record are brought into a common area and reconciled. These other systems of record may include proprietary relational database systems, mainframe non-relational databases, data exported to delimited formats, et cetera. A data dictionary may be maintained and reconciliation policies may be drawn up by a central data governance board. The output from this data warehouse allows users from different divisions using different systems of record to understand a common organization-wide data taxonomy.

One possible NXD solution involves an IoT data environment. Imagine a number of environmental sensors (e.g., temperature, humidity, pressure) being read on regular intervals and pushed to a central web location. In a typical XML tree structure, readings from each sensor or central controller (handling multiple sensors) could be placed in an XML document. This data does not require complex joins, and is much better suited for a NXD solution.

References

Pavlovic-Lazetic, G. (2007). Native XML databases vs. relational databases in dealing with XML documents. Kragujevac Journal of Mathematics, 30, 181-199.

Salem, R., Boussaïd, O., & Darmont, J. (2013). Active XML-based web data integration. Information Systems Frontiers, 15(3), 371-398.