#include <sm_vas.h> // which includes sm.h static rc_t begin_xct( long timeout = WAIT_SPECIFIED_BY_THREAD); static rc_t begin_xct( tid_t& tid, long timeout = WAIT_SPECIFIED_BY_THREAD); static rc_t commit_xct( bool lazy = false); static rc_t abort_xct() static rc_t chain_xct( bool lazy = false); static rc_t save_work(sm_save_point_t& sp); static rc_t rollback_work( const sm_save_point_t& sp); static xct_state_t state_xct(const xct_t*); static xct_t* tid_to_xct(const tid_t& tid); static const tid_t& xct_to_tid(const xct_t*); #define max_gtid_len 256 #define max_server_handle_len 100 typedef opaque_quantity<max_gtid_len> gtid_t; typedef opaque_quantity<max_server_handle_len> server_handle_t; enum vote_t { vote_bad, // illegitimate vote_readonly, // no log written vote_abort, // cannot commit vote_commit, // can commit if so directed }; static rc_t prepare_xct(vote_t &v); static rc_t enter_2pc(const gtid_t &); static rc_t recover_2pc(const gtid_t &, bool mayblock, tid_t& tid); static rc_t set_coordinator(const server_handle_t &h); static rc_t query_prepared_xct(int &numtids); static rc_t query_prepared_xct(int numtids, gtid_t l[]);
begin_xct(timeout)
begin_xct(tid, timeout)
state_xct()
tid_to_xct()
The Shore storage manager can participate in transactions coordinated by other software modules that employ the "presumed abort" two-phase commit protocol. The coordinator in such a situation is external to the Shore storage manager; it is assumed to have its own stable storage, and it is assumed to recover from failures in a short time, the precice meaning of which is given below.
A prepared transaction, like an active transaction, consumes log space and holds locks. Even if a prepared transaction does not hold locks needed by other transactions, it consumes resources in a way that can interfere with other transactions. If a prepared transaction remains in the system for a long time while other transactions are running, eventually the storage manager needs the log space used (reserved) by the prepared transaction. A coordinator must resolve its prepared transactions before the storage manager effectively runs out of log space for other transactions in the system. The amount of time involved is a function of the size of the log and of the demands of the other transactions in the system.
For the purpose of this discussion, the portion of a global transaction that involves a single Shore transaction is calld a thread of the global transaction.
A Shore transaction participates as a thread of a global transaction as follows:
A global transaction identifier is an opaque value to the Shore storage manager. It uses a template class defined as follows:
template <int LEN> class opaque_quantity { private: uint4 _length; unsigned char _opaque[LEN]; public: opaque_quantity(); opaque_quantity(const char* s); friend bool operator ==(const opaque_quantity<LEN>&, const opaque_quantity<LEN>&); friend ostream& operator <<(ostream &o, const opaque_quantity<LEN>&); opaque_quantity<LEN>& operator=(const opaque_quantity<LEN>&); opaque_quantity<LEN>& operator=(const char*); opaque_quantity<LEN>& operator+=(const char*); opaque_quantity<LEN>& operator-=(uint4 len); opaque_quantity<LEN>& append(const void* data, uint4 len); opaque_quantity<LEN>& zero(); // zero entire max-sized data structure opaque_quantity<LEN>& clear(); // zero length only operator const char *(); void * data_at_offset(uint i) const; uint4 wholelength() const; // including _length member uint4 length() const; // excluding _length member uint4 set_length(uint4 l); // of _opaque part only void ntoh(); // put in host byte-order void hton(); // put in net byte-order bool is_aligned() const; // to sizeof(int) };
The Shore storage manager implements the "read-only optimization" for presumed-abort. If a prepared transaction did not log any updates, the transaction is committed at the time it is prepared, and the vote returned indicates that the transaction thread is read-only. Once the vote is communicated to the coordinator, and the coordinator has recorded this vote on stable storage, this thread of the global transaction can be omitted from all further processing of the transaction.
The votes are {vote_bad, vote_readonly, vote_abort, vote_commit}.
If the application (value-added server) should crash during a two-phase commit, a new application (representing the coordinator) must run, and it must contact the Shore storage manager in order to complete the two-phase-commit protocol.
If the application crashes before the prepare is done the transaction thread is aborted.
If the application crashes during the first phase (after the prepare is done, but before the vote is written to stable storage, the application must retry the prepare phase to get the vote and resolve the transaction.
If a crash occurs during the second phase (after the prepare is done and its vote is written to stable storage, but before the transaction is resolved), the application cannot always tell if the second phase completed. It is always safe to try again to complete the transaction thread. If the transaction thread is unknown to the Shore storage manager at this point, the second phase completed.
In order to locate a prepared transaction after a crash, the application calls recover_2pc. If a prepared thread with the given global transaction identifier is found, the (local) Shore transaction identifier is returned, and the thread is attached. The application can subsequently call commit_xct or abort_xct.
The Boolean argument mayblock indicates whether the application considers it acceptable for the recover_2pc call to block (e.g., in the event that it is awaiting connection to its internal coordinator).
After recovery after a crash, a value-added server may discover what transactions were prepared and need recovery by calling the two forms of query_prepared_xct. The first call returns the number of such transactions. With that information, the value-added server can allocate memory in for storing the global transaction identifiers of the prepared transactions. The value-added server then invokes the second form of query_prepared_xct to get a list of the global transaction identifiers, and then recover the prepared transactions.