IntroductionDistributeddatabase is a database in which storage devices are not all attached to acommon processing unit, but it is stored in multiple computers. In today’sworld of Information Technology, it is demand of society to available allinformation on hand. In distributed database, users can access the informationat anywhere and anytime in the network. There is also requiring for secure andreliable communication of information.
The concurrency control and securityissues in distributed database have presented in this paper.TheoryDistributeddatabase systems provide an improvement on communication and data processingdue to its data distribution throughout different network sites. It is not onlyfor data access faster, but also a single-point of failure is less likely tooccur and it provides local control of data for users. A distributed databaseis a single logical database that is spread physically across computers inmultiple locations that are connected by data communication links. Theadvantage of distributed database is data is distributed, so that the networktraffic is reduced. If the company network is temporarily broken, the localdatabase does not affected and it will remain the works. The distributeddatabase is stored in multiple computers, so that it will not affect the workof other branch when there is problems occur in one branch. However, it is morecomplex to make sure that the data and indexes in distributed system are notcorrupted.
It is not efficient when there is heavy interaction between sitesand security problem as data are distributed.LiteraturesreviewThedesign of distributed database is introduced into three issues: fragmentation,replication and data allocation. According to the research of Shin and Irani1, fragmentation is a design technique to divide a relation into two or morepartitions. There are several types of fragmentation which are horizontal,vertical and hybrid fragmentation.
For horizontal fragmentation, it is definedas the selection operation. The divides table horizontally by selecting therelevant rows and these fragments can be assigned to different sides in thedistributed system. For vertical fragmentation, it is defined using theprojection operation and it divides the relation into columns. It is necessaryto include the primary key of table in each vertical fragmentation for allowingreconstruction.
For hybrid fragmentation, it is also called mixed fragmentationbecause it is the combination of both horizontal and vertical fragmentations.This fragmentation is more complex than previous fragmentation because thetable is divided into arbitrary blocks based on requirement. Carlos and Stevendescribed that the data replication is refer to the storage of data copies atmultiple sites served by a computer network 2. The problem in managingreplicated data is to maintain the consistency of the data. Replicated data aresubject to the mutual consistency rule. The mutual consistency rule requiresthat all copies of data fragments be identical. According to the research ofSwati, Kuntal and Bhawna 3, data Allocation is described as the process ofdeciding where to locate data.
Data allocation strategies are classified into centralizeddata allocation, partitioned data allocation and replicated data allocation. ResultoutcomeConcurrencycontrol is the activity of processing concurrent accesses to a database indistributed database system. Transaction is a set of read or writes operationused to perform a unit of work. Hence, the database transaction must be atomic,consistent, isolated and durable. There are some distributed concurrencycontrol technique have been discussed. Firstly, basic timestamp orderingalgorithm which is used to determine the out datedness of request with respectto the data objects and to order events with respect to one another based ontheir timestamp values.
Timestamp is a unique identifier to identify atransaction. The timestamp-ordering protocol ensures serializability amongtransaction in their conflicting Read and Write operations 4. Next techniqueis distributed two-phase locking (2PL). The main approach of the two phaselocking protocol is “read any, write all” 5. The 2PL Protocol oversees locksby determining when transactions can acquire and release locks. Eachtransaction executed in two phases which are growing phase and shrinking phase.Lock managers are distributed to all sites and each responsible for locks fordata at that. Besides that, distributedoptimistic protocol (OPT) is operates by exchanging certification informationduring the commit protocol.
The transaction is assigned a globally uniquetimestamp when all the transaction’s cohorts completed their work and reportedback to the master. This time stamp is sent to each cohort in the “prepare tocommit” message and it is used to locally certify all of its reads and writes6. On the other hand, security is used to protect data or information frommodification and unauthorized access.
There are five approaches in securityaspect of client and server. First, the work station approval mechanism ofusers may be partial or non-existent. It is possible to carry out automation ofthe Login procedure. Next, the work station may be installed in a public areaor in a high risk area. The work station may active strong utilities ordevelopment devices and thereby tries to bypass the security mechanism.
Inextreme cases the users may pretend to be another user and infiltrate thesystem. Security authentication,authorization, encryption and access control is the main security fordistributed database system. Deadlock is the major problem in distributedsystem. It is the state of process where set of process are waiting for eachother 7. ConclusionFragmentation,replication and data allocation are important issues for designing thedistributed database. The basic concept of concurrency control in distributeddatabase systems and the various techniques for concurrency control indistributed environments are discussed.
The need for distributed database intoday’s business environment has been increasing. Hence, it is important toensure that the system operate in a secure environment and integrity. Deadlockis the major problem in distributed database. Hence, we need to find out themethods to data distribution and accessing which leads to minimization ofdeadlock and thus resulting in proper utilization of resources.