Theories for Database and Data Management
Introduction
Theories concerning databases and data management cover an extensive array of subjects in computer science, ranging from the finite model theories to database design theories. The field of database and data management has undergone extensive research over the years. As data storage and management requirements increase, new approaches and solutions emerge to handle the challenges that arise from these increased demands. Modern data management systems deal with enormous datasets necessitating the need for theoretical models that borrow from fields such as machine learning, statistics and database theory (Kotu & Deshpande, 2019). One of the most widely applied theorems in data management and database theory is the finite model theory (FMT). This theorem is a subset of the model theory that deals with finite structures of mathematical structures, including sets and graphs through logic (Viswanathan, 2018). This paper discusses the finite model theory and its application in database and data management. It will also illuminate the divide that exists between the theory and its use in practice, outlining the causes and impacts of this gap.
The Finite Model Theory
The finite model theory (FMT), as stated above, is a branch of the model theory that restricts the field to finite structures (Viswanathan, 2018). The model theory (MT) is a mathematical field concerned with the study of structures like sets and graphs, and the proofs associated with these structures. The general concepts in MT involve the relationships that exist between formal languages, including semantics, and the interpretations that accompany these languages. The definitions associated with these languages are also referred to as the semantics of the specified language. FMT is the field of MT that restricts the study of the syntax and semantics of finite structures (Kolaitis, 2007). Database theories and data management models rely on FMT in the analysis and formulation of querying languages due to the limited nature of all data items stored in any computing system. Relational databases rely on structured query language (SQL) to store and retrieve data. SQL is highly reliant on first-order logic, which is a subcomponent of FMT. It is, therefore, prudent to understand the concepts and principles of the finite model theory and its application in database theory before looking at the gap between the theory and its use. Don't use plagiarised sources.Get your custom essay just from $11/page
Before delving deeper into FMT, it should be noted that most MT theorems and proofs do not hold when restricted to finite structures. These proofs include the completeness theorem and the compactness theorem that work in MT but fail when bound by the restrictions of the finite model theory (Dawar, Grädel, Kolaitis, & Schwentick, 2017). However, restricted variables and sets describe data in computer science, making the finite model uniquely suited to represent this data and translate it into information. First-order logic is, by far, the most useful subset of the finite model theory in its application to computer science and data management. First-order logic, also known as predicate logic, refers to the collection of formal systems that can be described using quantifiable variables instead of non-logical objects (Kolaitis, 2007). Most database theories that make use of FMT rely on the concepts developed in first-order logic.
The real power of first-order logic becomes significantly more evident when used with finite models, where collections of these models can be defines using sets of sentences with predicate logic attached (Viswanathan, 2018). Since FMT strives to discriminate sets of objects until isomorphism, first-order logic comes in handy in identifying common attributes and data points in an infinite set. These defining attributes can be applied over a range of elements in these sets to carve out isomorphic sets that are finite and can be queried using regular SQL syntax to retrieve data from large datasets. This concept of FMT is applicable in most practical scenarios, especially in the modern world where computer systems are awash with data (Srivastava & Venkatasubramanian, 2009). It gives scientists the ability to quantify data and make trade-offs based on gains and losses associated with specific aspects of the information such as its relevance and cost to the project.
The Gap Between Theory and Practice
The transition between database theory and practical application often poses a significant challenge to most computing students and scientists (Myers & Skinner, 1997). The need to understand the reason for applying specific concepts when implementing a software system design take a back seat when compared to the implementation procedure itself. For most scientists, the ‘how’ supersedes the ‘why’ of the procedure. This bias creates a significant gap in the knowledge of database theories and the practical side of the implementation. This section discusses the gap that exists between the theory of FMT and its practical applications in databases and data management.
FMT is a mathematical concept that focuses on proofs for mathematical structures (Viswanathan, 2018). The theory that accompanies most ideas in FMT involves numerous mathematical notations regarding predicate propositions and first-order logic (Kolaitis, 2007). These concepts can look daunting to most students, and prove to be more challenging to understand for database developers without a background in mathematics. For most developers, the needs of the clients are more important than the underlying concepts involved in the design of databases (Myers & Skinner, 1997). What the developers focus on are the outcomes of the software under development. They mainly focus on the system’s efficiency in performing the required tasks. The technologies and the inner workings of the databases or the data management systems are secondary to these needs. They may only concern the developer if they interfere directly with the efficiency of the system. Mathematical concepts, such as those involved in FMT, rarely come into focus during the practical development of these systems, since it developers assume that the database system creators already took into consideration these concepts.
Consider the following example of a theoretical definition for the notation of definability: A collection K of finite structures is definable if there is a sentence A such that [[A]] = K (Viswanathan, 2018). From a theoretical perspective, this statement is a vital tool in illuminating the real power of first-order logic as applied to the characterization of sets. The comment forms the basis for various studies and is critical in creating characterizations of models when the collection is not definable (Viswanathan, 2018). Understanding this statement would be vital for a computer scientist trying to understand database theory and create better models for finite systems and structures. For a software developer with a deadline from the client, this statement may be useful, but not critical in the development of the system. Since it is assumed that modern relational databases apply the best standards based on these theories, the developer can sit back and create a database system without referring to this theoretical model. d
Causes and Impact of the Gap
The strategies applied by students and system developers vary depending on the goals set during the start of the project and the outcomes expected at the end of the specified duration (Myers & Skinner, 1997). Although the system developed is likely to rely on theory-based models, time constraints often make it impossible for the developer to delve into the nitty-gritty details of the model. System developers and students, therefore, have to rely on the expertise of the creators of the underlying database systems for efficient theory-based and stable databases on which they can build. Due to these time constraints, and over-reliance on the database system creators, most modern computer science students often shy away from theories such as FMT that are the building blocks of most relational database systems.
Most stable database systems in the modern computing world such as MySQL also provide platform-independent user interfaces. These systems abstract the underlying concepts involved in the inner workings of the database system, requiring system developers to learn SQL as the main channel of communication. This simplification has both positive and negative effects, depending on the user of the system. For computer science students, the impact of highly abstracted database systems is the lack of awareness of the underlying theoretical concepts applied (Myers & Skinner, 1997). For example, students using the MySQL database often learn the query syntax associated with each command but require an expert to teach them the relationship between these queries and the finite model.
On the other hand, clients need quality software on a timely basis. A highly abstracted database software makes the work of the developer more comfortable by eliminating all the complexities associated with theoretical models. This simplicity makes it easier to create a stable product using minimal effort. The gap that is created by these highly abstracted database models, therefore, creates a positive impact for developers wishing to forego these details. They can focus on the practical aspects of the database system without needing to worry about the theoretical models used to develop the underlying database system.
Conclusion
This paper discusses the database and data management theories, focusing on the finite model theory as applied in data management. FMT is the sub-field of MT that focuses on finite mathematical structures. It differs from MT by restricting these structures into a finite universe. In database theory, we apply FMT in the definition of SQL syntaxes and semantics. The theoretical aspects of FMT, however, do not translate fully into practice leaving a gap that students and other practitioners have to learn if they hope to understand the model entirely. The impact of this gap, however, has some positive effects such as the simplification of the database development processes for developers hoping to create database systems quickly.
References