KEY-VALUE DATABASE DATA MODELLING
Abstract
One of the great benefits reaped from the emerging technologies are their ability to work perfectly as expected by the users. Initially, the technologies could operate as expected but to some extends. When relational databases first emerged, they appeared to be the best database models until such a moment when they have been challenged by the NoSQL databases. NoSQL database models are grouped into Key-value models, Document based models, column based models and graph based models. Among the four, Key value model is considered to be the superior because of the many benefits compared to the rest. This paper has majored on the key value modelling. The starts by a systemic scrutiny of data modelling as a concept on its own, outlines how this process is achieved in databases and the two approaches involved for that matter and enlightens on the benefits of data modelling. In the second phase of the paper, it introduces the main subject of the paper, Key value database models. Presents the distinct benefits of this model of NoSQL databases, presents the systematic data modelling process for specifically this NoSQL database model, outlined the unique features of key value model of NoSQL database which sets it out from the other three, Document based models, column based models and graph based models and finally given the current changes which have been incorporated into key value model to make it more efficient.
Key words (Data Modelling, NoSQL databases, HTML, JSON, and XML, Key-Value model, Oracle)
Introduction to database data modeling
Data modeling denotes the process used in modification of data in order to make sense by first defining then categorize it, and then establishing its standard definitions as well as descriptors to make it easily consumable by all the information systems based in the same organization. Data modelling is performed because of two main reasons (1) as part of building an overall strategy for information systems within an organization and (2) as part of developing new databases. In general, data modelling role in strategic planning entails determining the kind of data needed for a certain business processes/ process, while in the analysis context it focuses more on describing the existing data and how to categorize it. In Big Data cases, the process of data modelling requires the analyst to find the similarities between data obtained from disparate sources, and affirming that they describe a similar thing. However, in either approach the ultimate goal is coming up with a data representation that can easily be replicated in database architecture (Worboys, Hearnshaw, & Maguire, 2015). This paper specifically describes Key-value data modelling approach, which is a technique used to model data in a database. Don't use plagiarised sources.Get your custom essay just from $11/page
How data modeling is done in a database
In order to come up with a database that precisely classifies the data in question; having an in-depth understanding of the data type and its descriptors is of paramount importance, especially if the data was extracted from different sources. And that’s how an analyst settles at interoperability in the environment involving multiple systems and multiple data sources. In settling at a model that is suitable for architecting a certain database, choosing the methodology is the first step (Worboys, Hearnshaw, & Maguire, 2015).; after which a series of data models can be produced and that proceeds from a business-oriented needs to technical needs. There are two common methodologies used in data modeling;
Bottom-up approach has proved to be more effective when a re-engineering is being done on an already existing database than when a new strategic approach is being developed. The model usually starts with an existing data structure, which can be an existing database, spreadsheets, forms or reports. In this approach, data may have its origin from other data tools and proprietary systems that were not incorporated in the original engineering for purposes of data-sharing or Big Data analytics; a concept which is referred to as “siloed” data by the data engineers. The existing data structures are foundational, and a data model is built from that foundation.
Top-down approach on the other hand is used for strategic data modeling, where no references are done from the previously-existing systems. In this approach, both the data experts as well as the subject-matter experts’ work together to outline business requirements of the organization concerned and come up with logical data models that can easily support such requirements (Worboys, Hearnshaw, & Maguire, 2015)..
In most cases, the two methods are mixed in creating database data models: by merging the outlined organizational requirements with the prevailing data structures, which are in some cases application-specific and must be stretched to include emerging information systems. This is seen as a third methodology, also referred to as a mixed-method approach.
Benefits of Data Modeling in a database
Managing Data as a Resource
Modeling data makes it possible to normalize it and define it in regard to what it is and the attributes it accommodates. Again, data modeling provides the tools which are necessary in querying the database and obtaining the reports (Livingstone, Manallack, & Tetko, 2017). Without a modelled database, it’s a sure way of having a great data deal which has no efficient way or no way at all to be used. But with a modelled and a well-designed database, it makes it easy for business users to have access to information.
Integration of the existing systems
Many organizations have their data in different systems that cannot communicate with each as required. Modeling data in each of such systems makes it easier to understand the relationships and the redundancies in the data. And through that, the discrepancies can be resolved and these disparate systems integrated so as to work together (Livingstone, Manallack, & Tetko, 2017).
Business Intelligence
With data modelling, the entire requirement gathering of an organization is completed and merged from different sources which make querying and reporting requirements possible. This opens up business intelligence opportunities very rare when data is in silos, under haphazardly-designed databases. Business trends can be spotted easily, the spending patterns and predictions which can assist a business to navigate through challenges and opportunities (Livingstone, Manallack, & Tetko, 2017).
KEY-VALUE DATABASE MODELS
Key-value databases, also referred to as key-value store are the most flexible forms of NoSQL databases. This type of database modelling has emerged as an alternative to the limitations involved in the relational databases of past, where data was structured into tables and schemas predefinition was compulsory. In these models, no schemas are involved and data values are opaque. Contrally to the relational databases, the values under this model are recognized and accessed through a key. The stored values are in form of numbers, short videos, strings, binaries, counters, HTML, JSON, XML, images and more. In regard to flexibility, this is the most flexible model of NoSQL databases because it has a complete control over the stored value (Han, Haihong & Du, 2011).
In Key Value databases, data is presented as a group of key–values which are in pairs. The key–value model is among the naivest non-trivial data modelling for a database, and other richer data models are usually implemented on its top layer. These databases introduce REST-full APIs and protocol buffer interfaces necessary for accessing data. Also, Key Value databases like Riak have the ability to support some additional features like the search functionality, a distributed search engine with an inbuilt query language, Secondary Indexes which enable the stored objects to be tagged with additional values and query exact range and MapReduce which is a non-key-based querying mainly for large datasets (Han, Haihong & Du, 2011).
BENEFITS OF KEY VALUE DATABASE MODELS
Currently, there are implementations in businesses that don’t fit in the traditional relational databases. Such implementations benefit from key-value models, which have a number of advantages, including (Travis, 2014):
Flexibility of data modeling: considering that key-value database does not impose any data structure this factor makes it flexible for modeling data to meet the overall requirements of the application.
High performance: In Key-value architectures, no need of performing union, lock, join and other operations involved when working with objects under relational databases and that makes them beat relational databases in terms of performance. Unlike relational databases, key-value databases don’t entail searching through columns and tables to get an object. When aware of the key, locating the object becomes very fast.
Massive scalability: generally, any database query engine complexity corresponds to its scaling difficulty. Most of relational databases complicated query engines. Contrally, most key-value databases don’t have query engines at all, since their lookup path can be traced directly from the request to where the object is located in the memory. For that matter, most key value databases are easier to scale as compared to relational databases. Especially for distributed databases designed to dwell in different servers. Again, relational databases have a limitation of scalability; based on where their relation indexes are stored, amount of data existing in the system, speed of network within distributed system, and several other factors. Key value databases don’t have such a limitation, since data relations are not calculated by the query engine.
High availability: some of key value databases utilize masterless distributed architectures which eliminate single point of failure and thus maximizing on resiliency of the database. This makes these databases easier and less complex as compared with relational databases.
Data handling restrictions: The data handling restrictions evident in Column Family format have some implications on how data is stored inside a system and the way query engines process requests. Such restrictions as well as implications further impacts the scaling profiles of the databases involved. Key value databases are free from such restrictions, and mainly rely on the application code to break down the data. For that matter, it becomes easy to scale key value databases irrespective of the data types being stored. This is especially true when considering the distributed databases.
Data modeling process in key value databases
The best practices involved in key value data modeling mainly focus on the pattern of access. Therefore, developers are encouraged to have the problem approached from the application data fetching point of view. In the cases where data is written in a manner that it fits the required format of the application that fetches it, such a data model is considered to be nearly transparent, a good key value data model “falls out” of access-patterns and approach to design.
In key value databases, the process of modeling is supposed to take place within the OSI application layer. However, in more restrictive NoSQL databases with restrictive APIs like graph databases which deal with both nodes and edges, modeling processes is recommended to take place within the same database.
Besides the pattern of access, the design considerations are: whether data is supposed to be encrypted, versioned or modified when it is persisted, it will be written or read to more often, and whether it will be altered. Data that is not entitled for modification is called “immutable” data, and such a data provides advantages to the system architecture (Travis, 2014).
What sets out key value databases from the rest of NoSQL Databases?
Massive scalability
Generally, any database query engine complexity corresponds to its scaling difficulty. Most of relational databases complicated query engines. Contrally, most key-value databases don’t have query engines at all, since their lookup path can be traced directly from the request to where the object is located in the memory. For that matter, most key value databases are easier to scale as compared to relational databases. Especially for distributed databases designed to dwell in different servers. Again, relational databases have a limitation of scalability; based on where their relation indexes are stored, amount of data existing in the system, speed of network within distributed system, and several other factors. Key value databases don’t have such a limitation, since data relations are not calculated by the query engine (Travis, 2014).
Data handling restrictions
The data handling restrictions evident in Column Family format have some implications on how data is stored inside a system and the way query engines process requests. Such restrictions as well as implications further impacts the scaling profiles of the databases involved. Key value databases are free from such restrictions, and mainly rely on the application code to break down the data. For that matter, it becomes easy to scale key value databases irrespective of the data types being stored. This is especially true when considering the distributed databases.
High performance
In Key-value architectures, no need of performing union, lock, join and other operations involved when working with objects under relational databases and that makes them beat relational databases in terms of performance. Unlike relational databases, key-value databases don’t entail searching through columns and tables to get an object. When aware of the key, locating the object becomes very fast (Travis, 2014).
High availability
Some of key value databases utilize masterless distributed architectures which eliminate single point of failure and thus maximizing on resiliency of the database. This makes these databases easier and less complex as compared with relational databases (Travis, 2014).
Flexibility of data modeling
Considering that key-value database does not impose any data structure this factor makes it flexible for modeling data to meet the overall requirements of the application.
Current changes in Key-Value database modelling
Initially, Key-Value database users were either very smart or cheap. The prior were technologically shrewd to leverage the strengths of such a NoSQL store, whereas the latter had very few options rather than learning such a considerable different terminologies and technologies that required to be put into work.
Consequently, a third consumer who is neither technologically astute nor financially strapped, yet in need of reaping the benefits of NoSQL and relational technology combination due to the ever fluctuating data management approaches and which are recently being altered by the solidification of Big Data, the constant need for data and consumerization of IT (Travis, 2014).
The latter consumer has played a very crucial role in producing a profound effect on the recent developments being observed in the Key-Value database sphere and by signaling a shift in such a technology which has facilitated an increased SQL access, transactional data support, greater integration capabilities and advanced computations.
In accordance to Robert Green, the principle product strategist in Oracle NoSQL database technologies, the change from such a technology which appeared more reserved for some chosen few to a technology that can be widely embraced by many data users was more than inevitable:
“For this stuff to become more widely adopted and moved into production, you have to get away from the small group of guys who were at or near the top of their class, and went into startups. It’s necessary to get it to the average every day Joe working for a basic Fortune 2000 company on a regular project, but he has to think about how to make it work and he might get confused, so the technology is evolving to take on familiar concepts and terminology.”
Non-Relational SQL: The Access Paradigm
Customarily, Key-Value stores used range-based channels in a specific key space that would return esteems for information. This technique was less exact than customary decisive and basic questioning, which is the reason sellers have slowly expanded information get to (not information stockpiling) support to ordinary SQL based questioning (which a few merchants do with a subset of SQL in view of their own specific database) (DeCandia et al, 2017).
In this way, end clients cannot just use the customary engineering helps of rate, accessibility, and adaptability of Key-Value stores, yet in addition inquiry of the information through a recognizable social dialect. Similarly as important is the way that the expanded SQL get to likewise takes into consideration Data Modeling in a traditional unthinkable portrayal – which most SQL clients are now comfortable with. Notwithstanding how the information is put away, the table meta display overlays the capacity engineering and incredibly lessens the intricacy of demonstrating information for any number of uses. Conveying this technique diminishes time to generation and results in more solid application building.
Exchanges
One of the more famous patterns to affect Key-Value stores is expanding support for exchanges and value-based information, which was ordinarily restricted in NoSQL alternatives. Be that as it may (DeCandia et al, 2017), various sellers have as of late entered the commercial center reporting support for value-based information, proclaiming a development in which essentially the greater part of the real players (Oracle, Cassandra, Google) have presented alternatives for value-based help.
This adjustment in Key-Value stores is another case in which the innovation for these databases is improving their ease of use – for this situation by making it significantly simpler for engineers to program. More noteworthy exchange bolster and the sending of table meta models for information demonstrating make application advancement significantly more practical for the venture, while the arrival to SQL get to energizes a level of incorporation essential for utilize situations when workloads require accumulation. Also, offering help for exchanges significantly widens the workload abilities of Key-Value stores, expanding their utility for clients (DeCandia et al, 2017).
Store Computations
The utility for Key-Value databases is likewise becoming because of a more prominent accentuation on in-memory appropriated figuring. NoSQL databases can regularly reserve information more successfully than numerous conveyed store frameworks because of NoSQL’s abnormal state of unwavering quality and consistency ascribed to their boss building benefits, which Greene says is in charge of their trade of matrix storing for a significant number of Oracle’s clients. The ongoing accentuation on the appropriated abilities of NoSQL stores will probably bring about more significant calculation closer to where the information really lives, which will prompt enhanced investigation, information gathering, total, and calculation preparing (DeCandia et al, 2017).
Conclusion
Key value databases are the most fundamental databases of all, this is because they are the simplest of all in representation and don’t need query planners. For that matter, key value databases have the greatest base for most data platforms considered sophisticated. Data platforms are built for diverse use cases, such as fault tolerant systems or high available systems. If such data platforms are established firmly on a robust foundation, the data model choice can be achieved just as a matter of developer’s convenience, rather than an operational tradeoff. Future data platforms are expected to be established upon key value databases, having richer query languages, richer data models and APIs which are exposed over time.
References
Worboys, M. F., Hearnshaw, H. M., & Maguire, D. J. (2015). Object-oriented data modelling for spatial databases. International journal of geographical information system, 4(4), 369- 383.
Livingstone, D. J., Manallack, D. T., & Tetko, I. V. (2017). Data modelling with neural networks: advantages and limitations. Journal of computer-aided molecular design, 11(2), 135-142.
Han, J., Haihong, E., Le, G., & Du, J. (2011, October). Survey on NoSQL database. In Pervasive computing and applications (ICPCA), 2011 6th international conference on (pp. 363- 366). IEEE.
Travis, J. (2014). U.S. Patent No. 8,745,014. Washington, DC: U.S. Patent and Trademark Office.
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., … & Vogels, W. (2017, October). Dynamo: amazon’s highly available key-value store. In ACM SIGOPS operating systems review (Vol. 41, No. 6, pp. 205-220). ACM.