This essay has been submitted by a student. This is not an example of the work written by professional essay writers.
Visual Art

SPATIAL-BASED DATA ASSESSMENT

Pssst… we can write an original essay just for you.

Any subject. Any type of essay. We’ll even meet a 3-hour deadline.

GET YOUR PRICE

writers online

SPATIAL-BASED DATA ASSESSMENT

Student’s Name

 

 

Course Name

Professor

University

City and State

Date

 

 

 

Spatial-Based Data Assessment

Fitness for purpose

Introduction

Site selection is a fundamental process in determining the suitability of a given geographical area to host a given facility. Researchers and developers use geographic information systems (G.I.S.) to analyze and evaluate spatial data to make decisions based on an accurate representation of variables. The process of coming up with a functional G.I.S. application involves various steps. These steps include data sourcing, data integration, and capture; also, analysis of data and the generation of an output that is usable. The complexity of this process makes it prone to errors. This report, therefore, evaluates the quality of data used in the site evaluation process of a bioenergy plant and points out the error issues related to data processing. It then concludes by suggesting improvements that are necessary to enhance the accuracy of the output of the G.I.S. application.

Criteria for Assessing Purpose Fitness

Fitness-based for useful purposes refers to how easily and efficiently data can be interpreted and used in a way that meets the data needs of a specific project. In coming up with fitness for purpose criteria, a developer or researcher should first define the problem he wants the data to address. The researcher should then identify the relevant variables (Hong and Huang, 2017). The use of indicators makes this process easier. However, a researcher has to settle on the right signs to meet the needs of the G.I.S. application. The site selection process for a bioenergy plant presents several problems. These problems are both social and geographical. In designing the criteria for data to be used in the development of G.I.S. applications, the environment agency should make several considerations.

The currency of the data sets available is the first criterion that applies to this case study. This criterion is specifically essential when considering attributes like population density. Characteristics like population density are likely to change over time due to migration. However, for this application, the population dataset is out of the operator’s control. It is the government that determines the intervals in which to carry out the census process.  When data is readily available in many sources, it is easy to set a threshold for its currency (Daniels 2011). Therefore, the threshold for conservation data should be set to < 10 years. This limit holds because it is likely that new conservations areas may have come up within this period.

The scale of a data set would be the other criterion used to determine its fitness the application. The size used determines the accuracy of the data, especially in terms of how various errors affect the output. To get accurate output, the scale for source data should be >1:10000. This scale threshold is justified because the area that the site selection process covers is large.

The data content is also crucial in determining the fitness of data used for this G.I.S. application. The data content should directly help in the generation of a solution to the problem that the form seeks to solve.

Moreover, it is crucial to consider the positional accuracy of the data used in the G.I.S. application. The data used in maps for the equal representation of a geographical phenomenon should be the nearest representation of its exact physical position. To ascertain the fitness (Daniels 2011) of purpose in terms of positional accuracy for it to be used in this application, the threshold of the use of maps should be those that employ at least a two-dimensional orthogonal coordinate system. This positional accuracy would ensure the minimization of the effects of errors during data integration and capture.

Semantic accuracy should also be put into account when determining the fitness of data for this application. This aspect is crucial because some of the data sets provided are detailed enough; hence they can be examined for this quality.

Also, it is vital to examine the data set aside for use in this application for repeatability and completeness. The operator should inspect the data for possible gaps and unclassified areas.

Assessment of Available Data

The environmental consultancy firm G.I.S. application needs to provide an accurate output. As such, the data used should reflect the current spatial information. This need raises the question of whether or not the elevation data set collected in 2000 is fit for purpose. It is important to note that NASA did the last Shuttle Radar Topography Mission in 2000, and it provides the most recent digital elevation model. As such, the elevation data set used is current. The available data on the population is also based on the recent U.K. census, although it was collected more than ten years ago. However, the information on road access collected was last updated in 2018. In as much as there may not be significant changes in road access data, it noteworthy that more current data on road access is available. The data set, however, is fit for purpose since the error that may arise due to the time difference is negligible.  In terms of conservation, the provided data is updated and current. Generally, the currency of the data sets used in the G.I.S. application is reliable; hence the data can be used to achieve accurate results in terms of money.

In the bioenergy plant case study, the scale used for most of the data sets used is meet this criterion and able to provide accurate outputs and projections; however, the size used in the accessibility data sets of 1:25000 is big. This range jeopardizes the quality of the production of the G.I.S. application since it leads to a more significant positional error. Also, the resultant raster grid may not be accurate due to the large raster size. The table below demonstrates the relationship between positional error and map scale.

Table1. Representation of the map and its scale

Scale1:50001:100001:250001:500001:100000
Spatial resolutions (m)2550125250500
Permitted RMSE (m)102050100125
Permitted Emax (m)2040100200250

 

(Adapted from Chen and Zou 2013)

Moreover, considering the difference in the scale relationships between the datasets is not huge, it will be easy to integrate them into the G.I.S. application. Furthermore, the area needs for the plant is relatively large; hence it is easy to rely on the scales used in the data sets to produce accurate outputs from the G.I.S. application.

The content of the data should be relevantly addressing the problem necessitating the G.I.S. application. In the case of the bioenergy plant, data should help in finding areas that meet the selection criteria outlined for the site. Therefore, to be fit for practical purposes, the user data in this project particularly should help in calculating population density, evaluating the road access, assessing the environmental impact concerning conservation areas. Noteworthy, the contents of data sets used in this case study are relevant and meet the criteria set out for site selection.

Positional accuracy represents the nearness of the values of a geographical phenomenon in a coordinate system to the actual position of the event. This criterion, therefore, applies mainly to the data sets representing conservation and terrain. Both data sets are appositionally accurate since they are in the two-dimensional orthogonal coordinate system (Robinson, Webber, and Eifreim 2015). However, there is the need to manually entering the conservation X.Y. coordinates during the data integration. The resultant data may, therefore, not give an accurate output since it may be prone to errors.

Considering the data sets provided, the information contained in them meets the completeness criteria. This assertion is because the attributes of the data have a direct relation to the standards used in the site assessment. The resultant G.I.S. application output will, therefore, be accurate. For instance, for the conservation data set, the attributes used, such as National Reserve boundaries, are directly related to whether or not the site chosen will meet environmental conservation standards. This relation shows it is a complete data set. The population census figures are also directly connected to population density. Moreover, it is easier to determine the completeness of the data sets given since none of the offers time-lapse data.

Furthermore, the fitness for purpose assessment should also take into account the semantic accuracy of the presented datasets. The datasets provided in the bioenergy plant case study are semantically accurate because the map formats shown are detailed. For example, the D.E.M. provided for the elevation dataset is in 3D. The environmental consultancy can, therefore, derive accurate semantic meaning from it. The population density data set is also semantically precise since it is based on L.S.O.A which are much smaller hence give detailed data.

To determine whether the data sets provided are fit for purpose, the environmental consultancy firm should also check whether or not there is repeatability. The data sets provided meet this criterion. However, it is also essential to consider repeatability in connection with the output data (Baptista et al. 2014). A study of the data lineage in the G.I.S. application reveals that the processes involved in data integration and analysis would not lead to repeatability in the output. However, the conservation X.Y. coordinates data may be a risk area. This risk arises from the fact that it involves the manual input of coordinates.

 

Errors

G.I.S. applications are usually complicated due to the numerous processes involved. These errors either external or internal. Glaring errors are as a result of a G.I.S. user working implementing a wrong G.I.S. operation or using the erroneous G.I.S. data. On the other hand, internal G.I.S. errors are those that are not directly influenced by the user. They refer to aspects of quality.  It is important to note that internal errors can aggravate small external mistakes as a result of error propagation. This propagation is as a result of the cyclic nature of data production. The data used in G.I.S. go through the same production process as those produced by the G.I.S. application. Therefore, the errors present in the data incorporated in the G.I.S. or in the process of creating the input data will affect the quality of the output of the G.I.S. (Connoly and Begg 2015). Moreover, it is impossible to represent reality through conceptual and mathematical models correctly. This section of the report identifies and evaluates all errors that can occur concerning the G.I.S. followed steps throughout the bioenergy plant case study.

The data stream process is critical is understanding the mistakes in G.I.S. applications. Due to the multi-stage approach given in the G.I.S. development process, errors are introduced at any stage. For example, after the capturing of data, it needs to be manipulated and edited. In as much as a researcher may avoid an error during the manipulation process, an undetected error may have sneaked in during data capture.

Two types of errors may be introduced with the source data in the bioenergy case study. One of the mistakes that might occur in the process of producing the source data is the sampling or measurement error (Shi and Stein 2016). In the case of D.E.M.s, the spatial resolution used for generating a source data set can hugely influence the margin of error associated with the data.

The figure below illustrates this.

 

 

 

Fig. 1. Comparison between the three-dimensional perspective views using generalized D.E.M.s at ((a) to (c) from left to right) 90 m (a), 125 m (b), and 250 m (c) spatial resolutions (Adapted from Chen and Zou 2013).

For example, in the production process of the D.E.M. used in the elevation data set, NASA could have used sparse grids; hence the final representation of the terrain picture could be less detailed. However, the effect of this error may not be huge due to the scale used.

The source data for the G.I.S. application may also contain digitizing errors. These errors occur in the process of capturing data in a format that is used inside the G.I.S. For the bioenergy plant G.I.S. application; there could be a repetition of these errors. When coming up with the digital maps used as data sources, the manual digitisers may have made both psychological and physiological errors. However, these errors could result in consistent discrepancies between the digital and physical maps hence not have a considerable effect on the final output of the G.I.S. application.

In the process of data capture and integration, for the bioenergy plan site selection in the G.I.S. application, rasterizing errors may occur. These may occur as positional or attribute errors. As a result, this may result in positional changes. Besides, during rasterisation, connectivity may be lost and created in the wrong places (Rajafbard, 2020). The resultant effect would be the loss of critical information terms of boundaries. Moreover, the person involved in rasterisation may place two grids over the same vector map at different angles. This error may, however, not occur in this case study because almost all the source data that is used in this G.I.S. application was generated using the same map projections systems. To minimise the effects of the errors introduced through data integration, data cleaning and editing is vital.

It is also important to note that data editing and integration may also be a source of errors. In this case study, there is the presence of large data layers. This massive amount of data itself is a risk factor for producing mistakes during the data editing process. However, since the source data is mainly is vector form, the operators can eliminate or minimise these errors by specifying the tolerances for the operation.

Another mistake that may be present in the G.I.S. is an error in map overlay. In this case study, there is a high risk of this error occurring since large data sets are involved. For instance, there are two conservation maps whose future are identical save for the boundaries. In this case, the sliver polygon error is likely to arise. This error is mainly as a result of the multiplier effect of errors produced during data capture and integration.

Conclusion

In conclusion, the process of determining the fitness for data used in the G.I.S. application is detailed and complex. It involves devising criteria for assessing the data and ensuring these criteria are relevant to the needs that should be addressed by the G.I.S. application. Therefore, the operator should not only seek to examine the data for the various vital components but also seal the simple loopholes are usually sources of error in the G.I.S. application designing process.

 

 

 

 

 

 

 

 

 

 

 

Reference List

Chen, Y, and Zhou, Q (2013). A scale-adaptive D.E.M. for multi-scale terrain analysis. International Journal of Geographical Information Science, 27(7), pp.1329–1348.

Connolly, T. and Begg, C. (2015) Database Systems: A Practical Approach to Design, Implementation and Management 6th ed., Harlow, England: Pearson Education.

Daniels, J. A. (2011). Advances in environmental research. Volume 10 Volume 10. New York, Nova Science Publishers. http://site.ebrary.com/id/10661590.

De Souza Baptista, C., Pires, C. E. S., Leite, D. F. B., and de Oliveira, M. G. (2014) NoSQL geographic databases: an overview. In Pourabbas, E. (ed.) Geographical Information Systems: Trends and Technologies. Boca Raton, Florida: C.R.C. Press, pp. 73-103.

Hong, J. and Huang, M. (2017) ‘Enabling smart data selection based on data completeness measures: a quality-aware approach’, International Journal of Geographical Information Science, 31(6), pp. 1178–1197. doi: 10.1080/13658816.2017.1290251

Rajabfard, A. (2020). Sustainable development goals connectivity dilemma: land and geospatial information for urban and rural resilience. Boca Raton, New York: C.R.C. Press.

Robinson, I, Webber, J, and Eifrem, E (2015) Graph databases: new opportunities for connected data. 2nd ed.: O’Reilly.

Shi, W, Wu, B, and Stein, A (2016). Uncertainty modeling and quality control for spatial data

 

 

  Remember! This is just a sample.

Save time and get your custom paper from our expert writers

 Get started in just 3 minutes
 Sit back relax and leave the writing to us
 Sources and citations are provided
 100% Plagiarism free
error: Content is protected !!
×
Hi, my name is Jenn 👋

In case you can’t find a sample example, our professional writers are ready to help you with writing your own paper. All you need to do is fill out a short form and submit an order

Check Out the Form
Need Help?
Dont be shy to ask