Accounting solutions to questions

Accounting solutions to questions

Question 2

The ETL process is the method through which data gets extracted from the data sources, which are not optimized for the analytics objective, and then transferred to a central host (Niinimäki and Niemi, 2009). It involves extraction, transformation, loading and carrying out an analysis. Extraction consists of getting data from a given database. The data may be from different sources. Transformation involves the conversion of the data into a similar form and placing it on another database. The data gets standardized, cleansed, transposed and surrogated. Loading involves writing the data into a specific data warehouse.

Extraction

The raw data below is for the United States Federal Spending for the fiscal year 2020. It was extracted from the United States Federal website. The values are in $ billions.

Pensions- 1156.8, Health Care- 1340.2, Education- 209.7, Defense- 997.9, Protection- 47.0, Welfare- 368.5, General Government- 63.7, Interest- 376.2, Transportation- 101.6, Other spending- 128.3, Total spending- 4789.8

Don't use plagiarised sources.Get your custom essay just from $11/page

GET CUSTOM PAPER

Transformation

It involves arranging the data to be in a given format in a database for easier analysis. The data above can be rounded off to the nearest whole number and placed in a tabular form.

Item	Expenditure in $ billions
Pensions	1157
Health care	1340
Education	210
Defense	998
Welfare	369
Protection	47
Transportation	102
General government	64
Other spending	128
Interest	376

The data can get loaded onto a pie chart. From the pie chart, more analysis can be done, including the calculation of percentages.

Question 3

It involves the analysis of a dataset using both the supervised and unsupervised data approaches. A supervised approach uses an algorithm from an input and an output to predict the output of another variable (Gogoi et al. 2010). For the output approach, there is no any output variable to predict.

Below is a data set that illustrates the supervised approach:

X	2	3	4	5
y	5	7	9	11

The algorithm is a linear regression of y=ax+c

Where the value of a is 2 and the c value is 1

To predict the next value of y where x is 5:

y= (5 x 2) +1 = 11

The algorithm, which represents a straight line equation helps in predicting the next output variables for any given input.

An unsupervised approach has no algorithms to predict any future output variables. Clustering is one of the techniques to use in the approaches. Considering the data below:

Age of students in a class in years: 30,24,28,30,28,28,26,27,24,22,22,22,21,21,29,26,24,28,29,28,29,27,21

The data can be put into different clusters for easier analysis. The clusters may be as shown below:

Age clusters

Cluster	21-22	23-24	25-26	27-28	29-30
Number of students	6	3	2	7	5

The clusters are 5 in number, each covering two sets of age.

Question 4

It involves the visualization of results from the analysis (both the supervised and unsupervised approaches) and providing justifications for the choices. There are various ways of visualizing the two data sets above. For the supervised approach, a straight-line graph is of the essence. The justification is that the algorithm is a straight line equation. It is in the form of y = ax+ c where c represents the y-intercept, and a represents the gradient. Below is the visualized approach of the same:

The visualization for the unsupervised approach can be through bar charts. It is the best way because it gives room for more clear analysis and a person gets a mental picture of the representation.

References

Gogoi, P., Borah, B. and Bhattacharyya, D.K., 2010. Anomaly detection analysis of intrusion data using supervised & unsupervised approach. Journal of Convergence Information Technology, 5(1), pp.95-110.

Niinimäki, M. and Niemi, T., 2009. An ETL process for OLAP using RDF/OWL ontologies. In Journal on data semantics XIII (pp. 97-119). Springer, Berlin, Heidelberg.

Accounting solutions to questions

Pssst… we can write an original essay just for you.

Don't use plagiarised sources.Get your custom essay just from $11/page

Remember! This is just a sample.

Save time and get your custom paper from our expert writers