Capstone Project- Statics

The two variables that were chosen for the proposal were inclusive of the amount of knowledge in mathematics and the hours of study. It is believed that the hours of research have an impact on the amount of experience in mathematics (Gauer & Jackson, 2018). Mathematics is essential in our daily routine as it has numerous applications. The level of understanding in mathematics involves constant practice, and this is seen in the number of hours utilized in studying mathematics. The hour of studies affects the amount of knowledge in mathematics. Statics have a wide range of applications, and it is, therefore, essential to have extensive experience in mathematics for a broader specialization.


The target population of the research will involve all the college students at the university who will be selected randomly. The college students are respondents with adequate expertise in comparison to time and amount of knowledge. It is proposed to consider different ethical issues involved in the research principally. The typical concern will be participant consent, as informed to the respondents. The involved respondents are to be alerted in advance about the study’s aim. The identities and names of the respondents will be hidden to meet the university’s code of ethics. Hiding identity will also enable the respondents to give honest answers, and this will reduce biasness in the gathered data. Confidentiality of the information will be paramount to increase the students’ confidence in answering the questions.

The concerned survey is that the hours of studies affect the amount of knowledge in mathematics. The folfollowing lowing following are the survey questions involved in the research;

  1. How many hours do you study mathematics in a day?
  2. What amount of knowledge in mathematics do you have?

Data and Descriptive Statistics

The incorporation of descriptive statistics is a summary of coefficients for a given set of data. The data could provide the representation of a sample or the whole population involved in the analysis (Bickel & Lehmann, 2011). This type of statistics is useful in understanding and describing the features found in particular sets of data. The description is carried out by summaries of the measures and samples of the given data. The standard categories of descriptive statistics are inclusive of the measure of central tendency, which involves mode, median, and mean. These are the primary examples utilized in all levels of statistics and mathematics.

Sampling Method

The method of data collection intended for the research is the use of questionnaires among college students found in the university. The selection of the respondent will be random among the chosen sample population for conclusions and discussions. The use of the internet would also assist in issuing the questionnaires, and some of the respondents are to be reached by telephone. Questionnaires will be issued randomly, but care will be observed to avoid being biased or leaning towards one gender. The research is intended to take place in the university upon reporting back, and the submission is expected after about three months. Random sampling sometimes suffers from the challenge of being biased towards a certain age or gender. Such biasness consequently leads to making unreliable and invalid information and conclusions.


Data were collected using a survey of about twenty cases according to the set plans. To meet the practices of good ethics, for surveying the human topics, it is crucial to consent to be under survey, and there is protection to their information. The table below represents a sample representation of data as expected in the research. Tables, figures, and charts were used to represent the collected data for a better comparison and conclusions.

Table 1. Sample data entry.

ParticipantVariable 1: study hoursVariable 2: the amount of knowledge


A population of about 25 respondents was deemed sufficient for accomplishing the set objectives of the research. Comparison and discussion were carried out to come up with valid and reliable conclusions.

Variable 1- Descriptive Statistics

The first variable will be the number of hours utilized in studying mathematics. The subject of mathematics requires constant practice, and this is realized by the time spent studying. Different summaries will be performed, including interquartile range, five-number summary, standard deviation, and mean. Excel spreadsheets will be utilized to plot histograms for describing the distribution of the data. The shape of the data, together with a comparison of medians and means, will then be discussed. Mean is a better way of describing the center of the data as it indicates the center’s most valuable. Median is the middle number, which might not be the central value of the data set.

Variable 2- Descriptive Statistics

The second variable will be the amount of knowledge in mathematics. The analysis involved here will be similar to that of variable one.


The research involved a limited timeline and budget for its accomplishment. The respondents were not pressured to give information about the topic. From the sample results above, it can be said that the amount of knowledge in mathematics increases with the increase in the hours of studying. Standard deviation will be utilized to relate the spread of the collected data.

Z-Score and Outliers

Following the research, it is expected that the results and the data collected will incorporate outliers in both the data. This may be a result of random errors that are caused by mainly human errors. Several of the data will be beyond twice the standard deviation mean as the reference point (Altman, Iwanicz‐Drozdowska, Laitinen, & Suvas, 2017). The data furthest from the mean will be evaluated from the data and named as outliers. Usually, the outliers are omitted in the final calculations and conclusion as they can cause erroneous errors. Utilizing the excel spreadsheets, various central tendencies will be determined, including the mean, standard deviation, and later evaluation of the Z-score. An outlier could be below or above the value of the data’s mean as calculated from the raw information recorded. Outliers will also be defined by the use of the Z-score to determine whether the outliers are on the fences for the involved variable.

Correlation and Linear Regression

Referring to part one of the assignment, the relationship that can be conceived from the two variables is that the amount of knowledge in mathematics depends on the hours of studying mathematics. The lower the study hours, the little the amount of knowledge in a given student or respondent. The use of linear regression will be utilized to analyze the involved quantitative variables are mentioned above. The explanatory variable, in this case, will be the hours of studying mathematics, which will be given in hours. For the y-axis, the amount of knowledge in mathematics will serve as it is dependent on the time of the study. Whenever a student spends less time studying, then the likelihood of the student to increase the amount of knowledge in statics is almost zero. The vice versa is predicted and expected to be true.

The level of knowledge is a dependent variable while the hours of study are independent variables, and this is the reason for allocating each into its respective axis. With the quantitative data, different software will be used to record and plot various figures and charts for a better interpretation. A scatterplot will be selected as it works excellent with linear regression in illustrating the strength, form, and direction of the relationship between the variables. The discussion is expected to use the respective terminologies found in statistics. Using an Excel spreadsheet, the correlation will be determined together with the equation for linear regression. The expectation is that there will be a correlation between the observation and the hypothesis formulated. The number of data sources will be less to save time and ensure the accuracy and precision of data and information.


The initial belief of the analysis will be confirmed by correlation and linear regression of the variables. The source of data has been streamlined to reduce the chances of errors while maintaining the precision and accuracy of the raw data and consequences of the results found. The presence of outliers affects the precision of data, which might harm overall results. There is a vast range between the mean and the outlier indicating a case of wrongly recording or transferring data creating a drag in the information. Hence, the use of outliers makes the discussion and conclusion invalid and unreliable. The sampling method utilized will be random sampling. This sampling technique involves picking respondents without a strategy. The limitation is that it can be time-consuming, and there might be cases of being biased.




