descriptive vs inferential statistics

Descriptive Vs. Inferential Statistics

Descriptive and inferential statistics are both statistical procedures that help describe a data sample set and draw inferences from the same, respectively. The Buzzle article below enlists the difference between descriptive and inferential statistics with examples.

The Basics A statistical study requires a population. Population is a group from where information is gathered. This information is called data.
Statistics is a major branch of mathematics and deals with the study, classification, analysis, and representation of data. The null hypothesis statement is an important statistical procedure that is used to define the relationship between two quantities. If the statement is rejected, a contradictory statement to the null hypothesis, called an alternative hypothesis, is taken into consideration. Statistics encompasses many methodologies and procedures for data research and analysis. These procedures are used in scientific, mathematical, industrial, and even societal applications. In general, it comprises two major branches for data analysis: descriptive and inferential statistics.
Descriptive Statistics
  • In descriptive statistics the data is described in a very accurate manner; complete information is provided about that particular data set.
  • That said, the data can only be described; it may not be feasible for the same description or analysis to extend to a similar larger group.
  • Graphical elements are used to describe this data set as well; visual representation helps us understand the data better.
  • Consider a simple example of descriptive statistics. Assume that there are 70 students in a class, and their marks in 5 subjects have to be displayed.
  • This data can be presented in a number of ways. The marks can be listed down from highest to lowest, for each subject, and the students can be categorized accordingly.
  • Or, the subjects can be ranked as per importance, and the students' marks can be presented from highest to lowest.
  • The average scores in every subject can be listed. Or, the overall average score among the students can be calculated and presented. Thus, the data set mentioned above can be described in a variety of ways: subject-wise, student-wise, score-wise - from highest to lowest.
Inferential Statistics
  • Inferential statistics involves studying a sample of data; the term implies that information has to be inferred from the presented data.
  • A sample of the data is considered, studied, and analyzed.
  • Unlike descriptive statistics, this data analysis can extend to a similar larger group and can be visually represented by means of graphic elements.
  • Consider a country's population. For the sake of convenience, a smaller sample of the population is considered, results are drawn, and the analysis is extended to the larger data set.
  • Assume that you want to find out if the citizens in a state like a particular author. In such a case, data is collected from every city, small samples are described graphically, and conclusions are drawn.
  • If 45% of the population in 5 cities, out of 8, vote for that particular author, you can, to an extent, assume that 45% of the state population likes that author.
  • You have to make use of certain other methods as well, to reach a slightly more reliable conclusion.
Descriptive Vs. Inferential Statistics
Descriptive Statistics
Inferential Statistics
Definition
It helps you describe, organize, and summarize the data. It presents information in a manageable form.
It helps you observe and analyze a sample of the data. It is used to generalize and make judgments.
Types
  1. Measures of Central Tendency In this method, a single value is relied upon to describe data. The three measures of central tendency are mean, median, and mode. Mean is better known as average. It is the sum of the data to be studied and dividing it by the total number of data. Median is the middle value in the data set. And, mode is the number that appears most often in the set.
  2. Measures of Dispersion In this method, the dispersion of the study data from the average is considered. The three measures of dispersion are range, variance, and standard deviation. Range is the difference between the largest and smallest values in the data set. To find variance, the mean has to be subtracted from every value of the set, squared and added, and divided by the number of values in the set. Standard deviation is the square root of variance.
  1. Confidence Interval It measures one sample and gives a range of values for an unknown population parameter. It is an observed interval estimate.
  2. Hypothesis Testing It is an assumed analysis of a sample. The inferences drawn may or may not be true, are based on probability, and may be uncertain.
Graphical Methodologies Used
  1. Pie charts
  2. Bar graphs
  3. Histograms
  4. Frequency distribution chart
  5. Mean analysis graphs
  1. Correlation analysis
  2. Survival analysis
  3. Linear regression graph
  4. ANOVA
  5. Structural Equation Modeling
Examples
  • Consider an example of a student who has scored the following marks in 5 subjects - 40, 45, 42, 45, and 45.
  • In this case, the mean is (40+45+42+45+45)/5, which is equal to 43.4.
  • This is a data set of five elements; therefore, the median will be at number 3, which is 42.
  • The number 45 appears 3 times, therefore it is the mode.
  • Similarly, in the above set, the range is (45-40) = 5.
  • The variance will be: (40-43.4)2 + (45-43.4)2 + (42-43.4)2 + (45-43.4)2 + (45-43.4)2, which will be 21.2.
  • The standard deviation will be 4.60, which is the square root of 21.2.
  • To calculate the confidence interval, you will require a very large data set.
  • You will need to evaluate the mean, the standard deviation, and the margin of error.
  • You will need to select a confidence level among 90%, 95%, and 99% (these are the regular values).
  • The margin of error is calculated by multiplying the critical value by the confidence level (in decimals).
  • Having an approximate confidence interval for a set of data helps you draw conclusions about the reliability of results, especially in surveys.
  • Also called the multiplier, the critical value is standard for these confidence level values.
  • In case of hypothesis testing, we may consider the mean or standard deviation, and use the principle of probability to arrive at an approximate solution. The conclusion is an approximate assumption as well.
Inferential statistics is used when we have to generalize information about the available data. It is used in salary, population, and many other similar statistics, where estimates are calculated using a sample. Descriptive statistics, by contrast, may be used to describe a sample or the whole population, but cannot be used in instances where conclusions have to be drawn and studied for future references.

Похожие статьи