Week 2 Lecture
Introduction
All right, let us jump into statistics from the beginning. For some reason, when students hear or see the word “statistics,” a frighten look appears on their face, and they back away. I am not sure why! The word “statistic” did not suddenly jump out at you with hands ready to grab you; however, no matter how much statistics might frighten you, be brave and face it.
MGMT650, Statistics for Managerial Decision Making introduces statistical procedures to the graduate students. Once you completed the course, you might want to take a course that is solely devoted to statistics.
Putting Statistics into Context
Are you wondering as you undertake perhaps your first course in statistics—what do statistics have to do with me, my life, or my profession? Many students ask (or want to ask) this question in my face-to-face classes.
Here are some ideas about what statistics have to do with our lives. Managers and all kinds of professionals, along with conscientious citizens and family members, need to understand research and statistics. We need to be thoughtful, obtain accurate information, and make informed decisions. Today, it is hard to read business reports, the newspaper or your favorite journal or magazine without coming across quantitative data, and important conclusions based on analyses of these data. Research and statistics are simply everywhere. Here are some examples of research findings you may come across: cash bonuses work better than recognition in increasing employees’ performance; productivity will increase if staff works in the office rather than telecommuting; and if you exercise more your quality of health will increase.
The UMUC Graduate School wants you to know accurate research principles and to be able to interpret statistical test results. We all need to make decisions, hopefully the “right” decisions. Properly executed research and correct statistical analyses increase the odds of making decisions that are likely to be the right ones.
We begin this course with an overview of the context in which statistics are applied and interpreted. Statistics are used in the context of a research project. Notice we are not just jumping straight into statistical testing to start our course. If we did a t-test today, you could follow the directions in your text, use Excel, and run a t-test. However, without putting statistics into the larger context of a research process, you are not likely to understand the purpose of the t-test or what it tells you. We might get a t-test statistical result of t = 2.06 and p level = .10. However, what does that mean? So, larger picture (context) first.
What Do We Do With Statistics?
As just noted, statistics are used in the context of a research process in which people have a question or a problem to solve. They try to describe or hypothesize what is causing the problem. Then, they collect data and statistically analyze the raw data. Based on the results of the statistical tests, they draw conclusions with a known probability of accuracy.
The figure below definitions statistics:
Sometimes, people use statistics to describe concepts, like how many men versus women use the Internet. However, describing data is easy and not the focus of our course. Most people use statistics for more serious and insightful purposes, which include evaluating data and testing hypotheses and to allow us to sample and infer conclusions about the population. When we engage in correct research methods and use proper statistics, we greatly increase the chances we will obtain accurate information and make a correct decision.
Types of Statistical Applications in Business and Elsewhere
Today, managers—people like you and like me—use statistics in many areas of our professional work.
We are also likely to use statistics in many areas of our personal lives. Research and statistics often inform us or help us answer questions such as, “Will my children get a better education if I send them to private rather than public school? If I take a GRE prep course, am I likely to earn a higher score?”
Mean, Median, and Mode
The mean, median, and mode are the most common feature concerning descriptive statistics. Luckily, Microsoft Excel will calculate the means, median, and mode for you. There are a few different methods of calculating the raw data for central tendency.
Mean
Remember your grade school days in arithmetic and math classes. The teacher wrote a series of number on the board and asked you to find the average. That is, the central number on bell curves. This is easy; just add the numbers together divide by how many numbers: 1 + 2 + 3 + 4 + 5/5 = 3. You can perform this operation on a calculator.
Median
Now, you want to find the middle position of the set of numbers. Again, you have a series of numbers: 2, 4, 3, 6, 7, 3, 5, 9, 2, 8. First, I suggest rearranging the numbers in sequential order: 2, 2, 3, 3, 4, 5, 6, 7, 8, 9. Look for the half-way, middle section, or midpoint (various terms found in statistical textbooks) of the data set. You discover that 4.5, between 4 and 5, would be the middle section.
Mode
Now, you want to find the most reoccurring number in the data set. Let us take another data: 4, 6, 8, 7, 9, 3, 8, 2, 8, 5. Again, rearrange the numbers in sequential order: 2, 3, 4, 5, 6, 7, 8, 8, 8, 9. You will notice that the number 8 appears more times than the other numbers.
After you have calculated the means, median, and mode, you say, “So what!” What should I do with the results of the calculation? This may depend on the purpose of the study. The numbers show you the distribution of scores.
Baseball story. The baseball season has ended, and you wish to perform some simple at bat calculations. You have all of the necessary data, thousands of numbers, to perform this task. First, you decide to find the overall hitting average in the league. The means would serve the purpose. Second, you make a decision the hitting average (means) only told some of the statistical story, also call descriptive statistics. Now you want to find out the midpoint of the batting averages and which players’ average fell in the upper 50 percent and lower 50 percent brackets, the median would serve the purpose; however, you can do one more operation for further information to impress your friends. Finally, because you like this statistical work, you want to find out which baseball player’s hitting average appears the most in the league. In this case, the mode would help you.
The above scenario is just a simple story to get you into central tendency.
Below is another mean, median, and mode example from Neil Salkind’s Statistics for People Who (Think They) Hate Statistics: The Excel Edition (2007).
Below is the data set:
Score1 | Score2 | Score3 |
3 | 0 | 154 |
7 | 54 | 167 |
5 | 17 | 132 |
4 | 26 | 145 |
5 | 34 | 154 |
6 | 25 | 145 |
7 | 14 | 113 |
8 | 24 | 156 |
6 | 25 | 154 |
5 | 23 | 123 |
Using Microsoft Excel – Data Analysis command
Using Excel Data Analysis command calculating means, median, and mode comes quite easy.
1. After locating the Data Analysis command, click in and locate in the menu — Descriptive Statistics – click into the operation.
2. You will see textboxes, in the input range scroll the three data sets into the textbook, check Labels in First Row, check Summary Statistics, and select an output option. Platy with it a bit to see what works best for you. If you are concerned about overwriting existing data, select the option “New Worksheet Ply” with a title in the textbook. I like using “Descriptive Stat.” The name does not matter! – Very easy to do!
It should look like this:
Click okay and the output will automatically appear as a new spreadsheet:
As you can see there are many categories, but you only need to look at the Mean (3rd row), Median (5th row), and Mode (7th row). You should also notice the Standard Deviation (7th row), which will be discuss in the next section.
Standard Deviation
When reading empirical research reports, you might see “sd,” “s,” or “SD” = followed by a number” in the Result section. The abbreviations stand for standard deviation.
Ok, what is a standard deviation? Without getting technical, the standard deviation is the spread of numbers from the means on the bell curve. Not all data will fall directly on the middle axis of the curve. Numbers on the bell curve will be on the right side (negative side) and the left side (positive side) of the means. Large standard deviation means higher spread from the means and small standard deviation means smaller or tighter spread from the means.
Look back to the data from the previous section.
As you can see in row 7, the standard deviations for the scores. Let us look at the Score 1 column, the mean = 5.6, and the standard deviation = 1.50 (low number), and the Score 2 column has a means of 24.2 and standard deviation of 13.87 (high number). Score 1 output is closer to the means, i.e., tight and gathered together, but Score 2 output is further away from the means. In addition, the bell curve for Score 1 would have a steep bell shape curve and the bell curve for Score 2 would be flatter. What does this mean? This would depend on the study. If you are comparing test scores from one group to another group, the low standard deviation may mean that the test-takers share the same knowledge of the topic; however, the higher standard deviation may mean the test-takers do not have the same knowledge of the topic. Is this good or bad situation? The researcher would need to explore the topic for additional information. You might have to look at the target population to sample population scheme. You might have to look at the actual test for content validity and construct validity.