Bus Q600 Blogs (Fall 2010): 2010

Thursday, December 9, 2010

Extra Office Hours

Please note that the TAs for our course (Sandy Wu and Majid Taghavi) will hold extra office hours at 2:30-5:30 pm, Tuesday, Dec 14, 2010.

I will also be available in my office on December 16 from 2:30 to 3:30 pm.

Sunday, December 5, 2010

Summary for Week 12 (November 30 - December 2)

This week we started new Chapter 11 which is concerned with the relationship between two variables (usually, one independent and the other dependent).

We talked about covariance (not too useful by itself) and correlation coefficient (more useful).

We then moved on to a discussion of simple linear regression where the objective is to see if an independent variable (i.e., the student population in the example) can be used to estimate the value of the dependent variable (i.e., the monthly sales in the example). We calculated the regression equation's coefficients and performed an hypothesis test to see if the model was "significant."

We will complete our discussion of this chapter in the final week by learning about the coefficient of determination (r squared).

Sunday, November 28, 2010

Summary for Week 11 (November 23-25)

We ended Chapter 9 with a discussion of hypothesis testing for equality of two variances. If the null hypothesis cannot be rejected, then the assumptions we made about the equality of variances in Chapter 9 are justified. The test statistic used was F-statistic which will be helpful in solving some problems in Chapters 10, 11 and 12.

We started the new Chapter 10 by first looking at a farming problem where the farmer is trying to determine whether she should use low, medium or high level of fertilizer to maximize the yield. To obtain samples we use experimental design with one factor.

The hypothesis that all levels of fertilizer are equally effective is tested using Analysis of Variance (ANOVA). We compare the between and within group variabilities which leads to a one-sided F-test. We found in the example that we can reject the equality hypothesis with a very low p value. We then looked at the confidence intervals for the differences between the means to see if one particular level of fertilization is better than the others. Chapter 10 ended by solving the same problem using MegaStat and examining the Summary Table which includes all the numbers we found manually.

In Week 11 we will do the course evaluations.

Tuesday, November 23, 2010

Summary for Week 10 (November 16-18)

We completed the discussion of Chapter 8 by considering a bottling problem (two-sided test) and then a problem with unknown population variance (which required us to use the t-distribution).

Chapter 9 was started with a problem involving the weight losses as a result of using the Atkins diet vs. the conventional diet. In this chapter we still make use of confidence intervals and hypothesis testing, but for two populations. Therefore, we don't encounter any new theory, but some of the formulas slightly differ from what we saw in Chapters 7 and 8. For example, when the variances are not known, we may assume their equality (which must be tested) and then compute the pooled variance.

The class ended by having two students toss 15 tennis balls into a bucket and testing the hypothesis that their success rates are equal. This material is not going to be in the final exam, but we did it to illustrate the use of MegaStat and to have some fun.

We will complete Chapter 9 in Week 11.

Saturday, November 20, 2010

Posting of assignment and exam marks

Dear Bus Q600 students:

As you know, I have been posting your assignment and exam marks on the course web site as a .pdf file after removing your first and last names. The only information that remains on the .pdf file about you is your student number which you should not reveal to others.

If you are not comfortable with this method of reporting your marks, please let me know so I can remove all information from the .pdf file related to your student number and your marks.

Monday, November 15, 2010

Midterm exam results

We have finally sorted out the problems with missing/incomplete student numbers on the scan sheets. The exam marks can be accessed by clicking on this link.

The population mean is μ = 81%, which is very close to what I reported as the mean 79% of a sample of n = 5 exams. The distribution is bi-modal, meaning that there is a large number of marks around 80%, but also a bunching around 55%.

I am pleased with these results, but I hope that the students who had low marks will try harder in the final exam.

Sunday, November 14, 2010

Summary for Week 9 (November 9-11)

I started by informing the class that the exams were still not completely marked because a few students entered their student numbers incorrectly on the scan sheets. This was taking us some time to sort out and that the results would be available early Week 10. However, a random sample of n = 5 exams which were marked manually revealed a sample mean of 79%, with a 95% CI ranging from 70% to 88%.

I then returned to the confidence interval problem for a proportion and talked about the election polls where interviewing roughly 1000 respondents is sufficient to get a 95% CI with a 3.1% margin of error. We found the correct sample size from a formula for n. This completed the discussion of Chapter 7.

Chapter 8 started with a taste test where a student claimed that he/she can tell the difference between Coke and Pepsi. The null hypothesis was that he/she would just be guessing. See this link for details of the experiment.

Null and alternative hypotheses were discussed in greater detail which were followed by a definition and examples of Type I and Type II errors. A one-sided test example with z-test involving cigarette tar content was presented. I then talked about the very important matter of p-value and showed that this value was very small in the cigarette example resulting in rejecting H0 for all α > p. The class ended with an example involving the DSB GMAT scores for 2005 (where the p value was very large, given the sample mean).

Saturday, November 13, 2010

Taste test experiment with Coke and Pepsi (Week 9)

To motivate the discussion for hypothesis testing, we did a taste test similar to the one that was done by "The Lady Tasting Tea" in Cambridge in the late 1920s. (The idea belongs to late Professor R. Fisher, the father of modern statistics.)

A student in each class volunteered to be the subject of the test. The students claimed that they can tell the difference between Coke and Pepsi. Similar to the "Tea" test, I brought 8 cups to the class and filled 4 with Coke and 4 with Pepsi. The null hypothesis H0 is that the student can't tell the difference, and that he/she is just guessing. (The students were Melissa, Sean and Bryan in EC01, C01 and C02, resp.)

If the subject is just guessing, there is a 1/70 chance (1.4% probability) that all 8 cups will be identified correctly. So, I would reject H0 if that happens. (This is the p-value.)

Melissa got all 8 correct in the second attempt, Sean got all 8 correct, and Bryan made 1 mistake. So, I rejected my H0 in Melissa's ands Sean's cases by stating that I believed they could tell the difference between Coke and Pepsi. But I couldn't reject H0 in the last case because there is a 16/70 probability (28%) that a person guessing will make 1 mistake. (These probabilities are the result of using hypergeometric distribution.)

Here are some still pictures of the experiment in Section C02 with Bryan doing the tasting:

Tuesday, November 9, 2010

Experiment with the inflatable globe in Bus Q600

In Week 8 we discussed confidence intervals for the population mean μ and the population proportion p. To illustrate the calculation of the point estimate for p and the corresponding confidence interval, the students participated in an experiment where they randomly picked points on an inflatable globe.

We did this for 30 times and in each section where the experiment was performed, we found that about 21 or 22 times water (lakes, oceans) was hit. This resulted in an estimate of about 70% which is very close to the true proportion of 70.8%.

Here is the video of this experiment (on YouTube) as it was performed in Section C02 (Thursday class).

Monday, November 8, 2010

Summary for Week 8 (November 2-4)

We continued with the z-based confidence intervals (CI) where σ is known. We completed the discussion of the CI calculation for the physicians' taxable incomes example. The important thing to remember is that a CI gives us the probability (e.g., 95%) of finding the true but the unknown mean μ in that interval.

When σ is not known, then we can't use the z-distribution, but fortunately, we have the t-distribution at our disposal (due to William Gosset, a.k.a. "Student"). This distribution is more variable than z, but when the sample size increases beyond 30, it too is approximated by the normal.

The important question of how to find the best sample size n was discussed in the context of the physicians' taxable incomes. We also looked at the problem when σ is not known. (Take a preliminary sample of size m, and continue!)

What about the CI for a proportion? This was motivated by looking at the election polling results. Towards the end of the class, we did an experiment using an inflatable globe and found the estimate of the proportion of water (oceans, lakes, etc.) to the total surface area of the globe. In each section the estimated proportion ended up being very close to the true 70.8% after taking only 30 samples. (The students randomly picked a point on the globe as either water (W) or land (L). I will post the video I took in Section C02 on YouTube a little later.

The midterm exam took place this week on November 5, 2010, at 5:00 pm in Great Hall, RJC.

Monday, November 1, 2010

Felipe Senisterra

Felipe, please contact Dr. Parlar regarding your recent e-mail.

Thursday, October 28, 2010

Bus Q600 classes in Week 8

Dear Bus Q600 students;

Please note that Bus Q600 classes will NOT be cancelled during Week 8 (Nov. 2-4). We will continue with the discussion of Chapter 7 which was started this week.

Good luck in your upcoming exams (accounting and economics).

Summary for Week 7 (October 26-28)

We started with a (detailed) review of the stock returns problem and looked at the n=2 and n=3 sample cases. In all cases we have the mean of the sample averages equal to the population mean, and the variance of the sample averages equal to the variance of the population divided by the sample size.

We discussed the adjustment necessary in the standard deviation of the sample means if the population is finite and sampling is done without replacement. Next, we looked at the MPG of Zebra 501 GT sports car using Visual Statistics (that comes with the book CD). The concepts of unbiasedness and minimum variance estimators were discussed. We continued with the example of a Mercury speed boat engines.

The chapter ended with the discussion of sample proportion which was motivated using the disastrous "New Coke" campaign in 1985.

The new Chapter 7 was introduced by an example involving the taxable incomes of physicians where μ is not known but σ is known. We will continue with this example next week.

Friday, October 22, 2010

Lost accounting textbook

Dear Bus Q600 students; I received the following e-mail from Vijay who has lost his book. Please contact him if you have found it.

``Hello Dr. Parlar,
My name is Vijay Somers and I am in your evening Q600 class on Tuesday. I believe I may have lost my accounting(A600) textbook in RJC214 last Tuesday. Is there anyway you could send out an email to the class to see if anybody picked it up by accident?

Thanks,
Vijay Somers''

Thursday, October 21, 2010

Summary for Week 6 (October 19-21)

We continued our apple juice can example where the content of the cans was uniformly distributed between 950mL and 1050mL. We calculated the probability that a randomly selected can had between 980 and 1020 mL of apple juice by simply finding the area of a rectangle.

In Section 5c, normal distribution was introduced. I illustrated this distribution by using data sets of heights and exam results. We also looked at some web sites describing Galton's board which illustrates the principle that binomial converges to the normal.

This chapter ended with a discussion of probability calculations which became possible by converting the X r.v. to a standardized Z r.v. We also analyzed the reverse problem of calculating the z-values, given the probability.

Chapter 6 is concerned with sampling distributions and the first topic discussed here was the distribution of the sample mean. We started a detailed example of four groups of stocks and compared its probabilistic properties to the properties of the sample mean and its distribution.

We will continue and finish Chapter 6 in Week 7.

Please use your mcmaster.ca e-mail accounts when corresponding with McMaster faculty

Dear Bus Q600 students:

It is important to use your mcmaster.ca e-mail accounts when you correspond with me and other instructors. If I receive an e-mail from a student's Gmail, Yahoo!, etc., account, I may not respond to that e-mail.

McMaster Policy: ``Students who wish to correspond with instructors directly via email must send messages that originate from their official McMaster University email account. This protects the confidentiality and sensitivity of information as well as confirms the identity of the student.''

Wednesday, October 20, 2010

Assignment 3 due dates

Please see this link for Assignment 3 information and the due dates.

Sunday, October 17, 2010

Summary for Week 5 (October 12-14)

We returned to the tire molding example (where tires are molded in pairs), and introduced the concept of a probability distribution (a list of possible values the random variable can take and their corresponding probabilities). In this example we calculated the average number of defectives that one may find as a result of 100 runs (where each run produces two tires).

The concept of the expected (mean) value of a random variable X was formalized with a formula and illustrated with a physical analogy, i.e., the centre of gravity. Discussion continued with an important example of home insurance policy where even though the expected profit for the company is positive, it may not be desirable to stay in business if the company insures only a few homes.

We then learned about the variance of a random variable and compared its formula to the formula we used for a population's variance. Three simple examples involving a coin flip were used to illustrate the case of increasing variance and zero variance (loaded coin).

We looked at the important special discrete distribution known as binomial and calculated the probability of getting x successes in n trials. (I illustrated this with three tennis balls and a bucket.) We ended Chapter 4 with an application of binomial distribution to a new drug purchase problem which hospital boards may face.

Since Chapter 4 is now complete, due dates for Assignment 2 are during Week 6, see http://www.business.mcmaster.ca/courses/q600/Assignments/HW2/HW2.html

Chapter 5 is concerned with continuous random variables (such as waiting time, amount of apple juice in a can, or temperature). I illustrated the concept of a continuous random variable with a YouTube video of a bottle filling process. We then looked at the special case of a uniform random variable using the example of the random content of an apple juice can.

The lecture notes for Week 5 (the Thursday section C02) are here.

We will continue with Chapter 5 and finish it during Week 6.

Wednesday, October 13, 2010

Assignment 1 marks (adjusted)

I noticed that the the solution for Q8.4 (population standard deviation) did not print correctly, thus resulting in the erroneous marking of this question. The correct value is 73.829.

In order not to penalize the students who gave the correct answer, I have decided to increase everyone's mark by 1.

See the results at, http://www.business.mcmaster.ca/courses/q600/documents/Q600-2010-101013-MP-Posted.pdf.

Monday, October 11, 2010

Informal Course Evaluations

Dear Bus Q600 students:

Thank you for the frank and constructive comments you provided on the informal evaluation that took place during Week 4. I am pleased to report that a large majority of you were happy with how the course was being delivered. There were a few isolated complaints and several suggestions; I will try to implement some minor changes based on these comments.

I analyzed the ratings you provided for Question 3 (The Effectiveness of the Instructor) and found that in all three sections the median score was 9. This is very encouraging, and I hope that you will find the course even more interesting as we cover new topics in the coming weeks.

Assignment 2 due dates

Section EC01 : October 19, 2010 (Tue.) 7:00 pm (in class)
Section C01 : October 20, 2010 (Wed.) 2:30 pm (in class)
Section C02 : October 21, 2010 (Thu.) 11:30 am (in class)

Please, also remember the following:

Put your section number, your name and your student number on the top right-hand corner of the cover page/first page of your assignments. This is crucial for easy sorting into alphabetical order and by section.
The assignments can be hand-written (except where you have an Excel/MegaStat output in which case you must also submit the hardcopy output).
Provide your explanations in full sentences, not in point form.
You must submit your assignments at or before the due date/time indicated for your section.
Late assignments will not be accepted
Please note that the assignments e-mailed to Dr. Parlar or to any of the TAs will not be accepted.

Thursday, October 7, 2010

Summary for Week 4 (October 5-7)

We started by reviewing last week's material. We then defined an "event" and discussed the method of calculating the probability of an event.

Next, complement, union and intersection of events was discussed and used in an example of newspaper readership. The formula for the probability of a union of two events was presented and mutually exclusive events were also defined.

Discussion continued with the definition of conditional probability which was used in the analysis of the problem involving hiring of management trainees.

Independent events were introduced and related to the results in the trainees example. This completed Chapter 3.

We started Chapter 4 by discussing random variables which associate numerical values with the outcomes of an experiment. Several examples were discussed that involved discrete and continuous random variables. As the last example we looked at a tire molding problem where there were four outcomes but three values for the random variable (defectives in the pair).

In this class the students also provided their comments on the informal course evaluation. I will say more about this in a different post.

Saturday, October 2, 2010

Summary for Week 3 (September 28-30)

We started by reviewing the Empirical Rule and the related concept of Tolerance Intervals and 6-sigma. We continued with a discussion of the "z-scores." ("zed", not "zee".) We made use of the Tolerance Interval idea by looking at an example of quality improvement, i.e., the coffee temperature case.

Next, we looked at another measure of variation, namely, the pth percentile. We illustrated this concept by discussing two examples with n = 8, and n = 7. The calculation of the percentile was formalized by introducing a rule that involved two steps. We completed Chapter 2 with a discussion of the Box-and-Whisker plots.

The new Chapter 3 (Probability) was motivated with examples of coin toss, Lotto 6/49, the birthday problem, and Monty Hall's car-and-goats problem. We then looked at three methods for calculating probabilities. Next, a random experiment was defined and this was followed by sample spaces and events which were illustrated using the coin tossing and backgammon examples.

I hope you found the YouTube videos and the car-goats problem interesting.

We will continue with Chapter 3 (and finish it) next week.

Thursday, September 23, 2010

Summary for Week 2 (September 21-23)

We started Chapter 2 by discussing an interesting (and historical) graph by C. Minard who described the movement of Napoleon's army in the 1812-13 Russian campaign. (We also heard a piece of classical music.)

We continued by looking at the shape of a distribution and learned about stem-and-leaf diagrams, dot plots and histograms. In all cases we used MegaStat to draw these graphs. We discussed a method for determining the best value of K and L, and drew a histogram with correct class lengths and number of classes.

We discovered that many data sets are symmetrical, but they could also be positively- and negatively-skewed. (McMaster salaries we looked at were positively-skewed.)

Next, we reviewed the mean, median and mode as measures of central tendency. The variability measures of range (not too useful), variance and standard deviation were introduced. (Recall the example I gave with two buckets of water.) The sample standard deviation has a slightly different formula for which I gave an intuitive explanation.

We will continue with the Empirical Rule for "normal populations" which we did not finish yet.

I plan to complete Chapter 2 next week.

Tuesday, September 21, 2010

Summary for Week 1 (Septembr 14-16)

We started by taking a quick look at the course outline.

In Chapter 1, we discussed the use of statistics in different disciplines, defined populations and samples and looked at the sampling problem associated with the "Dewey defeats Truman" incident in 1948.

Next, we learned how to generate random numbers (marbles in a bucket, random number table, and finally by MegaStat). We classified data as quantitative vs. qualitative, and cross-sectional vs. time-series. For the time series case we looked at the coffee temperature case (Stella Liebeck vs. McDonald's).

The two concepts of being in statistical control, and being capable were also introduced. We wrapped up our discussion with a look at a blood pressure data taken over time, thus completing Chapter 1.

Friday, September 17, 2010

Help on Installing MegaStat

As I indicated on page 4 of the course outline, please refer to,
(i) Excel's help on "add-in", and
(ii) MegaStat's <GettingStarted-MegaStat07.pdf> file for installation instructions. This .pdf file is on the CD that comes with the book.

Thursday, September 16, 2010

Section C02 (2010-09-16)

I was planning/hoping to spend a minute or two on plotting a MegaStat graph of 24 coffee temperature values in the Excel file on the course web site:

http://www.business.mcmaster.ca/courses/q600/ChapterComments/documents/CoffeeTemp.xls

As you remember, the system turned itself off at the end of the class and I couldn't show you the Excel file.

Here are the steps to follow to plot the graph (assuming that you have installed MegaStat):

Double-click on the file and start Excel | Add-Ins > MegaStat > Descriptive Statistics > Input Range (A2 to A25) > Choose Runs Plot > OK

Does the Runs Plot indicate that the process (coffee temps) is in statistical control, and capable?

Wednesday, September 8, 2010

First blog from Dr. Parlar

Dear Bus Q600 students.

This is the first time I am using a blog in any of my courses. I hope you will find it useful.

Let me know how we can make good use of this new tool to improve the learning process in our course.

Please sign in (on the right) using your Google, Twitter or Yahoo accounts.