This page uses software programs that we are not using in this course and focuses only on the t-test assuming **UNEQUAL** variances. See our tutorials on your software for information about how to read your software output and be sure to understand how to handle both t-tests and how to use the test for equality of variances to determine which t-test to conduct.

Please note that, while the information will vary in setup and format from program to program, the values are the same, since the same data is used for all the printouts. In addition, the format of output will be different for other tests (ex. correlation, regression, ANOVA); thus, this information pertains only to the output for t-tests.

As reference, here is information on the problem data that was analyzed in the printouts:

This question was asked of a random sample of 239 college students, who were to answer on a scale of 1 to 25. An answer of 1 means personality has maximum importance and looks no importance at all, whereas an answer of 25 means looks have maximum importance and personality no importance at all. The purpose of this survey was to examine whether males and females differ with respect to the importance of looks vs. personality.

**R Output:**

R was used to calculate the two independent samples t-test result, using gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable. The data command used to produce this output is as follows:

The top line tells us that a two sample t-test was calculated.

The second line tell us specifically what was analyzed (ex. data:)

The third line has three values:

- t = -4.6574, which is the value of the t-statistic
- df = 182.973, which is the degrees of freedom. It is used when determining critical value.
- p-value = 6.143e-06, which is the exact probability of getting this value for the t-statistic. The value is written in scientific notation, which needs to be converted to get the exact probability level. Since this value ends in e-06, the decimal point needs to be moved 6 places to the left. Thus, 6.143e-06 equals .000006143, which is very, very small. When rounded to four decimal points, the P-value is .0001; thus, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks.

The fourth line describes the alternative hypothesis. In this case, since the “true difference in means is not equal to 0;” thus, it assessing that the claim that there is a difference, but not that one group is higher or lower than the other group, which is a two-sided, two independent samples t-test.

The fifth line tells us that the sixth line contains the 95% confidence interval that can be used to assess the null hypothesis (H_{0} : μ_{1} – μ_{2} = 0). Thus, the 95% confidence interval is: (-3.695865, -1.496292). Since 0 does not fall within the 95% confidence interval, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks. As it should be, these results are consistent with the above results, where the P-value was used.

The last three lines give us information on the samples; in this case, group means.

- mean personality score for females: 10.73333
- mean personality score for males: 13.32941

**StatCrunch (edited) Output:**

StatCrunch was used to produce the following three charts, with gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable.

- The first chart contains summary statistics for the two samples (Females and Males) on the quantitative response/dependent variable (looks).
- The second chart contains the (two-sided) two independent samples t-test result, using gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable with the P-value method for assessing the null hypothesis.
- The third chart contains the (two-sided) two independent samples t-test result, using gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable with the 95% confidence interval method for assessing the null hypothesis.

The above chart displays information on the characteristics of the samples.

The first line gives a description of the information contained in each column, while the first column tells us what information is contained in each of the rows:

- Column 1 contains information on the x (explanatory/independent) categorical variable. In this case, there are two levels of the x variable, Gender: males and females. Thus, row 2 refers only to information on the sample of Females, while row 3 refers only to information on the sample of Males.
- Column 2 contains information on the number of observations in each sample: 150 Females and 85 Males.
- Column 3 contains the means for each of the samples: 10.733334 for Females and 13.3294115 for Males.
- Column 4 contains the standard deviation for each of the samples: 4.254751 for Females and 4.0189676 for Males.
- Column 5 contains the standard deviation of the sampling distribution (which is also known as the standard error): 0.347399 for Females and 0.43591824 for Males.

The above chart shows the results of the two-sided (since the alternative hypothesis is: H_{a} : μ_{1} – μ_{2} ≠ 0), two independent samples t-test, using the P-value method.

The first line gives a description of the information contained in each column:

- Column 1 contains information on the difference that is being assessed: μ
_{1}– μ_{2}, where μ_{1 }is the population mean for Females and μ_{2 }is the population mean for Males. - Column 2 contains the difference between the sample means that is being assessed in the t-test. Thus, the mean from the sample of Females (10.733334) minus the mean from the sample of Males (13.3294115) or 10.7333334 – 13.3294115 = -2.5960784.
- Column 3 contains the standard error.
- Column 4 contains the degrees of freedom, which is based on the sample sizes and is used to determine the critical value for cutoff scores.
- Column 5 contains the t-statistic, which is -4.657358.
- Column 6 contains the P-value, which is <0.0001. Thus, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks.

The above chart shows the results of the two-sided (since the alternative hypothesis is: H_{a} : μ_{1} – μ_{2} ≠ 0), two independent samples t-test, using 95% confidence intervals.

The first line gives a description of the information contained in each column:

- Column 1 contains information on the difference that is being assessed: μ
_{1}– μ_{2}, where μ_{1 }is the population mean for Females and μ_{2 }is the population mean for Males. - Column 2 contains the difference between the sample means that is being assessed in the t-test. Thus, the mean from the sample of Females (10.733334) minus the mean from the sample of Males (13.3294115) or 10.7333334 – 13.3294115 = -2.5960784.
- Column 3 contains the standard error.
- Column 4 contains the degrees of freedom, which is based on the sample sizes and is used to determine the critical value for cutoff scores.
- Column 5 contains the lower limit and Column 6 contains the upper limit of the 95% confidence interval that can be used to assess the null hypothesis (H0 : μ1 – μ2 = 0). Thus, the 95% confidence interval is: (-3.6958647, -1.4962921). Since 0 does not fall within the 95% confidence interval, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks. As it should be, these results are consistent with the results from the previous table, where the P-value was used.

**Minitab Output:**

Minitab was used to calculate the two independent samples t-test result, using gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable. The below chart shows the results of the two-sided (since the alternative hypothesis is: HA : μ1 – μ2 ≠ 0), two independent sample t-test, using both the P-value and 95% confidence interval to assess the null hypothesis.

The top line tells us that a two sample t-test was calculated, using score as the response/dependent variable and gender as the explanatory/independent variable.

The second line tell us that the two sample t-test was calculated with score as the response/dependent variable.

The third, fourth, and fifth lines, comprise a table of summary statistics for the response/dependent variable. The third line gives a description of the information contained in each column, while the first column tells us what information is contained in each of the rows:

- Column 1 contains information on the x (explanatory/independent) categorical variable. In this case, there are two levels of the x variable, Gender: males and females. Thus, row 2 refers only to information on the sample of Females, while row 3 refers only to information on the sample of Males.
- Column 2 contains information on the number of observations in each sample: 150 Females and 85 Males.
- Column 3 contains the means for each of the samples: 10.73 for Females and 13.33 for Males.
- Column 4 contains the standard deviation for each of the samples: 4.25 for Females and 4.02 for Males.
- Column 5 contains the standard deviation of the sampling distribution (which is also known as the standard error): 0.35 for Females and 0.44 for Males.

The sixth line contains information on the difference that is being assessed: μu – μu; in this case, the population mean of males is subtracted from the population mean of females.

The seventh line the difference between the sample means that is being assessed in the t-test. Thus, the mean from the sample of Females (10.73) minus the mean from the sample of Males (13.33) or 10.73 – 13.33 = -2.596.

The eighth line contains the 95% confidence interval that can be used to assess the null hypothesis (H0 : μ1 – μ2 = 0). Thus, the 95% confidence interval is: (-3.696, -1.496). Since 0 does not fall within the 95% confidence interval, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks.

The ninth line has three values:

- T-Test of difference = 0 (vs not =): T-Value = -4.66, which tells us that this is a two-sided two sample t-test, with a t-statistic value of -4.66.
- P-value = 0.000, which is the exact probability of getting this value for the t-statistic. Since it is not possible to have 0 probability, the last value is changed to a 1, when reporting, and it shows that, with a P-value of .001, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks. As it should be, these results are consistent with the results from the previous line (line 8), where the 95% confidence interval was used.
- DF = 182, which is the degrees of freedom. It is used when determining the critical value.

**TI Calculator Output:**

The TI calculator was used to calculate the two independent samples t-test result, using gender (Gender) as the categorical explanatory/independent variable and the importance of personality (looks) as the quantitative response/dependent variable. The below chart shows the results of the two-sided (since the alternative hypothesis is: H_{A} : μ_{1}≠ μ_{2} ), two independent sample t-test, using the P-value to assess the null hypothesis.

The top lines of both the left and right images tell us that a two sample t-test was calculated, while the second lines tell us that the alternative hypothesis is H_{A} : μ_{1}≠ μ_{2}. Thus, these are the results for a two-sided two sample t-test.

**Left Chart:**

The third line shows the t-statistic, which is -4.657358347.

The fourth line shows the p-value = 6.1426775e-06, which is the exact probability of getting this value for the t-statistic. The value is written in scientific notation, which needs to be converted to get the exact probability level. Since this value ends in e-06, the decimal point needs to be moved 6 places to the left. Thus, 6.143e-06 equals .000006143, which is very, very small. When rounded to four decimal points, the P-value is .0001; thus, we can reject the null hypothesis that there is no difference between the mean score of Females and the mean score of Males on the importance of looks.

The fifth line contains the degrees of freedom, which is df = 182.9726756. It is used when determining critical value.

The sixth line contains the mean for the first sample, which is Females: 10.73333333.

The seventh line contains the mean for the second sample, which is Males: 13.32941176

**Right Chart (which is a continuation of the left chart, when viewed on the calculator screen):**

The third line contains the mean for the second sample, which is Males: 13.32941176.

The fourth line contains the standard deviation for the first sample, which is Females: 4.25475126.

The fifth line contains the standard deviation for the second sample, which is Males: 4.01896763.

The sixth line contains the sample size for the first sample, which is 150 Females.

The seventh line contains the sample size for the second sample, which is 85 Males.

This document linked from Two Independent Samples

]]>A line is described by a set of points **(X,Y)** that obey a particular relationship between **X** and **Y**. That relationship is called the equation of the line, which we will express in the following form: **Y = a + bX **In this equation, **a** and **b** are constants that can be either negative or positive. The reason to write the line in this form is that the constants **a** and **b** tell us what the line looks like, as follows:

- The
**intercept (a)**is the value that**Y**takes when**X**= 0 - The
**slope (b)**is the change in**Y**for every increase of 1 unit in**X**.

The slope and intercept are indicated with arrows on the following diagram:

The technique that specifies the dependence of the response variable on the explanatory variable is called **regression**. When that dependence is linear (which is the case in our examples in this section), the technique is called **linear regression**. Linear regression is therefore the technique of finding the line that best fits the pattern of the linear relationship (or in other words, the line that best describes how the response variable linearly depends on the explanatory variable).

To understand how such a line is chosen, consider the following very simplified version of the age-distance example (we left just 6 of the drivers on the scatterplot):

Consider the line:

The intercept is 1. The slope is 1/3, and the graph of this line is, therefore:

Consider the line:

The intercept is 1. The slope is -1/3, and the graph of this line is, therefore:

This document is linked from Linear Relationships – Linear Regression.

]]>