The document provides information about the SYSTAT statistical software. It describes SYSTAT as a powerful system for statistical analysis and graphical presentation of data. It offers features ranging from basic descriptive statistics to advanced statistical algorithms. The user interface has output, data editing, and graph editing panels. It allows importing and exporting data from various formats and performing statistical analyses through an intuitive Windows interface.
The document provides information about the SYSTAT statistical software. It describes SYSTAT as a powerful system for statistical analysis and graphical presentation of data. It offers features ranging from basic descriptive statistics to advanced statistical algorithms. The user interface has output, data editing, and graph editing panels. It allows importing and exporting data from various formats and performing statistical analyses through an intuitive Windows interface.
The document provides information about the SYSTAT statistical software. It describes SYSTAT as a powerful system for statistical analysis and graphical presentation of data. It offers features ranging from basic descriptive statistics to advanced statistical algorithms. The user interface has output, data editing, and graph editing panels. It allows importing and exporting data from various formats and performing statistical analyses through an intuitive Windows interface.
The document provides information about the SYSTAT statistical software. It describes SYSTAT as a powerful system for statistical analysis and graphical presentation of data. It offers features ranging from basic descriptive statistics to advanced statistical algorithms. The user interface has output, data editing, and graph editing panels. It allows importing and exporting data from various formats and performing statistical analyses through an intuitive Windows interface.
SYSTAT is a powerful statistical analysis and graphical presentation software. It offers numerous statistical features ranging from simple descriptive statistics to advanced algorithms. It has an intuitive Windows interface and handles various data formats.
SYSTAT provides statistical analysis, data handling, graphing capabilities and an intuitive user interface. It offers features like drag and drop, right click functionality, comprehensive dialogs and a flexible command language.
The user interface is organized into the Viewspace, Workspace and Commandspace. The Viewspace has tabs for output, data and graphs. The Workspace has tabs for output organizer and dynamic explorer. The Commandspace has tabs for interactive, batch mode and log commands.
1. Introduction SYSTAT was designed for statistical analysis and graphical presentation of scientific and engineering data. In order to use this tutorial, knowledge of Windows 95/98/2000/Nt/XP would be helpful.
SYSTAT provides a powerful statistical and graphical analysis system in a new graphical user interface environment using descriptive menus, toolbars and dialog boxes. It offers numerous statistical features from simple descriptive statistics to highly sophisticated statistical algorithms.
Taking advantage of the enhanced user interface and environment, SYSTAT offers many major performance enhancements for speed and increased ease of use. Simply pointing and clicking the mouse can accomplish most tasks. SYSTAT provides extensive use of drag-n-drop and right click mouse functionality. SYSTATs intuitive Windows interface and flexible command language are designed to make your research more efficient. You can quickly locate advanced options through clear, comprehensive dialogs.
SYSTAT also offers a huge data worksheet for powerful data handling. SYSTAT handles most of the popular data formats like, Excel, SPSS, SAS, BMDP, MINITAB, S-Plus, Statistica, Stata, JMP, and ASCII. All matrix operations and computations are menu driven.
The Graphics module of SYSTAT 11 is an enhanced version of the existing graphics module of SYSTAT 10.2. This module has better user interactivity to work with all graphical outputs of the SYSTAT application. Users can easily create 2D and 3D graphs using the appropriate top tool bar icons, which provide tool tip descriptions of graphs. Graphs could be created from the Graph top tool bar menu or by using the Graph Gallery, which facilitate accomplishing complex graphs (e.g. global map with contour, 3D surface plots with contour projections, etc.) with point and click of a mouse. Simply double clicking the graph will bring up a dialog to facilitate editing most of graph attributes from one comprehensive 'dynamic dialogue'. Each graph attribute such as line thickness, scale, symbols choice, etc. can be changed with mouse clicks. Thus simple or complex changes to a graph or set of graphs can be made quickly and done exactly as the user requires.
2. Getting Started with SYSTAT 2.1 Opening SYSTAT for Windows To start SYSTAT for Windows NT4, 98, 2000, ME and XP, Choose Start All Programs SYSTAT11 SYSTAT 11 SYSTAT 2 Alternatively, you can double-click on the SYSTAT icon , to get started with SYSTAT.
2.2 User Interface The user interface of SYSTAT is organized into three spaces: I. Viewspace II. Workspace III. Commandspace
I. Viewspace has the following tabs: Output Pane. Graphs and statistical results appear in the Output Pane. You can edit, print and save the output displayed in the Output Pane. Data Editor. The Data Editor displays the data in a row-by-column format. Each row is a case and each column is a variable. You can enter, edit, view, and save data in the Data Editor. Graph Editor. You can edit and save graphs in the Graph Editor.
The Output Pane is fixed in the Viewspace, whereas the Data Editor and Graph Editor can be moved to the Workspace and restored by double-clicking on the tab. The advantage is that any two of these tabs can be viewed simultaneously.
SYSTAT 3 II. Workspace has the following tabs: Output Organizer. The Output Organizer tab helps primarily to navigate through the results of your statistical analysis. You can quickly navigate to specific portions of output without having to use the Output Pane scrollbars. Dynamic Explorer. The Dynamic Explorer can be used to rotate 3-D graphs, apply power transformations to values on one or more axes, and change the confidence intervals, ellipses, and kernels in scatter plots. By default, the Dynamic Explorer appears automatically when the Graph Editor tab is active.
III. Commandspace has the following tabs: Interactive. In the Interactive tab, you can enter commands at the command prompt (>) and issue them by pressing the Enter key. Untitled. The Untitled tab enables you to run the commands in the batch mode. You can open, edit, submit and save SYSTAT command file (.syc or .cmd) Log. In the Log tab, you can view the record of the commands issued during the SYSTAT session (through Dialog or in the Interactive mode). You can cycle through the three tabs using the following keyboard shortcuts: CTRL+ALT+TAB. Shifts focus one tab to the right. CTRL+ALT+SHIFT+TAB. Shifts focus one tab to the left.
3. SYSTAT Data, Command and Output files Data files. You can save data files with (.SYD) extension. Command files. A command file is a text file that contains SYSTAT commands. Saving your analyses in a command file allows you to repeat them at a later date. These files are saved with (.SYC) extension. Output files. SYSTAT displays statistical and graphical output in the output Pane. You can save the output in (.SYO), Rich Text format (.RTF) and HyperText Markup Language format (*.HTM).
4. The Data Editor The Data Editor is used for entering, editing, and saving data. Entering data is a straightforward process. Editing data includes changing variable names or attributes, adding and deleting cases or variables, moving variables or cases, and correcting data errors.
SYSTAT 4 SYSTAT imports and exports data in all popular formats, including Excel, ASCII Text, Lotus, BMDP Data, SPSS, SAS, StatView, Stata, Statistica, JMP, Minitab and S-Plus as well as from any ODBC compliant application. Data can be entered or imported in SYSTAT in the following way:
Entering data Consider the following data that has records about seven dinners from the frozen-food section of a grocery store.
Brand$ Calories Fat Lean Cuisine 240 5 Weight Watchers 220 6 Healthy Choice 250 3 Stouffer 370 19 Gourmet 440 26 Tyson 330 14 Swanson 300 12 To enter these data into Data Editor, from the menu choose: File Menu Data This opens the Data Editor (or clears its contents if it is already open).
Double-click (VAR00001) to open the Variable Properties dialog box. . SYSTAT 5
Type BRAND$ for the name. The dollar sign ($) at the end of the variable name indicates that the variable is a string or a character variable, as opposed to numeric variable. Note: Variable names cannot exceed 12 characters.
Select String as the Variable type. Enter the number of characters in the Characters box. From the Character drop-down list, select the desired number of characters. Click OK to complete the variable definition for VAR00001. To type CALORIES as Variable name, double-click (VAR00002) to open the Variable Properties dialog box. Select Numeric as the Variable type. Enter the number of characters in the Characters box. [The decimal point is considered as a character.] Select the number of Decimal places to display. Click OK to complete the variable definition for VAR00002. Repeat this process for the FAT variable, selecting Numeric as the variable type. Click the top left data cell (under the name of the first variable) and enter the data. To move across rows, press Enter or Tab after each entry. To move down columns, press the down arrow key.
Note: To navigate the behavior of the Enter key in the Data Editor. From the menu choose: Edit Options Data SYSTAT 6
Click either of the two radio buttons below Data Editor Cursor. Once the data are entered in the Data Editor, the data file should look something like this:
SYSTAT 7 For saving the data, from the menu choose: File Save As
Importing Data To import IRIS.xls. (data of Excel format) from the menu choose: File Open Data...
From the Files of type drop-down list, choose Microsoft Excel. Select the IRIS.xls file. Select the desired Excel sheet and click OK. The data file in the Data Editor should look something like this: SYSTAT 8
Statistical Analyses through SYSTAT
5. Descriptive Statistics Descriptive Statistics offers basic statistics and stem-and-leaf plot for columns as well as rows. The basic statistics are: number of observations (N), minimum, maximum, mean, sum, standard deviation, variance, coefficient of variation (CV), range, median, standard error of mean, etc. Besides the above options, you can perform the Shapiro-Wilk test for normality.
Example 5.1: We will use the IRIS data to compute descriptive statistics. This data set consists of four measurements made on 50 random samples of Iris flowers from each of the three species of Setosa, Versicolor, and Virginica (coded as 1, 2, and 3, respectively). The four measurements are Sepal length, Sepal width, Petal length, and Petal width in cm. This is a famous data set from Fisher (1936).
To calculate basic statistics for the iris data, from the menu choose: Analysis Descriptive Statistics Basic Statistics SYSTAT 9
Choose SEPALWID and add it to the Selected variable(s) list. Select N, Mean, SD, Minimum, Maximum. To check for normality, select the Shapiro-Wilk normality test option. Click OK.
The following output is displayed in the Output Pane:
SEPALWID N of cases 150 Minimum 2.000 Maximum 4.400 Mean 3.057 Standard Dev 0.436 SW Statistic 0.985 SW P-Value 0.101
6. Correlation The Correlation feature computes correlations and measures of similarity and distance.
Example 6.1: In the previous example, we computed basic statistics for SEPALWID. We will now compute the correlations between the four variables. SYSTAT 10 To compute correlations between the four variables: SEPALLEN, SEPALWI, PETALLEN and PETALWID, from the menu choose: Analysis Correlations Simple...
Often, we may want to compute certain statistics separately for each group defined by certain variable(s) in the data set. In this case, we may want to examine if the correlations are of the same magnitude in the three species. SYSTAT facilitates such computations by its By Groups feature. Let us use By Groups in the Data menu to request separate results for each level of SPECIES (grouping variables). From the menu choose: Data By Groups SYSTAT 11
In the By Groups dialog box, select SPECIES as variable. Click OK. Return to the Simple Correlations dialog box. Select all the four variables and add it to the Selected variable(s) list. Click OK.
The following output is displayed in the Output Pane:
S E P A L L E N S E P A L W I D P E T A L L E N SEPALLEN P E T A L W I D SEPALWID PETALLEN PETALWID
Number of observations: 50
Quick Graphs. Quick Graphs are graphs which are produced along with numeric output without the user invoking the Graph menu.A number of SYSTAT procedures include Quick Graphs. The Quick Graphs above are automatically generated when you request correlations (with the Quick Graphs options on). If you want to turn off the Quick Graph facility: Under Edit menu, click Options. In the Global Options dialog, select the Output tab. Turn off the Display statistical Quick Graphs option.
The above Quick Graphs in this example are in the scatterplot matrix (SPLOM). In each SPLOM there is one bivariate scatterplot corresponding to each entry in the correlation matrix that follows. A univariate histogram for each variable is displayed along the diagonal, and 75% normal distribution-based confidence ellipses are displayed within SYSTAT 14 each plot. For species 3 (i.e. Virginica), the plot of SEPALLEN and PETALLEN has the narrowest ellipse, and thus, the strongest correlation, which is 0.8642.
7. Hypothesis Testing SYSTAT provides several parametric tests of hypotheses and confidence intervals for means, variances, proportions, and correlations. This section provides examples of the one-sample t-test and the paired t test.
a. One-Sample t-test The one-sample t test is used to test if the mean of the population (from which the data set is a sample) is equal to a hypothesized value.
Example 7.1: One-Sample t-test Let us study the effect of cigarette smoking on the carbon monoxide diffusing capacity (DL) of the lung. Ronald Knudson, Walter Klatenborn, and Benjamin Burrows found that current smokers had DL readings significantly lower than those of exsmokers or nonsmokers. Let us answer, whether the data indicates that the mean DL () reading for current smokers is significantly lower than 100 DL?
The null hypothesis is H o : = 100 against the alternative hypothesis H 1 : < 100
The carbon monoxide diffusing capacities for a random sample of n=20 are entered in the Data Editor.
SYSTAT 15 To perform one-sample t-test, from the menu choose: Analysis Hypothesis testing Mean One-Sample t-test
Add DL_Reading to the Selected variable(s) list. Enter Mean 100. From the drop-down list, select the alternative type as less than. Click OK.
The following output is displayed: One-sample t-test of DL_READING with 20 cases Ho: Mean = 100.000 against Alternative = 'less than' Mean = 89.855 95.00% confidence bound = 95.617 SD = 14.904 t = -3.044 df = 19 p-value = 0.003 SYSTAT 16 60 70 80 90 100 110 120 130 DL_READING 0 1 2 3 4 5 6 7 C o u n t
Conclusion: We observe that the one-sided p-value is 0.003, which is highly significant. Clearly, the mean DL () reading for current smokers is significantly lower than 100 DL.
b. Paired t-test The paired t-test assesses the equality of two means in experiments involving paired measurements. Example 7.2: Paired t-test To illustrate the paired t-test we use the data from Hand et al. (1996). The data were collected on the systolic blood pressure of 15 patients (MacGregor et al., 1979). The interest is to see if there is any difference in the systolic blood pressure of the patients, before and after the administration of a drug called captopril. The BP data file gives the supine systolic and diastolic blood pressures (mm Hg) for 15 patients with moderate essential hypertension, immediately before and two hours after administering the drug.
SYSTAT 17 The null hypothesis is H o : d = 0 (i.e. there is no difference in the systolic blood pressure of the patients, before and after the administration of the drug). The alternative hypothesis is H 1 : d > 0 (i.e. there is positive difference in the systolic blood pressure of the patients, between before and after the administration of the drug, indicating that the drug has the desired effect.)
To perform paired t-test, from the menu choose: Analysis Hypothesis testing Mean Paired t-test
Add SYSBP_BEFORE and SYSBP_AFTER in the Selected variable(s) list. From the drop-down list, select the alternative type as greater than. Click OK.
The output is displayed in the Output Organizer Paired samples t-test on SYSBP_BEFORE vs SYSBP_AFTER with 15 cases Alternative = 'greater than' Mean SYSBP_BEFORE = 176.933 Mean SYSBP_AFTER = 158.000 Mean difference = 18.933 95.00% confidence bound = 14.828 SD of difference = 9.027 t = 8.123 df = 14 p-value = 0.000 SYSTAT 18
SYSBP_AFTER SYSBP_BEFORE Index of Case 120 130 140 150 160 170 180 190 200 210 220 V a l u e
From the above graph, it is seen that the systolic blood pressure has decreased after the administration of the drug captopril. The test results (mean difference=18.933, p=0.000) indicate that the drug captopril reduces the systolic blood pressure.
8. R x C Contingency Table A contingency table provides a display of (joint) frequencies of categorical (or discrete) data to study relationships between two or more variables. Using Crosstabulation, you can analyze and save frequency tables that are formed by categorical variables.
Example 8.1: Contingency Table This example uses questionnaire data from a community survey (Afifi et al., 2004). The survey was conducted to study depression and help-seeking behavior among adults. The CESD depression index was constructed by asking people to respond to 20 items. The SURVEY2.SYD data file includes a record (case) for each of the 256 subjects in the sample. The data set consists of following variables:
ID SEX AGE MARITAL EDUCATN EMPLOY INCOME RELIGION BLUE DEPRESS LONELY CRY SAD FEARFUL FAILURE AS_GOOD HOPEFUL HAPPY ENJOY BOTHERED NO_EAT EFFORT BADSLEEP GETGOING MIND TALKLESS UNFRNDLY DISLIKE TOTAL CASECONT DRINK HEALTHY DOCTOR MEDS BED_DAYS ILLNESS CHRONIC MARITAL$ SEX$ AGE$ EDUC$
SYSTAT 19
To study the relationship between depression and education, label the EDUCATN and CASECONT into categories using the Label dialog box. To open the Label dialog box, from the menu choose: Data Label
SYSTAT 20 Select EDUCATN as the variable. Type the value(s) that require labels. Type the label for each specified value. Click OK. Repeat the process for the variable CASECONT and label the value 1 as depressed and 0 as normal.
To tabulate, from the menu choose: Analysis Tables Two-Way
Select EDUCATN as the Row variable(s) and CASECONT as the Column variable. Below the Tables, check the Frequencies and the Table percents check boxes. Click OK.
The output is displayed in the Output Pane. Frequencies
EDUCATN (rows) by CASECONT (columns) SYSTAT 21 depressed normal Total +---------------------+ Dropout | 14 36 | 50 HS grad | 18 80 | 98 College | 11 75 | 86 Degree + | 1 21 | 22 +---------------------+ Total 44 212 256 Row percents
EDUCATN (rows) by CASECONT (columns)
depressed normal Total N +---------------------+ Dropout | 28.000 72.000 | 100.000 50 HS grad | 18.367 81.633 | 100.000 98 College | 12.791 87.209 | 100.000 86 Degree + | 4.545 95.455 | 100.000 22 +---------------------+ Total 17.187 82.813 100.000 N 44 212 256
Test statistic Value df Prob Pearson Chi-square 7.841 3.000 0.049
Conclusion: As the level of education increases, the proportion of depressed subjects decreases. Of those not graduating from high school (Dropout), 28% are depressed, and 4.55% of those with advanced degrees are depressed. Notice that the Pearson chi-square is marginally significant (p value = 0.049). It tests the hypothesis that the percentage of depressed is the same in all education groups.
9. Fitting Distributions The Fitting Distributions feature enables you to assess whether the observed data can be modeled by a distribution from a parametric family of distributions with appropriately chosen parameter values.
Example 9.1: Fitting Normal Distribution The data in FOREARM1 contains length of forearm (in inches) from Pearson and Lee (1903). A normal distribution may be an appropriate model to describe the data on the forearm length. To fit a normal distribution, from the menu choose: Analysis Fitting Distributions Continuous SYSTAT 22
Add ARMLENGTH in the Selected variable(s) list. Select Distribution as Normal.
The output is displayed in the Output Pane:
Variable Name: ARMLENGTH Distribution: Normal Estimated: Location or mean (mu) = 18.802143 Scale or SD (sigma) = 1.116466 Estimation of parameter(s): Maximum likelihood method. Test Results: LimitL LimitU Observed Expected . 17.1600 11.0 9.8934 17.1600 17.6900 12.0 12.4498 17.6900 18.2200 16.0 19.8022 18.2200 18.7500 29.0 25.2471 18.7500 19.2800 22.0 25.8024 19.2800 19.8100 24.0 21.1380 19.8100 20.3400 11.0 13.8807 20.3400 . 15.0 11.7865 140.0 140.0000
SYSTAT 23
Chi-square test statistic = 3.849814 df = 5 p-value = 0.571236 Kolmogorov-Smirnov test statistic = 0.047870 Lilliefors Probability (2-tail) = 0.554270 Shapiro-Wilk test statistic for normality = 0.991759 p-value = 0.590263
16 18 20 22 ARMLENGTH 0 10 20 30 C o u n t 0.0 0.1 0.2 P r o p o r t i o n
p e r
B a r FITTED DISTRIBUTION
Conclusion: The above analysis indicates that a normal distribution fits the data well.
10. Analysis of Variance We used the t-test for comparing the mean of one sample with a specified value or for comparing the means of two groups. In many situations there is a need to compare several means and to test the significance of differences between three or more means from independently sampled populations.
Example 10.1: One Way ANOVA This example uses a one-way design to compare average typing speed for three groups of typists. Fourteen beginning typists were randomly assigned to three types of machines and given speed tests. The following are their typing speeds in words per minute:
Electric Word processor Plain old 52 67 52 47 73 43 51 70 47 49 75 44 53 64
SYSTAT 24 Does the equipment influence typing performance?
H o : The average speeds of the three machines are the same. H 1 : The average speeds of the three machines are not all the same. To carry out analysis of variance using the above data, we need to reorganize the data in a form suitable for SYSTAT. This is done by using the `Reshape feature and `wrapping the columns as follows. Wrapping puts the group variable in one column and the measurement variable in another column. Thus we need to wrap the data in two columns for which from the menu choose: Data Reshape
The data file looks as below:
SYSTAT 25 The variable MEASURE is the typing speed using three types of machines. The levels 1, 2 and 3 correspond to machines ELECTRIC, WORD PROCESSOR and PLAIN OLD respectively in the TRIAL column. Of course, you might like to rename `Trial as `Equipment$ and `Measure as `Speed using the Variable Properties dialog. Now let us do one-way analysis of variance using the wrapped data. To perform One-Way ANOVA, from the menu choose:
Analysis Analysis of Variance Estimate Model
Add Measure as the Dependent variable. Add TRIAL as the Factor. Click OK.
The output is displayed in the Output Pane:
Effects coding used for categorical variables in model.
Categorical values encountered during processing are: TRIAL (3 levels) 1, 2, 3 1 case(s) deleted due to missing data.
Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P TRIAL 1469.3571 2 734.6786 53.5196 0.0000 Error 151.0000 11 13.7273
Least Squares Means 1 2 3 TRIAL 37.0 45.2 53.4 61.6 69.8 78.0 M E A S U R E
Conclusion: We reject the hypothesis as the p value is small. The Quick Graph illustrates this finding. Although the typists using electric and plain old typewriters have similar average speeds (50.4 and 46.5, respectively), the word processor group has a much higher average speed.
Example 10.2: Two Way ANOVA Consider the following data from a two-factor (Drug & Disease) experiment, from Afifi and Azen (1972), cited in Kutner (1974). The dependent variable, SYSINCR, is the change in systolic blood pressure after administering one of four different drugs to patients with one of three different diseases. Patients were assigned randomly to one of the possible drugs. The data are stored in the SYSTAT file AFIFI. SYSTAT 27
SYSTAT 29 Conclusion: In two-way ANOVA, begin the analysis by looking at the interaction effect. The DRUG * DISEASE interaction is not significant (p = 0.396), so shift your focus to the main effects. The DRUG effect is significant (p < 0.0005), but the DISEASE effect is not (p = 0.164). Thus, at least one of the drugs differs from the others with respect to blood pressure change, but blood pressure change does not vary significantly across diseases.
Note: Along with ANOVA table, SYSTAT also displays the Estimates of the model parameters. To get the estimates, you need to select LONG as the Print option. To do so, from the menu, choose Edit Options. Select the Output tab. From the Output results, select Length as Long.
11. Linear Regression Regression analysis is used to investigate the relationship between a response variable and one or more predictors.
Example 11.1: Let us study the relationship between noise exposure (predictor or independent variable) and hypertension (dependent or response variable). The following data were collected on Y (blood pressure rise in millimeters of mercury) and X (sound pressure level in decibels).
Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P Regression 92.934 1 92.934 53.502 0.000 Residual 31.266 18 1.737
----------------------------------------------------------------------------- *** WARNING *** Case 5 is an outlier (Studentized Residual = 2.741) Durbin-Watson D Statistic 2.290 First Order Autocorrelation -0.179 SYSTAT 31 Conclusion. The estimates of the regression coefficients are -10.132 and 0.174, so the equation regression is: Y= -10.132 +0.174X F-ratio in the analysis of variance table is used to test the hypothesis that the slope is 0 (or, for multiple regressions, that all slopes are 0). The F is large when the independent variable(s) helps to explain the variation in the dependent variable. Here, there is a significant linear relation between Y and X. Thus, we reject the hypothesis that the slope of the regression line is zero (F-ratio = 53.502, p value (P) < 0.0005). SYSTAT also outputs statistics and warnings for outlier detection and for testing the assumptions in linear regression methodology.
12. Logistic Regression Logistic regression describes the relationship between a dichotomous response variable and a set of explanatory (predictor or independents) variables. The explanatory variables may be continuous or (with dummy variables) discrete.
Example 12.1: Binary Logistic Regression To illustrate the use of binary logistic regression, we consider a hypothetical data set. Data on 15 skiers present, falling down (0= not falling, 1= falling) on a ski run is tested against the difficulty of the run (on an ordered scale from 1 to 3, treated as if continuous) and the season a categorical variable where 1 = autumn, 2= winter, and 3 = spring) To perform Logistic regression, from the menu choose; Analysis Regression Logit Estimate Model
SYSTAT 32 Select FALL as the Dependent variable. Select DIFFICULTY and SEASON as the Independent variables.
Let us use Category tab to recode the variable SEASON.
Select SEASON as the Categorical variable. Select Coding type as dummy coding. Click OK.
The output is displayed in the output pane: Categorical values encountered during processing are: SEASON (3 levels) 1, 2, 3 FALL (2 levels) 0, 1 Categorical variables are dummy coded with the highest value as reference.
Binary LOGIT Analysis.
Dependent variable: FALL Input records: 15 Records for analysis: 15 Sample split
L-L at iteration 1 is -10.3972 L-L at iteration 2 is -8.8005 L-L at iteration 3 is -8.7411 L-L at iteration 4 is -8.7404 L-L at iteration 5 is -8.7404 Log Likelihood: -8.7404 Parameter Estimate S.E. t-ratio p-value 1 CONSTANT -1.7768 1.8898 -0.9402 0.3471 2 DIFFICULTY 1.0108 0.8960 1.1281 0.2593 3 SEASON_1 0.9275 1.5894 0.5836 0.5595 4 SEASON_2 -0.4185 1.3866 -0.3018 0.7628 95.0 % bounds Parameter Odds Ratio Upper Lower 2 DIFFICULTY 2.7478 15.9106 0.4745 3 SEASON_1 2.5282 56.9781 0.1122 4 SEASON_2 0.6581 9.9666 0.0434 Log Likelihood of constants only model = LL(0) = -10.0952 2*[LL(N)-LL(0)] = 2.7096 with 3 df Chi-sq p-value = 0.4386 McFadden's Rho-Squared = 0.1342
Conclusion. We see that none of the coefficients is significant. The likelihood-ratio statistic of 2.7096 is chi-squared with three degrees of freedom and a p-value of 0.4386.
13. Graphs SYSTAT offers a wide variety of graphical analysis tools that enable better visualization of the data. The editing options in SYSTAT allow you to fine-tune and change the display of the graph. To create a Summary charts, Density displays, Plots click on the graph toolbar menu or select the icon from the Graph toolbox
Note. Graph menus are available when a data file is in use.
Example 13.1: Simple Scatter Plot Let us create a simple scatter plot. Consider the following data file. In various international cities, how long must people work to earn enough to buy a Big Mac? How does this time relate to the length of a typical work week? We plot BIG_MAC, the working time (in minutes) to buy a Big Mac against WORKWEEK, the length of the work week (in hours). The data are in the RCITY file that has 46 cases, one for each city.
Open the RCITY.SYD data file from DATA folder of main SYSTAT directory. SYSTAT 34 Note. By default, the file location is C:\Program Files\SYSTAT 11\Data. You can also change the default path. To do so, from the menu choose: Edit Options. Select the File Locations tab. Select the radio button, Set custom directories. Change the path for Open data.
To plot Big Mac against WORKWEEK, from the menu choose; Graph Plots Scatterplot
SYSTAT 35 Select WORWEEEK as the X-variable(s). Select BIG_MACK as the Y variable. Click OK. The Output pane displays the following graph:
30 35 40 45 50 WORKWEEK 0 100 200 300 B I G _ M A C
Customization of an existing graph Once you have created a graph, you can use the Graph Editor tab change many of its features without recreating the graph. Using the Graph menu, you can change the properties such as color, axes, labels, symbols, titles and graph size.
Note: To view the graph in the Graph Editor, either double click on it or click the Graph Editor tab or double click the corresponding node in the tree formed in the Output Organizer tab.
SYSTAT 36 To Edit Graph Axes From the menu choose: Graph Options Axes The Axes dialog enables you to alter the axes of your graphs. It has three tabs Labels, Scale, and Tick Marks.
Labels tab
To enter the new labels for the axes of your graph, select the Labels tab. Change the WORKWEEK in the X-axis label to Average working hours per week. Click Ok.
Alternatively, by right-clicking on the graph you can edit the label of your graph.
SYSTAT 37 Please note that the above menus are also available in the main Scatterplot dialog box.
Scale tab
You can define a range for the scale of each axis on the graph.
Note: Any data points that fall outside the range do not appear on the graph. To flip the axes check Transpose X-Y check box.
Tick Marks Tick Marks tab allows changing X and Y-axis tick intervals along with the tick marks style. SYSTAT 38
To Edit the Graph Layout From the menu choose: Graph Options Layout
The Layout dialog box enables you to alter the graphic title, legend, and layout of frames. It has three tabs Graph Title, Frame Layout, and Legend.
Graph Title
SYSTAT 39 Enter a new title for your graph, say, WORKWEEK Vs BIG_MACK.
Frame Layout Frame Layout allows you to enter a title for individual frames, and change the position and size of the graph.
In the Frame size, enter Height and Width equal to 3. Note: For graphs consisting of one frame, no frame title can be specified.
Legend The Legend tab allows you to alter the position of the graph legend, its title, and its item labels. SYSTAT 40
Note. Usually legend tab would be active, when a grouping variable is selected while creating a graph. Since no grouping variable has been selected here, all fields in the legend tab are inactive.
To Edit Appearance of the Graph From the menu choose: Graph Options Appearance The Appearance dialog box enables you to alter the color, fill and the symbol of the graph. It has three tabs: Color, Fill, and Symbol and Label.
Color
SYSTAT 41
To change the color for the elements in the graph, select the option Select color. Select a color from the Color drop-down list for each of the y variables.
Fill
To change the fill pattern for the elements in the graph, select the option Select fill. SYSTAT 42 Select a fill pattern from the Fill Pattern drop-down list for each of the y variables.
Symbol and Label
You can change the symbol type by using any of SYSTATs 23 built-in symbols.
After performing the above steps, edited graph looks like this 30 35 40 45 50 WORKWEEK 0 100 200 300 B I G _ M A C WORKWEEK Vs BIG_Mac
SYSTAT 43 14. Getting Help SYSTAT uses the standard HTML Help system to provide information you need to use SYSTAT and to understand the results. This section contains a brief description of the Help system and the kinds of help provided with SYSTAT. The best way to find out more about the Help system is to use it. You can ask for help in any of these ways: Click the button in a SYSTAT dialog box. This takes you directly to a topic describing the use of the dialog box. This is the fastest way to learn how to use a dialog box. Right-click on any dialog box item, and select 'What's this?' to get help on that particular item. Select Contents or Search from the Help menu. For help on commands, from the command prompt (on the Interactive tab of the Commandspace) type: