Download the statistics program with the key. Free programs for statistical data analysis

Briefly about how to perform basic actions in the programStatistica 6.0

Preparing data for processing

All data must be presented in table form.

Each row of the table represents one study participant. That is, if a total of, for example, 42 people were examined (both experimental and control groups together), then the table contains 42 rows plus headings.

Each table column is a variable.

When preparing data variable We will consider any information about the study participant. For example, the first variable - the first column of the table - can be a sequence number or even some unique name test subject. The name itself is NOT required in the study. It can only be useful to accurately and accurately enter all the information about this particular person.

The next variable could be group type– experimental or control. You can call the variable “group”. This variable must be completed for all study participants. Please note: the SAME designation must be used for all participants in the SAME group. For example, exp.g.– for all participants in the experimental group, counter.g.– for all participants in the control group. Next, you can specify the gender of the study participants.

In the example data file, the first variable is Pol. The next variable is age. Here the age is simply indicated in years. Next comes the variable Edu – level of education. This variable can take only 3 values ​​- “secondary-specialized,” “higher,” “incomplete higher.” The following is the length of service in years. The next variable, marital status, can also take on several values. In this example, the first six variables contain general sociodemographic information; These are not techniques yet.

The next three variables - No. 9, 10, 11 - correspond to three scales of the Maslach methodology (the names of the scales are not important to us now). Each of them can take values ​​from 0 to a certain level, now this is not important.

Variables 12, 13 and 14 – assessments of the components of the socio-psychological climate: emotional, cognitive and behavioral components. Calculated according to the method. Can only take three values ​​-1, 0, 1.

In total, in our example we get 14 variables.

I draw your attention to the fact that the variables are different. We will be interested primarily in the division of variables into metric And nominative. Metric variables - for example, age, scores on an intelligence scale, etc. - can take on different values ​​within a certain range, with a higher or lower value corresponding to a higher or lower level of the trait being measured.

Nominative variables can take a fixed number of values. For example, the variable “gender”. It can take two values ​​– M or F. The variable “level of education”: can take three values ​​– secondary vocational, higher, incomplete higher. The “group type” variable is also nominative; it specifies whether the participant belongs to the experimental or control group.

Question: determine which variables from your research are metric and which are nominative. This is extremely important for the choice of research methods.

The result of this stage of work is a table with data (compiled on paper or - better - in Excel), plus an understanding of which variables are metric and which are nominative.

Creating a new file in the programStatistica 6.0

Open the program and select File–New from the top menu. (I recommend using the English version of the program)

A window will appear in which you can select the required number of variables (NumberofVariables) and the number of observations (NumberofCases). In our example there will be 14 variables and 78 observations. Click OK.


We get clean file, into which you can enter research results. This sheet may not be completely visible, so there are scroll bars at the bottom and right.

The result of this stage is a blank sheet on which the results of the study can be entered.

An example of such a sheet is below.

Data input

If you created a data table in Excel, you can copy the data from there into statistics.

(Generally speaking, the Statistica program supports importing data from Excel, but for this you need to organize the data very correctly and perform the import itself very correctly. You can make mistakes. Therefore, I suggest transferring the data “manually.”)

How to create variable names

When creating a new file, all the variables in it are already signed and are called Var1, Var2, Var3, etc. To make it more convenient to work, you need to rename them. To do this, double-click on the variable headers l eva To foot m yushki (designation - 2LKM). A window will open. In it, click on the “AllSpecs...” button, as shown in the figure.

A window will open in which you can label all variables.

After that, click OK. The names of the variables that you write will appear instead of Var1, etc. The numbering of variables will remain, and this is normal.

Next, you need to fill out the entire table with data. If you have already entered data into the Excel program, then you can select a range with data there (without any numbering and without variable names), copy it, and paste it into the Statistica program.

After this, it is advisable to save the data file: menu File–SaveAs..., then indicate where this file should be placed and what to name. The program writes the file type automatically. To save, click the “Save” button. After saving a file, its name appears on the screen, on a blue background in the title bar. It looks something like this:

The result of this stage is a completed and saved file with the research results.

Calculations in the program

From now on, the most useful item in the top menu is Statistics.

Comparison of means in two groups - Student's T-test

This criterion can be used to compare the average values ​​of ONLY metric variables and ONLY in TWO groups (not three, four, ...)

In our example, the variables are metric:

    No. 3 – Age – age

    No. 5 – Stajj – work experience

    No. 7 – ProfStress – an indicator of professional stress

    No. 9 – Maslach_1 – the first indicator of the Maslach method

    No. 10 – Maslach_2 – the second indicator of the Maslach method

    No. 11 – Maslach_3 – the third indicator of the Maslach method

The “Gender” variable divides all participants into two groups – men and women.

The “Group” variable divides all participants into two groups – the experimental group and the control group.

Accordingly, in our example, using the Student's T-test, we can check whether 1) the average values ​​of the variables listed above differ between men and women; 2) whether the mean values ​​of the above variables differ between participants in the experimental and control groups.

In the top menu, select Statistics – in it Basic Statistics/Tables.

select, click OK.

A window with settings appears. First of all, we need to select the variables for which we want to carry out the calculation. To do this, click the Variables button as shown in the figure:

The variable selection window appears.

Here on the left side – Dependentvariables – you need to indicate those metric variables whose average values ​​we want to compare. For example, these are variables 3, 5, 7, 9-11 (age, experience, stress, etc.). You can select variables from a list or empty window print numbers.

On the right side – Groupingvariable – we indicate ONE variable that divides our sample into two groups. For example, you can select the 1-Pol variable, then we will compare the indicators of men and women. Or you can select the 2-Group variable here, then we will compare the experimental and control groups. If we are interested in both options, we will have to apply the T-test twice. But only one variable is selected at a time on the right side of the window.

Now let's look at an example with the 1-Pol variable. It will look like this:

Now OK.

The program returns us to the previous window. To perform calculations, you need to click the Summary button, one of the two, they are shown in the picture.

Another window will appear on the screen – Workbook1. The program will write all calculation results to this file.

Let us consider the results obtained in detail.

The table on the left lists the variables whose average values ​​we compared on a gray background. The columns “Meanж” and “Meanм” contain the average values ​​of the variables for women and men, respectively. That is, the average age of women is 40.68, the average age of men is 39.15 years. The average length of service for women is 17.44 years, for men – 16.87 years. Next, the t-value column contains the value of the t-criterion; we don’t need it. The df column denotes the number of degrees of freedom; we don’t need that either. (That is, when presenting the results of statistical data processing in work, it would be nice to indicate these numbers, but there is no need to decipher them). The next column –p– is required. This is the same level of reliability of differences in average values. Probably the most important column from this table.

Theoretical digression. To test whether the means of the two groups are different, we first calculate these values. And almost always the average values ​​in the two groups will be at least somewhat different. That is, we almost always get DIFFERENT average values. In our example it’s the same – the average values ​​for women and men for all variables are different. But in some places they differ more, in others less. And “by eye” we cannot determine whether the average values ​​differ “a little” or “a lot.” This can only be determined using statistical tests, for example, Student's t-test.

Without going into details of the calculations, I suggest you remember:

Average values ​​in two groups for any variable significantly different,If index p<0,05 (in the program these variables are highlighted in red)

In this case, they also say that the differences in mean values ​​are reliable (or statistically significant) at the 5% level.

Sometimes, if p is greater than 0.05 but less than 0.1, then the differences are said to be at the level of a statistical trend. That is, these are less pronounced differences.

But usually if p>0.05, then they say that no significant differences have been identified/not established/not found. But EVEN IF p>0.1, YOU CANNOT SAY THAT THE AVERAGE VALUES ARE THE SAME.

Thus, in this case, only indicators of professional stress differ significantly for men and women (p value = 0.029, which is less than 0.05). At the trend level, there are differences in the Maslach_2 indicator (here p = 0.051, this is more than 0.05, but less than 0.1). No significant differences were found for other variables.

Now let's look at a comparison of the average values ​​in the experimental and control groups.

Again in the top menu, select Statistics – in it BasicStatistics/Tables. Since we have already launched this program module, a window will appear on the screen

You can select "Continuecurrent" to continue the calculation.

To jump to a comparison between the experimental and control groups, click the Variable button. In the right part of the window –Groupingvariable– select variable number 2. Click OK. ClickSummary as in the pictures above.

We get the following result.

Please note that for the participants in the experimental and control groups the average age, average length of service and average values ​​for the Maslach_2 indicator are significantly different. No significant differences were found for other variables.

How to close the program.

First you need to close all calculations. To do this, click on the rectangle in the lower left corner, a calculation window will open, close it with a cross or the Cancel button.

The second step is to close the Workbook1 window – also with a cross. You can save this file, but it is not necessary.

The third step is to close the data file.

Fourth, close the program.

I'll add later:

Comparison of means in two groups is a non-parametric method.

Comparison of means in three or more groups - analysis of variance

Analysis of contingency tables - Chi-square.

Using the Chi-square criterion, we find that the distribution according to the attribute “like/dislike ice cream” among boys and girls is significantly different. That is, they have “different” attitudes towards ice cream.

Here, using Chi-square, we find that no significant differences were found. That is, boys and girls “do not differ” in their love/dislike for computer games.

We check whether the educational level of participants in the experimental and control groups differs.

Correlation coefficients.

Transferring results to Excel

Overview of statistical programs






At the stage of planning an experiment, the functions from the Sampling menu will be useful for the researcher, allowing you to determine the required number of groups for some of the most common research tasks. Among the functions implemented in MedCalc, special mention should be made of the ability to carry out basic types of statistical analysis, without having sample values, i.e. based on average values, scatter indicators, etc. This can be useful when studying literature data, since complete information about the primary results of the experiment is not provided in publications. For example, to compare sample means using the Student's test, it is enough to know the arithmetic means themselves, the standard deviation and the sizes of both samples. This data should be entered in the window called Tests > Comparison of >


Title: Review of statistical programs
Detailed description:

The productivity of the work performed is closely related to the tools used. So, according to legend, Archimedes said that he could turn the Earth over if he received the necessary fulcrum and leverage. But the great philosopher did not have the necessary tools, and our planet is still flying in its orbit. A similar situation arises in the field of statistical analysis of research results. It is quite possible to carry out statistical data processing with only a pencil and paper, but it is much faster and more efficient to do this with the help of special tools, namely statistical software. Strictly speaking, software packages used for statistical analysis should be classified as mathematical programs, so in this article the terms “mathematical” and “statistical” will be used interchangeably.

As a rule, young scientists take their first steps in statistics using spreadsheet processors, with the vast majority using MS Excel. The second most popular spreadsheet processor today is Calc from office suite OpenOffice.org. Unfortunately, some researchers perceive these programs as the most convenient and suitable tool for analysis. However, they are mistaken. The use of such software is permissible in cases where it is necessary to perform simple operations such as sorting data, calculating descriptive statistics, constructing certain types of graphs, and also simply to save the primary data of your experiment and keep a laboratory journal. In other words, full statistical processing of research results in Excel is impossible. This is an office application, not a scientific one.

All scientific mathematical applications can be divided into two large groups: programs with a graphical interface and without it. You should not think that the graphical interface somehow characterizes the quality software product. These properties are in no way dependent on each other. Nevertheless, such a division is of great practical importance. The fact is that not everyone can comfortably work on the command line. Today, many computer users do not even think about abandoning the “cliquedromes” on which an impressive part of the modern IT industry rests. However mathematical calculations it is still more convenient to perform by typing commands from the keyboard, rather than clicking on numerous buttons on the screen. Therefore, serious applications have a command line mode with a built-in programming language and a graphical interface.

First, let's get acquainted with the statistical computing environment and the programming language R. Its origins lie in the S programming language, with which they have a lot in common. The standard configuration of R does not include a graphical interface, which is familiar to many users. As a result, a number of researchers have the erroneous opinion that this tool Allows you to perform numerical calculations only, but does not have graphing capabilities. This is wrong. The R system has ample opportunities for statistical processing data, including for working with graphics, and the window interface can be installed as an additional application. But it should be kept in mind that graphical user interfaces for R are noticeably inferior to those in other statistical packages.

You can install the R environment on a computer running Windows, MacOS, or Linux. When starting the R system, an inexperienced user will have the question: “Where should I enter data?” Due to the lack of a built-in table editor, the analyzed information is either entered directly into the command line as an argument to the corresponding functions, or loaded from external files. The first option is convenient when working with single values, and the second - in cases where it is necessary to work with tables. The tables themselves can be created in any spreadsheet processor, and the files can be saved in *.csv format, which is easily loaded into R.

Having loaded information into variables, you can begin to process it using a huge number of functions implemented in R. But you should remember that all intermediate data when working with this language is stored not in temporary files, but directly in RAM. This feature must be kept in mind when processing very large amounts of information: R will use a significant part of the computer's RAM.
The syntax of the language is quite simple and easy to learn. To date, more than a hundred books have been written on a variety of areas of using the R statistical computing environment, but all of them are based on English language. Unfortunately, there is still very little Russian-language information and it is presented only in the form of scattered articles on some issues of using this programming language. It is the lack of information that is holding back the spread of a high-quality software package in our country (despite the fact that it is free).

The R's reliability comes from its origins. The language was created as a free implementation of the very powerful S programming language, the history of which dates back to 1976, when the first working version appeared. Today, the S language is the basis of the S-PLUS application developed by TIBCO Software Inc., and, unlike R, is commercial product. S-PLUS has a nice graphical interface, in which data can be entered by loading from an external file, database, or by copying a table from a text file or spreadsheet processor. S-PLUS, like R, can run on different operating systems and can be used to perform numerical and graphical analysis methods.

Another popular statistical application is SAS, which originated in the 1960s at the University of North Carolina as an application for analyzing agricultural research results. Today, the system continues to be developed by the SAS Institute, which has already released the ninth version of this program. The scope of SAS is a wide variety of scientific research, business analytics, etc.

The system consists of modules, each of which performs a specific range of tasks. The BASE and STAT modules are most often used in statistical processing. Implemented in the SAS system own language programming, which in its syntax is closer to BASIC and is not similar to R or S. The system allows you to load data from external files or enter them directly into the terminal window. Working using SAS, you can perform statistical data processing different levels complexity, in accordance with the assigned tasks. Interaction with the program is possible both in console mode and through a graphical interface, which is a graphical shell for simplified input of SAS programming language commands.

Programs that use primarily a command line interface also include Stata, developed by the American corporation StataCorp. The application can run on operating systems of the Windows family, MasOS and Linux. Data entry here is possible either by loading from external files or using the built-in table editor, which is quite simple, but allows you to perform all the necessary manipulations with tables. The principles of working with the Stata application are no different from those when using the programs described above. Those users who find terminal mode inconvenient can use the program menu to automatically generate built-in programming language commands.
All described statistical packages can be used for any type of statistical analysis. Thus, the functionality of the R language can be changed by adding libraries of functions aimed at a strictly specific type of task. In addition, anyone who has sufficient knowledge and experience with this language can create their own functions and libraries that correspond to the specifics of a particular user.

But in addition to “general profile” statistical software, there are programs aimed at scientists working in the field of biomedical research. Thus, the MedCalc program, developed since 1993 by the Belgian company MedCalc Software, is positioned as a full-fledged statistical application created in accordance with the needs of biomedical researchers. The developers focus the attention of researchers on the ease of use of MedCalc for analyzing ROC curves.

The program is convenient in that it does not offer redundant functionality, which often confuses an unprepared person starting to work with universal applications. In addition to this, the ability to work only in a graphical interface without using the command line makes the program less flexible, but more attractive for use in this field of science, since specialists with medical education very rarely can boast of extensive experience working with mathematical programs.

To date, the twelfth version of the program has been created. Unfortunately, only Windows users, but this disadvantage is compensated by relatively small system requirements and the ability to run the application in both Windows 2000 and Windows 7. For those who have never used the program, it is possible to download a fully functional demo version of the product from the medcalc.org website, which will work without restrictions for fifteen days. In addition, the package includes demo files containing data sets and examples of their analysis.

Data entry into MedCalc is carried out in an integrated spreadsheet editor or by importing files various formats, such as *.csv, excel, etc. To call the built-in editor, just select the Spreadsheet command in the menu, after which you can start generating a table. In statistical programs, the columns of tables are called “variables” and the rows “cases.” When creating a table, it will be useful to follow several rules:
. The first variable must contain the serial numbers of the cases. This is necessary in order to be able to restore their previous order after re-sorting the values.
. Numeric values should be entered without rounding to avoid losing information.
. If some values ​​are missing, you can skip them, leaving empty cells in the table.
. Each variable must have only one value for each case.

After saving the table or loading a file with data, the information processing stage begins. To perform statistical analysis, select the appropriate item in the Statistics menu. Each type of analysis has its own set of settings, for which help can be obtained by clicking the Help button.
At the stage of planning an experiment, the functions from the Sampling menu will be useful for the researcher, allowing you to determine the required number of groups for some of the most common research tasks. Among the functions implemented in MedCalc, special mention should be made of the ability to conduct basic types of statistical analysis without having sample values, i.e. based on average values, scatter indicators, etc. This can be useful when studying literature data, since complete information about the primary results of the experiment is not provided in publications. For example, to compare sample means using the Student's test, it is enough to know the arithmetic means themselves, the standard deviation and the sizes of both samples. This data should be entered in the window called Tests > Comparison of > means (t-test), and the comparison result will be displayed in the same window. The rest of the functions in the Tests menu are used in the same way.

Thus, the MedCalc program, at a relatively low price, provides the user with a convenient interface without unnecessary “functionality”, equipped with a good spreadsheet editor. All calculations and diagrams are saved in one file and are easily sorted in a special list on the left side of the main program window. Statistical analysis is performed using conveniently organized menus, equipped with concise and understandable reference material. In this regard, the program will be very useful for scientists performing biomedical research and inexperienced in mathematical applications.

MedCalcl is a simple and easy-to-use program, but not every user can get everything he needs to do his job from it. Among those who place very high demands on statistical software and are willing to shell out several thousand dollars for it, applications such as Statistica or SPSS Statistics are popular. Both programs are real “monsters” in comparison with MedCalc - both in cost and in their computing capabilities. It is impossible to talk about them in detail within the framework of an article; for this you would have to write a book of several hundred pages, so we will limit ourselves to a brief introduction.

Statistica is developed by StatSoft. As of today, the latest version is Statistica 9. SPSS program, whose name is an abbreviation for Statistical Package for the Social Sciences, relatively recently became owned by IBM and changed its name to PASW (Predictive Analytics SoftWare) Statistics. Both programs have an excellent graphical interface, a built-in programming language and the ability to integrate with the statistical computing language R.

It should be noted that the almost limitless possibilities in statistical processing provided by these tools require large computer resources. Thus, SPSS requires at least 1 GB of RAM to run. Operating systems that can run SPSS: Windows, MacOS and Linux. Statistica is developed only for Windows, which somewhat reduces the number of its users.

As always, work in programs begins with data entry. The integrated table processor allows you to create tables using the methods familiar to every user of office applications. Saved tables, as well as calculation results, graphs and reports in Statistica can be conveniently arranged in one file called a “Workbook”, while the organization of the workspace in SPSS is less convenient, but still quite acceptable for use after a short period of adaptation.

The programs contain all the most popular statistical methods: frequency analysis, calculation of statistical characteristics, contingency tables, correlations, plotting, t-tests and a large number of nonparametric tests, multivariate linear regression analysis, discriminant analysis, factor analysis, cluster analysis, variance analysis, reliability analysis, multidimensional scaling and a number of others. Calling these statistical procedures is done by selecting the appropriate windows from the menu and making the necessary settings in them. All types of analysis are divided into groups, which helps you quickly navigate the application interface.

STATISTICA and SPSS systems have wide graphic capabilities. They include a wide variety of categories and types of charts, including scientific, business, 3D and 2D charts in various systems coordinates, specialized statistical graphs - histograms, matrix, categorized graphs, etc.

The statistical functions available in both applications are striking in their variety. It seems that these statistical analysis tools allow you to do anything, provided that the user has thoroughly studied how they work. The main obstacle to mastering these programs is the time that needs to be spent on training. It is precisely because of the user’s lack of knowledge that, in most cases, the power of statistical packages of this level is not even half used.

As you can see, there are many applications for statistical analysis in the world. Only a small part of them was briefly described in this article. Outside of it there were such programs as Minitab, MatLab, Octave, GenStat, JMP, Analyse-it, domestic development STADIA and many other, large and small, expensive and free programs. However, such an abundance of software should not frighten the researcher; it will be enough to once make a thoughtful choice in favor of one or two programs, carefully study the intricacies of their use, and they will serve as faithful assistants in the statistical analysis of experimental results for many years.


Introduction

STATISTICA is an integrated data analysis and management system. STATISTICA is a tool for developing custom applications in business, economics, finance, industry, medicine, insurance and other fields. STATISTICA is easy to learn and use.

All analytical tools available in the system are available to the user and can be selected using an alternative user interface. The user can fully automate his work, starting from the use of simple macros to automate routine actions up to in-depth projects, including integrating the system with other applications or the Internet. Automation technology even allows inexperienced user customize the system for your project.

STATISTICA system procedures have high speed and calculation accuracy.

Flexible and powerful data access technology allows you to effectively work with both data tables on the local disk and remote storages data.

The system has the following generally recognized advantages:

  • contains full set classical methods of data analysis: from basic statistical methods to advanced methods, which allows flexible organization of analysis;
  • is a means of building applications in specific areas;
  • The delivery set includes specially selected examples that allow you to systematically master analysis methods;
  • meets all Windows standards, which allows you to make the analysis highly interactive;
  • the system can be integrated into the Internet;
  • supports web formats: HTML, JPEG, PNG;
  • is easy to learn, and experience shows that users from all areas of application quickly master the system;
  • STATISTICA system data can be easily converted into various databases and spreadsheets;
  • supports high-quality graphics that allow you to effectively visualize data and conduct graphical analysis;
  • is an open system: it contains programming languages ​​that allow you to expand the system and run it from other Windows applications, for example, from Excel.

STATISTICA consists of a set of modules, each of which contains thematically coherent groups of procedures. When switching modules, you can either leave only one window of the STATISTICA application open, or all previously called modules, since each of them can be executed in a separate window (as a separate Windows application).

When executing STATISTICA modules as stand-alone applications at any time in any module there is direct access to “common” resources (data tables, BASIC and SCL languages, graphical procedures).

When installing the system, the installation program ( Setup) creates a group of applications on the desktop called STATISTICA and places there icons of the Module Switcher window (the STATISTICA icon is the first in the group, see figure), the Basic Statistics and Tables module and some other programs ( Help, Setup). The user may find it more convenient to launch modules by clicking their icons on the desktop (rather than using the Module Selector window); so he will probably want to create additional icons for the modules beyond those automatically created by the installer ( Setup). To create another icon in this group, follow the standard procedure Windows(select New from the menu File in the window Dispatcher programs ( Program Management r) and create a new program element).

Setting up the STATISTICA system. The system provides the ability to customize many characteristics and the program interface in accordance with user preferences. You can change, for example, the startup process, namely, cancel the default one full screen mode, change the appearance of the launch pad, toolbar, data tables and other parameters.

Configure general system settings. You can change the settings for general system parameters at any time while working with the program. These parameters define:

  • general aspects of program behavior (maximizing the STATISTICA window at startup, Workbooks, tool Drag and let go - Drag-and-Drop, automatic connections between graphs and data, multitasking mode, etc.),
  • output mode (e.g. automatic printing of tables or graphs, report formats, buffering, etc.),
  • general view of the application window (icons, toolbars, etc.),
  • appearance of document windows (colors, fonts).

Each of these parameters can be configured in the corresponding window, which is accessed through the menu Service. The following figures show two examples of such windows.


All general settings can be configured regardless of the type of document window (for example, table or graph) that is active in this moment.

Customizing the user interface. When working with the STATISTICA system, it is possible to customize the user interface of the program so that it becomes more “thought out” from the point of view of the needs of a particular user.


Depending on the requirements of the task and personal preferences (as well as aesthetic considerations), you can use a variety of “modes” and operating conditions for the program.


Multiple support various configurations STATISTICA systems. STATISTICA will keep all current settings and default settings until special changes are made.

The fact that system configuration information is stored in the same folder from which the STATISTICA program is called allows you to have different program configuration options for different projects or types of work. For example, you can call a program from different folders on disk, each of which contains a specific coherent set of documents, and for each of these folders the system can be configured with its own output settings, default plotting options, etc. You can create multiple STATISTICA icons in different groups applications on the desktop Windows(each of which corresponds to a specific project or type of work) and set different values ​​for them in the Working directory field ( Working Directory)(using the system dialog box Windows Properties software element (Program Item Properties)).

Multitasking. STATISTICA supports multitasking mode (between its modules or other applications).


When processing very large amounts of information or performing complex analysis procedures, you can switch to another STATISTICA module (or another application Windows), using the ability to conduct data processing in the background.

Working in one window of the STATISTICA application(instead of multi-window mode). One of the options for global system settings of the STATISTICA package allows the user to set the mode in which the program will run by default - in one application window or as a set of applications (each in its own window). One of the immediate consequences of this choice will be the mode in which the Module Switcher window will work: when you double-click on a module name in this window, the selected module will either open instead of the already open one, or a new application window will open for it, while the previous window will remain open .

The choice of one or another operating mode is made in the Switching modules: single application mode field in the Default settings: general settings dialog box (called from the Tools menu). If this box is checked, STATISTICA will run in single application mode.


Single Application Mode. When the single application window mode is selected, switching from one module to another will occur without opening new windows. New module each time it will open in the same window, replacing the previous one. Some users will prefer this “simple” mode of operation, since all analysis will take place in one application window, and the number of active programs on the desktop will be minimal.

Approximately the same effect can be achieved by pressing the button Finish and switch in the Module Switcher dialog box; in this case, the application window of the current module will close, but will not be replaced by a new window; instead, the system will open the "next" application window.

Multi-application mode. The main advantage of the multiple application mode is the ability to parallelly execute various analysis procedures (modules) in different simultaneously open application windows. At the same time, you can switch between modules without closing the previous ones, and take advantage of all the advantages of working with independent queues of result tables and graphs for application windows different modules. This mode has obvious advantages for most data analysis tasks and makes it possible to use different analysis methods (and compare the results).

Interactive data analysis in STATISTICA. The system does not require the user to indicate all the information that should be displayed on the screen before performing the analysis. After all, the analysis of even a simple plan can generate a large number of tables of results and simply an endless number of graphs, so when conducting a real analysis, before studying the main results, it is difficult to imagine which graphs or tables should be analyzed first. That's why STATISTICA gives the user the ability to select specific output types and interactively conduct sequential comparisons and modeling analyzes after the data has been processed and the main results have been obtained.

The number of windows displayed can also be adjusted so as not to overload the computer screen.


Flexible STATISTICA calculation routines and a wide range of methods graphical representation data of any type opens up endless possibilities for the user to conduct exploratory analysis and test statistical hypotheses.

What features do workbooks provide?. Workbooks help organize sets of files (eg, result tables, graphs, text/graphical reports, user programs, etc.) that were created or used (eg, viewed) during the analysis of a data set. Workbooks store a list of all files used with the current dataset.


An updated list of these files is automatically saved with the data file. If you check the box Auto next to the file name, it will automatically open with the current data set.

Help system and online (electronic) manual. For more information about certain system features, press the Help (F1) key while the corresponding command or menu item is highlighted. STATISTICA contains an Electronic Manual - background information for all procedures and functions of the program, available in context-sensitive mode by pressing the F1 key or the help button in the title bar of all dialog boxes (the reference contains over 10 megabytes of compressed documentation). Thanks to the dynamic organization of the e-Manual via hyperlinks (and various customization options), it is generally faster to use help system what to look for necessary information in printed form. Help can also be accessed by double-clicking the message field of the status bar at the bottom of the STATISTICA application window (the message field also displays brief comments about the functions of the drop-down menus or toolbar buttons, respectively, when a menu item is selected or a button is clicked).

Statistical Advisor. The Statistical Advisor is an interactive help system. After selecting the Advisor item from the drop-down menu (Help), the program will ask you simple questions about the nature of the problem being solved and the type of source data, and then offer a list of the most appropriate procedures (and explain where to find them in the STATISTICA system).


Hyperlinks can take you directly from the Statistical Advisor section to the detailed description of relevant statistical methods and procedures in the Introductory Overview section.

Applications. All the possibilities considered (available at any time while working with the system) can serve as a significant alternative or addition to the usual interactive user interface, since they allow you to automate the routine process of repeatedly performing the same, including very complex, tasks. For example, a macro command (called by clicking a button on the AutoTask Buttons toolbar or by pressing a single key) can contain a long list of variables, a frequently used graph, an embedding operation, etc.

Automatic reports and automatic printing of result tables. Whether processing occurs in batch mode or is interactively requested by the user, the output mode can be selected Auto report. This mode allows you to automatically, without any user action, print (or send to a report window or file) the contents of all output windows that are obtained during the analysis process.

The mode of automatic display of each results table and/or graph constructed on the screen can be useful not only for creating a complete report on the results of the analysis, but also during exploratory data analysis, when there is a need to return to previous step and review results obtained from early stages of data processing. To do this, all output information (result tables and graphs) can be sent to a temporary scrolling Text/Output Window and then, if necessary, saved, printed or copied to a text editor file.

Automatic printing of graphs. The mode for automatically printing all graphics that appear on the screen is especially useful as a means of batch graphic printing.


Typically, printing graphs takes quite a long time. Therefore, it makes sense to use this mode to print a sequence (“cascade”) of graphs obtained when applying certain analysis methods (for example, to visually represent the configurations of averages when studying higher-order relationships in variance analysis, you need long sequence graphs, and for multidimensional tables - a cascade of three-dimensional histograms for two variables).

However, it is much more efficient to direct the generated sequence of graphs to the Text/Output Window. STATISTICA provides the ability to batch print all previously saved graphs and tables of results; To do this, select Print files in the drop-down menu File.

Clipboard. The fastest and in many cases easiest way to retrieve data from other applications Windows(for example, spreadsheets) is the use of the clipboard, which STATISTICA supports special formats data created by applications such as MS Excel or Lotus for Windows. For example, STATISTICA correctly interprets formatted (for example, 1,000,000 or $10) and text values. The clipboard and data file conversion can also be used to export data from the STATISTICA system to other formats. STATISTICA uses the same set of formats and data types when importing and exporting data.

File import functions. Data files from applications Windows and other operating systems can also be converted to the STATISTIC A system format using file import functions that include access to all databases (via ODBC support), as well as the ability to import formatted text files and free-format text files (ASCII). Importing files without using the clipboard has its advantages:

  • it allows the user to specify exactly how the import should be carried out (for example, select value ranges from files, import or not import variable names, text values ​​and observation names, and specify how they should be interpreted);
  • it gives the user access to data types that are not accessible (or difficult to access) through clipboard operations (such as long value labels or special codes missing data).

DDE connections. STATISTICA supports dynamic data exchange (DDE) conventions, which allows you to dynamically link a range of data in a source data table to a dataset from other applications (Windows). This procedure is actually much simpler than it may seem and is easy to learn without technical knowledge of the DDE mechanism, especially when using the Establish Link command (instead of entering a description of the link). DDE (Dynamic Data Exchange) connections can be established between a source file (server), such as an MS Excel spreadsheet, and a STATISTICA system data file (client file), so that when changes are made to the source file, the data is in the corresponding part of the source data table STATISTICA (client file) will be automatically updated.


Typically, the two files are dynamically linked in industrial settings when a measurement device is connected to the serial port of the computer on which the STATISTICA system data file resides (for example, to automatically update certain measurements hourly).

DDE links can be established using the Set Link command from the Edit Source Data Table drop-down menu or by entering a link definition in the field Long name(label, formula, link): variable specification dialog box.


If the connection is established, you can manage it in the dialog box Link Manager(called with the command Connections... dropdown menu Edit).


Date and Time Formats. In system data files (which are organized as databases), the display format for values ​​is applied to the entire variable rather than to individual cells (as in Excel). Therefore, values ​​that were formatted as dates in Excel will appear as Julian (integer) values ​​in the STATISTICA file (for example, 34092 instead of May 3, 1993) unless the format is set for the corresponding variables date or Time.

Does STATISTICA support ODBC interface? Yes, in order to implement this feature, there is a list of Data Import commands, which can be called from the File drop-down menu of any module. The ODBC STATISTICA interface includes capabilities for combining fields from multiple tables and provides access to a variety of database files, including desktop and desktop formats (eg, dBASE for Windows, Paradox, Sybase, Oracle, SAS, etc.).


Import via ODBC can be automated using the ODBC/Templates function or SCL programs.

Types of objects. If the New object mode is set, then the type created object can be selected from the list Windows applications, which support OLE tools. After selecting the type and pressing the button OK The corresponding application window will open to create a new object. If the Object from file mode is set, the type of object to be inserted is also selected from the list of Windows applications that support OLE tools; after selecting the type, all previously saved files of this application will be shown. In the Picture from file mode, you can insert an object that is incompatible with the OLE method, but recorded in one of the graphic Windows formats: in metafile format (file with *.wmf extension) or bitmap image (file with *bmp extension).


Linking and embedding. STATISTIC A supports OLE (object linking and embedding) facilities in both client and server mode. Thus, it is not only possible dynamic setting STATISTICA graphs in other applications (server mode), but also the implementation and subsequent conversion of OLE-compatible objects of other applications (for example, graphs or tables) or your own objects into STATISTICA graphs. In other words, in addition to attaching external elements to STATISTICA graphs, you can use paste to directly access objects contained in a file on disk (for example, drag them directly from the window File Manager or Conductor(Windows Explorer) and place it on the STATISTICA chart).


STATISTICA supports both linked (that is, dynamically attached) and embedded (that is, statically "inline") objects. However, they can be located in any file created by Windows applications, including files in STATISTICA's own graphic format (with the *.stg extension). Moreover, STATISTICA can simultaneously act as both a client and a server in the OLE method, while supporting unique opportunity creating nested compound documents (up to the fourth order inclusive), that is, a STATISTICA document with an embedded document can, in turn, be embedded in another document of this system.

Note that each of these two attachment methods (linking and embedding) has its own advantages and disadvantages.

Related objects. Graphs with linked objects are slower to redraw because they may involve links to external files. At the same time, these graphs are updated automatically (the status of the links can be set in the Data and Graph Links dialog box, which is called up from the Edit graphical menu), and this allows you to easily create compound documents that include exactly the “current” contents of other files.


Embedded objects. Graphs with embedded objects are redrawn faster than with linked objects, since there are no links to updated external files. If you double-click on an embedded object, the server application (that is, the source) will be called, in which you can change this object. There are two ways to update an embedded object: edit it or replace it manually.

On the menu Edit you can configure all parameters of external objects (linked or embedded), as well as their connections with other chart components. In addition, by right-clicking on an object, you can select the desired configuration commands from the context menu. The only exception is the method of attaching the object (linking or embedding), which is determined at the time the file is attached (after that, only the linked object can be converted to embedded, but not vice versa (see the Convert to Embedded command from the drop-down menu Edit)).

Configure linked or embedded OLE objects. STATISTICA OLE graph objects can be edited by double-clicking on the object; in this case, the source application will be opened in OLE server mode with the object ready for editing. If this object is a STATISTICA graph, then a new graphic window will open in the current module, which will allow the system to simultaneously act as both a client and a server.


When editing is complete, you can use any of the standard OLE conventions to exit server mode and update the graph in STATISTICA (using the Refresh, Refresh and Return to... etc. commands in the File application pull-down menu; these commands are only available in case if the application is running in server mode).

Graphic formats Metafile and Raster image. To insert graphic file Applications that do not support OLE methods use the Save Metafile or Save Bitmap commands (from the File drop-down graphical menu). The graph in Windows metafile format will be written to a file with the *.wmf extension, and in the raster image format - with the *.bmp extension. These formats, described in the next two paragraphs, do not allow you to fully implement all the customization capabilities of STATISTICA graphics, but at the same time they are compatible with all applications that support Windows graphics formats.

What is a Windows Metafile? Metafile graphics format is one of the standards for recording graphic files (with the *.wmf extension) and presenting them on the Windows clipboard. It contains a picture in the form of descriptions and definitions of all components of the graph and its attributes (for example, line elements, their colors and patterns, filling patterns, descriptions of text and its parameters).

Compared to the bitmap standard (see below), the metafile format allows for more flexible configuration of OLE-incompatible objects in Windows applications.

For example, when opening a metafile in Microsoft program Draw allows you to “expand” the graph image, select and change individual lines, fill patterns or colors, as well as edit the text and change its attributes.


However, not all Windows applications fully support all metafile format capabilities available in STATISTICA. Some parameters of graphs recorded by STATISTICA in this format may change when played back in other applications. For example, the rotation of some fonts may disappear. Therefore, whenever possible, use the STATISTICA graphics format and 01£ methods for working with graphs in other applications in order to have access to all the customization capabilities of STATISTICA itself.

Limitations of the standard Windows Metafile format. The complex graphics produced by STATISTICA may be too large, in terms of the number of data points represented, to be written in the metafile format that Windows defaults to for most graphic linking and embedding operations. In such cases, you need to use a raster image. For more information, refer to the e-Manual from the dialog box Extra options, which is called up from the Graphics tab of the Page/Output Options dialog box.

What is a raster image format? The Bitmap format is the second standard Windows graphics format, which is used to represent graphic files (with the *.bmp extension) and transfer the image via the clipboard (like the Metafile format). This format does not save any additional data or parameters other than the image of the picture itself.

Unlike a metafile, a bitmap is a “passive” point-by-point representation of a graphics window. The ability to customize such a schedule in other Windows applications is very limited. Typically they only involve stretching, squashing, cutting, pasting, and drawing on top of the graph. As noted above, when working with graphs in other applications, it is more convenient to use recording in the STATISTICA graphic format and OLE methods in order to have access to all the configuration options of the STATISTICA system itself.

What is STATISTICA's own graphics format? STATISTICA graphic files have the extension *stg. Their main difference from metafiles and raster images is that they contain not only the picture, but also all the information necessary to set up the graph and analyze the data. All data presented on the graph, their connections, fitting equations, parameters of embedded objects, connections between graphs and figures, etc. are recorded here. Graphs recorded in this format can subsequently be opened in any of the modules of the STATISTICA system to continue setting up and analyzing the data. Alternatively, they can be printed in batch using the command Seal files from the dropdown menu File. Graphic files in STATISTICA's native format can be dynamically linked to Windows application documents using OLE methods.

Export via clipboard (paste or paste special using OLE methods). Using the clipboard is the fastest way to export a graph to another application. When copied to the clipboard, three graphical representations of the object are created: in native STATISTICA format, in Windows metafile format, and in bitmap format. Each of them can be used in other applications.

STATISTICA system graphs can be present in other applications (editors or spreadsheets) either as linked or embedded objects. When OLE methods are used, they remain linked to the STATISTICA system and can therefore be edited interactively within other applications.

Access to all chart data. Data presented on system graphs can be directly viewed and modified, regardless of their type, in the built-in Graph Data Editor. This could be raw data, parts of a results table, or a series of calculated values ​​(for example, a probability plot).

For each chart, a “child” Editor window is created associated with it, which is closed along with its graphic window. The editor is organized into groups of columns representing individual dependencies of this schedule(see next paragraph).


Categorized graphics. To create categorized graphs, data is divided into subgroups. Several graphs will be presented simultaneously on one image, one for each of the specified subgroups. For example, you can build graphs separately for male and female subjects, divide patients into groups of women with high blood pressure, women with low blood pressure, men with high blood pressure, divide products by quality, country of origin, etc. Dividing data into homogeneous groups and exploring the connections between these groups is an extremely important data analysis technique.


  • They are available in most analysis dialog boxes (these graphs are automatically created in procedures that analyze groups or subgroups of data, such as classification, t-tests, analysis of variance, discriminant analysis, and nonparametric analysis).
  • These types of graphs are available in the Quick Statistical Graphs list in the context menus of all source data tables and result tables.
  • They can be called from the list of Statistical graphs (in the drop-down menu, Graphics), when constructing which it is suggested big choice various methods of data categorization.

How are “categories” defined for categorized graphs? So, first you need to divide the data into groups. When you build categorized graphs from analysis dialog boxes, subgroups of data are automatically identified (because this division is part of data exploration). When constructing statistical graphs, various ways to define subgroups based on one or two grouping variables are offered. In addition, the division into subgroups can be organized by the user himself, using any combination of variables from the current data set.

There are several methods for identifying categories:

  • by integer values ​​of grouping variables ( Whole numbers),
  • dividing grouping variables into given number intervals ( Categories),
  • by dividing grouping variables into intervals with given boundary values ​​( Borders),
  • by specifying specific values ​​(codes) of grouping variables ( Codes),
  • by forming complex subgroups ( Complex subgroups); To do this, the user can enter case selection conditions of virtually unlimited complexity and use the values ​​of any variable in the current data file, as shown below.

The following figure shows a fairly complex graph, categorized according to two criteria. In this case, a mixed method was used to identify subgroups. Categorization by two characteristics means that the elements of the graph are arranged as elements of a two-input table obtained after using two different categorization methods.

The two lines in the above graph represent the division into subgroups according to the values ​​of the variable Measure5. The three columns of the graph represent subgroups defined in a special way by observation numbers (null variable) and the values ​​of the Home_7 variable. Below is a dialog box where the parameters of this graph were set.


Each small graph shows the dependence between the variables Work_1 and Work_2 (as X and Y, respectively). The first categorization (Categories by X - “columns” of graphs) is carried out using the method Complex subgroups in the dialog box called up by the button Set subgroups:

The second class (Categories according to Y or “rows” of graphs) is determined by the grouping variable Home_2. The range of this variable is divided into two equal intervals. To do this, in the dialog box for setting chart parameters, the value 2 is entered in the Categories field (in this case, the distribution of the variable Home_2 is divided into two groups: observations for which values ​​are less than or equal to 104.62, and observations with values ​​of this variable greater than this number).

Ternary surface plots and level line maps. When outputting mixture design analyzes in the Experimental Design module, you can plot ternary plots in the form of 3D surfaces or level line maps.

Ternary plots can be built from the submenu Statistical XYZ Graphs, Statistical Categorized Graphs, and Custom Graphs drop down menu Graphic arts.

Graphs in polar coordinates. Some types of graphs can be plotted in polar coordinates. These include scatter plots, line graphs and sequential nested graphs from the Statistical 2M graphs submenu (it is called up from the Graphics drop-down menu).


Categorized graphs can also be constructed in polar coordinates.

Many graphs drawn in a conventional rectangular coordinate system can be represented in polar coordinates. To do this, you need to set the corresponding switch in the General Markup dialog box to the Polar position.


How to place a graphical object from another application on a STATISTICA system chart? To insert any graphical objects compatible with Windows, you can use all the pasting operations described above using the clipboard (including linking and embedding using OLE methods). These operations can be performed on raster objects, Windows metafiles, STATISTICA graphics, and any OLf-compatible objects.

How to place text on a STATISTICA chart (reports, tables, etc.)? Using the clipboard operations described above, you can place a very large text object (for example, a report several pages long) on ​​STATISTICA graphs. This text is edited and modified in the Graph Text Editor window of the STATISTICA system or in the corresponding application, which is the server in the OLE method.

All pasting and clipboard operations described in the previous section apply to any compatible Windows graphical objects, and linking and embedding operations are performed on all objects that support OLE methods.

Gallery of STATISTICA graphs. This button opens the STATISTICA Graph Gallery dialog box. This button is present in the dialog box of each chart type.


From here you can quickly and easily call up all statistical and user graphs, empty graphics windows and user statistical graphs. To do this you need to highlight the name the right type graphics and double-click on it (or click OK).

Custom and statistical graphs. In addition to custom graphs, which are called directly from the summary dialog box of any statistical program, there are two main types of graphs available from the menu or toolbar of any table: custom graphs and statistical (and quick statistical) graphs.

The main difference between the two main types of graphs is the source of the data to be displayed. These differences are described in more detail in the following sections.

Custom charts. A custom graph allows you to display any user-specified combination of values ​​from source data tables or result tables (as well as from any combination of their rows and/or columns). The menu offers five types of such graphs: 2M custom plots, 3M custom sequential plots, 3M custom scatter and surface plots, custom matrix plots and custom pictographs. When you select one of them, the corresponding dialog box opens, where you can set the data range of the current table to be displayed on the graph. The contents of this dialog box depend on the selected custom graph type. Initial selection The plotting data offered in this dialog box is determined by the position of the cursor in the current table. In each custom graph dialog box, when setting parameters, it is possible to select a specific type of graph (within the main type). The type of graph can also be selected after construction (using the General Layout or Graph Placement dialog boxes, which open by double-clicking on the background area of ​​the graphic window or by selecting the corresponding line from the Layout drop-down menu).

Statistical graphs. Unlike custom graphs, which are a means of visually displaying the numerical data of any table (source data or results, see above), statistical graphs offer hundreds of predefined types of graphical representations, including analytical summaries of statistical data. They are called from the Graph Gallery dialog box, which is opened using the toolbar button of the same name or from the drop-down menu Graphic arts.


When constructing such graphs, values ​​are used directly from the data file, which do not depend on the contents of the current table, block selection and cursor position. In this case, either standard methods of graphical analysis of source data are offered (various scatter graphs of values, histograms, graphs of average values, for example, medians), or standard analytical methods of research (normal distribution density graphs, probability graphs with the excluded trend or graphs of confidence intervals of regression lines) . When constructing statistical graphs, the program takes into account the selection conditions and weights of observations.

The most widely used types of statistical graphs (accessed from the menu Graphic arts, see previous paragraph) are presented in the menu Fast statistical graphs. These chart lists do not provide as wide a range of options as menus Statistical graphs, but unlike the latter, they simplify and speed up the procedure for constructing a graph. Quick statistical graphs:

  • called from context menus or from the toolbar of any table (usually they do not require access to drop-down menus or dialog boxes),
  • do not require the user to select variables (this choice is determined by the current position of the cursor in the table) and intermediate settings of parameters (the format of the corresponding graphs is determined by default).

When choosing point B fast statistical graphs(using the button on the toolbar from the context menu or from the drop-down menu Graphic arts) the selection menu appears statistical graph for the current table variable, that is, the one that the cursor is currently pointing to.


If the cursor does not point to any of the variables, then before plotting any graph from the menu Fast statistical graphs you will be prompted to select a variable from the list. When creating such graphs, STATISTICA takes into account the current selection conditions and observation weights.

Block statistical plots. These types of (custom) graphs are called from context menu items Block statistics by columns And Block statistics by row or from the dialog box Gallery of graphs.

Either of these options allows you to build a summary statistical graph for the selected block to compare values ​​in rows (Block Statistics by Row) or in table columns (Block Statistics by Column). This type of charts is similar to those custom charts that display data from the current table block.

Other specialized graphics. In addition to the standard set of quick statistical graphs, some tables allow you to build more specialized statistical graphs (for example, time sequences in the module Time series, icons of regression residuals, as well as contour plots in the module Cluster analysis). As mentioned earlier, specialized graphs that are associated not with a specific table of results, but with a specific method of data analysis (for example, graphs of approximating functions in the module Nonlinear estimation or average per module Analysis of variance), are called directly from the dialog box with the analysis results (that is, from the window containing the output parameters of the data processing method used).

Setting up a chart before and after plotting it. Any changes to the chart parameters in STATISTICA are made from the active graphics window (after the chart is displayed on the screen). As a rule, it makes sense to first plot the graph using the default parameter values, and then make various changes. However, in those rare cases where plotting takes too long (when creating complex composite graphics or processing large data sets), you can intervene in the process to make the necessary adjustments. You can interrupt drawing with a single keystroke or mouse click anywhere on the screen, and then continue drawing after making the necessary changes.

There are two main methods for setting up a chart - adding and editing custom graphic objects, changing the structural elements of the chart.

Do they apply to various types graphs different methods of setting?

No. Regardless of how you create a graph, you can use any of the features provided in the STATISTICA system to customize and change it. You can add to any chart new schedule, merge it with another graph, place a related or embedded object in it. In addition, the graph can be modified in any way, drawn on it, and various methods of fitting functions can be used. The same setup methods are available when working with graphs that were previously saved and called from a disk file.

Setting up a statistical graph before and after its construction. The section How to customize a STATISTICA graph shows that most of the customization options (hundreds of different graphical presentation options) are available immediately after plotting. To do this, just click on a specific element of the chart or select the corresponding item in the dialog boxes General markup or Chart placement, which are called from the drop-down menu Markings.

At the same time, certain parameters that define the data source must be set before plotting, for example, variables, categorization method, label values, case names, axis labels. IN in this example Before building a graph, you need to select variables and methods, categorization, and, if necessary, set the values ​​of some parameters using the button Options(which is not used here).

Now let's return to our example. After plotting, when you click anywhere in the background of the graphics window, a dialog box will appear. General markup, in which the parameters of the general layout of the graph are adjusted.


In this window you can change the type of graph and set the construction of a map of lines near the level (use the field Graph type). In addition, you can change the Number of sections parameter from the default value of 15 x15 by 25 x 25 (this parameter determines the accuracy of constructing a map of level lines):

After making changes, click OK, and you will see a new graph:


Back to the dialog box again General markup and select the value for the contour line type Zone. In addition, we will place control characters in the first three lines of the chart title @F, @F And @F, to write there the equations of the approximating quadratic function for the first dependence (number 1 in place of the first parameter in square brackets) for each of three separate graphs (numbers 1,2 and 3 as the second parameters):

For the fastest display and comprehensive formatting of function equations, it is better to use the dialog box Options, which is called from the dialog box Statistical graphs. Click OK, and you will see the modified graph:


Now you can continue exploring the various ways to set up a chart. The easiest (and fastest) way to change the parameters of an element is to double-click on it. In addition, with a single right-click on a given object, you can call up the corresponding context menu.

For example, when you right-click on one of the graph axes, the context menu shown below will appear, offering a choice of settings for that axis:


In the graph shown below, using the toolbar button, other proportions of the graphic window were selected, in addition, the status was changed symbols from fixed to movable, and their text is edited, organized and moved to another location.


Can graphs be automatically updated when the data file changes?

Yes they can. All graphs retain connections with the table of source data on which they were built. In this case, if the update is not done manually and the connections are not cancelled, the graph is automatically updated when the source data changes. There is a special dialog box to manage connections Connections between data and graphics. It is called from the dropdown menu Graphic arts.


Here you can set the automatic communication mode, when the graph is automatically updated when the data on which it is built changes. You can also set the mode to Manual or temporarily block communication. In addition, you can set the Link to current data file mode and build the same graph or series of graphs for other data files. The communication method can be changed globally using the dropdown menu command Service.

STATISTICA also supports nested connections with other applications. For example, you can link a graph to data in an Excel 5 spreadsheet using Dynamic Data Exchange (DDE). When you press F9 to recalculate the Excel table, automatic update both the data in this table and the corresponding graph in the STATISTICA system. See also the next two points.

STATISTICA graphic format. Graphs and figures can be saved in the STATISTICA graphic format in a file with the *.stg extension. The commands used for this are Save And Save as... from the dropdown menu File. It is this format that is recommended for recording a graphic file if it is intended to be opened again in the STATISTICA system or attached to other applications using OLE methods. Unlike other graphic formats, the STATISTICA format stores not only the picture itself, but also the Graph Data Editor with all the data presented on the graph, all analytical parameters (fitting equations, ellipses, etc.), as well as other parameters that allow you to subsequently continue the analysis of graphic data . This format is most useful when linking or embedding a graph into another STATISTICA graph. Files saved in this graphic format can be printed in batch mode using the command Printing files from the dropdown menu File.

STATISTICA Command Language (SCL)

STATISTICA contains two built-in programming languages: STATISTICA BASIC and SCL (command language). Both languages ​​are designed to work in the STATISTICA environment and contain built-in operations for accessing source data tables, result tables and graphical functions.

The STATISTICA BASIC language is a simple and at the same time quite powerful programming language. It can be used to create a wide range of applications, ranging from simple programs data transformation and ending with complex user procedures for complex analysis and output of information.

This programming language is suitable for solving large computing problems, since the processed data arrays can have up to 8 dimensions and there are no restrictions on the size of the arrays. In this way the user can use all available memory and create procedures that involve operations on large multidimensional matrices.

The built-in STATISTICA BASIC language is available at any time during analysis, along with an integrated environment that allows you to write, edit, check, debug (pre-run) and execute programs.

The STATISTICA BASIC language, like a regular programming language, supports cyclic operations and conditional jumps, functions and subroutines, as well as working with dynamic libraries(DLL). At the same time, it “understands” the structure of the STATISTICA system data files and allows you to organize interactive data processing in the environment of the system itself using user dialog boxes. With this language, the user can create his own complex data analysis programs while simultaneously using ready-made algorithms calculations and plotting provided in the STATISTICA system.

The command language SCL (STATISTICA Command Language) is intended for organizing batch data processing and creating your own applications based on the procedures contained in the STATISTICA system. In order for the user to implement his own calculation algorithms, it is possible to integrate the STATISTICA BASIC and SCL languages.

Programs written in the built-in languages ​​of the STATISTICA system are available in any module of the system and at any stage of data analysis, and they can be called and executed either using autotask buttons or directly from the editing window. The user also has the opportunity to create his own libraries of functions and subroutines and thus significantly expand the proposed set of data processing procedures and presentation of results.

Input and execution of SCL programs. STATISTICA can operate in “true” batch mode as a command-driven system using the built-in application control language SCL (STATISTICA Command Language), accessible in any system module from the Analysis drop-down menu. You can enter a sequence of commands to perform specific actions, and then run them as many times as you like in batch mode.

Another method of action is also possible - using a dialog box Command Master to quickly select and enter the required list of commands.


To write and debug “packets” of commands, the integrated environment of the SCL language is used. It includes a text editor combined with a window Command Master(see illustration above - button Command Master on the toolbar Command language), a language syntax help system with examples, and integrated program validation tools (accessible from the drop-down menu Service).

Custom SCL Language Extensions. SCL programs can include not only predefined parameters and commands for performing actions for statistical processing, management and graphical output of data (see buttons Help: examples And Help: syntax in the toolbar), but also custom “commands” defined using the Assign Keys tool (SendKeys)(in accordance with the rules adopted in MS Visual BASIC).

Programs written in this way can perform, for example, operations on the clipboard (Copy paste), change the default output parameters of various procedures, and perform other functions.

SCL programs may also include programs and procedures written in STATISTICA BASIC (STATISTICA's language for converting and manipulating data and graphs, accessible from any module in the package). For example, user-defined graphical or computational procedures in the STATISTIC A B ASIC language can be executed as part of a batch of SCL commands.

User interactive interface for SCL programs. Despite the fact that the SCL command language does not directly contain a special user interactive interface, nevertheless, for these purposes you can use programs in the STATISTICA BASIC language called from SCL programs, for example, to create dialog boxes that allow you to select variables, files data, etc. during program execution (see examples in the On-screen Manual).

STATISTICA executable module. The command language contains a special Executable module that allows you to develop turnkey applications, which are called by double-clicking the icon of the corresponding “user application” on the Windows desktop.

This feature saves the user time when the same procedure or sequence of analysis procedures is repeated many times, and also makes it possible to use SCL programs by users who are not familiar with the conventions of the STATISTICA system.


To create such a turnkey application, you first need to write the SCL program itself and save it in the usual way (for example, in Program file 1.sct). Then, in the Windows Program Manager window, you need to create an icon for the executable module named Sta_run.exe (it is located in the STATISTICA folder on the disk).

In the command field you need to specify the name of the SCL program to be executed (for example, d:\data\program1.scl"). Now, when you click on this icon, the execution of the program (in this case Program1.se!) will begin. Using the described method, you can create any number of custom applications, and using the Program Manager window give them meaningful names that correspond to the data analysis tasks that these applications perform.

Autotask buttons

Autotask buttons are a pop-up, customizable toolbar (you can turn it on or off using the CTRL+M keys).


The buttons on this toolbar can be assigned/overridden using the button Settings... (or by clicking on the corresponding button while holding down the CTRL key). In the dialog box that opens, you can assign names to existing and new buttons.

Let's move on to a more systematic presentation.

Often when performing difficult task there is a need to perform the same sequence of actions, for example, opening previously saved charts, data or program listings. The constant need to perform tasks that are unrelated to your main job can be time-consuming or even annoying. The STATISTICA system provides features that relieve the user of monotonous operations and help create comfortable working conditions.

Autotask buttons are a customizable panel that, if necessary, you can easily remove from the screen or restore again (you can restore or hide this panel using the CTRL+M button combination). On the panel "Auto task buttons" click the button Setting... The window for setting up autotask buttons will open. In the central part of the window there is a column of buttons that allows you to:

  • Change or set a button. By clicking on this button, you can set the sequence of keystrokes on the keyboard. To organize such a sequence, just press the button Record on the right side of the dialog box. From this moment on, the system will automatically begin to remember and translate your actions into command language. By pressing, for example, the Alt button on your keyboard, you will be taken to the main menu, which you can navigate using arrows and Enter keys. Helps you move freely within dialog boxes Tab key etc. To end recording, press CTRL+F3. At the bottom of the window Setting up autotask buttons The window navigation buttons and their corresponding syntax will be described.
  • Remove button. At any time you can remove a button that has become unnecessary. Set the sequence of functions or operations on Command language STATISTICA (SCL).
  • Use computational procedures written in STATISTICA BASIC, data conversion, data management operations, graphical procedures, as well as procedures written in any other programming language called from STATISTICA BASIC.
  • Open data files and any supporting files of the STATISTICA system.
  • Create and edit macro commands (sequences of keystrokes) that correspond to frequently performed procedures, tasks, or settings. Such editable commands can be entered into text form or, for example, as a sequence of mouse movements.

Each of the windows described above provides the ability to create hotkey combinations. You can assign a combination of the CTRL key and any letter from A to Z or numbers from 0 to 9. After saving this setting, you will only need to press a certain key combination, which will be equivalent to pressing the autotask button.

The toolbar can be global or local and contains large libraries of custom jobs and procedures. A local toolbar is associated with a specific module or project. The name of the currently open panel is displayed in the title bar of the dialog box.

Customized toolbar Autotask buttons can then be saved using the dialog box commands Settings....


Toolbar Autotask buttons can be used as a convenient interface for custom extensions of standard procedures.

Toolbar sizes can be changed using the mouse:

The panel can be fixed by moving it to the edge of the STATISTICA application window, as shown in the following figure.


As already noted, the toolbar buttons Autotask buttons can be configured or reassigned in the dialog box Setting up autotask buttons(which opens using the button Settings...on the toolbar). In addition, individual buttons can be edited and/or reassigned directly in the corresponding settings window; To do this, you need to click on this button while holding down the CTRL key


This will open the configuration window for that specific button.

By selecting the last item in the context menu that appears when you right-click anywhere on a toolbar, you can quickly switch between different pre-saved toolbars Autotask buttons.


A look into the future

STATISTICA is constantly evolving, opening up new opportunities for users. In short, the development of the system occurs in the spirit of the development of modern Windows technologies. Flexible customization for specific project tasks, a wide range of statistical options available to the user from other applications, global integration with other applications, for example, using VB, C++, Java, optimization for Web and multimedia applications - the nearest prospects for STATISTICA. It will be possible to embed various objects into data tables (multimedia spreadsheets): sound, photos, etc.

First steps in the STATISTICA system

Our acquaintance with the STATISTICA system, of course, should begin with data entry. You will see how easy it is to enter a wide variety of data into STATISTICA. It is assumed that the STATISTICA system is installed on your computer and you consistently repeat the described steps.

As specific area Let's choose a medical example.

As you already know, the source data in the STATISTICA system is organized in the form of tables. If you have experience working with spreadsheets (such as MS Excel), you will quickly get used to STATISTICA tables. Note that the tabular data structure of STATISTICA allows you to naturally display most real data.

A spreadsheet consists of rows and columns. The columns of the STATISTICA table are called Variables - Variables, and the lines Cases - Observations.

For example, in medicine, observations are patients, variables are gender, age, date of admission to hospital, date of diagnosis, date of operation, transfer to another hospital, discharge, etc. You can present such a table as a page notebook doctor, where the rows are, for example, the names of patients, the columns are characteristics (variables describing the course of the disease).

To create a table with data, do the following:

1. Start the STATISTICA program.

2. A menu will open Statistical modules(STATISTICA Module Switcher).

3. Select a module from the menu and click on it with the mouse.

4. You are now in the module Basic statistics and tables, in which you can select any statistical procedure included in this module. But since you have a different goal, just click the button Exit(Cancel).

So, you are in the working window of the module Basic statistics and tables STATISTICA systems. In the main working window of the system, move the mouse cursor to the menu bar File and left click. Select a command from the drop-down menu Create data. A window immediately appears on the computer screen Data Creation(see picture below).

In this window you can enter a file name, for example, medicine 1 .sta (the file can be named in Russian, but for a number of reasons it is more advisable to use English names).

Now place your mouse cursor in the field File name - Name file and type the desired name from the keyboard.


After pressing the key Enter on the keyboard or buttons Save the program will create an empty table containing 10 rows and 10 columns.


You can easily increase or decrease both the number of rows and the number of columns of this table. Create as many rows and columns in the table as needed. To do this, use the buttons on the toolbar.

For example, click the button Observations. After pressing the button, a menu will appear on the screen offering next choice for table observations: D add, move, copy, delete, enter case names. For example, select Add by double-clicking the left mouse button. A window will open in which you can specify the number of observations added to the table:

Click OK, and the number of rows (observations) in the table will increase by 2, that is, it will become equal to 12. Similarly, change the number of variables in the table. In this case, 11 variables will be needed. Click the button Variables on the toolbar. Using the mouse cursor, select the item from the drop-down menu Add. A window will appear on the screen where you can make the settings as shown below.

Click the button again Observations and select menu item Names. A dialog box appears that allows you to specify how many characters in the table will be reserved for case names. You can also expand the field for observation names using the mouse.

So, you've taken the first step towards achieving your goal - creating a spreadsheet that has 11 columns and 12 rows, as well as space for entering case names (see figure).


Now you need to enter the table name (its title) and variable names. You work using a mouse and keyboard. Remember the basic principle: double-clicking title fields opens dialog boxes that allow you to enter titles, describe variables, etc. Enter a table title. To do this, double-click on the top row of the table, the empty row that is located above the variables. In the window that appears, enter a table title.


Type the title using the keyboard, press OK. The entered text appears in the table header. In field File information and notes You can record additional information that will be useful when working with the file.

The names of variables and observations are edited in the same way. For example, to enter names, you must double-click in the field Observation name and in the window that appears, enter the names of the patients:

In order to describe a variable, you need to double-click on its name - for example, after clicking on the title variable1 (VAR1), a window will open in which you can set its name (or rename it), variable format, label, connection, etc.

Now fill the created table with data. Data is entered directly from the keyboard. We will discuss export options, for example to MS Word, later. If you need to enter numeric data, use the keyboard and the cursor arrows. Place the cursor on the desired table cell and enter numerical data. Text values ​​are entered differently. Move the cursor to the variable cell with text values ​​and double-click. The code 9999 will appear in the cell - this is the code for missing values. Erase the code using DEL button on keyboard. Then enter the desired text value. As a result, you can get the following table:


Thus, you have learned how to create tables and enter data into them. By repeating the described actions several times with other data, you will firmly consolidate the acquired skills.

Since STATISTICA is a regular Windows application, you can easily and quickly import data obtained from STATISTICA into another Windows application, such as MS Word.

The best way to do this is to press the ALT and F3 buttons simultaneously. A “sight” icon will appear on the screen instead of the mouse cursor. Using your mouse, place the crosshair in the upper left corner of the table. Then press the left mouse button, lock the crosshair and, while holding the mouse button, move the crosshair to a new location on the table. The selected part of the table will be marked with a rectangular frame. After you release the mouse button, the marked part of the table will be placed on the clipboard. If you now open required document Word and type the combination of the CTRL and V buttons on the keyboard, the selected table segment will be copied to the document.

Notes. You worked in the module Basic statistics and tables, in a similar way you can enter data in any module of the STATISTICA system. From point of view general opportunities In terms of data management, the system modules are the same.

The STATISTICA system has a special module Data management(Data management), which contains advanced features that allow you to quickly create a spreadsheet, combine two tables, cut part of a table, sort observations by any criterion: for example, arrange patient names in alphabetical order or organize them by age, etc. (see picture below).

Exercise. Sort the medicinel.sta file data by patient age and city. Use the module Data management and option Sorting observations.

One more example

From the STATISTICA system module switch, launch the module Basic statistics and tables. To do this, select the Basic Statistics and Tables module from the menu and click on it. The module will be selected from the list of modules. Then move your mouse cursor to the button Switch to and press it. The STATISTICA system will start, and the working window of the Basic Statistics and Tables module will appear on the screen. It is in this module that we will work.

In the module Basic statistics and tables create a data file as shown in the picture.

The file contains the results of a survey of 10 women (data are model) regarding their marital status and level of anxiety. First variable SEM_GENDER describes Family status women. This variable takes two values: P_family- full family, N_family- incomplete family. The second variable ANXIETY, describes a woman’s self-assessment of personal anxiety. It takes two values: low, high. It is known that personal anxiety is characterized by a persistent tendency to perceive a life situation as threatening (containing a secret threat). You see that the first woman interviewed - observation number 1 (first row in the table) - has complete family and characterizes his state of mind as alarming. The second woman interviewed - observation number 2 (second line of the table) - has an incomplete family and assesses her level of anxiety as low, etc.

Name this file women1sta.

Note that the variables in this file take text values, which is typical for sociological surveys.

Take this advice to help you organize your text data entry more efficiently. Variables take text values, and if you enter text into the table every time, it will take too much time. For convenience, it is better to use numerical values, and then go to text mode by clicking the button on the toolbar. It is convenient to encode the values ​​of variables. Let's show how it's done. Let's start with the variable SEM_SEX. Double-click on its title with the left mouse button, and a window will appear on the screen Text Value Manager - SEM_GENDER.

In this window in the column Text type in the first line P_family, and in the column Number type 1. This will cause the text value P_family code 1 will be assigned. In the second line Text value manager dial N_family, and in the column Number dial 2 - text value N_family code 2 will be assigned. Next, press the button OK.


Now enter the values ​​of 1 into those variable cells SEM_GENDERP_family.

Enter 2 values ​​in those variable cells SEM_GENDER, which should contain a text value N_family.

Now you just need to click a button on the STATISTICA toolbar to get the desired text values.

In exactly the same way, enter text values ​​into the variable cell ANXIETY

So, you have created the women1.sta file. Now, based on this source data file, we will build a contingency table. This is very easy to do in STATISTICA.

Step 1 AnalysisStart panel.

You will see the different types of analysis that are available in the module. Select analysis: Tables and headers and press the button OK.

A window will appear on the screen Define tables.

Step 2. First in line Analysis select T contingency tables(possible option Tables of flags and headers).


Step 3. Next, click the button Set tables. In the window that appears, select the variables that will be tabulated in the table. These variables specify the division of the source data into groups, so they are often also called grouping variables. In this case, you need to tabulate the values ​​of the variables SEM_GENDER And ANXIETY.

So select them as shown in the image below.


Note that you can generally select up to 6 lists of grouping variables, which allows you to build extremely complex tables containing many larger number variables than in the described example. It is precisely these tables that often arise during mass surveys, and you need to be able to construct them.

After selecting the variables, click the button OK. You will be returned to the dialog box shown in the figure. Please note that the window has changed slightly: the number 1 has appeared next to the Number of tables inscription, because you selected the variables and asked the system to build one table.

Step 4. Click Enter on the keyboard or button OK

The system will perform calculations and offer to view the result in the window Crosstabulation results.


Step 5. In the window Crosstabulation results click the button View summary tables. The following contingency table will appear on the screen:

You can see that the variables are tabulated in this table SEM_GENDER And ANXIETY. At the intersection of rows and columns are the absolute values ​​calculated from the original womenl .sta data file.

We tabulated the values ​​of two variables together, SEM_GENDER And ANXIETY, and such an action is often called crosstabulation (from the English cross - to cross).

From the constructed table, called a contingency table in slang, it is clear that three women have a two-parent family and a low level of anxiety, two women have an incomplete family and a low level of anxiety, etc. If you are interested in separate tabulation of each variable, look at the rightmost column and the bottom row of the table. You will see that among the women surveyed, five had a complete family and five had an incomplete family; five women had high levels of anxiety (see far right column), five had low levels of anxiety (see bottom row).

It often becomes necessary to present percentages in the table along with absolute values. The STATISTICA system allows you to select the percentages that are required: for example, only row percentages, or column percentages, or percentages of the total, or both.

Column percentages are percentages calculated relative to the total frequencies of the column. Row percentages are percentages calculated relative to the total frequencies of the row. Percentage of total number are calculated relative to the sum of the frequencies in the table. Let's look at how this is done.

You'll be back out the window again Crosstabulation results.

Step 7. In the window Crosstabulation results note the options on the right side, grouped together Tables.

For example, select the option Percentage of total. Move your mouse cursor to the appropriate square and click. In the window Crosstabulation results click the button View summary tables. The following table will appear on the screen:

Here, next to the absolute values, relative values ​​appeared - percentages calculated from the total number of women, that is, from 10.

So, from the table it is clear (please check!) that:

  • 30% of women have a complete family and a low level of anxiety (first cell of the table),
  • 20% of women have a complete family and a high level of anxiety (second cell of the table),
  • 20% of women have single-parent families and low levels of anxiety,
  • 30% of women have single-parent families and high levels of anxiety.

The constructed table can be edited, its appearance, labels, etc. can be changed.

Step 8. Editing the table.

Double click on a field for example Total % in the constructed table. In the window that appears Results table row name instead of Total % enter %.

You will receive a table like:

Step 9. Construction of separate tables with percentages.

Go back to the window again Crosstabulation results and pay attention to the option Show selected % in separate tables.

Make the following settings: select the option Percentage of total and option Show selected % in separate tables. Then click the button


You will see two tables, one of which will contain only absolute values, and the other - percentages calculated from the total number of respondents.


Step 10. Create an auto report.


STATISTICA has a useful reporting tool that allows you to present all your results in a rtf format; Then the report can be output to a printer, edited and beautifully printed.

Do the following: go to the View menu and select the option Text/Output Window. From the constructed tables (they are located in the working window of the system), select the one that you want to save for the report. Click on it with your mouse. Re-enter the menu File and select the option Seal. The marked results table will be printed.

In this window, you can, for example, edit a table and prepare it in the format required for a research report or article.


Please note that no programming language was ever used during the work; all actions are interactive in nature, and this is a great advantage of the STATISTICA system. It is as easy to work with as, for example, text editor MS Word. In conclusion, you are offered an exercise that will consolidate the acquired skills.

Example. Create the file women2.sta in STATISTICA. More realistic scales are used to gradate the values ​​of variables. Scale of a woman’s marital status: single, single-parent family, two-parent family. Woman's anxiety scale: low, moderate, high.

Graphical analysis of contingency tables

Contingency tables allow you to compactly describe data. They are convenient and require a minimum of comments, so they are popular among doctors, sociologists, and marketers. The STATISTICA system makes it very easy to build even the most complex contingency tables.

Here we will look at how to visualize constructed tables, i.e. we will get acquainted with STATISTICA tools that allow you to graphically analyze tables. Visually it is much easier to see the patterns contained in the tables. The examples use a small amount of data so that the basic techniques can be clearly presented. Imagine the difficult situation you would be in if you were dealing with huge tables, and these are the kind of tables that arise in practice. “Follow us!” - still remains our main motto.

So, the STATISTICA system is running on your computer, you are working in the Basic Statistics and Tables module (in the English version of STATISTICA, the Basic Statistics and Tables module is called Basic Statistics and Tables").

Example (continued)

The women1.sta data file you are working with is open in the work window. Let us recall that this file contains the results of a survey of 10 women (data are model) regarding their marital status and level of anxiety.

First variable CEM_POJI- marital status of women. This variable takes two values: P_family- full family, N_family- incomplete family.

Second variable ANXIETY- self-assessment of a woman’s personal anxiety. It takes two values: low, high. It is known that personal anxiety is characterized by a stable tendency of an individual to perceive a life situation as threatening. In this simplified example, we used two levels of anxiety: low and high.

You see that the first woman interviewed - observation number 1 (first line in the table) - has a full family and characterizes her condition as alarming. The second woman interviewed - observation number 2 (second line of the table) - has an incomplete family and assesses the level of anxiety as low, etc.

Step 1. Move the mouse cursor to the Analysis item. Click on it with your mouse. In the menu that appears, make a choice: Launch pad.

Select analysis: Tables and headers and press the button OK.

Use the options in the table setup window to tabulate the variables SEM_GENDER And ANXIETY.


Step 2. After the system builds the table, look carefully at the window Crosstabulation results.

Notice the buttons in the lower right corner of the dialog box Crosstabulation results.


Step 3. In the dialog box Crosstabulation results click the button Categorized histograms:


The meaning of these histograms is as follows: the women surveyed are divided into two groups (categories): women from two-parent families and women from single-parent families. A typical histogram for these variables looks like this:

Here you can clearly see the difference between categorized histograms and regular ones. In a regular histogram, the number of women with high and low anxiety is the same. On the categorized histogram, the number of women with a high level of anxiety in single-parent families is higher than in complete families. The level of anxiety of women in two-parent families is lower than the level of anxiety in single-parent families.

Continuation of the example

Consider the data file women2.sta. To gradate the values ​​of the variables, we used more realistic scales: single woman, single-parent family, two-parent family. Woman's anxiety scale: low, moderate, high.

Step 1. Move the mouse cursor to the item Analysis. Click on it with your mouse. In the menu that appears, make a choice: Start panel.

Select Tables and headers and press the button OK.

Step 2. In line Analysis select Tables conjugation (possible option Tables of flags and headers).


Next, click the button Set tables. In the window that appears, select the variables that will be tabulated in the table (see details above). In this case, it is necessary to tabulate the values ​​of the variables SEM_GENDER And ANXIETY.

Click the button Codes and select the tabulated codes (values) qualitative signs. In this example, the number of variable values ​​has increased because a more precise measurement scale is used.

If you want all variable values ​​to be tabulated, click the button Choose all in the lower right corner.


Note that you can generally select any set of codes. Variable codes can be viewed by clicking the button Inf..

For example, variable SEM_GENDER takes the following values:

Step 3. Click Enter on the keyboard or button OK in the upper right corner of the dialog box.

STATISTICA will perform the calculations, tabulate the data and present the result in the window Crosstabulation results(see picture).


Step 4. In the window Crosstabulation results click the button View summary tables. A table will appear on the screen:



The meaning of the histograms is as follows: women are divided into 3 groups or categories: women from two-parent families, women from single-parent families, single women (cf. the previous example). A separate histogram is built for each group, and all these histograms are collected together on one graph, which allows you to visually compare the groups.

Step 6. In the dialog box Crosstabulation results click the button ZM histograms.

A 3D histogram will appear on the screen.


The meaning of this histogram is as follows: all possible combinations of values ​​of two variables are compiled: marital status and anxiety level, and the number of times each combination occurs is calculated.

The three-dimensional histogram very clearly reproduces the crosstabulation table. You put the table on a plane and put a column in each cell, the height of which is equal to the number of observations in the table cell.

If you are not satisfied with the angle of the constructed three-dimensional histogram, you can change it using the system tools. STATISTICA offers an amazing charting tool. For example, they can be rotated.

Click the button Rotation located on the toolbar.

A window will appear on the screen in which you can rotate and select the desired perspective.

Use the scroll bar to rotate the graph. Experiment with it a little. First, for example, use your mouse to move the scroll cursor to the far left position. You will see the following picture:

Now move the scroll cursor to the right:

Each time the cursor moves, the graph rotates. Choose the option that suits you. Click the button OK. The desired graph will appear on the screen.

Step 7 Plotting graphs of frequency interactions. In the window Crosstabulation results click the button Frequency interaction traffic. An interaction graph will appear on the screen:


The meaning of this graph is simple: it shows how the frequencies of observations from different groups interact or are related to each other.

All plotted graphs show that women from different families differ in their level of anxiety. Whether this difference is significant is determined by statistical tests.

Along with commercial statistical packages, there are quite a large number of completely free statistical programs and applications. At the same time, a number of free programs are not only not inferior, but even superior in functionality to commercial applications. I will give a list of the main free programs for statistical data processing.

ξ EpiInfo - a free statistical package supported by the US Centers for Disease Control. The main feature is the ability not only to conduct statistical analysis, but also to create questionnaires and forms for data entry (including the creation of forms for collecting information on the Internet). Latest version also supports integration with Google Maps and visualization of cartographic information. Enough significant limitation for large amounts of data it can be used as a database Microsoft format Access.

ξ OpenEpi- kit statistical functions, allowing you to quickly apply relatively simple and commonly used statistical tests. OpenEpi can be used online on the developer’s website, or installed on your computer. The advantage of the package is a set of functions for calculating statistical power, the number of groups, generating random numbers, as well as the ability to calculate statistical significance based on group statistics, which is useful when evaluating articles.

ξ PSPP- in appearance and functionality it is very similar to SPSS (in fact, the name of the package is mirror image), and is completely free.

ξ SOFA — Allows you to perform basic statistical tests, but does not allow you to perform regression analysis. One of the distinctive features of the package is the quick creation of various standard graphs and summing tables that do not require formatting, as well as the ability to execute custom scripts in Python.

ξSEER-Stat is a free statistical package aimed at application in oncology, the development of which is supported by the US Cancer Institute. The software package has many functions for calculating morbidity, survival and mortality (including age-standardized indicators).

ξWINPEPI— a program for analyzing epidemiological data. A detailed description of the functionality is available. The same author created a number of other programs for use in epidemiology.

ξ Statistical Analysis for Genetic Epidemiology is a statistical analysis program for geneticists and epidemiologists, which contains many functions for obtaining descriptive statistics, data verification, quantification heredity of a trait or disease, assessment of the most likely age of onset of the disease, identification of patterns of occurrence of individual alleles or single-nucleotide changes, and other possibilities.