Summary of free statistical software
This is a review of Openstat, Easyreg, Epidata, WinIDAMS, Instat, MicrOsiris and Epi Info. This is the info I have so far.

General notes:

1. All programs read .csv files.

2. MicrOsiris and Epi Info read files with blanks for missing. Stat4U needs something for the missing, like -9 or -9.99
For Stat4U, all variables can have the same value for missing, e.g., -9.99.


1. Import the .csv file, then call up commands.


1. All values of each variable should have the same number of decimal places. So you need to open the file above to excel, format each variable to, say, 2 decimal places. Also, WinIDAMS can't handle variables with more than 10 digits.

2. For WinIDAMS, each variable has to have a 'missing' indicator. I used -999 or -9, and these have to be clearly defined in the file definition. See the .dic file listed above.


1. Open stat runs through menus, and can read .csv files. OpenStat also can use excel files, if you copy from excel and then paste to the data table in OpenStat, and also you have to give labels to all variables. I had problems with my data set, but others may have better luck.

2. OpenStat has a great variety of file and variable manipulation functions like selecting cases, splitting, sorting and merging files, defining, recoding, and transforming variables, and importing and exporting files.

3. OpenStat also has a large number of analysis methods, ranging from the basic descriptives (e.g., means, frequencies and crosstabs) all the way through many multivariate and non parametric methods.

4. When you click on a procedure, like some analysis method or a file manipulation method, a small window on the top of that procedure box explains briefly what that method does.

5. Output for each analysis appears in seperate output window. You can save the output as a seperate file.

6. Some examples of analysis, as they appear from OpenStat output.

Mean of GDP Per Capita
                                                                DISTRIBUTION PARAMETER ESTIMATES

                                                                gdpcap (N = 230)  Sum =    2664800.000
                                                                Mean =  11586.087  Variance = 158882949.687  Std.Dev. =  12604.878
                                                                Std.Error of Mean =    831.141
                                                                0.950 Confidence Interval for mean :   9957.080 to  13215.093
                                                                Range =  69300.000  Minimum =    600.000  Maximum =  69900.000
                                                                Skewness =      1.645  Std. Error of Skew =      0.160
                                                                Kurtosis =      2.976  Std. Error Kurtosis =      0.320

Epi Info:

1. Epi Info runs through menus, but also has a program editor window, which shows you the commands that you've selected, and which you can copy and paste or edit and re-run, as you wish.

2. Epi Info reads .csv files, but you need to read it as text delimited. Then when the read box opens, click on the file type, and select 'other files (*.csv)'. Epi Info also reads .xls files.

3. Epi Info has many basic file and variable manipulation commands, like merging files, writing (and exporting) files, variable definition and recode, and selecting and if options.

4. Epi Info seems to have a limited number of analysis functions, including frequencies, means, summarizing, graphs, a couple of kinds of regression (linear, logistic), survival analysis and proportional hazards. Regression does not allow a choice of what kind of regression, like forward, stepwise, etc. Epi Info also has the ability to use weights when analyzing surveys from simple random (or unbiased systematic) samples. Epi Info does not have factor analysis.

5. Epi Info does have the ability to map results. (I haven't tried this yet.)

6. As far as I can tell, Epi Info doesn't have a menu command for getting means of multiple variables, can only seem to get means for one variable at a time. You can add in additional variables as crosstabs or stratifying. I contacted CDC and they said they would consider adding in the ability to get means for multiple variables in their next version.
    The output for means gives the frequency of each value of the variable.
    Epi Info is the only program that doesn't allow means of multiple variables.

7. Epi Info doesn't do correlation. You need to use regression with 2 variables to get the correlation coefficient.

8. Epi Info saves output to an htm file. While you are running Epi Info, all of the output is saved to one htm file.

9. Some examples of analysis, as they appear from Epi Info output.

Mean of GDP Per Capita
Obs Total Mean Variance Std Dev
2664800.0000 11586.0870 158882949.6867 12604.8780
Minimum 25% Median 75% Maximum Mode
600.0000 2200.0000 6700.0000 17500.0000 69900.0000 600.0000

Regression of Degrees North, Birthrate and Percent of Labor Force in Agriculture on GDP Per Capita.
Variable Coefficient Std Error F-test P-Value
agriculture -30249.510 6224.311 23.6186 0.000002
birthrate -296.721 87.756 11.4326 0.000863
North 109.541 31.562 12.0458 0.000632
CONSTANT 20700.331 1969.627 110.4553 0.000000
Correlation Coefficient: r^2= 0.43
Source df Sum of Squares Mean Square F-statistic
Regression 3 14984390921.779 4994796973.926 53.004
Residuals 208 19600627757.466 94233787.296  
Total 211 34585018679.245    

Last modified 1/21/08