Free Statistical Tools on the WEB

click here to return to stats page
click here to return to reports page

A short version of this article first appeared in the International Statistical Association newsletter, Vol 26, Number 1 (76),  2002, and is at   http://isi.cbs.nl/NLet/NLet021-04.htm   and    http://isi.cbs.nl/FreeTools.htm 


There is a great deal of research methods information available for free on the WEB.  Information includes data or data sets, and also general statistical textbooks, email lists, software, and many sites about special topics, such as epidemiology, forecasting, data presentation, data editing, multiple imputation, and propensity score analysis.  This article is a brief review of some useful sites covering these topics.

There are several sites that are general links. One of these is the Intute statistics pagehttp://www.intute.ac.uk/socialsciences/statistics/  which also has sub-pages on demography, international, local, national, official and regional statistics, and statistical theory. The Intute site has a variety categories, such as data, educational material, government sites, mailing lists and societies. One other general site is Betty Jung's statsites  http://www.bettycjung.net/Statsites.htm  .

The best place to start for learning about statistics is HyperStatistics Online, at http://davidmlane.com/hyperstat/index.html.  This is the best place it is a a nice statistics book, and it is a comprehensive list of other on line statistics books.  Most of these are basic to intermediate.  One book, the Statsoft text, http://www.statsoft.com/textbook/stathome.html  has fairly advanced topics.  Another, Statistics at square one  http://www.bmj.com/statsbk/  is a fairly introductory book.

Since statistics is difficult to learn and it is not always clear to the general public how statistics may be useful, there are several projects aimed at educating the public. One is the International Statistical Literacy Project   http://www.stat.auckland.ac.nz/~iase/islp/   The mission of this project "is to support, create and participate in statistical literacy activities and promotion around the world."  Also, the International Statistical Institute   http://isi.cbs.nl/   is preparing "Statistics in public life", which looks to be interesting.  The American Statistical Association  http://www.amstat.org/.  also links to a variety of resources about teaching, at all grade levels.

There is also tons of free software on the net.  The best place to find free statistical software is the Free Statistical Software site at http://statpages.org/javasta2.html.  This site lists general purpose software, as well as software devoted to specific purposes, such as curve fitting, epidemiology, surveys, and programming.  There are also brief descriptions of each package. We also list software packages on our page   http://gsociology.icaap.org/methods/soft.html  along with a list of other sites that list free statistical packages.  One great site about learning how to use statistical software is the Statistical Computing site, at http://www.ats.ucla.edu/stat/default.htm.  They have a large number of links, how to's and other material, mostly for commercial packages.  One review of free statistical software is here   http://en.citizendium.org/wiki/Free_statistical_software   which  briefly describes the history, quality, functions and limits of a number of free packages.

There are a number of email lists.  Allstat, at   https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=allstat   is a general list, although a great deal of the postings appear to be postings about jobs or training courses.    Another list stat-l, at   http://lists.mcgill.ca/archives/stat-l.html  focuses more on statistical questions. Another useful list, not on Allstat, is Epidemio, at  http://www.listes.umontreal.ca/wws/info/epidemio-l   This list is about epidemiology. 

There are a number of comprehensive places to look for data.  One is Statistical Resources on the Web  http://www.lib.umich.edu/govdocs/stats.html.  This is a comprehensive guide to data on many topics, including health, demographics, labor, economics, environment, and much more. Another starting point for social, political and economic data is the Global Social Change Research Project  http://gsociology.icaap.org/,  which has both links to a very large number of other data link sites, and a page of data sets compiled or created from other data sets. Many of the data sets listed on this project site are public domain. This UN site   http://data.un.org/   has data on nearly every topic, from the UN and it's various associates.  This UN site  http://unstats.un.org/unsd/methods/inter-natlinks/sd_natstat.asp  and this BLS site  http://www.bls.gov/bls/other.htm  link to national statistical centers of most countries of the world.
    Not all data sets are free to use. Some will charge for use, and you generally have to check each one.

There are resources about dozens of specific topics on the web.  Some of these topics include epidemiology, graphical analysis and presentation, missing data, forecasting, gathering data and meta-analysis.

Epidemiology: The two best places to start for epidemiology are EpiMonitor,   http://www.epimonitor.net/index.htm, which has a very comprehensive list of links and the WWW Virtual Library: Epidemiology  http://www.epibiostat.ucsf.edu/epidem/epidem.html  another gateway.  Another very good place to start is epidemiolog, at http://www.epidemiolog.net/.  This site also has a fairly comprehensive listing of epidemiology sites, as well as an on-line textbook. First time visitors should start at  http://www.epidemiolog.net/evolving/ .  Another free on-line textbook is Epidemiology for the Uninitiated, at http://www.bmj.com/epidem/epid.html.
     A very good place to find world epidemiological data, reports, issues and information is from WHO   http://www.who.int/topics/epidemiology/en/   which includes for example the 10 leading causes of death, and the  Weekly Epidemiological Record.
    There are also three interesting sites for learning epidemiology. One is the Epidemiology Supercourse, http://www.pitt.edu/~super1/, which is a set of on line lectures on various epidemiology courses.  These lectures can be downloaded and used, whole or in part, in your own lectures.  The North Carolina Center for Public Health Preparedness Training Website  http://nccphp.sph.unc.edu/training/   has free on line training for biostatistics, epidemiology, other topics. You can get certificates for each class you complete. Each class is 1/2 to 1 hour. 


click here to return to top
 

Graphics: After analyzing data, it is very helpful to know how to best present the results.  Very good sites are:  Informative Presentation of Tables, Graphs and Statistics, at  http://www.rdg.ac.uk/ssc/publications/guides/toptgs.html ,Washington Statistical Society Methodology Seminars,  Data Presentation: A Guide To Good Graphics   http://www.scs.gmu.edu/~wss/methods/zawitzg.html  and Presenting Data   http://lilt.ilstu.edu/gmklass/pos138/datadisplay/ .  Also BTS’s Guide to Good Statistical Practice  has a useful section on presenting results, at   http://www.bts.gov/publications/guide_to_good_statistical_practice_in_the_transportation_field/index.html   .   For some interesting good and bad examples, see the Gallery of Data Visualization, at  http://www.math.yorku.ca/SCS/Gallery/

Missing Data:  Two sites that are overviews of missing data page are the University of Texas Statistical Services FAQ page, #25, at   http://www.utexas.edu/its-archive/rc/answers/general/gen25.html   and Cornell's Office of Statistical Computing FAQ page,  http://www.osc.cornell.edu/news/archive.php   specifically FAQ #46 and #47.  One way to deal with missing data is multiple imputation, described at the Multiple Imputation FAQ page, at http://www.stat.psu.edu/~jls/mifaq.html.  Multiple imputation fills in missing data by using other variables to predict the missing values.  This method is also described at Joseph Schafer’s site, in a 1999 article "Multiple imputation: a primer".  at   http://www.stat.psu.edu/~jls/index.html.   One software program for estimating missing data is AMELIA, at http://gking.harvard.edu/stats.shtml

Forecasting: Two faculty members have lectures about forecasting on the web.  These are Bob Nau's class notes on forecasting at http://www.duke.edu/~rnau/411out00.html, and Hossein Arsham's Time Series Analysis and Forecasting Techniques, at  http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Forecast.htm   Also, another forecasting site is the Federal Forecasters Consortium, at   http://www1.va.gov/vhareorg/ffc.htm   Conference proceedings can be downloaded from this site.
 

Methods of gathering data:  There are a number of sites on gathering data.  Two places to start are Resources for Methods in Evaluation and Social Research, at   http://gsociology.icaap.org/methods/  and The World Wide Evaluation Information Gateway   http://www.policy-evaluation.org/    These site are link to other sites about methods, quantitative and qualitative.  Some sites are about specific tools in data gathering.  The Statnotes site has a section on survey methods, at   http://faculty.chass.ncsu.edu/garson/PA765/survey.htm    Tom O'Connor's lecture notes, at   http://www.apsu.edu/oconnort/3760/3760lects.htm   covers various issues such as measurement, validity and reliability, and scales in indexes.

Meta-analysis:  There are several introductions to meta-analysis.  One is a BMJ site, Meta-Analysis, Education and debate   http://www.bmj.com/collections/ma.htm   a collection of chapters describing methods and issues.  One link is to an on line book Meta - Analysis: Methods of Accumulating Results Across Research Domains, by Larry C. Lyons, at   http://www.lyonsmorris.com/MetaA/index.htm   (this link sometimes doesn't work).   One of the Epi Supercourses is about meta-analysis, How to conduct a Meta-Analysis  http://www.pitt.edu/~super1/lecture/lec1171/index.htm   Another site is The Meta Analysis of Research Studies   http://echo.edres.org:8080/meta/   which is an overview and links to documents and resources.  

Public education about statistics.  Three papers about how to read papers are: How to read a paper: Statistics for the non-statistician. I: Different types of data need different statistical tests.  Trisha Greenhalgh, BMJ 1997;315:364-366 (9 August)    http://www.bmj.com/cgi/content/full/315/7104/364How to read a paper: Statistics for the non-statistician. II: "Significant"  relations and their pitfalls.  By Trisha Greenhalgh, BMJ 1997;315:422-425 (16 August) http://www.bmj.com/cgi/content/full/315/7105/422     and  How to read a paper: Papers that go beyond numbers   http://www.bmj.com/cgi/content/full/315/7110/740   Article by Trisha Greenhalgh Rod Taylor, in BMJ 1997;315:740-743 (20 September).
   Also
, the American Statistical Association has an on line journal, the Journal of Statistical Education, at http://www.amstat.org/publications/jse/  which has free articles about teaching statistics.  This organization  Consortium for the Advancement of Undergraduate Statistics Education at  http://www.causeweb.org/   also has a great many links to texts, notes, journals, data sets, etc, in particular in the resources section.

Other topics include a paper by Rubin explaining propensity score analysis, at http://www.symposion.com/nrccs/rubin.htm.  Propensity score analysis is a method of dealing with self selection bias.  Also, the Federal Committee on Statistical Methodology, at http://www.fcsm.gov/reports/ , has some interesting papers, especially RL2. Record Linkage Techniques - 1997: Proceedings of an International Workshop and Exposition.  (This is RL2, not RL1.)   Another interesting special topic sit is the Centre for Multilevel Modelling at http://www.cmm.bristol.ac.uk/  One site about data mining is kdnuggets at http://www.kdnuggets.com/ (a newsletter and general links to links site).   

I don't necessarily endorse any of the sites listed here, and do not assume responsibility for content of the web sites listed in this article. This article is solely presented for educational purposes.

click here to return to stats page
click here to return to reports page
click here to return to top
 

last updated 6/21/09
last verified  6/21/09

click here to see who we are or to contact us