Free Statistical Tools on the WEB

click here to return to stats page
click here to return to reports page
click here to return to main page

A short version of this article first appeared in the International Statistical Association newsletter, Vol 26, Number 1 (76),  2002, and is at   http://isi.cbs.nl/NLet/NLet021-04.htm   and    http://isi.cbs.nl/FreeTools.htm 


There is a great deal of information about statistics available for free on the WEB.  Information includes data or data sets, general statistical textbooks, email lists, software, and many sites about special topics, such as epidemiology, forecasting, data presentation, data editing, multiple imputation, and propensity score analysis.  This article is a brief review of some useful sites covering these topics.Just one note. There are a number of lectures at pitt.edu. Linking to those sites from here doesn't seem to work, but if you copy and paste the urls, they will work.

To start with, World Statistics Day  http://unstats.un.org/unsd/wsd/Default.aspx  was recently celebrated, on October 10, 2010. According to the UN, the goal of this day was to "pay tribute to statisticians’ outstanding work in producing and disseminating the necessary data to respond to the every day new challenges and to measure progress in people’s lives." (World Statistics press release, http://unstats.un.org/unsd/wsd/docs/WSD_18Oct2010.pdf .)  This was billed as the first World Statistics Day, so perhaps there will be more.

When looking for statistical information, there are several sites that are general links. Two general sites are Betty Jung's statsites  http://www.bettycjung.net/Statsites.htm  and statsci   http://www.statsci.org/index.html   One other is the World Wide Web Virtual Library: Statistics  http://www.stat.ufl.edu/vlib/statistics.html  Most of this page is about educational institutions, institutes, associations and the like, with one section on statistical resources. 

The best place to start for learning about statistics is HyperStatistics Online, at http://davidmlane.com/hyperstat/.  This is a a nice statistics book, and it is a comprehensive list of other on line statistics books.  Most of these are basic to intermediate.  One book, the Statsoft text,  http://www.statsoft.com/textbook/   has the basics as well as fairly advanced topics.  Another, Statistics at square one  http://resources.bmj.com/bmj/readers/statistics-at-square-one/statistics-at-square-one   is a fairly introductory book, but from 1997.  Another approach is a site is Robert Niles' site Statistics Every Writer Should Know   http://www.robertniles.com/stats/   with plain English explanations for many basic statistical concepts.

People can also take free on line training classes on statistics, for example, from the North Carolina Center for Public Health Preparedness Training Web Site, http://cphp.sph.unc.edu/training/index.php, or University of Minnesota's Midwest Center for Life-Long Learning in Public Health   http://www.sph.umn.edu/ce/mclph/   These classes offers a certificate at the end of the training.  StatTrek   http://stattrek.com/   also has a couple of on line tutorials.  Another project, from Claremont Graduate University is the Web Interface for Statistical Education   http://wise.cgu.edu/   also with some on line tutorials and links to resources. An open course from Carnegie Mellon http://oli.web.cmu.edu/openlearning/forstudents/freecourses.  is basically presenting material used in the course taught at the Univerity.

Also, the American Statistical Association has an on line journal, the Journal of Statistical Education, at http://www.amstat.org/publications/jse/  which has free articles about teaching statistics.  The  Consortium for the Advancement of Undergraduate Statistics Education at  http://www.causeweb.org/   is also about teaching statistics and has a great many links to texts, notes, journals, data sets, etc, in particular in the resources section.

Since statistics is difficult to learn and it is not always clear to the general public how statistics may be useful, there is one project aimed at educating the public: the International Statistical Literacy Project  http://www.stat.auckland.ac.nz/~iase/islp/home   The mission of this project "is to support, create and participate in statistical literacy activities and promotion around the world."  A similar project is Statistical Literacy   http://www.statlit.org/   which basically is a central resources for events, and links to presentations and other information. A kind of related project is stats.org   http://www.stats.org/   from George Mason University. This project describes basic statistical terms but the main focus seems to be discussing news stories and how to understand the statistics in those news stories.

Two government websites also try to help the public understand statistics.  The Australian Bureau of Statistics  http://www.abs.gov.au/websitedbs/a3121120.nsf/home/Understanding%20statistics  has an on line class and a page defining statistical terms.  US's National Atlas has a page on Understanding Descriptive Statistics   http://www.nationalatlas.gov/articles/mapping/a_statistics.html  

There are a number of statistical associations.  An international association is the International Statistical Institute  http://isi-web.org/   . Some other associations are the American Statistical Assocation   http://www.amstat.org/   the International Chinese Statistical Association   http://www.icsa.org/   and the International Indian Statistical Association  http://www.intindstat.org/ . Statsci has a list of associations  http://www.statsci.org/soc.html  as does the International Statistical Institute  http://isi-web.org/statsoc/nsslist .

There is also tons of free software on the net.  The best place to find free statistical software is the Free Statistical Software site at http://statpages.org/javasta2.html.  This site lists general purpose software, as well as software devoted to specific purposes, such as curve fitting, epidemiology, surveys, and programming.  There are also brief descriptions of each package. We also list software packages on our page   http://gsociology.icaap.org/methods/soft.html  along with a list of other sites that list free statistical packages.  One great site about learning how to use statistical software is the Statistical Computing site, at http://www.ats.ucla.edu/stat/default.htm.  They have a large number of links, how to's and other material, mostly for commercial packages.  One review of free statistical software is here   http://en.citizendium.org/wiki/Free_statistical_software   which  briefly describes the history, quality, functions and limits of a number of free packages.

There are a number of email lists.  Allstat, at   https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=allstat   is a general list, although a great deal of the postings appear to be postings about jobs or training courses.    Another list stat-l, at   http://lists.mcgill.ca/archives/stat-l.html  focuses more on statistical questions. Another useful list, not on Allstat, is Epidemio, at  http://www.listes.umontreal.ca/wws/info/epidemio-l   This list is about epidemiology. Another form of discussion group is the forum. TalkStats   http://www.talkstats.com/   is one forum, with discussions about basic to advanced, homework to theory.  A smaller forum is from Statistics.com   http://www2.statistics.com/resources/discussionboards/   with only two general categories, statistical methods and homework.

There are a number of comprehensive places to look for data.  One starting point for social, political and economic data is the Global Social Change Research Project  http://gsociology.icaap.org/,  which has both links to a very large number of other data link sites, and a page of data sets compiled or created from other data sets. Many of the data sets listed on this project site are public domain.  All of the data are free to use.  This UN site   http://data.un.org/   has data on nearly every topic, from the UN and it's various associates.  The Worldbank also has a data page   http://data.worldbank.org/  Most of the data on the Worldbank site and all of the data on the UN site may be used freely.  This UN site  http://unstats.un.org/unsd/methods/inter-natlinks/sd_natstat.asp  and this BLS site  http://www.bls.gov/bls/other.htm  link to national statistical centers of most countries of the world.

There are a number of statistical journals on the web with free content. Many of these are listed at the Directory of Open Access Journals   http://www.doaj.org/doaj?func=subject&cpid=59   page on statistics.  Some of the journals listed here include the Latin American Journal of Probability and Mathematical Statistics  http://alea.impa.br/english/index_v7.htm ,  the Electronic Journal of Applied Statistical Analysis  http://siba-ese.unisalento.it/index.php/ejasa/index ,   and the Journal of Official Statistics  http://www.jos.nu/ 

There are resources about dozens of specific topics on the web.  Some of these topics include epidemiology, graphical analysis and presentation, missing data, forecasting, gathering data and meta-analysis.

Epidemiology: The two best places to start for epidemiology are EpiMonitor,   http://www.epimonitor.net/index.htm, which has a very comprehensive list of links and the WWW Virtual Library: Epidemiology  http://www.epibiostat.ucsf.edu/epidem/epidem.html  another gateway.  Another very good place to start is epidemiolog, at http://www.epidemiolog.net/.  This site also has a fairly comprehensive listing of epidemiology sites, as well as an on-line textbook. First time visitors should start at  http://www.epidemiolog.net/evolving/ .  Another free on-line textbook is Epidemiology for the Uninitiated, at   http://resources.bmj.com/bmj/readers/epidemiology-for-the-uninitiated/epidemiology-for-the-uninitiated-fourth-edition   (from 1997)
     A very good place to find world epidemiological data, reports, issues and information is from WHO   http://www.who.int/topics/epidemiology/en/   which includes for example the 10 leading causes of death, and the  Weekly Epidemiological Record.


click here to return to top
 

Presenting Results: After analyzing data, it is very helpful to know how to best present the results.  Very good sites are:  Informative Presentation of Tables, Graphs and Statistics, at  http://www.reading.ac.uk/ssc/publications/guides/toptgs.html   ,Washington Statistical Society Methodology Seminars,  Data Presentation: A Guide To Good Graphics   http://www.scs.gmu.edu/~wss/methods/zawitzg.html  and Presenting Data   http://lilt.ilstu.edu/gmklass/pos138/datadisplay/ .  Also BTS’s Guide to Good Statistical Practice  has a useful section on presenting results, at   http://www.bts.gov/publications/guide_to_good_statistical_practice_in_the_transportation_field/index.html   .   For some interesting good and bad examples, see the Gallery of Data Visualization, at   http://www.datavis.ca/gallery/index.php   More recently, there are sites showing moving charts, like Gapminder   http://www.gapminder.org/   or mapping international data like Show   http://show.mappingworlds.com/world/  

Missing Data:  Two sites that are overviews of missing data page are the University of Texas Statistical Services FAQ page, #25, at   http://www.utexas.edu/its-archive/rc/answers/general/gen25.html   and Professor von Hippel's faq page   http://www.sociology.ohio-state.edu/people/ptv/  where he talks about whether data are missing at random or not, and how to deal with the missing data. Also see the first couple of paragraphs of Dr. Howell's page  http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Missing_Data/Missing.html  One way to deal with missing data is multiple imputation, described at the Multiple Imputation FAQ page, at http://sites.stat.psu.edu/~jls/mifaq.html    Multiple imputation fills in missing data by using other variables to predict the missing values.  One software program for estimating missing data is AMELIA, at   http://gking.harvard.edu/software/  

Forecasting: Two faculty members have lectures about forecasting on the web.  These are Bob Nau's class notes on forecasting at http://web.duke.edu/~rnau/411out00.html, and Hossein Arsham's Time Series Analysis and Forecasting Techniques, at  http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Forecast.htm   Also, another forecasting site is the Federal Forecasters Consortium, at   http://www1.va.gov/vhareorg/ffc.htm   Conference proceedings can be downloaded from this site.
 

Methods of gathering data:  There are a number of sites on gathering data.  Two places to start are Resources for Methods in Evaluation and Social Research, at   http://gsociology.icaap.org/methods/  and The World Wide Evaluation Information Gateway   http://www.policy-evaluation.org/    These site are link to other sites about methods, quantitative and qualitative.  Some sites are about specific tools in data gathering.  Tom O'Connor's lecture notes, at  http://www.drtomoconnor.com/3760/default.htm  covers various issues such as measurement, validity and reliability, and scales in indexes.

Meta-analysis:  There are several introductions to meta-analysis.  One is a supercourse  http://www.pitt.edu/~super1/lecture/lec1171/index.htm  .  One link is to an on line book Meta - Analysis: Methods of Accumulating Results Across Research Domains, by Larry C. Lyons, at   http://www.lyonsmorris.com/MetaA/index.htm   (this link sometimes doesn't work).   One of the Epi Supercourses is about meta-analysis, How to conduct a Meta-Analysis  http://www.pitt.edu/~super1/lecture/lec1171/index.htm    

Other topics include propensity score analysis  http://www.epa.gov/caddis/da_advanced_5.html  .  Propensity score analysis is a method of dealing with self selection bias.  Robert Pruzek has a paper describing propensity score analysis   http://rmpruzek.com/   Another interesting special topic sit is the Centre for Multilevel Modelling at  http://www.bristol.ac.uk/cmm/  One site about data mining is kdnuggets at http://www.kdnuggets.com/ (a newsletter and general links to links site).   

I don't necessarily endorse any of the sites listed here, and do not assume responsibility for content of the web sites listed in this article. This article is solely presented for educational purposes.

click here to return to stats page
click here to return to reports page
click here to return to main page
click here to return to top
 

last updated 9/28/2011
last verified  11/23/2011
click here to see who we are or to contact us