Author: Jonathan Friedman


PySurvey: Interactive analysis of survey data 

``PySurvey`` is a `Python <>`__ package designed to perform interactive analysis of survey data, composed of counts of occurrence of different categories in a collection of samples. 
Specifically, ``PySurvey`` is developed in the context of genomic surveys, such as 16S surveys, where one studies the occurrence of OTUs across samples.
Though much of ``PySurvey``'s functionality is not unique to survey data, and equivalent features are implemented in many other packages, ``PySurvey`` is intended to serve as a 'one-stop-shop', and thus attempts to includes all the methods that are commonly used in the analysis of genomic survey data (often by wrapping around other packages), with a sensible choice of default parameters (e.g. distance metrics, etc').

``PySurvey`` is based on the powerful `pandas <>`__ package which offers rich data structures which are 
tailored and optimized for interactive analysis of large data tables.

``PySurvey`` Resources
- **Documentation:**
- **Source Repository:**

Key Features
  - General utility:
	- Metadata support.
	- Filtering of samples/components.
	- ML and Bayesian estimation of component fractions.

  - Exploratory analysis:
	- Dimension reduction: PCoA.
	- Clustering: hierarchical, gaussian mixture models GMM.
	- Compositional correlations via `SparCC <>`__.
	- Plotting: sorted heatmaps, stacked plots, ...
  - Ecological theory:
  	- Sample diversities (alpha diversity).	
	- Rarefaction.
  	- Rank abundance plots.