fgread-py

General Documentation

If you want to learn how to use the readers see our FASTGenomics documentation.

Details on our API

For details on the available functions see the API section

API

Reading data in FASTGenomics

fgread.ds_info(ds: Optional[str] = None, pretty: bool = None, output: bool = None, data_dir: pathlib.Path = PosixPath('/fastgenomics/data')) → pandas.core.frame.DataFrame[source]

Get information on all available datasets in this analysis.

Parameters:
  • ds (Optional[str], optional) – A single dataset ID or dataset title. If set, only this dataset will be displayed. Recommended to use with pretty, by default None
  • pretty (bool, optional) – Whether to display some nicely formatted output, by default True
  • output (bool, optional) – Whether to return a DataFrame or not, by default True
  • data_dir (Path, optional) – Directory containing the datasets, e.g. fastgenomics/data, by default DATA_DIR
Returns:

A pandas DataFrame containing all, or a single dataset (depends on ds)

Return type:

pd.DataFrame

fgread.load_data(ds: Optional[str] = None, data_dir: pathlib.Path = PosixPath('/fastgenomics/data'), additional_readers: dict = {}, expression_file: Optional[str] = None, as_format: Optional[str] = None)[source]

This function loads a single dataset into an AnnData object. If there are multiple datasets available you need to specify one by setting ds to a dataset id or dataset title. To get an overview of availabe dataset use ds_info()

Parameters:
  • ds (str, optional) – A single dataset ID or dataset title to select a dataset to be loaded. If only one dataset is available you do not need to set this parameter, by default None
  • data_dir (Path, optional) – Directory containing the datasets, e.g. fastgenomics/data, by default DATA_DIR
  • additional_readers (dict, optional) – Used to specify your own readers for the specific data set format. Dict key needs to be file extension (e.g., h5ad), dict value a function. Still experimental, by default {}
  • expression_file (str, Optional) – The name of the expression file to load. Only needed when there are multiple expression files in a dataset.
  • as_format (str, optional) – Specifies which reader should be uses for this dataset. Overwrites the auto-detection of the format. Possible parameters are the file extensions of our supported data formats: h5ad, h5, hdf5, loom, rds, csv, tsv.
Returns:

A single AnnData object with dataset id in obs and all dataset metadata in uns

Return type:

AnnData Object

Examples

To use a custom reader for files with the extension “.fg”, you have to define a function first:

>>> def my_loader(file):
...     anndata = magic_file_loading(file)
...     return anndata

You can then use this reader like this:

>>> fgread.load_data("my_dataset", additional_readers={"fg": my_loader})

Readers for supported formats

fgread.readers.read_10xhdf5_to_anndata(ds_file: pathlib.Path)[source]

Reads a dataset in the 10x hdf5 format into the AnnData format.

fgread.readers.read_10xmtx_to_anndata(ds_file: pathlib.Path)[source]

Reads a dataset in the 10x mtx format into the AnnData format.

fgread.readers.read_anndata_to_anndata(ds_file: pathlib.Path)[source]

Reads a dataset in the AnnData format into the AnnData format.

fgread.readers.read_densecsv_to_anndata(ds_file: pathlib.Path)[source]

Reads a dense text file in csv format into the AnnData format.

fgread.readers.read_densemat_to_anndata(ds_file: pathlib.Path, sep=None)[source]

Helper function to read dense text files in tsv and csv format. The separator (tab or comma) is passed by the corresponding function.

fgread.readers.read_densetsv_to_anndata(ds_file: pathlib.Path)[source]

Reads a dense text file in tsv format into the AnnData format.

fgread.readers.read_loom_to_anndata(ds_file: pathlib.Path)[source]

Reads a dataset in the loom format into the AnnData format.

fgread.readers.read_seurat_to_anndata(ds_file: pathlib.Path)[source]

Reads a dataset in the Seurat format into the AnnData format (not implemented).