API¶

Reading data in FASTGenomics¶

fgread.ds_info(ds: Optional[str] = None, pretty: bool = None, output: bool = None, data_dir: pathlib.Path = PosixPath('/fastgenomics/data')) → pandas.core.frame.DataFrame[source]¶

Get information on all available datasets in this analysis.

Parameters:	ds (Optional[str], optional) – A single dataset ID or dataset title. If set, only this dataset will be displayed. Recommended to use with `pretty`, by default None pretty (bool, optional) – Whether to display some nicely formatted output, by default True output (bool, optional) – Whether to return a DataFrame or not, by default True data_dir (Path, optional) – Directory containing the datasets, e.g. `fastgenomics/data`, by default DATA_DIR
Returns:	A pandas DataFrame containing all, or a single dataset (depends on `ds`)
Return type:	pd.DataFrame

fgread.load_data(ds: Optional[str] = None, data_dir: pathlib.Path = PosixPath('/fastgenomics/data'), additional_readers: dict = {}, expression_file: Optional[str] = None, as_format: Optional[str] = None)[source]¶

This function loads a single dataset into an AnnData object. If there are multiple datasets available you need to specify one by setting ds to a dataset id or dataset title. To get an overview of availabe dataset use ds_info()

Parameters:	ds (str, optional) – A single dataset ID or dataset title to select a dataset to be loaded. If only one dataset is available you do not need to set this parameter, by default None data_dir (Path, optional) – Directory containing the datasets, e.g. `fastgenomics/data`, by default DATA_DIR additional_readers (dict, optional) – Used to specify your own readers for the specific data set format. Dict key needs to be file extension (e.g., h5ad), dict value a function. Still experimental, by default {} expression_file (str, Optional) – The name of the expression file to load. Only needed when there are multiple expression files in a dataset. as_format (str, optional) – Specifies which reader should be uses for this dataset. Overwrites the auto-detection of the format. Possible parameters are the file extensions of our supported data formats: `h5ad`, `h5`, `hdf5`, `loom`, `rds`, `csv`, `tsv`.
Returns:	A single AnnData object with dataset id in obs and all dataset metadata in uns
Return type:	AnnData Object

Examples

To use a custom reader for files with the extension “.fg”, you have to define a function first:

>>> def my_loader(file):
...     anndata = magic_file_loading(file)
...     return anndata

You can then use this reader like this:

>>> fgread.load_data("my_dataset", additional_readers={"fg": my_loader})

Readers for supported formats¶

fgread.readers.read_10xhdf5_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dataset in the 10x hdf5 format into the AnnData format.

fgread.readers.read_10xmtx_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dataset in the 10x mtx format into the AnnData format.

fgread.readers.read_anndata_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dataset in the AnnData format into the AnnData format.

fgread.readers.read_densecsv_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dense text file in csv format into the AnnData format.

fgread.readers.read_densemat_to_anndata(ds_file: pathlib.Path, sep=None)[source]¶: Helper function to read dense text files in tsv and csv format. The separator (tab or comma) is passed by the corresponding function.

fgread.readers.read_densetsv_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dense text file in tsv format into the AnnData format.

fgread.readers.read_loom_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dataset in the loom format into the AnnData format.

fgread.readers.read_seurat_to_anndata(ds_file: pathlib.Path)[source]¶: Reads a dataset in the Seurat format into the AnnData format (not implemented).