TCGA bulk data preprocessing with omicverse
CommunityPreprocess TCGA data for survival analysis.
Education & Research#bioinformatics#bulk RNA-seq#clinical data#preprocessing#cancer genomics#omicverse#survival analysis#TCGA
AuthorStarlitnightly
Version1.0.0
Installs0
System Documentation
What problem does it solves? Working with TCGA data involves navigating complex file structures, integrating diverse data types (expression, clinical, sample sheets), and preparing it for survival analysis. This Skill automates the entire preprocessing and survival analysis setup, saving significant manual effort.
Core Features & Use Cases
- Automated Data Ingestion: Ingest TCGA sample sheets, expression archives, and clinical information into a unified AnnData object.
- Metadata Initialization: Initialize AnnData objects with raw counts, FPKM, and TPM layers, and attach patient clinical data.
- Survival Analysis Setup: Automatically prepare and integrate survival attributes for downstream analyses.
- Gene-Level Survival Analysis: Plot and analyze gene-level survival curves using DESeq-normalized counts.
- Use Case: Load a TCGA ovarian cancer dataset, preprocess all expression and clinical files, then perform survival analysis for a specific gene like MYC, and export the fully annotated AnnData object for further research.
Quick Start
Load my TCGA OV dataset, initialize clinical metadata, and plot the survival curve for the gene 'MYC'.
Dependency Matrix
Required Modules
omicversescanpypandasmatplotlib
Components
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: TCGA bulk data preprocessing with omicverse Download link: https://github.com/Starlitnightly/omicverse/archive/main.zip#tcga-bulk-data-preprocessing-with-omicverse Please download this .zip file, extract it, and install it in the .claude/skills/ directory.