TCGA bulk data preprocessing with omicverse

Community

Preprocess TCGA data for survival analysis.

AuthorStarlitnightly
Version1.0.0
Installs0

System Documentation

What problem does it solves? Working with TCGA data involves navigating complex file structures, integrating diverse data types (expression, clinical, sample sheets), and preparing it for survival analysis. This Skill automates the entire preprocessing and survival analysis setup, saving significant manual effort.

Core Features & Use Cases

  • Automated Data Ingestion: Ingest TCGA sample sheets, expression archives, and clinical information into a unified AnnData object.
  • Metadata Initialization: Initialize AnnData objects with raw counts, FPKM, and TPM layers, and attach patient clinical data.
  • Survival Analysis Setup: Automatically prepare and integrate survival attributes for downstream analyses.
  • Gene-Level Survival Analysis: Plot and analyze gene-level survival curves using DESeq-normalized counts.
  • Use Case: Load a TCGA ovarian cancer dataset, preprocess all expression and clinical files, then perform survival analysis for a specific gene like MYC, and export the fully annotated AnnData object for further research.

Quick Start

Load my TCGA OV dataset, initialize clinical metadata, and plot the survival curve for the gene 'MYC'.

Dependency Matrix

Required Modules

omicversescanpypandasmatplotlib

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: TCGA bulk data preprocessing with omicverse
Download link: https://github.com/Starlitnightly/omicverse/archive/main.zip#tcga-bulk-data-preprocessing-with-omicverse

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository