Fig. 1

scRNA-seq data processing and analysis pipeline using natural language processing (NLP). (A) Construction of a gene similarity network based on gene expression profiles and generation of gene sequences using random walks. (B) Conversion of gene sequences to gene vectors using the word2vec model, and calculation of cell vectors by aggregating gene vectors weighted by gene expression levels. (C) Downstream applications of cell vectors, including visualization analysis, gene perturbation analysis, and tissue network structure analysis