2D和3D空间中的肿瘤演化和微环境相互作用

2025-06-24 23:58来源:本站

  在圣路易斯华盛顿大学医学院的知情同意书中收集了所有样品。来自BRCA,PDAC,CRC,CHOL,RCC和UCEC的样品在手术切除过程中收集,并通过标准病理(机构审查委员会协议201108117,201411135和202106166)进行了验证。经过验证后,将1.5×1.5×0.5 cm3的部分去除,拍照,称重和测量。然后将每个部分细分为6-9片,然后进一步细分为4个横切片。然后分别将这四个部分分别放入福尔马林,在液氮,dmem和snap-frozen中进行快速冻结,然后再嵌入OCT。选择网格处理而不是打孔采样的目的是基于实用性的,因为它最大程度地减少了剩余的组织。相关协议可以在协议中找到(https://doi.org/10.17504/protocols.io.bszynf7w)46。

  将OCT填充的组织或FFPE组织样品切除,并遵循visium v​​ensements-tosents-tossue制备指南,将其放置在空间基因表达幻灯片上。将用于串行切片的样品切片并以5至100μm的间隔范围收集。进行串行分段时,第一部分被命名为U1,其次是U2,U3等。将选定的切片加载到覆盖玻片上,并记录每个部分之间的距离。对于十月的样本,在先前的出版物10中已经描述了详细的方法。简而言之,将新鲜组织样品与室温涂有OCT无气的涂层。在使用贴纸和使用H&E染色的OCT填充组织样品的H&E染色的RNA质量检查后,将块评分为合适的尺寸,适合捕获区域,然后将其分为10μm。然后将切片固定在甲醇中,用H&E染色,并使用Leica DMI8显微镜上的明亮场成像设置以×20放大倍数成像。然后将组织样品透化18分钟,并在森林空间基因表达试剂盒后构建ST文库用户指南CG000239 REV A(10倍基因组学)。cDNA是从聚腺苷酸化的使者RNA逆转录的,该cent cons在载玻片上使用引物捕获。接下来,第二链是从第一链中合成并变性的。然后将自由cDNA从载玻片转移到管以进一步扩增和图书馆构造。在Illumina Novaseq-6000系统的S4流动池上测序库。对于FFPE样本,已经在以前的出版物中描述了详细的方法47。简而言之,通过评估根据Qiagen rneasy FFPE套件协议从FFPE组织切片中提取的RNA的DV200进行,然后进行10倍基因组协议中描述的组织粘附测试。切片(5μm) 根据visium -tosents -tossue制备指南(10倍基因组学,CG000408 REV A),将其放置在空间基因表达幻灯片上。过夜干燥后,将载玻片在60°C下孵育2小时。然后按照森林空间进行FFPE - 驱虫蛋白,H&E染色,成像和降低链接的方案进行脱蜡脂化(10倍基因组学,CG000409 REV A)。将切片用H&E染色,并使用Leica DMI8显微镜上的明亮场成像设置以×20放大倍率成像。之后,立即为H&E染色部分进行缩短链接。接下来,将人类整体转录组探针面板添加到组织中。在这些探针与靶基因杂交并彼此连接后,结扎产物在RNase处理和透化后释放。然后将连接的探针与捕获区域的空间条形码寡核苷酸杂交。ST库是从探针生成的,并在Illumina Novaseq 6000系统的S4流动池上进行了测序。可以在protocols.io(https://doi.org/10.17504/protocols.io.x54v9d33e/v1和https://doi.org/10.17504/17504/protocols.io.kxygx95ezg8j/v1:

  购买了无载体单克隆或多克隆抗人类抗体(补充表3),并使用多个通道中的免疫荧光(IF)进行验证。筛选后,使用Akoya抗体共轭试剂盒(Akoya Biosciences,SKU 7000009)与条形码(Akoya Biosciences)共轭抗体。直接通过Akoya Biosciences直接购买了几种常见标记。根据制造商的说明(Codex用户手册,Rev C)执行法典染色和成像。简而言之,将5μm的FFPE切片放在涂有适合的盖板上(Sigma,440140),并在60°C下烘烤过夜,然后再脱氧化。第二天,将组织在二甲苯中孵育,在乙醇中孵育,并在DDH2O中洗涤,然后用TE缓冲液抗原检索,pH 9(基因,10-0046)在沸水中在水稻烹饪中持续10分钟。然后使用阻止缓冲液(法典染色套件,SKU 7000008)阻止组织样品,并用标记抗体面板在室温下在室温下,在室温下,将标记抗体面板染色3小时。每种抗体的稀释因子在Codex循环信息表中提供(补充表3)。使用配备尼康CFI PLAN APOλ×20/0.75物镜,Codex仪器(Akoya Biosciences)和Codex Instrument Manager(Akoya Biosciences)的键盘荧光显微镜(Model BZ-X810)对Codex多环形实验的成像进行。然后使用法典处理器(Akoya Biosciences)对原始图像缝合并处理。多重成像完成后,在同一组织上进行H&E染色。代码X中每种抗体的染色质量显示为绿色的单个通道,在补充无花果中,DAPI为蓝色。10和11。

  从每个样品中检索了大约20–30 mg的闪光或冷冻管或200 µm的OCT组织,并等分试一成核制备,以用于下一个GEM单细胞多组Multiome ATAC +基因表达套件或下一个GEM单细胞单细胞3'Kit V.3.1 Kit v.3.1 Kit。将样品重悬于裂解缓冲液中(10 mM Tris-HCl(pH 7.4)(Thermo,15567027),10 mm NaCl(Thermo,AM9759),3 mM MGCL2(Thermo,AM9530G),0.10%NP-40替代品(0.10%NP-40)(%v/v/v)(%v/v)(%v/v)646563),1%库存BSA溶液(%v/v)(MAC,130-091-376),无核酸酶的水(Invitrogen,AM9937),加上0.1 U µL – 1 rnase抑制剂),通过dunconding和40%的plabore(然后延伸)(然后使用40-μmmmcellisect)(然后)(然后是2%)BSA,1×PBS和RNase抑制剂)。收集滤液,然后在4°C下以500克离心6分钟。然后将核颗粒重悬于BSA洗涤缓冲液中,用RNase抑制剂用7AAD染色,将核纯化并通过FACS进行分类。可以在protocols.io(https://doi.org/10.17504/protocols.io.14egn7w6zv5/v1,https://doi.org/10.17504/protocols.io.261gednx7v47/v1cols.50.5051中找到相关协议。

  使用刀片将大约15–100 mg的每个肿瘤切成小块。来自人类肿瘤解离试剂盒(Miltenyi Biotec,130-095-929)的酶和试剂与1.75 mL DMEM一起添加到肿瘤组织中。将所得的悬浮液加载到绅士级C-Tube(Miltenyi Biotec,130-093-237)中,并与加热器(Miltenyi biotec,130-096-427)进行了绅士八颗OCTO隔离器。在加热解离计划(37H_TDK_1)上进行30-60分钟后,将样品从分离剂中取出,并通过40-μm的微型过滤器(Pluriselect,No。43-10040-60)或40-μm尼龙网(Fisher Scientific,22-363-547)筛选为15-15-15-inciCON CONICAIC,将样品从分离剂中删除并通过40-μm的微型过滤器(Pluriselect,No。43-10040-60)进行过滤。然后在4°C下以400克旋转样品5分钟。去除上清液后,当可见红色颗粒时,使用200μl至3 mL ACK裂解溶液(Thermo Fisher,A1049201)重悬于细胞沉淀1-5分钟。为了消除反应,加入了10 mL PBS(Corning; 21-040-CM),添加0.5%BSA(Miltenyi Biotec; 130-091-376),并在400g下在4°C下以400g旋转5分钟。去除上清液后,将细胞重悬于1 ml PBS中,用0.5%BSA重悬,使用锥虫蓝色可视化活细胞和死细胞。最后,在4°C下以400克旋转样品5分钟,并在500μl至1 ml PBS中重悬于0.5%BSA,最终浓度为700–1,500个细胞,每μl。该协议可在protocols.io(https://doi.org/10.17504/protocols.io.bsnqnddw)52上获得。

  使用10倍基因组铬仪器在油滴中分离核和细胞和条形码珠。对单核悬浮液进行计数,并使用血细胞计数仪调节到每µL 500–1,800核的范围。随后进行了逆转录以结合细胞和转录特异性条形码。所有SnRNA-Seq样品均使用铬下一个宝石单细胞3'库和凝胶珠套件v.3.1(10x基因组学)运行。对于多种套件,使用了铬的下一个宝石单细胞多组ATAC +基因表达(10x基因组学)。Nuclei were then subjected to downstream protocols by 10x (Next GEM Single Cell Multiome ATAC + Gene expression: https://cdn.10xgenomics.com/image/upload/v1666737555/support-documents/CG000338_ChromiumNextGEM_Multiome_ATAC_GEX_User_Guide_RevF.pdf.下一个宝石单细胞3'kit v3.1:https://support.10xgenomics.com/single-cell-gene-gene-expression/library-prep/doc/doc/user-guide-guide-chromium-single-single-cell-3-reagent-cell-cell-cell-kits-kits-kits--kit------ user-user-guide-v31-chemistry-dual-index-dual-index)。单细胞悬浮液受到下一个GEM单细胞3'KIT v.3.1协议的约束。然后将条形码的文库汇总并在具有相关流动池的Illumina Novaseq 6000系统上进行测序。

  肿瘤组织样品是从手术切除的样本中获得的。除去新的单细胞制备后,其余样品在液氮中被冻结,并存储在-80°C下。在批量DNA提取之前,将样品冷冻(covaris)并等分以进行大量提取。从具有Dneasy血液和组织试剂盒(Qiagen,69504)或Qiaamp DNA Mini Kit(Qiagen,51304)的组织样品中提取基因组DNA。根据制造商的说明(Qiagen),使用Qiaamp DNA Mini试剂盒(Qiagen,51304)从冷冻保存的外周血单核细胞中纯化基因组种系DNA。根据制造商的说明(Thermo Fisher Scientific),使用量子DSDNA HS分析(Q32854)通过荧光测定法(Q32854)评估DNA数量。协议可从协议(https://doi.org/10.17504/protocols.io.bsnhndb6)53获得。

  将大约100–250 ng的基因组DNA碎片在靶向250 bp插入物的Covaris LE220仪器上。使用Sciclone NGS平台(Perkin Elmer)上的KAPA Hyper Library Prep套件(Roche)构建了自动双索引库。在针对5 µg库池的混合捕获之前,将多达十个文库按等摩尔比以等摩尔比的比率合并。使用XGEN Exome Research面板V.1.0试剂(IDT Technologies)杂交图书馆池,该小型试剂(IDT Technologies)跨越了人类基因组的39-MB靶区域(19,396个基因)。将库在65°C杂交16-18小时,然后进行严格的洗涤,以去除虚假的杂交库碎片。洗脱富集的文库片段,并进行PCR周期优化,以防止过度放大。在测序之前,使用Kapa Hifi Master Mix(Roche)扩增了富集的文库。根据制造商协议(Roche),使用KAPA库定量套件通过QPCR确定每个捕获的库池的浓度,以生成适合Illumina Novaseq-6000仪器的群集计数。接下来,生成了2×150配对的末端读数,靶向12 GB的序列,以达到每个库的100倍覆盖范围。

  将石蜡块(FFPE块)切成5μm,并遵循FFPE组织制备指南(10倍基因组学,CG000578,REV B)放置在Xenium载玻片上。这些幻灯片使用FFPE组织增强子进行了一系列的二甲苯和乙醇洗涤,以进行去氯化和去链接,如概述(10倍基因组学,CG000580,REV B)。使用来自Xenium人类多组织面板(10倍基因组学,1000626)的379个探针以及另外100个自定义探针(补充表6)进行了过夜的原位探针杂交。杂交探针结扎后,样品进行了滚动圆的扩增,并使用自动荧光混合物淬灭背景。用DAPI染色细胞核,以改善样品跟踪和近似细胞边界(10倍基因组,CG000582,REV D)。这些样品以及缓冲液和解码消耗品被加载到xenium分析仪中(10倍基因组学,1000481)。使用提供的指导初始化运行(10倍基因组学,CG000584,REV C)。这些荧光记者与条形圆形cDNA的靶向互补区杂交。运行完成后,在同一区域进行H&E染色。

  所有数据分析均在R和Python环境中进行。上面的相关方法部分提供了特定功能和库的详细信息。使用Wilcoxon秩和测试,比例测试,超几何测试或Pearson相关测试确定显着性。p值< 0.05 were considered significant. Details of statistical tests are provided in the figure legends and the relevant methods sections.

  FASTQ files were preprocessed using trimGalore (v.0.6.7; with parameters: --length 36 and all other parameters set to default; https://github.com/FelixKrueger/TrimGalore). FASTQ files were then aligned to the GDC’s GRCh38 human reference genome (GRCh38.d1.vd1) using BWA-mem (v.0.7.17) with parameter -M and all others set to default. The output SAM file was converted to a BAM file using the samtools (https://github.com/samtools/samtools; v.1.14) view with parameters -Shb, and all others set to default. BAM files were sorted and duplicates were marked using Picard (v.2.6.26) SortSam tool with the following parameters: CREATE_INDEX=true, SORT_ORDER=coordinate, VALIDATION_STRINGENCY=STRICT, and all others set to default; and MarkDuplicates with parameter REMOVE_DUPLICATES=true, and all others set to default. The final BAM files were then indexed using the samtools (v.1.14) index with all parameters set to default.

  Somatic mutations were called from WES data using the Somaticwrapper pipeline (v.2.2; https://github.com/ding-lab/somaticwrapper), which includes four different callers: Strelka (v.2.9.10)54, MUTECT (v.1.1.7)55, VarScan (v.2.3.8)56 and Pindel (v.0.2.5)57. We kept exonic single nucleotide variants (SNVs) called by any two callers among MUTECT (v.1.1.7), VarScan (v.2.3.8) and Strelka (v.2.9.10) and insertions and deletions (indels) called by any two callers among VarScan (v.2.3.8), Strelka (v.2.9.10) and Pindel (v.0.2.5). For the merged SNVs and indels, we applied a 14× and 8× minimal coverage cut-off for tumour and normal tissue, respectively. We also filtered SNVs and indels by a minimal VAF of 0.05 in tumours and a maximal VAF of 0.02 in normal samples. We also filtered any SNV within 10 bp of an indel found in the same tumour sample. Finally, we rescued the rare mutations with VAFs within 0.015 and 0.05 based on an established gene consensus list58,59. In a downstream step, we used Somaticwrapper to combine adjacent SNVs into double-nucleotide polymorphisms using COCOON (https://github.com/ding-lab/COCOONS), as reported in a previous study60.

  We applied an in-house tool called scVarScan that can identify reads supporting the reference allele and variant allele covering the variant site in each cell by tracing cell and molecular barcode information in a snRNA-seq and single-cell RNA sequencing (scRNA-seq) or Visium bam file. The tool is freely available at GitHub (https://github.com/ding-lab/10Xmapping). For mapping, we used high-confidence somatic mutations from WES data produced by Somaticwrapper (described above). Visium reads were prefiltered with the flag ‘xf:i:25’ for reads contributing to unique molecular identifier counts.

  For each ST section, we applied two sets of statistical tests to all WES-based somatic mutations mapped to ST. First, for each mutation with greater than 30 reads of coverage on ST across all spots, the VAF was calculated for all tumour region spots and all non-tumour region spots as the number of variant reads across all spots divided by the number of total reads across all spots. A binomial test was then done using VAF of non-tumour spots as the background: binom.test(alterative=“greater”). Then, a proportion test was done between the VAFs in different spatial subclones with prop.test(alternative=“two.sided”). Finally, multiple testing correction was done on both sets of tests with the function p.adjust().

  Somatic CNVs were called using GATK (v.4.1.9.0)61. Specifically, the hg38 human reference genome (NCI GDC data portal) was binned into target intervals using the PreprocessIntervals function, with the bin length set to 1,000 bp and using the interval-merging-rule of OVERLAPPING_ONLY. A panel of normals was then generated using each normal sample as input and the GATK functions CollectReadCounts with the argument --interval-merging-rule OVERLAPPING_ONLY, followed by CreateReadCountPanelOfNormals with the argument --minimum-interval-median-percentile 5.0. For tumour samples, reads that overlapped the target interval were counted using the GATK function CollectReadCounts. Tumour read counts were then standardized and denoised using the GATK function DenoiseReadCounts, with the panel of normals specified by --count-panel-of-normals. Allelic counts for tumours were generated for variants present in the af-only-gnomad.hg38.vcf according to GATK best practices (variants further filtered to 0.2 > AF> 0.01和使用GATK函数收集量标记的条目。然后,使用GATK函数模型进行模型对段进行建模,并将其用作输入的DeNOCORIED拷贝比和肿瘤等位基因计数。然后使用GATK函数CallCopyRatioSegments在段区域调用片段的副本比率。

  BedTools62交叉点用于将拷贝数比从段与基因映射,并分配了所谓的扩增或缺失。对于重叠多个段的基因,使用自定义的python脚本将其称为基因,以放大,中性或删除为基于加权拷贝数比,根据每个重叠段的拷贝比率计算得出的加权拷贝数比,重叠的长度以及CallCopyRatioSoseMosegments函数使用的Z分数的长度。如果结果z得分截止值在callcopyratiosegments使用的默认z得分阈值(v.0.9,1.1)的范围内,则使用默认z得分阈值的边界(复制callcopyratiosegments函数的逻辑)。

  对于每个样本,我们通过将删除的FASTQ文件和关联的H&E图像传递给Space Ranger(v.1.3.0,v.2.0.0,v.2.0.0和v2.1.0'和v2.1.0''和v2.1.0'命令,使用默认参数启用了重复性参数)和前置GRCH38基因组2020-A,从而获得了每个样本的未过滤特征 - barcode矩阵(v.1.3.0,v.2.2.0.0,v.2.2.0.0和v2.1.0命令)。Seurat用于所有后续分析。我们使用每个幻灯片的Load10x_ -Spatial函数构建了一个seurat对象。然后将每个载玻片缩放并使用SCTRANSFORM函数进行标准化,以校正批处理效应。任何合并的分析或随后的细胞和样品子集的样品子集,其中几个幻灯片进行了相同的缩放和归一化方法。使用原始的Louvain算法聚类,以及使用Findneighbors和FindClusters的前30个主要组件分析维度,如“分析,可视化和集成Seurat” Vignette的效果,来自Seurat(https:///satijalab.org/seurat/spatial seurat)。

  为了检测使用SCRNA-SEQ,SNRNA-SEQ和VISIUM DATA的大规模染色体CNV,使用默认参数(httpspps://github.com/github.com/baredinstitute/infercnv)。地渗入层次是在样本级别运行的,仅使用原始计数矩阵使用后质量控制过滤数据。对于SNRNA-SEQ和SCRNA-SEQ数据,所有非恶性细胞均用注释“非肿瘤”作为参考,所有恶性细胞都具有相同的注释“肿瘤”,并带有以下参数:shialy_mode =“ subclusters”,-Cluster_by_groups = t,-cluster_by_groups = t,-denoise = t,-denoise = t,-denoise = t&hmmmmmmmmmmmmmmmmme对于visium st数据,将200个斑点注释为“非恶性”的估计值纯度评分最低,并且用作“恶性”斑点将其微核心ID作为注释,并带有以下参数:window_length = 151,Analysis_mode_mode =“ sample”,-cluster_by_groups = -Cluster_by_groups = t,-cluster_by_groups = t,-cenoise = t,-denoise = t,and-denoise =Calicost(https://github.com/raphael-group/calicost)63在具有相同输入注释(MicroRegion ID)的venium st数据上运行。来自同一微区的所有斑点都被视为最小的分析单位。然后,使用默认参数运行Calicost,并手动检查结果。

  为了确定两个空间CNV轮廓之间的相似性,我们使用了修改的Jaccard相似性评分。CNV轮廓定义为一组基因组窗口,带注释拷贝数中性(0),放大(1)或删除(-1)。然后比较了两个CNV轮廓,并分解了重叠的基因组窗口,以使两个配置文件具有相同的窗口(从包装基因组机是v.1.46.1降低的功能)。然后,将CNV相似性得分(SIM)定义如下:

  其中表示基因组窗口i的大小,表示基因组窗口中的轮廓a的CNV注释(0,1或–1),并表示在所有基因组窗口中A或B中A或B不含CNV中性的所有基因组窗口中的CNV注释B基因组窗口I中的CNV注释。

  为了确定空间CNV曲线和基于WES的CNV(与扩展数据相关的图5A)之间的相似性,我们使用了平均敏感性的相似性评分(在空间CNV中也检测到了基于WES的CNV的比例)和特异性(与WES基于WES CNV的空间CNV的分数)。具体来说,

  其中表示来自空间CNV的基因组窗口A的大小,则表示基于WES的CNV的基因组窗口E的大小,表示基因组窗口中的配置文件a的CNV注释(0,1或–1)。

  在OCT工作流程(补充图1A)中,Calicost同时将CNV和小组鉴定为空间亚克隆。在FFPE工作流程中,首先选择了每个微层中自信的空间CNV事件,通过将它们与匹配的WE进行比较。然后,在所有肿瘤微局部计算成对的CNV相似性评分。最后,使用函数hClust(d = 1-CNV相似性,方法=“ ward.d2”)将微心区与CNV相似性分数聚集,并分为具有功能cutree的簇(h = 0.8×max(hclust $ height))。手动审查最终的亚克隆分配,以避免过度集群并消除小型离群CNV概况。

  使用visium st,通过使用H&E的多步过程确定肿瘤微区域。通过手动审查H&E染色部分的形态,将每个ST斑点分配为基质或肿瘤。如果至少有50%的像素涵盖了恶性细胞形态,则该点被标记为肿瘤。否则,将其标记为基质。接下来,我们使用一组三个规则定义了不同的肿瘤微凹陷。第一个规则指定的是紧邻彼此的肿瘤斑点最初被标记为单个肿瘤微疗法。第二个规则指出,如果两个不同的肿瘤区域共同占据了一个位置的至少50%,则该斑点被分配给较高百分比占据的独特肿瘤区域。最后,第三个规则规定,如果一个肿瘤微层中的肿瘤斑点存在明显的形态差异,则微层必须分为不同的微层,每一个清晰的形态。

  之后,我们运行了使用数学形态来完善肿瘤微裂缝的变形工具集(https://github.com/ding-lab/morph)。也就是说,如果微层中的斑点总数小于或等于三个,则我们将所有斑点标记为基质。最后,Morph通过在点深度相关分析方法中描述的一系列数学形态操作分配了肿瘤微层的每个斑点的层(例如T1),该方法表示微层中给定斑点的深度。

  为了计算每个位置的区域,我们使用了斑点大小(55 µm),并在每个位置之间(100 µm)之间使用10倍基因组学的中心距离(100 µm)(http://kb.10xgenomics.com/hc/en-us/articles/360035487572-what-what-is-the-patial-spatial-resolution-resolution-and-configuration-the-capture-capture-area-area of​​-the-visium-visium-v1-v1-v1-ven-ven-ven-gene-expression-slide-slide-slide-slide-)。如补充图6所示,森林斑点形成了覆盖样品的六角形晶格。该晶格的重复单元是一个以八个等边三角形组成的梯形形状。每个三角形的侧面为50 µm(占中心距离的位置的一半)。使用等边三角形的面积方程并将其乘以8,我们获得了每个梯形的面积为8,660 µm2,这是每个位置占据的平均面积。为了计算微层的大小,我们将斑点计数乘以8,660,并除以106,以获得MM2中的大小。

  我们通过按照以下公式估算每个截面的微层密度:每个截面尺寸的密度(每片尺寸)(在斑点中)除以每个位置8,660 µm。然后每mm2的密度=每µm2×106的密度(每mm2的n个微层)。

  细胞类型分配是根据以下已知标记完成的:B细胞,CD79A,CD79B,CD19,MS4A1,IGHD,CD22和CD52;cdc1,cadm1,xcr1,clec9a,rab32和c1orf54;CDC2,CD1C,FCER1A,CLEC10A和CD1E;MregDC,Lamp3,CCR7,FSCN1,CD83和CCL22;PDC,IL3RA,BCL11A,CLEC4C和NRP1;巨噬细胞,CX3CR1,CD80,CD86,CD163和MSR1;肥大单元,HPGD,TPSB2,HDC,SLC18A2,CPA3和SLC8A3;内皮,EMCN,FLT1,PECAM1,VWF,PTPRB,ACTA2和ANGPT2;成纤维细胞,COL1A1,COL3A1,COL5A1,LUM和MMP2;Pericyte,RGS5,PLXDC1,FN1和MCAM;NK细胞,FCGR3A,GZMA和NCAM1;浆细胞,CD38,SDC1,IGHG1,IGKC和MZB1;T细胞,IL7R,CD4,CD8A,CD8B,CD3G,CD3D和CD3E;和监管T细胞,IL2RA,CTLA4,FOXP3,TNFRSF18和IKZF2。用以下标记注释乳房中的正常上皮细胞:Lumsec,gabrp,elf5,cl28,krt15,barx2和hs3st4;Lumhr,Ankrd30a,Erbb4,Aff3,TTC6,ESR1,NEK10和XBP1;和基础,SAMD5,FBXO32,TP63,RBBP8和KLHL13。用以下标记注释肝脏中的正常上皮细胞:肝细胞,ALB,CYP3A7,HMGCS1,ACSS2和AKR1C1;胆管细胞,SOX9,CFTR和PKD2。使用参考数据BaronpancreasData(“人类”)对胰腺中的正常上皮细胞,包括导管,腺泡,胰岛和胰岛细胞。

  我们确定了基因表达与其肿瘤微层中的斑点深度之间的相关性。首先,将每个位置分配一个深度定义为肿瘤微层中最接近TME的距离的距离。该深度通过迭代过程在几层中进行了量化,从而将所有紧邻非恶性斑点附近的恶性斑点视为第1层,然后将所有与第1层相邻的恶性斑点都视为第2层,并重复该过程,直到分配了所有层数。如果斑点的层大于斑点和任何森维边界之间的最小距离(包括覆盖窗口的边缘,组织截面的边缘以及部分内部的任何空斑点),那么我们排除了这样的斑点,因为我们只知道该点的深度的上限。此外,在分析中排除了少于3层或50个斑点的肿瘤微区域。层之间的距离被视为森林斑点的中心距离(100 µm)。

  为了给较大和较小的区域提供相同的重量,每个位置的最大深度进一步归一化。然后,我们在基因表达(至少有1个在所有斑点的50%以上从基因检测到的转录本)和每个位置的归一化深度之间独立进行了部分相关测试,肿瘤纯度为协变量如下:

  如果层分数是层数除以肿瘤中的层总数以正常化,则对于大小的微区域,RHO是层相关系数,而B是协方差纯度的相关系数。当具有匹配的SNRNA-SEQ数据(通过RCTD venvol的肿瘤分数)或估计值(即每点肿瘤纯度估计得分)时,纯度是通过反向卷积推断的。每个基因都针对一组SNRNA-seq衍生的非恶性基因列表进行检查,以确保分数的变化不是源于细胞类型组成的变化。最后,我们针对每个ST部分进行的所有测试进行了多次测试调整。

  为了总结遍布跨部分的中心和肿瘤微区域周围的生物学程序,我们首先获得了队列级平均层相关系数。如果测试不显着(P≥0.05),则将RHO分配为0,以表示无相关性。如果未对部分进行测试(<50% of the spots have at least one transcript), rho was also assigned as 0. When a case had multiple sections, we first took the average rho across sections to avoid bias towards tumours with more sections. Then, the average of rho was calculated for each cohort (all samples or samples from each cancer type).

  In the same fashion, rank statistics were calculated for each test as –log10(P value) × rho for tests with P < 0.05. For tests with P ≥ 0.05 or genes not tested, the rank statistic was 0. We then calculated average rank statistics per case, followed by the average per cancer type. Finally, with the full list of rank statistics calculated for all genes tested, we used the function GSEA (parameters: pvalueCutoff=0.5; package: clusterProfiler v.3.18.1) to obtain the normalized enrichment score of Hallmark pathways (package: msigdbr 7.5.1) from the MSigDB64. Finally, only pathways with P < 0.1 were kept in the final results.

  We use differential expression and per cent expression filters, comparing expression among cell types in the matching snRNA-seq data to further characterize genes identified in the centre and periphery enriched analysis. The steps implemented in this workflow generated four categories: tumour-specific, stromal-specific, tumour-enriched and stromal-enriched (Supplementary Fig. 8a,b). Genes that did not pass the significant cutoff in any differential expression analysis were labelled separately as not DEG.

  To distinguish these four groups, we first performed differential gene analysis of cell types in the matching snRNA-seq data, filtered by a conventional significance cut-off (log2(fold change) > 0.5, adjusted P < 0.05, Bonferroni correction), to obtain DEGs (Supplementary Fig. 8a). Given the heterogeneity in tumours, certain tumour-specific genes might only exist in a subpopulation of tumours. Therefore, we first subclustered the tumour populations (using the Subcluster function in Seurat with a resolution of 0.5) to obtain tumour subclusters. We then compared each subcluster with all other non-tumour cells. A gene was considered a tumour DEG if at least one tumour subcluster showed significant expression compared with the non-tumour cells and vice versa for non-tumour DEGs (Supplementary Fig. 8a,b).

  For candidate tumour or stromal-specific genes, a DEG was designated as tumour-specific if it met both of the following criteria: (1) it is a DEG when compared with all non-tumour cell types from at least one tumour subcluster; and (2) its expression was <15% in all non-tumour cell types (Supplementary Fig. 8c).

  The reverse applied to candidate stromal-specific DEGs. If a DEG did not meet both of these requirements to be tumour or stromal specific, it was designated as either tumour-enriched or stromal-enriched based on whether the expression level was higher in tumour or stromal cell types (Supplementary Fig. 8a).

  We focused on ten cases (comprising four BRCA, two CRC and four PDAC samples) with multiple spatial subclones for this analysis. To obtain subclone-specific DEGs, we used FindMarkers from the function in Seurat with the ‘wilcox’ test option DEGs between each subclone and TME. We then applied the cut-off for adjusted P < 0.01, average log2(fold change) > 1 and per cent expression in at least one cell type >0.4选择显着的DEG。为了推断治疗反应,我们使用了扰动数据库LINCS L1000(参考文献65),特别是lincs_l1000_chem_pert_down数据集从富集66来评估空间子收支平衡和下调治疗后的空间子球中上调的DEG和下调基因的基因集重叠。为了使图4中的曲线用“奇数”和每个亚克隆中选择的顶部化合物对数据进行排序。相应的化合物元数据(包括作用机理)是从线索中获得的(Clue.io,“扩展的CMAP Lincs Resource 2020释放”),以在热图上添加注释。

  为了区分源自癌性与非恶性基质或免疫细胞的转录本,我们使用了每个器官的合并SNRNA-SEQ数据(乳腺,肾脏,肝脏和胰腺)进行细胞型标记分析。该分析使用“ Wilcox”测试选项中的Seurat中的Findallmarkers功能。随后,我们通过应用过滤器(例如平均log2(折叠更改)> 2),至少在一种单元格> 0.4和调整后的P值中来完善基因列表。< 0.01 to ensure robust marker selection for each cell type. The resultant gene list is available in Supplementary Table 5. This list was instrumental in excluding non-cancerous cell genes from analyses pertaining to cancer-specific expression patterns, such as pairwise microregion similarity analysis. Of note, during the analysis, we observed a notable mapping of various epithelial cell types in the snRNA-seq reference dataset for BRCA when using the RCTD deconvolution method. This observation probably stems from the diverse BRCA subtypes present in the cohort. To address this, we opted to combine all epithelial cell types into a single category during the identification of cell-type markers and excluded them from the blacklist. For tumours originating from organs other than the four mentioned above, we aggregated all genes present in the blacklist across organs to form a comprehensive multiorgan blacklist, which aided in filtering out non-cancerous transcripts.

  For overall tumour heterogeneity, we selected Morph-identified spots then ran ROGUE (v.1.0)67 to measure heterogeneity as 1-ROGUE. We then compared the transcriptional profiles of microregions by selecting the top 500 most variable features after excluding stroma regions in ST samples following Morph processing. Our initial evaluation involved conducting Pearson correlation tests for each pair of microregions, using a range of the top 250–1,500 most variable genes with increments of 250 (that is, 250, 500, 750, …, 1,500). We observed consistent correlations for nearly all values beyond using more than 500, which led us to select the top 500 genes for this analysis. This choice reduced the risk of selecting too few variable genes (for example, <250 most variable genes) while also avoiding the inclusion of numerous genes with minimal effect on the transcriptional profile. GSEA analysis was done using the function GSEA (parameters: pvalueCutoff = 0.5; package: clusterProfiler v.3.18.1) to obtain the normalized enrichment score of Hallmark pathways (package: msigdbr v.7.5.1) from the MSigDB64.

  Module scores on top of each heatmap in Extended Data Fig. 6 were calculated with the AddModuleScore function from Seurat68 using the genes listed in each heatmap. This score represents the average expression levels of a gene set. The score was calculated for each spot and a box plot was used to show the distribution of module scores in each microregion.

  Cell-type composition per spot was deconvolved using RCTD18 with default parameters and doublet_mode = ‘multi’. The reference for each run was the cell types manually annotated from the Seurat object of the matching snRNA-seq or Multiome sample. To quantify spatial distribution of each cell type, cell type fraction of 6 layers (T3 and above, T2, T1, E1, E2, E3 and above) from each tumour microregion is calculated and averaged in each sample. To compare differential TME infiltration between spatial subclones, cell type fraction from all spots between spatial subclones was compared with pairwise Wilcoxon rank-sum test and FDR adjustment.

  We evaluated the spatial-based cell–cell interaction (CCI) in the ST sample using COMMOT69 with CellChat database and distance threshold of 1,000 µm, following the same threshold used in the original publication for Visium. The median sender and receiver signals for each interaction family were compared between all tumour boundary spots (including tumour boundary layer and TME boundary layer) and all non-boundary spots (Wilcoxon rank-sum test) on a sample. Interaction pathways with signal difference great than 0.1 and FDR less than 0.05 are considered significantly boundary-enriched. Boundary DEGs were identified with FindMarkers function on three sets of comparisons: boundary/tumour, boundary/TME and boundary/all non-boundary. A boundary DEG has adjusted P value 0.25 in boundary/non-boundary test, and log2(fold change) > 0 in the other two tests.

  We applied PASTE2 (ref. 70), the updated ST-based alignment tool PASTE70, to enable partial image alignment. Serial sections of the same tumour piece were aligned pairwise with default settings. Each Visium data point in every ST section received new coordinates, denoted as x′ and y′, based on the alignment results. We then identified the nearest spot on each adjacent section for every spot, connecting them along the z axis. This process facilitated the linking of spots across all sections on the z axis. To assess whether one microregion was connected to another in an adjacent section, we first removed stromal spots and then counted the connected spots. If any microregion on one section connected to the next section with more than three shared spots, then we considered these two microregions, located on different sections, as connected in 3D space and forming the same tumour volume. This connection was labelled as volume 1, volume 2, and so forth in the figures (Fig. 5d,e and Extended Data Fig. 9a–d).

  We used two geometric metrics to describe tumour volume: connectivity and loop. For connectivity (degree), this metric quantifies the number of connections from an individual microregion to adjacent sections. For example, if microregion 2 in section 2 connects to 3 microregions in section 1 and 2 in section 3, its connectivity is 5. The maximum connectivity of a tumour volume is the highest connectivity among its microregions. For loop, this metric was calculated as the total number of connections minus the total number of microregions plus one, identifying intricate loop structures within the tumour volume.

  Before registration, imaging data underwent the following transformations. Multiplex images were converted to greyscale images of DAPI intensity. The image was then downscaled by a factor of 5 before key point selection. H&E images (also downsampled by a factor of 5) were used for keypoint selection with Visium data.

  For registration, we used BigWarp71, which was packaged in the Fiji/ImageJ software application. To register each collection of serial images, we used the first serial section as the fixed image and the second image as the moving image. After the second image was warped to the first image, the second image was used as the fixed image for the transformation of the third image. Key point registration proceeded in this fashion for all images in the serial section experiment. A total of 4–20 key points were selected per image transformation. once key points were selected, a moving field was exported from BigWarp for each image transformation. This dense displacement field was then upscaled by a factor of 5 so it could be used to warp the full-resolution imaging data. The full-resolution dense displacement field was then used to register its corresponding multiplex or Visium data. The code used for registration is available at GitHub (https://github.com/ding-lab/mushroom/tree/subclone_submission).

  once imaging data were registered, they were processed in the following manner before model input.

  For Visium ST data, genes were limited to genes expressed in a minimum of 5% of spots across all serial sections and expression counts were log2 transformed. CODEX, Visium ST and H&E data were normalized by subtracting the mean expression and dividing by the standard deviation for each gene.

  expression profiles for each patch were generated differently for image-native data (CODEX and H&E) and point-based data (Visium). expressions for CODEX and H&E patches were calculated as the average pixel intensities for each image channel over all pixels within the patch bounds. Visium patches were calculated in largely the same manner; however, the expression profile of each spot within the patch was linearly weighted by its distance to the centre of the patch. This differential weight helped to account for variation expression due to the number of spots that fall within patch boundaries.

  The neighbourhood annotation model consisted of an autoencoder with a vision transformer (ViT) backbone (Supplementary Fig. 7). In brief, an autoencoder is an unsupervised training method for which an encoder (embedding component) and a decoder (reconstruction component) work together to learn how input data are generated. Specifically, the network derives an approximation, Q, to the true posterior generating function, P, for the output, given the input. The autoencoder used was asymmetric, meaning that the encoder and decoder were not inverse copies of one another. The encoder consisted of a ViT with a similar architecture to previously described architectures72,73 (Supplementary Table 4).

  ViTs work on image tokens as input. In brief, image tokens are n-dimensional representations of patches of the input image. During training, image tiles were sampled from a uniform distribution across the set of input sections (Supplementary Fig. 7a). The sampled tile was then split into patches, for which the number of patches was determined by two hyperparameters: patch height (ph) and patch width (pw). Each patch was then flattened to a 1 × (ph × pw × c) vector, where c is the number of channels in the image (in the case of spatial transcriptomics data, c is the number of genes). The unrolled patches were then concatenated into a n × (ph × pw × c) matrix, where n is the number of patches in the image tile. Each row in this matrix is a token that represents a patch in the image tile. The tokens were then projected by a linear layer to shape n × d, where d is the dimension of the transformer blocks.

  After this, a slide token was concatenated to patch tokens. The slide token (representing the slide from which the image tile was selected) was indexed from a trainable embedding of size n_slides × d, where n_slides is the number of slides in the serial section experiment. The motivation for the slide token is that as it is passed through the transformer blocks, along with the patch tokens, information can be shared across all tokens, allowing the slide token to learn to attend to useful representations of the patches. This feature allowed the model to be more robust to batch effects between serial sections. Following the addition of the slide token, positional embeddings were added to all tokens and passed through the transformer blocks comprising the ViT encoder. All variables above and details of the transformer architecture are available in Supplementary Table 4.

  once passed through the encoder, patches were represented as an embedding of size n × d. The next step of the architecture was neighbourhood assignment. Neighbourhoods were assigned to patches in a hierarchical manner, meaning that a patch was classified into several neighbourhoods that differed in level of specificity. For each level of the neighbourhood hierarchy, the subsequent levels comprised partitions of the previous levels’ neighbourhoods, that is, except for the first level, each neighbourhood was a subset of a neighbourhood in a previous level of the hierarchy. For this analysis, the model generated three levels of neighbourhoods, each with the capacity to discover up to n = 8 (level 1), n = 32 (level 2) and n = 64 (level 3) neighbourhoods, respectively. For this analysis, all neighbourhoods shown are neighbourhoods annotated at hierarchy level 3. The model contained three codebooks (one for each level) that are of size n_NBHDs × d, where n_NBHDs is the number of possible neighbourhoods that can be assigned for the given level. The patch embeddings output by the ViT encoder were projected by three independent blocks of linear layers (one for each level) that output each patch’s probability of assignment to a given neighbourhood. These probabilities were then used to retrieve neighbourhood embeddings from the codebook corresponding to the neighbourhood level. Three linear blocks (one for each level) were then used to independently reconstruct patch embeddings at each level to each patch’s original pixel values. The code used for training the model is available at GitHub (https://github.com/ding-lab/mushroom/tree/subclone_submission).

  The overall loss function has two main contributions: mean squared error (MSE) on the reconstruction of the input patches, and cross-entropy loss on the encoded distribution and the normal distribution with 0 mean and 1.0 variance.

  During training, the autoencoder was simultaneously trying to optimize two main tasks: the reconstruction of the expression profile of each image patch embedding and the alignment of neighbourhood labels between adjacent sections. These two competing objectives forced the model to learn representative expression patterns while also keeping neighbourhoods aligned between input sections, which helped to combat neighbourhood differences due to batch effects. Differences in patch expression were quantified by MSE, whereas neighbourhood adjacency was enforced by minimizing the cross-entropy of patches adjacent to each other in the z direction during training.

  The overall loss function is defined below:

  Where λNBHD (maximum of 0.01) and λMSE (set to 1.0) are scalers for the neighbourhood loss (LNBHD) and reconstruction loss (LMSE), respectively. During training, λNBHD for was linearly increased from 0 to its maximum value.

  Two separate runs of the model were trained for HT397B1 (six H&E, four CODEX and two Visium ST slides) and HT268B1 (four Visium ST slides). Training hyperparameters, such as batch size and number of training steps, are provided in Supplementary Table 4. For HT268B1, only one instance was trained because only one data type was present. For HT397B1, three model instances were trained (one for each data type) and were subsequently merged following the procedure described in the section ‘3D neighbourhood construction and integration’.

  Following training, the model inference was performed on overlapping image tiles for each slide using a sliding window of size 8 and a stride of 2 (that is, 2 overlapping patches between image tiles). The 2 × 2 centre patches of each tile were extracted and retiled to match the original slide orientation. Each reconstructed ‘patch embedding image’ was at a resolution of 50 pixels µm–1 (that is, each neighbourhood patch represents an area that is 50 µm wide) with the exception of Visium ST, for which the patch resolution was 100 pixels µm–1.

  After the assignment of neighbourhoods for each section, slides were interpolated to generate a 3D neighbourhood volume. For this, we used linear interpolation of neighbourhood assignment probabilities with the torchio library74.

  Following interpolation, we also integrated neighbourhood volumes for HT397B1, for which multiple data-type-specific volumes were generated using a graph-based clustering approach. In brief, all overlapping neighbourhood voxel annotations were identified. A graph was then constructed, whereby nodes represented each neighbourhood partition combination, and edges are the distance (in the expression profile) between these partition combinations. This graph was then clustered with the Leiden graph clustering algorithm to identify integrated neighbourhoods. Hyperparameters for the above clustering process are provided in Supplementary Table 4. 3D neighbourhoods were displayed using the open-source visualization tool Napari (https://github.com/napari/napari).

  Neighbourhoods were then assigned to Visium ST spots in the following manner. Each spot was assigned the neighbourhood label of the neighbourhood overlapping its spot centroid.

  To focus on neighbourhoods most related to the TME biology, we filtered out neighbourhoods with >50%与副本编号注释的子电源重叠。此外,我们排除了样本中所有ST部分映射到少于十个斑点的社区。

  肿瘤克隆的亚克隆边界区域被定义为亚克隆注释点的最外层的结合,而斑点从它们扩展了一层,代表了在肿瘤-TME界面上大约100-150 µm的区域。亚克隆特异性分数计算为与每个亚克隆最外层的邻域重叠。

  在HT397B1中,计算了所有社区的DEG,而不仅仅是那些因子克隆重叠和斑点计数而过滤的DEG。社区4和6的前50摄氏度分为三类:共享,邻里4和社区6独有。对于图5中的显示,根据以下排序标准选择了每个组的前十名以进行显示。通过从邻域4中减去邻域6中的平均表达来计算邻域4和6之间的平均表达增量。共享DEG是根据每个基因的绝对表达增量以升序的方式排序的。邻里4和邻里6独有的基因通过平均表达达美的下降和上升的方式订购。

  我们用于细胞注释的工作流程包括四个主要步骤:(1)图像格式转换,(2)细胞分割,(3)空间特征生成和(4)细胞类型分类。首先,我们将Codex平台(.QPTIFF)的图像输出转换为流行的开源OME-TIFF格式。在此过程中,我们还为每个样品产生了一个单独的图像,因为有时在同一成像运行中包含多个组织。然后,我们在DeepCell framework75中使用了Mesmer预先训练的核+膜分割模型来分割核和整个细胞。DAPI被用作核强度图像,而通道泛环蛋白,HLA-DR,SMA,CD4,CD4,CD45,HEP-PAR-1,CD31,E-Cadherin,CD68和CD3E对于给定图像中的那些是均值,均值为单个通道,并用作膜强度图像。

  然后,我们使用门控过程来识别单元格。首先,为了打击成像运行和组织类型之间蛋白质强度分布的差异,通过视觉检查为每个图像时在细胞键入过程中使用的所有蛋白质通道手动设置阈值。高于此强度阈值,一个像素被认为是给定标记的阳性,在其下方,像素被认为是阴性的。然后,我们使用从上一步开始的单元格分割边界来计算每个细胞中所有细胞键入标记的阳性像素的分数。该过程的结果是一个特征矩阵(NUM细胞×NUM蛋白),代表每个细胞中每个细胞键入蛋白的正标分数。如果其像素的5%对该标记为阳性,则认为细胞对标记阳性。然后将细胞用特定于每个样品的门控策略标记。在门控过程中,每个单元格都经过一系列和门,如果一个单元格通过了给定步骤的所有标准,则将其注释为针对该步骤指定的单元格类型,而如果未能通过标准,则将其传递到门控策略的下一个下游步骤。补充表4列出了本文中用于样品的门控策略。

  以下标签是所有可能的细胞类型注释的集合:上皮,CD4 T细胞,CD8 T细胞,调节性T细胞,T细胞,巨噬细胞,巨噬细胞,巨噬细胞,B细胞,B细胞,树突状,免疫,内皮,成纤维细胞和肝细胞。对于某些图像,并非所有需要构成特定细胞类型所需的蛋白质。例如,CD4并非每个图像面板中都可以用于CD4 T细胞的注释中。在这些情况下,构建了门控策略,以便如果不存在特定蛋白(即,标记为T细胞而不是CD4 T细胞),则可以将细胞标记为更通用的细胞类型。如果一个单元对门控策略的所有步骤均为阴性,则将其注释为“未标记”。图像格式转换的代码和单元格分段可以在GitHub(https://github.com/estorrs/multiplex-imaging-pipeline)上找到。

  注册后,使用对齐图像的坐标将标记为肿瘤映射为肿瘤的景点。与codex对准的幻灯片中每个位置的中心的坐标与其visium对应物相同。法典一致的幻灯片中的每个位置都占据了半径为150像素的圆的区域。然后,使用Python的scipy.ndimage.distance_transform_edt计算每个像素的欧几里得距离转换。计算了从微区域和微区域内的距离。

  使用以下步骤生成了HT397B1和HT268B1的肿瘤体积的表面网格可视化:(1)肿瘤邻域选择,(2)网格结构和(3)网格着色。首先,具有超过给定阈值的肿瘤指标(如下所述)的综合邻域被认为是肿瘤体积的一部分。在HT268B1中,用于量化上皮特征的度量是每个邻域注释的ST斑点的比例。> 60%亚克隆斑点的社区被认为是肿瘤。在HT397B1中,我们使用的是法典注销的上皮细胞的一部分,因为法典部分超过了该样品的visium st部分。上皮细胞比例> 60%的邻域被认为是肿瘤。

  然后构建了一个新的卷,在该册中,使用上述标准将其归类为肿瘤社区被视为肿瘤阳性体素,并且所有其他体素都是肿瘤阴性的。然后用高斯核(Sigma = 1.0)将此3D肿瘤面膜平滑。然后将所得值用作行进立方体算法76,77的输入,以生成肿瘤体积的表面网格。我们使用带有默认参数的行进立方体算法的Scikit-Image实现(Skimage.marching_cubes)。

  为了使表面网格着色,我们生成了3D特征量(如下所述),然后根据特征体积中相应位置的体素值在表面网格上的彩色点。特征量是一个体积,该卷中的每个体素都描述了串行部分数据集中的某些特征(例如,给定基因的表达,单元格的分数等)。以下方式构建了此分析中使用的特征量。首先,在适用功能的串行部分中,该功能以与3D邻域相同的分辨率进行了归纳(在这种情况下为50 µm)。然后将binned特征插入z方向以填补部分之间的空白。所得的体积与集成的邻域体积相同,每个体素的值是体素的汇总特征计数。对于HT268B1,使用的特征是tymp1和iGLC2的记录表达。对于HT397B1,我们使用了成纤维细胞和免疫细胞分数。如“ Codex成像数据的细胞类型注释”部分所述注释细胞。使用Napari(https://github.com/napari/napari)可视化表面网格,并根据体积对数量进行调整。我们还使用Imaris平台可视化HT397B1组织的体积,为此我们从以下法典标记中生成表面:泛环蛋白(上皮),CD45(Immune)(免疫)和SMA(Stromal)。

  按照“从Xenium面板设计”指示,使用Xenium面板设计师(https://cloud.10xgenomics.com/xenium-panel-designer)设计自定义xenium基因和突变探针。(https://www.10xgenomics.com/support/in-situ-gene-expression/documentation/steps/panel-design/xenium-panel-panel-getting-start- started #design-tool)。简而言之,从符号转录本(Ensembl v.100)策划了靶向转录的变体位点两侧的21 bp序列。然后评估所有四个可能的连接连接(WT等位基因的两个,两个用于变体等位基因,三个在删除情况下,两个wt等位基因,另一个用于变体等位基因)。仅排除仅提供非脱颖而出的连接(CG,GT,GG和GC)的变体位点。连接连接序列的两个碱基是RBD5(RNA结合域)的最后一个碱基和RBD3探针的第一基碱基。除非需要一个中性连接以避免发夹,均聚物区域,二聚体或不利的退火温度,否则首选的连接始终优先于中性连接。然后将RBD5和RBD3的探针长度从21 bp的起始长度调节,以针对每个探针之间的温度在50°C和70°C之间(总目标68°C和82°C)。排除了IDT的Oligo分析仪形成二聚体或发夹的探针的变体位点。在RBD5或RBD3探针中,具有均聚物区域的均聚物区域的变体位点被排除在外。

  在这里,我们在SNRNA-SEQ数据中使用了反卷积结果和细胞类型的特异性表达来对veleve venvolde velvolde vervolde verveless sT表达数据(补充图9)。简而言之,对于给定的Gene1,我们首先计算了匹配的SNRNA-SEQ数据中每个细胞类型的基因1的平均表达,随后滤除了这种基因在最高平均表达的<5%的细胞类型中的表达,然后将每个细胞类型的平均表达划分为所有平均表达的总和,从而创建每个细胞类型的表达贡献。然后,对于给定的位置,将每个细胞类型的贡献乘以细胞类型转化结果(例如RCTD)的细胞类型比例,然后归一化为1,以给出最终的表达贡献矩阵(WN)。例如,在补充图9a中,基于过滤后的SNRNA-SEQ表达,基因1具有相应的A,B型B和C的40%,30%和30%的贡献。对于Spot1,由于在现场只有1个细胞类型B,因此40%×1/40%×1给出了Gene1对Spot1中细胞类型的最终100%贡献。Spot2分别包含50%A和B细胞类型,因此,点2中的标准化细胞类型的贡献为50%×40%/(50%×40%+50%+50%×30%)≈57.1%,细胞类型57.1%,50%×30%×30%/(50%×40%+50%+50%×30%)的表现为5.42.9%BB B. B. the Over the Over the Over -B. tecribere the Over deconver at deconv b.在Spot1和Spot2中20)具有相应的基于细胞类型的贡献,以获得Spot1的最终反价vol的表达值 - 细胞类型B = 5,Spot2 - Spot2 - 细胞类型A≈10.42和Spot2 - 细胞类型B≈8.58。

  有关研究设计的更多信息可在与本文有关的自然投资组合报告摘要中获得。

左文资讯声明:未经许可,不得转载。