性腺激素受体调控基因是脑性别差异的基础_国际

　　所有动物都保持在12小时的光/12小时黑暗周期上，并随意提供食物和水。所有小鼠实验均根据CSHL机构动物护理和使用委员会规定的严格指南进行。将所有动物随机分配给实验组。ESR1CRE（参考文献50），RPL22HA（参考51），Rosa26Cag-Sun1 – Sfgfp-sfgfp – Myc（参考文献52）;从Jackson Labs获得了缩写为Sun1 – GFP），VGATCRE（参考文献53）和C57BL6/J野生型小鼠。从S. A. Khan54收到ESR1LX小鼠。成年男性和雌性小鼠在8至12周之间使用。对于成年激素治疗实验，在皮下服用5μgE2（Sigma E8515）后4小时，将动物安乐死以进行组织收集，悬浮在玉米油（Sigma C8267）或载后进行后3周。对于新生儿切割和运行，ATAC-SEQ和RNA-SEQ实验，在P0上用5μgE2或媒介物处理动物，并在4小时后（ERα切和运行）或4天后收集（ATAC – SEQ和核RNA-SEQ）。对于新生儿多组，snRNA-seq和如果定量，则在P4（Multiome）或P14（Multiome，SnRNA-Seq以及染色）上收集动物。

　　细胞系包括MHYPOA CLU-175克隆（Cedarlane Labs）和MCF-7（ATCC）。细胞系未测试支原体污染。将细胞维持在补充10％FBS和青霉素/链霉素的标准DMEM中。在切割和运行之前，将MCF7细胞在含有10％木炭成FBS和青霉素/链霉素的无酚红色DMEM培养基中生长48小时，然后用20 nm 17-β-雌激素或媒介物（0.002％乙醇）处理45分钟。

　　如前所述进行实验55。简而言之，在迅速斩首的成年ESR1CRE/+; RPL22HA/+小鼠后，BNSTP被微解剖。使用Nugen Ovation RNA-Seq试剂盒（7102和0344），从四个生物复制样品（每个由8-9个合并动物组成）（每只由8-9种合并动物组成）中制备组织均质化，免疫沉淀和RNA提取，并从四个生物学复制样品（每个由8-9种合并动物组成）制备文库。在Illumina NextSeq上使用76 bp的单端读取对多路复用的库进行测序。如先前所述的16,55，通过对实验条件蒙蔽的研究者进行了原位杂交染色和定量验证。肋骨序列在扩展数据表1中列出。

　　成年ESR1CRE/+; SUN1 – GFPLX/+小鼠（每个条件下四个）用氯胺酮/右美托咪定深入麻醉。跨越BNSTP的500μM的切片在冰上的成年小鼠脑基质（肯特科学）中收集。The BNSTp was microdissected and collected in 1 ml of cold supplemented homogenization buffer (250 mM sucrose, 25 mM KCl, 5 mM MgCl2, 120 mM tricine-KOH, pH 7.8), containing 1 mM dithiothreitol, 0.15 mM spermine, 0.5 mM spermidine and 1× EDTA-free PIC (Sigma Aldrich11873580001）。将组织在带有松散的杵的1-ml玻璃组织研磨机（惠顿）中匀浆。接下来，添加了0.3％的igepal Ca-630，并用牢固的杵将悬架匀浆五次。通过40μm滤网过滤匀浆，然后在4°C下以500g离心15分钟。将沉淀重悬于0.5 mL均质化缓冲液中，其中含有1 mM二硫代醇，0.15 mM的精子，0.5 mM的精子和1×EDTA无PIC。使用Sony SH800S细胞分选仪（纯度模式），总共将30,000个GFP+核收集到冷ATAC-RSB（10 mM Tris-HCl pH 7.5，10 mM NaCl，3 mM MGCL2）中。排序后，加入0.1％的Tween-20，将核在4°C下以500克离心5分钟。将核的沉淀直接重悬于转座反应混合物中。

　　使用OMNI-ATAC协议56进行TN5换位。转座反应中使用了2.5μl的TN5酶（Illumina 20034197）。按照标准协议，使用NEBNEXT高保真2×PCR Master Mix（NEB M0541L）制备库。在最初的五个循环放大循环之后，根据qPCR优化，另外四个周期添加了。放大后，使用Ampure XP珠（Beckman Coulter a63880）选择了两次尺寸（0.5×–1.8×），以去除残留的引物和大基因组DNA。使用MID或高输出套件在Illumina NextSeq上使用配对的76 bp读取单独的条形码库并用配对的76 bp读取。

　　为了收集细胞进行切割，用汉克的缓冲盐溶液（HBSS）洗涤两次细胞，并在37°C/5％CO2下与预热的0.5％胰蛋白酶-EDTA（10×）一起孵育5分钟。用补充10％FBS和青霉素/链霉素（MHYPOA细胞）的DMEM将胰蛋白酶灭活，或补充了10％木炭剥离FBS和青霉素/链霉菌素（MCF-7细胞）的无苯酚红色DMEM（MCF-7细胞）。胰蛋白酶化后，将细胞在15毫升圆锥管中以500克离心，并重悬于新鲜培养基中。如前所述，进行剪切和运行，并进行了少量修改。将细胞在洗涤缓冲液中两次洗涤（20 mM HEPES，pH 7.5，150 mm NaCl，0.5 mm的精子素，1×PIC，0.02％digitonin）。在Coustess II FL自动细胞计数器（Thermo Fisher）上测量细胞浓度。每个样品总共使用了25,000个细胞。Cells were bound to 20 μl concanavalin A beads (Bangs Laboratories, BP531), washed twice in wash buffer, and incubated overnight with primary antibody (ERα: Santa Cruz sc-8002 or EMD Millipore Sigma 06-935, Nfix: Abcam ab101341) diluted 1:100 in antibody buffer (wash buffer containing 2 mMEDTA）。第二天，将细胞在洗涤缓冲液中两次洗涤，并加入700 ng ml-1蛋白A-MNase（PA-MNase，在内部制备）。在4°C下孵育1小时后，将细胞用洗涤缓冲液洗涤两次，并将其放在冰上的金属热块中。使用2 mM CaCl2启动PA-MNase消化。90分钟后，通过将1：1与2×停止缓冲液混合（340 mM NaCl，20 mM EDTA，4 mM EGTA，50μgml -1 rNase A，50μgml -1糖原，0.02％Digitonin）来停止消化。通过在37°C下孵育10分钟，释放消化的片段，然后在4°C下以16,000克离心5分钟。如前所述14，通过苯酚 - 氯仿提取从上清液中纯化DNA。

　　如先前所述，从解剖学的C57BL6/J小鼠中分离出核的POA，BNSTP和MEAP，如前所述57（图1A），如前所述52。组织发作后，将脑匀浆与50％的OPTIPREP溶液混合，并在38.5 mL超清晰的管中（Beckman-Coulter C14292），用4.8 mL的30％，然后是40％的OptipRep溶液将脑匀浆。用9,200 R.P.M.的Beckman SW-28摇杆转子进行超速离心。在4°C下18分钟。超速离心后，通过直接管穿刺从30/40％的OptipRep界面收集了约1.5 ml的核悬浮液，并用3-ml注射器连接到18号针头。在伯爵夫人II FL自动细胞计数器上测量核浓度。对于ERα切割和运行（1：100，EMD Millipore Sigma 06-935），从BNST，MPOA和MEA分离出400,000个核。对于Nfix切割和运行（1：100，ABCAM AB101341），从五只动物的BNSTP中分离出200,000个核。总共使用400,000个皮质核进行切割和运行的IgG对照（1：100，抗体 - 在线ABIN101961）。在珠子结合之前，将0.4％的igepal Ca-630添加到核悬浮液中，以增加对contanavalin a磁珠的亲和力。如上所述进行所有后续步骤，并使用修饰的洗涤缓冲液（20 mM HEPES，pH 7.5，150 mm NaCl，0.1％BSA，0.5 mm精子定，1×PIC）。

　　Cell line CUT&RUN libraries were prepared using the SMARTer ThruPLEX DNA-seq Kit (Takara Bio R400676), with the following PCR conditions: 72 °C for 3 min, 85 °C for 2 min, 98 °C for 2 min, (98 °C for 20 s, 67 °C for 20 s, 72 °C for 30 s) × 4 cycles, (98 °C for 20 s,72°C为15 s）×14个周期（MCF7）或10个周期（MHYPOA）。使用带有10个PCR周期的相同套件制备了大脑切割和运行的文库。所有样品均选择具有Ampure XP珠（0.5×–1.7×）的尺寸，以去除残留的适配器和大基因组DNA。使用MID或高输出套件在Illumina NextSeq上使用配对的76 bp读取单独的条形码库并用配对的76 bp读取。对于MHYPOA实验，将样品用配对的25 bp读取在Illumina Miseq上。

　　Brains were dissected from perfused P14 Esr1Cre/+;Sun1–GFPlx/+ animals and cryosectioned at 40 μm before immunostaining with primary antibodies to GFP (1:1,000, Aves GFP-1020) and Nfix (1:1,000, Thermo Fisher PA5-30897), and secondary antibodies against chicken (1:300, Jackson Immuno如前所述16，703-545-155）和兔子（1：800，杰克逊免疫711-165-152）。配备MBF Neurolucida软件的Zeiss AxioImager M2系统用于采用BNSTP跨越20×宽场图像堆栈（五个部分，两侧）。NFIX+，GFP+和NFIX+ GFP+细胞的数量使用中心的Fiji/ImageJ进行了三个光学切片的量子，该切片被视而不见条件。

　　皮下向5μgE2或p0上的雌性和雄性ESR1CRE/+; SUN1 -GFPLX/+小鼠在P4上收集（每种条件汇集的4-5只动物，每次重复）。如上所述，对BNSTP进行了微解析，并收集在300μl冷的补充均质缓冲液中。如成人大脑所述提取细胞核。通过40μm滤网过滤后，将细胞核用600μl冷的，补充的均质缓冲液稀释3：1，并立即用于分类。使用100μm排序芯片，将总共30,000个GFP+核收集到冷ATAC-RSB缓冲液中。排序后，如上所述进行了核换位和文库制备。

　　雌性ESR1CRE/+; SUN1 – GFPLX/+小鼠在皮下注射5μgE2或在P0上载体，并在4小时后收集（每种条件合并5只动物，每次重复）。如新生儿ATAC -SEQ实验所述，对BNSTP，MPOA和MEA进行了微分解，并提取了核。通过40μm滤网过滤后，将细胞核用600μl冷的，补充的均化缓冲液稀释3：1。添加了2 mM浓度的EDTA，并立即将样品用于排序。使用100μm排序芯片，将共收集150,000 GFP+核收集到冷切和运行洗涤缓冲液中。将GFP-事件收集到冷切和运行洗涤缓冲液中，随后将150,000个核对ERα-和IgG负面控制切割和运行的Countess II FL自动化细胞计数器计数。所有随后的步骤均如描述的成人脑切割和运行实验所述。使用10个PCR周期制备P0切割和运行库。

　　如上所述，BNST从P4和P14女性和雌性ESR1CRE/+; SUN1 -GFPLX/+小鼠中进行了微分解（如上所述（每种条件合并的4-5只动物）。提取细胞核，并准备对新生儿散装ATAC -SEQ实验进行排序，并在均质化缓冲液中包含1UμL -1保护剂RNase RNase抑制剂（Sigma）。总共将40,000-50,000个GFP+核收集到1 mL冷的ATAC-RSB缓冲液中，并补充了0.1％Tween-20，0.01％Digitonin，2％无菌过滤的BSA（Sigma A9576）和1UμL-1uμl-1保护剂RNase RNase RNase抗抑制剂。将细胞核在4°C下以500g的摇摆式转子离心10分钟。仔细去除约950μl的上清液，并将200μl10x基因组学稀释核缓冲液添加到管的侧面，而不会干扰颗粒。将细胞核在4°C下以500克再次离心10分钟。仔细去除约240μl上清液，并将核重悬于其余体积（约7μl）中。按照制造商的说明，将样品立即用于10倍基因组学单细胞多组ATAC +基因表达试剂盒（1000285）。使用高输出套件，在Illumina NextSeq上对SnRNA-Seq和Snatac – Seq库进行了测序。每个样品的深度为SNATAC库的每个单元格的深度约为40,000–80,000个平均读取，而SNRNA库的每个单元格的平均读数约为40,000-50,000个平均读数。

　　BNSTP从p14雌性和雄性VGATCRE; ESR1+/+; SUN1 – GFPLX以及雄性VGATCRE； ESR1LX/LX; SUN1 – GFPLX小鼠进行了微分解。立即将来自单个动物的组织样品闪烁在乙醇干冰浴中，并在-80°C下储存，直到每组收集n = 3只动物。在实验当天，将组织样品从-80°C去除并保持在干冰上。随着组织仍然冷冻，将冷，补充的均质缓冲液加入管中，并立即将组织转移到玻璃均质器中，并机械散发和过滤，如我们的其他新生儿实验所述。将总共80,000–90,000个GFP+细胞核收集到100μl冷的ATAC-RSB缓冲液中，并在0.5-mL DNA Lo-bindibe（Eppendor）中补充了1％无菌过滤BSA（Sigma A9576）和1UμL-1保护剂RNase RNase RNase RNase RNase RNase抑制剂。收集后，将核在4°C下的摆动式离心机中用两轮温和离心（200g持续1分钟）固定。在第二轮之后，仔细去除上清液，在管中留下约40μl。根据制造商的说明，将核轻轻地重悬于该剩余体积中，并立即用于10倍基因组单细胞3'基因表达试剂盒V3（1000424）。每个生物样品分为两个10×泳道，产生6个库，这些文库被合并并在Illumina NextSeq 2000上测序，深度约为45,000–60,000个平均每个单元的平均读数。

　　雌性ESR1CRE/+; SUN1 – GFPLX/+小鼠在皮下向5μgE2或P0上的媒介物皮下注入。四天后，将动物迅速被斩首，并使用微型型（Thermo Scientific Microm HM 650V）在冷均质缓冲液中收集400μm的切片。BNST被微解释（每条条件下4个动物），并收集在1 ml冷的，补充含有0.4 u ml -1 rnaseout的均匀的均化缓冲液中（Thermo Fisher，10777019）。如新生儿散装ATAC -SEQ所述分离核。使用Sony SH800S细胞分选仪（纯度模式），总共将12,000个GFP+核与1：100补充1：100的冷缓冲RLT和补充1：100。将核裂解物存储在-80°C下，直到收集所有重复。将所有重复的核样品在冰上解冻，并使用Qiagen rneasy Plus微型试剂盒（74034）分离RNA。根据制造商的指南，使用Ovation Solo RNA-Seq系统（Tecan Genomics，0501-32）制备了链特异性的RNA-Seq库。使用MID输出套件将单个条形码的库多路复用并用单端76 bp读取。

　　对配对末端的读数进行修剪以去除Illumina适配器和低质量的基本（Casadapt -Q 30）58。使用Bowtie2（参考文献59）和以下标志对修剪的读数对齐： - 燕尾 - 非常敏感的局部 - 毫无疑问 - 无 - 没有混合 - 没有 - 没有 - 没有 - phred33。使用PICARD（http://broadinstitute.github.io/picard/）markduplicates（remove_duplicates = true）删除了重复的读数。通过映射质量60（Samtools View -Q 40）和片段长度（DeepTools Arignmentsive-MaxFragmentLength 120）来过滤读取。读取与线粒体染色体的对齐，并使用Samtools除去了不完整的组件。过滤后，使用MACS2 CallPeak（-min-min-Length 25 -q 0.01）62在单个重复BAM文件上调用峰值。为了识别样品之间的共识NFIX峰，MacS2 CallPeak是在跨生物重复（n = 2）合并的BAM文件上进行的，随后在治疗和性别中相交。在下游分析之前，使用Bedtools相交（-v）63去除IgG对照中调用峰的TF峰。

　　使用difbind v2.10.0（参考文献64）执行切割和运行的差分峰调用。由单个重复BAM和MACS2 NarrowPeak文件（每个条件n = 2）创建一个计数矩阵。共识峰最近在最高读取密度的点附近达到±100 bp（summits = 100）。建立了性别和治疗之间的对比度（类别= C（DBA_处理，DBA_Condition）），并且使用EDGER 65进行差异峰值调用。PADJ的差分ERα峰< 0.1 were used for downstream analysis. For Nfix, differential peaks with a Padj < 0.1 and abs(log2[FC]) > 1 were used for downstream analysis. Differential peak calling for the MCF-7 CUT&RUN experiment was performed with DESeq2 (Padj < 0.1) in DiffBind. Differential peak calling for the P0 ERα CUT&RUN experiment was performed with DESeq2 (Padj < 0.01) in DiffBind. To identify sex-dependent, oestradiol-responsive peaks for adult brain ERα CUT&RUN, the DiffBind consensus peakset count matrix was used as input to edgeR, and an interaction between sex and treatment was tested with glmQLFTest.

　　Brain E2-induced ERα CUT&RUN peaks were annotated to NCBI RefSeq mm10 genes using ChIPseeker66. DeepTools plotHeatmap was used to plot ERα CUT&RUN (Fig. 1b), representing CPM-normalized bigwig files pooled across replicate and sex per condition, at E2-induced ERα peaks. Heatmaps of individual ERα CUT&RUN replicates are shown in Extended Data Fig. 2. CUT&RUNTools 67 was used to plot ERα CUT&RUN fragment ends surrounding ESR1 motifs (JASPAR MA0112.3) in E2-induced ERα ChIP–seq peaks. BETA (basic mode, -d 500000)68 was used to determine whether ERα peaks were significantly overrepresented at E2-regulated RNA-seq genes (P < 0.01), as well as sex-dependent E2-regulated genes (P < 0.01), compared to non-differential, expressed genes. Motif enrichment analysis of ERα peaks was performed with AME69 using the 2020 JASPAR core non-redundant vertebrate database. Motif enrichment analysis was performed using a control file consisting of shuffled primary sequences that preserves the frequency of k-mers (--control --shuffle--). The following seven ERα ChIP–seq files were lifted over to mm10 using UCSC liftOver and intersected with E2-induced ERα peaks to identify brain-specific and shared (≥4 intersections) ERα-binding sites: uterus (intersection of GEO: GSE36455 (uterus 1)70 and GEO: GSE49993 (uterus 2)71), liver (intersection of GEO: GSE49993 (liver 1)71 and GEO: GSE52351 (liver 2)72), aorta72 (GEO: GSE52351), efferent ductules73 (Supplementary Information) and mammary gland74 (GEO: GSE130032). ClusterProfiler75 was used to identify associations between brain-specific and shared ERα peak-annotated genes and Gene ontology (GO) biological process terms (enrichGO, ont = 'BP', Padj < 0.1). For Disease ontology (DO) and HUGO Gene Nomenclature Committee (HGNC) gene family enrichment, brain-specific ERα peak-associated gene symbols were converted from mouse to human using bioMart76 and then analysed with DOSE77; enrichDO, Padj < 0.1) and enricher (Padj < 0.1). Log-odds ESR1 and ESR2 motif scores in brain-specific and shared ERα peaks were calculated with FIMO78, using default parameters.

　　MCF7 ERα CUT&RUN data were compared to MCF7 ERα ChIP–seq data from ref. 79 (GEO: GSE59530). Single-end ChIP–seq fastq files for two vehicle-treated and two 17β-oestradiol (E2)-treated IP and input samples were accessed from the Sequence Read Archive and processed identically to ERα CUT&RUN data, with the exception of fragment size filtering. Differential ERα ChIP–seq peak calling was performed using DiffBind DESeq2 (Padj < 0.01). DeepTools was used to plot CPM-normalized ERα CUT&RUN signal at E2-induced ERα ChIP–seq binding sites. DREME80 and AME were used to compare de novo and enriched motifs between E2-induced MCF7 ERα CUT&RUN and ChIP–seq peaks.

　　Reads were adapter trimmed and quality filtered (q > 30) (http://hannonlab.cshl.edu/fastx_toolkit/), and then mapped to the mm10 reference genome using STAR81. The number of reads mapping to the exons of each gene was counted with featureCounts82, using the NCBI RefSeq mm10 gene annotation. Differential gene expression analysis was performed using DESeq2 (ref. 83) with the following designs: effect of treatment (design = ~ batch + hormone), effect of sex (design = ~ batch + sex), two-way comparison of treatment and sex (design = ~ batch + hormone_sex), four-way comparison (design = ~ 0 + hormone_sex) and sex–treatment interaction (design = ~ batch + sex + hormone + sex:hormone).

　　ATAC–seq data were processed using the ENCODE ATAC–seq pipeline (https://github.com/ENCODE-DCC/atac-seq-pipeline) with default parameters. To generate CPM-normalized bigwig tracks, quality-filtered, Tn5-shifted BAM files were converted to CPM-normalized bigwig files using DeepTools bamCoverage (--binSize 1 --normalizeUsing CPM).

　　ATAC–seq differential peak calling was performed with DiffBind v2.10.0. A DiffBind dba object was created from individual replicate BAM and MACS2 narrowPeak files (n = 3 per condition). A count matrix was created with dba.count, and consensus peaks were recentred to ±250 bp around the point of highest read density (summits = 250). Contrasts between sex and treatment were established (categories = c(DBA_TREATMENT, DBA_CONDITION)), and edgeR was used for differential peak calling. Differential peaks with an FDR < 0.05 and abs(log2[FC]) > 1 or abs(log2[FC] )>0用于下游分析。Deeptools ComputeMatrix和PlotheatMap用于在E2-OPEN ATAC峰处绘制平均ATAC CPM。为了识别性别依赖性的雌二醇响应峰，将差异共识峰值计数矩阵用作EDGER的输入，并用GLMQLFTEST测试了性别与治疗之间的相互作用。使用chipseeker将E2-OPEN ATAC峰和总车辆或E2 ATAC峰（在每个治疗条件的重复和性别相交）被注释到NCBI REFSEQ MM10基因。簇式膜用于计算GO生物过程项的富集。如上所述，对E2-OPEN ATAC峰相关基因进行了DO和HGNC基因家族富集，如上所述，用于ERα切割和运行分析。Beta（基本模式，-D 500000）68用于确定在E2调节的RNA-SEQ基因下是否明显过分代表E2-OPEN ATAC峰（P< 0.01), as well as sex-dependent E2-regulated genes (P < 0.01), compared to non-differential, expressed genes. Motif enrichment analysis of E2-open ATAC peaks was performed with AME, using the 2020 JASPAR core non-redundant vertebrate database. FIMO was used to determine the percentage of E2-open ATAC peaks containing the enriched motifs shown in Extended Data Fig. 4h, i.

　　ATAC–seq differential peak calling and comparison between gonadally intact (abbreviated as intact) and GDX ATAC samples were performed with DiffBind v2.10.0 and edgeR. A DiffBind dba object was created from individual replicate BAM and MACS2 narrowPeak files for the four groups: female intact (n = 2), male intact (n = 2), female GDX vehicle treated (n = 3), male GDX vehicle treated (n = 3). A count matrix was created with dba.count, and consensus peaks were recentred to ±250 bp around the point of highest read density (summits = 250). The consensus peakset count matrix was subsequently used as input to edgeR. Differential peaks (abs(log2[FC]) > 1, Padj < 0.05) were calculated between female intact and male intact and between female GDX vehicle treated and male GDX vehicle -treated groups using glmQLFTest. BETA was used to assess statistical association between gonadally intact, sex-biased ATAC peaks and sex DEGs called in BNSTp Esr1+ snRNA-seq clusters (top 500 genes per cluster, ranked by Padj). Sex DEGs ranked by ATAC regulatory potential score68, a metric that reflects the number of sex-biased peaks and distance of sex-biased peaks to the TSS, are shown in Extended Data Fig. 7g. HGNC gene family enrichment was performed on sex DEGs, using a background of expressed genes in any of the seven BNSTp Esr1+ clusters.

　　To identify differential peaks across the four conditions, an ANOVA-like design was created in edgeR by specifying multiple coefficients in glmQLFTest (coefficient = 2:4). A matrix of normalized counts in these differential peaks (Padj < 0.01) was clustered using k-means clustering (kmeans function in R), with k = 4 and iter.max = 50. For each k-means cluster, the cluster centroid was computed, and outlier peaks in each cluster were excluded on the basis of having low Pearson’s correlation with the cluster centroid (R < 0.8). Depth-normalized ATAC CPM values in these peak clusters are shown in Fig. 2i (mean across biological replicates per group) and Extended Data Fig. 7 (individual biological replicates). Peak cluster overlap with E2-open ATAC loci (abs(log2[FC]) > 0, Padj < 0.05) was computed with bedtools intersect (-wa). For each peak cluster, motif enrichment analysis was performed by first generating a background peak list (matching in GC content and accessibility) from the consensus ATAC peak matrix using chromVAR (addGCBias, getBackgroundPeaks)84, and then calculating enrichment with AME using the background peak list as the control (--control background peaks). In Fig. 2i, the JASPAR 2020 AR motif (MA0007.3) is labelled as ARE, and the ESR2 motif (MA0258.2) is labelled as ERE.

　　Mouse BNST snRNA-seq data containing 76,693 neurons across 7 adult female and 8 adult male biological replicates26 were accessed from GEO: GSE126836 and loaded into a Seurat object85. Mouse MPOA single-cell RNA-seq data containing 31,299 cells across 3 adult female and 3 adult male biological replicates32 were accessed from GEO: GSE113576 and loaded into a Seurat object. Cluster identity, replicate and sex were added as metadata features to each Seurat object. Pseudo-bulk RNA-seq analysis was performed to identify sex differences in gene expression in the BNST snRNA-seq dataset. Briefly, the Seurat object was converted to a SingleCellExperiment object (as.SingleCellExperiment). Genes were filtered by expression (genes with >1 count in ≥5 nuclei). NCBI-predicted genes were removed. For each cluster, nuclei annotated to the cluster were subsetted from the main Seurat object. Biological replicates containing ≤20 nuclei in the subsetted cluster were excluded. Gene counts were summed for each biological replicate in each cluster. Differential gene expression analysis across sex in each cluster was performed on the filtered, aggregated count matrix using DESeq2 (design = ~ sex) with alpha = 0.1. The BNSTp_Cplx3 cluster was excluded, as none of the replicates in this cluster contained more than 20 nuclei. Clusters containing ≥25% nuclei with ≥1 Esr1 counts in the main Seurat object were classified as Esr1+ (i1:Nfix, i2:Tac2, i3:Esr2, i4:Bnc2, i5:Haus4, i6:Epsti1, i7:Nxph2, i8:Zeb2, i9:Th, i10:Synpo2, i11:C1ql3, i12:Esr1, i13:Avp, i14:Gli3). To identify TFs that correlate with sex DEG number per cluster (Fig. 2g), a linear regression model with percentage of TF expression as the predictor variable and sex DEG number per cluster as the response variable was generated using the lm function in R stats (formula = percentage of TF expression ~ DEG number). This model was tested for all TFs in the SCENIC86 mm10 database. All TFs were then ranked by R2 to identify those most predictive of sex DEG number, and the ranked R2 values are shown in Fig. 2g.

　　To visualize BNSTp Esr1+ snRNA-seq data (Fig. 2a), BNSTp Esr1+ clusters were subsetted from the main Seurat object. Gene counts were normalized and log transformed (LogNormalize), and the top 2,000 variable features were identified using FindVariableFeatures (selection.method = vst). Gene counts were scaled, and linear dimensionality reduction was performed by principal component analysis (runPCA, npcs = 10). BNSTp Esr1+ clusters were visualized with UMAP (runUMAP, dims = 10). To generate the heatmaps in Extended Data Fig. 7a, pseudo-bulk counts for each biological replicate included in the analysis were normalized and transformed with variance-stabilizing transformation (DESeq2 vst), subsetted for sex-biased genes in each cluster, and z-scaled across pseudo-bulk replicates.

　　To examine differential abundance of BNSTp Esr1+ clusters between sexes (Fig. 2b), the proportion of total nuclei in each BNSTp Esr1+ cluster was calculated for each biological replicate. After calculating the proportions of nuclei, sample MALE6 was excluded as an outlier for having no detection (0 nuclei) of i1:Nfix and i2:Tac2 clusters and overrepresentation of the i5:Haus4 cluster. The one-sided Wilcoxon rank-sum test (wilcox.test in R stats) was used to test for male-biased abundance of nuclei across biological replicates in each cluster. P values were adjusted for multiple hypothesis testing using the Benjamini–Hochberg procedure (method = fdr).

　　To identify marker genes enriched in the i1:Nfix cluster relative to the remaining six BNSTp Esr1+ clusters (Extended Data Fig. 6b), differential gene expression analysis was performed using DESeq2 with design = ~ cluster_id (betaPrior = TRUE), alpha = 0.01, lfcThreshold = 2, altHypothesis = greater.

　　To identify the enrichment of Lamp5+ subclass markers in BNSTp and MPOA Esr1+ clusters (Extended Data Fig. 6e), a Seurat object was created from the Allen Brain Atlas Cell Types dataset. Gene counts per cell were normalized and log transformed (LogNormalize), and subclass-level marker genes were calculated with the Wilcoxon rank-sum test (FindAllMarkers, test.use = wilcox, min.diff.pct = 0.2). The mean expression of Lamp5+ subclass markers (avg_log[FC] >0.75，padj< 0.05, <40% in non-Lamp5+ subclasses) was calculated in BNSTp and MPOA Esr1+ clusters and visualized using pheatmap.

　　To generate the UMAP plots shown in Extended Data Fig. 6g, BNSTp Esr1+ clusters were integrated with MPOA/BNST Esr1-expressing clusters (e3: Cartpt_Isl1, i18: Gal_Tac2, i20: Gal_Moxd1, i28: Gaba_Six6, i29: Gaba_Igsf1, i38: Kiss1_Th) using Seurat. Anchors were identified between cells from the two datasets, using FindIntegrationAnchors. An integrated expression matrix was generated using IntegrateData (dims = 1:10). The resulting integrated matrix was used for downstream PCA and UMAP visualization (dims = 1:10).

　　metaNeighbor28 was used to quantify the degree of similarity between BNSTp Esr1+ clusters and MPOA Esr1+ clusters and between BNSTp Esr1+ clusters and cortical/hippocampal GABAergic neuron subclasses from the Allen Brain Atlas Cell Types database29. Briefly, the BNST and MPOA Seurat objects were subsetted for Esr1+ clusters, and then transformed and merged into one SingleCellExperiment object. For the BNSTp and cortex comparison, BNSTp Esr1+ clusters were merged into a SingleCellExperiment with cortical/hippocampal GABAergic cortical clusters. Unsupervised metaNeighbor analysis was performed between BNST and MPOA clusters, and between BNST and cortical/hippocampal clusters, using highly variable genes identified across datasets (called with the variableGenes function). The median AUROC value per cortical/hippocampal GABAergic subclass across Allen Brain Atlas datasets for each BNSTp Esr1+ cluster is shown in Fig. 2d.

　　Differential peak calling on the neonatal bulk ATAC–seq experiment was performed with DiffBind v2.10.0 and edgeR. A count matrix was created from individual replicate BAM and MACS2 narrowpeak files (n = 3 per condition). Consensus peaks were recentred to ±250 bp around the point of highest read density (summits = 250), and the consensus peakset count matrix was subsequently used as input to edgeR. Differential peaks across the three treatment groups (NV female, NV male, NE female) were calculated by specifying multiple coefficients in glmQLFTest (coefficient = 4:5). To identify accessibility patterns across differential peaks (Padj < 0.05), a matrix of normalized counts in differential peaks was hierarchically clustered using pheatmap, and the resulting dendrogram tree was cut with k = 6 to achieve 6 peak clusters (Extended Data Fig. 8a). The two largest clusters were identified as having higher accessibility in NV males and NE females compared to NV females (cluster 3, labelled as NE open), or lower accessibility in NV male and NE female compared to NV females (cluster 5, labelled as NE close). Motif enrichment analysis of NE-open peaks was performed with AME using the 2020 JASPAR core non-redundant vertebrate database. GO biological process, DO and HGNC gene family enrichment analyses were performed, as described above for adult GDX treatment ATAC–seq data analysis.

　　Raw sequencing data were processed using the Cell Ranger ARC pipeline (v2.0.0) with the cellranger-arc mm10 reference. Default parameters were used to align reads, count unique fragments or transcripts, and filter high-quality nuclei. Individual HDF5 files for each sample containing RNA counts and ATAC fragments per cell barcode were loaded into Seurat (Read10X_h5). Nuclei with lower-end ATAC and RNA QC metrics (<1,000 ATAC fragments, <500 counts, nucleosomal signal > 3, TSS enrichment < 2) were removed. DoubletFinder87 was then used to remove predicted doublets from each sample (nExp = 9% of nuclei per sample). Following doublet removal, nuclei surpassing upper-end ATAC and RNA QC metrics (>60,000 ATAC fragments, >去除20,000个RNA计数，> 6,000个基因）。过滤后，将每个样品的Seurat对象用于RNA分析并合并。对基因计数进行标准化并进行对数转换（log Normalize），并使用FindVariableFeatures（selection.method ='vst'）确定了前2,000个变量特征。对基因计数进行缩放，从而回归以下变量：RNA计数的数量，RNA基因数量，线粒体计数的百分比和生物学性别。线性维度降低是通过主成分分析（RUNPCA，NPCS = 25）进行的。基于PCA空间中的欧几里得距离构建了k-near-neybours图并精制（Findneighbors，npcs = 25），然后使用Louvain算法（Findclusters，分辨率= 0.8）聚集了核。用UMAP（RunuMap，dims = 25）可视化snRNA簇。为了减少聚类的粒度，从PCA空间（BuildClusterTree）中构建的距离基质（buildClusterTree）中产生了聚类身份的系统发育树，并将其视为树状图（PlotClusterTree）。计算了系统发育树的末端节点中的簇之间的学位（findmarkers，test.use ='wilcox'，padj< 0.05), and clusters were merged if they had fewer than 10 DEGs with the following parameters: >0.5 avg_log [fc]， <10% expression in negative nuclei, and >阳性核中25％的表达。最终的从NOVO SNRNA-SEQ簇显示在扩展数据中图10c。

　　随后使用Seurat将来自新生儿多组数据集的抑制性神经元簇（SLC32A1/GAD2+）分配给成人BNST ESR1+群集标签。成年BNST ESR1+簇（如上所述）是从成年SnRNA-Seq对象子集中的，并随机下采样至5,000个核。与新生儿和成年ESR1+抑制性神经元簇相同的参数进行了归一化，数据缩放和线性维度降低。首先使用FindTransferancher鉴定成人（参考）和新生儿（查询）数据集之间的锚固细胞。参考群集标签以及相应的UMAP结构随后使用MapQuery转移到新生儿数据集中。如前所述，预测分数测量了参考和查询数据集的邻居结构的一致性，用于量化从成年人到新生儿核的标签转移的置信度。扩展数据图10D显示了每个参考群集的预测评分和映射到成人参考群集标签上的核的时间点，以及每个从头群集中映射到每个成人参考群集的每个从头群集的核百分比（预测评分> 0.5）。为了进一步验证成人和新生儿数据集之间的标签转移质量，我们计算了新生儿簇之间的DEG邮政标签传输（findmarkers，test.use.use ='wilcox'，padj< 0.05, min.diff.pct = 0.1, avg_log[FC] >0.5）并计算了其背景提取的新生儿和成年BNST ESR1+核中的平均表达（AddModulesCore）（在扩展数据中可视化图10E）。

　　为了生成伪泡，为每个SNATAC群集的ATAC BIGWIG标准化，我们首先使用SamTools（-Q 30 -F 2）重新处理每个样品的CellRanger ARC输出BAM文件，并使用Picard MarkDuplicates（Barcode_tuplicates（barcode_tuplicate）删除了每个细胞条形码的重复读数（barcode_tag = cb = cb emove_drem emect __dueplicates = true）Sinto（https://timoast.github.io/sinto/）使用从Seurat对象提取的单元格代码将每个群集的ATAC对齐分为单个BAM文件。使用DeepTools BamCoverage（-binsize 1- normalizeused cpm）计算每个伪堆BAM文件的CPM标准化大wig文件。

　　为了分析新生儿多组SNATAC数据，我们使用了ARCHR88。为每个多组示例创建了单独的箭头文件，然后合并到一个单个ARCHR项目中。在创建箭头文件时计算每个核的基因活性评分（AddGenesCoremat = true）。元数据（群集标签，性别，时间和QC指标）通过细胞条形码匹配转移到先前生成的Seurat对象到ARCHR项目。使用Archr的迭代潜在语义索引方法（Additerativelsi）对SNATAC数据进行了降低。使用魔术89在ARCRR（addimputeWeaights）中添加人均归类权重，以Denoise Sparse ATAC数据以进行UMAP可视化。使用Archr的迭代重叠峰合并方法（addReproduciblePeaks，groupby ='cluster'）进行群集感知的ATAC峰调用。峰值呼叫后，为每个峰（addPeakantation）添加了CISBP人类基序注释，并计算了每个基序的Chromvar偏差得分（AddDeviationsMatrix）。此外，Chromvar用于计算共识BNSTP NFIX切割峰的每核偏差分数。为了执行神经元同一性调节剂分析（扩展数据图10G），计算了CISBP基序数据库中所有TF的TF RNA表达和基序偏差评分之间的相关性。相关系数> 0.5的TF和前50％的每个群集之间的最大TF RNA log2 [FC]值分类为神经元身份调节剂（扩展数据中的粉红色图10G）。

　　为了可视化基因活性和CISBP基序偏差得分（图3C和扩展数据图10G），将分数（Imputematrix）估算，通过细胞条形码匹配转移到原始的Seurat对象，并使用目录图可视化。SIGNAC90用于为每个样品生成和存储逐个峰值计数矩阵。计算每个群集的SNATAC标记（FindAllmarkers，test.use ='lr'，vars.to.to.to.regress ='ncount_atac'，min.pct = 0.1，min.diff.pct = 0.05，logfc.threshord = 0.15）。使用Deeptools Multibigsummary计算每个标记峰的伪泡Snatac群集CPM，并使用pheatmap可视化（扩展数据图10F）。使用FindMotifs对每个群集的SNATAC标记峰进行了基元富集分析。每个SNATAC群集的前三个富集基序显示在扩展数据中。

　　为了鉴定跨P4 SNATAC簇的Ne-Open基因座的性偏见富集（图3D），我们首先过滤了低含量的P4 SNATAC簇（<400 nuclei), and then computed the difference in ATAC CPM between males and females at NE-open loci in each cluster. Differential ATAC CPM values were scaled across clusters, then grouped using k-means clustering (k = 12, iter.max = 50) and visualized with pheatmap (Fig. 3d). To call sex DEGs (Padj < 0.05) in each cluster and time point, we used MAST91 in Seurat (FindMarkers, test.use = 'MAST', min.pct = 0.05, logfc.threshold = 0.2, latent.vars = 'nFeature_RNA', 'nCount_RNA').

　　To link NE-regulated loci to sex DEGs at P4 and P14 (Fig. 3e and Extended Data Fig. 11h), we computed the Pearson correlation coefficient between sex DEG expression and NE-regulated peak accessibility for each cluster (linkPeaks, min.distance = 2,000, distance = 1,000,000, min.cells = 2% of cluster size). Sex DEG log2[FC] values and NE-regulated ATAC site correlation coefficients were hierarchically clustered and visualized using ComplexHeatmap92.

　　Raw sequencing data were processed using the Cell Ranger pipeline (v6.0.0) with the refdata-gex-mm10-2020-A reference. Default parameters were used to align reads, count unique transcripts and filter high-quality nuclei. Individual HDF5 files for each sample were loaded into Seurat. Nuclei with lower-end RNA QC metrics (<1,000 counts) were removed. DoubletFinder87 was then used to remove predicted doublets from each sample (nExp = 9% of nuclei per sample). Following doublet removal, nuclei surpassing upper-end RNA QC metrics (>20,000 counts, >去除了6,000个基因）。过滤后，合并了Seurat对象。如单核多组数据处理所述，对基因计数进行了标准化和缩放。

　　使用seurat将p14 snRNA-seq数据集分配给成人BNST抑制簇标记。从成年的SnRNA-Seq对象将成年BNST抑制簇取代，并随机下采样至10,000个核。对于P14和成人抑制性神经元簇，进行了归一化，数据缩放和线性维度降低。然后，如单核多组数据处理所述进行标签传输。扩展数据图12b显示了映射到成人参考群集标签上的P14核的预测评分。为了验证成人和p14数据集之间的标签转移质量，我们在上述标签传输后计算了p14簇之间的DEG，并计算了其背景减少的平均表达（addModulesCore）和成人BNST抑制簇（显示在扩展数据中图12C中）。对照女性和对照或有条件的ERαKO雄性之间的性别为每个p14簇，如上所述，用于多组分析。计算每个组的群集丰度，并在扩展数据中绘制图12d。

　　修剪读数以去除Illumina适配器和低质量的基本词（CASADAPT -Q 30），然后使用Star映射到MM10参考基因组。使用nudup.py python软件包（https://github.com/tecangenomics/nudup）删除了技术重复读数（具有相同链方向和相同的链方向和相同分子标识符的相同的起始和终端位置）。使用mm10.refgene.gtf文件计算每个链（-s 1）上每个基因（包括内含子）的读取映射数（包括内含子）。通过表达预滤基因后（Rowmeans≥5），使用DESEQ2（设计=〜处理）进行差异基因表达分析。

　　有关研究设计的更多信息可在与本文有关的自然研究报告摘要中获得。

左文资讯声明：未经许可，不得转载。