核小体纤维拓扑指导转录因子与增强子的结合_生活

　　IPS细胞的所有动物实验及其来自MEF的细胞生成的所有动物实验均由爱丁堡大学动物福利和道德审查机构批准，在爱丁堡大学进行，并根据由内政部和项目许可规定的法规进行。所有重编程实验均由爱丁堡大学SBS伦理委员会（ASOUFI-0001）批准。这项研究是根据希伯来大学和哈达萨医学中心以及国家伦理委员会（以色列卫生部）和NIH的联合伦理委员会（IACUC）进行的，该委员会批准了动物福利研究协议。希伯来大学是AAAALAC国际认可的研究所。

　　卸下内部器官和头部后，在E12.5–13.5处从129和129/C57BL/6小鼠胚胎产生了主要MEF。将每个胚胎的其余体体在200 µL胰蛋白酶-EDTA（0.25％，Gibco）中孵育15分钟，在37°C下孵育15分钟。然后通过添加800 µL MEF培养基将胰蛋白酶灭活，并与固定在1 mL注射器的18量规针中迅速解离胚胎。The embryo suspension was passed through the syringe several times (4 to 6) until becoming homogeneously cloudy and then transferred, drop-wise, to a 15 ml falcon tube containing 9 ml of warm MEF medium (GMEM (Sigma-Aldrich, G5154), 10% FCS, 1 mM sodium pyruvate, 1 mM -glutamine, 1× non-essential amino acids (ThermoFisher Scientific，11140035）。悬浮液被重力沉积，直到形成细胞碎片颗粒。将大多数上清液（10 mL）轻轻去除，并将其铺在含有温暖MEF培养基的10 cm盘中。每天在显微镜下对细胞进行监测，如果不汇合，则将细胞丢弃。通过胰蛋白酶消化和冷冻保存或立即使用汇合细胞（通过0）。

　　使用CB6F1宿主胚胎进行胚泡注射。在用PMSG（M.I.P.兽医）和HCG（Merck）激素并与CB6F1雄性交配后，在3.5天（胚泡阶段）中获得胚胎，然后将10-20 PB的ES细胞与TDTOMATO-MICTODTENT INITOR型细胞置入10-20 PB的细胞，并注入10-20 PB的细胞，并将其注入。（origio）在一滴FHM培养基（Zenith Biotech，ZeHP-050）中被矿物油覆盖。注射后不久，将胚泡转移到伪孕后CD1/ICR雌性后2.5天（每位女性10-15个胚泡）。在E13.5上分离嵌合胚和胎盘，并在荧光显微镜下观察（Nikon Eclipse t！）。从E13.5处从嵌合胚胎中切除性腺，并通过荧光显微镜观察（Nikon Eclipse t！）。

　　MEF被维持在MEF培养基（GMEM（Sigma-Aldrich，G5154），10％FCS，1 mM丙酮酸钠，1 mM-谷氨酰胺，1倍非代表性氨基酸（Thermo Fisher Scientific，11140035），11140035），0.1 mm IMM IMM IMMECATO MERCATTO MORCETOMATECATTO MONESTOMYC（50）（50 µg mL -1））在37°C和5％CO2下。人类胚胎肾脏293T（HEK293T）细胞（Lenti -X Takara，632180）在HEK培养基（GMEM，10％FCS，1 mm丙酮酸钠，1 mM-谷氨酰胺）中保持在37°C和5％CO2中。将小鼠ES细胞在0.2％明胶上生长，并保持在ES细胞培养基中（GMEM，10％FCS，1 mM-谷氨酰胺，0.1 mMβ-巯基乙醇，1×非必需氨基酸，100 U ML-1白血病抑制因子（LIF）和37°C和5％Co2。TS细胞在TS细胞培养基（RPMI-1640（Thermo Fisher Scientific，21875034），20％FCS，0.1 mMβ-cercapto乙醇中，将TS细胞保持在0.2％明胶的γ射线辐射馈线上。HFGF4（R＆D，235-F4-025），1 µg ML-1肝素（Sigma-Aldrich，H3149））。在TX培养基（无肝和谷氨酰胺（Life Technologies），64 mg l – 1-抗甲酸2-磷酸2-磷酸镁，14μgl – 1 sodium selenite，19.4 mg l-liN – 54 mg lind，54 mg lind，54 mg lind，54 mg lin，54 mg lin，在TX培养基中，将无需进料器培养的TS细胞维持在NOTARIGEL涂层的板（corning）（corning）（corning）（corning）（corning）。Mg L – 1 Holo-转移蛋白（全部Sigma-Aldrich），2 mM-谷氨酰胺，1％青霉素和链霉素），新鲜补充25 ng Ml-1 HFGF4，2 ng Ml-1 HTGF4，2 ng Ml – 1 Htgf-ß1（peprotech）（peprotech）和1 µg ml-1 heparin（sig）sheparin（sig）share（sig）sigmafrin（sich）。在无馈物条件下，进行了所有芯片 - seq，ATAC-SEQ，RNA-SEQ，MICRO-C和MNase-Seq实验。

　　如先前所述5,7进行重新编程为IPS细胞及其细胞。所有感染均在首次感染前2天以60–80％汇合的60–80％汇合的MEF（通过0或1）进行。For infection, replication-incompetent lentivirus expressing vectors encoding for reprogramming TFs and ratios (GETM, 3:3:3:1; TMR, 4.5:1:4.5; GETMR, 2:2.5:2:1:2.5; OSKM, 3:3:3:1; and BS9G4M, 3:3:3:1) were packaged with a lentiviral packaging mix（5.1μgPspax2和2.4μgpmd2.g）在含有HEK293T细胞的10 cm菜肴中，并在转染后48小时收集。将上清液通过0.45 µM过滤器过滤，并补充了8 µg ml-1的聚甲烯（Sigma-Aldrich），然后用于感染MEF。然后，感染后24小时，将培养基替换为含有10％FBS的新鲜GMEM（Thermo Fisher Scientific）。为了启动重编程，将2 µg ML -1强力霉素添加到最初48小时的培养基（GMEM含有10％FBS）中，然后再切换到相关的重编程培养基。对于IPS细胞重编程，将培养基替换为ES细胞培养基，并以200 U ML -1和2 µg ML -1强力霉素的最终浓度补充LIF，然后再撤回12天，然后撤回强力霉素。对于其细胞重编程，将培养基替换为TS细胞重编程培养基和2 µg ML -1强力霉素。为了用GETMR重新编程其细胞或IPS细胞，每隔一天将重编程培养基用强力霉素更换20天，然后在没有强力霉素的情况下进行10天的培养。监测板的原代IPS细胞及其细胞菌落。对于IPS细胞克隆分离，将单杆细胞菌落化为胰蛋白酶（0.25％），并在进料细胞上的六孔板中的单独孔中单独镀片。在显微镜下监测分离的菌落的形态，并每隔一天更换培养基五到十个通道，直到稳定的IPS细胞菌落形成。

　　由于在早期重编程中进行芯片 - seq所需的较大染色质量，因此生成了每个TF的大规模浓缩慢病毒。首先，将HEK293T细胞以每15 cm板的密度为2×106细胞的密度，并在30 mL HEK培养基中生长24小时，然后用相关的慢病毒质粒转染。每种病毒都是在单独的菜肴中制备的。转染，将2.4 µg PMD.g，5.1 µg PSPAX2和7.5 µg相应的FuW-Tet-O-TF载体溶解在1,710 µL Opti-Mem培养基中（Thermo Fisher Scientific，31985062）和90 µL FUGENE 6 ROOMET（PROMEGA，E262），E262）温度，在添加到含有HEK293T细胞的15 cm板上，然后将其孵育16小时。转染培养基被新鲜的HEK培养基取代，并将转染的细胞再培养60小时。通过收集30 mL上清液来收集慢病毒，该上清液通过0.45 µm聚乙烯过滤器拟合的注射器，并在4°C下用10 mL Lenti-X试剂孵育16小时（Clonti-X，631232）。然后通过在4°C下以1,500g离心1,500g离心1 h。除去上清液，并将病毒沉淀溶解在200 µL GMEM中，在4°C下过夜，然后等分并在-80°C下储存。平均而言，每个病毒的滴度被确定为每毫升的7×108感染单位。

　　对于早期编程的芯片序列分析，将4.8×106 MEF（通道1）在15 cm盘中在MEF培养基中培养16小时。第二天早晨，通过将含有Tet-on OSKM，GETM，GETMR或BS9G4M慢病毒的MEF培养基替换为MEF培养基，以每种TF和5 MOI的RTTA2M2 LINETIVIRUS和8 µg ML-1 -1 polyborene的多种感染（MOI）感染了细胞。24小时后，将培养基更改为没有多甲烯的MEF培养基。第二天，感染的细胞达到约90％的汇合度，并在2中分裂1，然后再孵育16小时，然后通过将2μgml -1多西环素添加到培养基中，并孵育48小时。然后将细胞交联以收集染色质（请参阅“ chip-seq”部分）。

　　每TF大约1.5×107细胞制备染色质片段。对于细胞交联，将3毫升甲醛交联缓冲液（50 mM Hepes-KOH，pH 7.5，100 mM NaCl，1 mm EDTA，0.5 mM EGTA，EGTA，11％甲醛添加到15 cm（Corning，430599）中，用于每30毫升培养基，并在室内温度下进行10毫升温度，将其添加到15 cm（Corning，430599）中。通过添加1.65 ml 2.5 m甘氨酸并在室温下旋转5分钟来阻止交联。使用硅刮刀（Thermo Fisher Scientific，08100240）在其培养基中收集细胞，并在4°C下1,350 RCF离心5分钟。通过重悬于10 ml冰冷的PBS将交联的颗粒洗涤3次，然后在4°C下1,350 RCF在1,350 RCF下离心5分钟。将ES细胞或其细胞的五个15厘米菜肴和七个15 cm感染的MEF菜肴组合成单颗粒进行加工。颗粒随后用液氮闪烁，并存储在-80°C下。

　　为了有效的裂解，MEF样品在液氮中闪烁，并在冰中解冻三遍，然后在冰上融化1小时。Cell pellets were resuspended in 10 ml lysis buffer 1 (50 mM HEPES-KOH pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40 substitute (Sigma-Aldrich, 74385), 0.25% Triton X-100 and cOmplete Ultra Protease Inhibitor (Roche, 5892970001)) with rotation for在4°C下10分钟。通过将细胞裂解物通过冰上40杆的紧密7 mL悬浮均匀剂通过细胞裂解物来提取核。通过以1,350 RCF离心在4°C下以1,350 RCF离心5分钟收集核。将核在10 mL裂解缓冲液2（10 mM Tris-HCl pH 8，200 mm NaCl，1 mM EDTA，0.5 mM EGTA和完整的超蛋白酶抑制剂）中洗涤10分钟，并在室温下旋转。然后通过以1,350 RCF离心在4°C下以5 ml裂解缓冲液3（10 mm Tris-HCl PH 8，100 mm NaCl，1 mm EDTA，1 mm EDTA，0.5 mm EGTA 0.1％Na-deoxyoxyCyCyCy酸，0.5％n- lauroylsarcosine和Complete ulter ulter ulter ulter ulter ulter ulter ulter ulter，在4°C下以1,350 RCF离心5分钟收集细胞核5分钟。

　　重悬的核分为五个等分试样，中含有AFA纤维的1 ml毫米（Covaris，520130），并使用Covaris M220聚焦超声量（Covaris）（峰值，75 W; ution wath; utional inastion; utional inaste; 10; Cycles cycles cycles; Cycles cycles; cycles cy; cyce; 200; cy cy; 200; 200;最高温度; c;降温; 75; c;降温; 75 w; nimum nimum nimum dempers; sep demive; sep demess;°c; 75 w; extional;将小米依次超声处理10分钟，并保持在冰上。超声染色质转移到蛋白质小管（Eppendorf）中。然后，将100 µl的10％Triton X-100添加到每个1 mL超声染色质中以提高染色质溶解度。然后将染色质样品离心（在4°C下为20,000克10分钟），上清液转移到新鲜的管中。最佳超声处理时间是通过在10分钟的间隔内服用50 µL等分试样确定的，并通过琼脂糖凝胶电泳检查DNA片段尺寸分布，直到主要产生150-350 bp频带。将早期重编程样品超声检查60-70分钟，ES细胞样品30-40分钟，其细胞持续50-60分钟。最终超声处理中的另外50 µL等分试样被保留为用于芯片分析的输入DNA控制。超声染色质和输入DNA样品在液氮中被冻结，并存储在-80°C下。

　　对于每种芯片重复，将30μl蛋白质G分角（Thermo Fisher Scientific，10004d）在阻断溶液中洗涤3次（PBS，0.5％（w/v）BSA）。将珠子用在200μl阻断溶液中稀释的适当TF（补充表2）升高的10μg抗体饱和，在4°C下旋转6小时。然后在阻塞溶液中又洗了三次珠子。通过将珠子与旋转器上的40μg染色质（基于DNA含量）在4°C的旋转器上孵育20小时来进行芯片。然后将珠子转移到新鲜的预告片中，用Ripa洗涤缓冲液（50 mm Hepes-Koh pH 7.5，500 mm licl，1毫米EDTA，1％NP-40替代品，0.7％Na-deoxyCy），一次使用TE NACL（10 mm Tris tris-Hcl PH 8，1mmMMMMMMMMMMMMMMMMMMM）。通过将珠子重悬在200μl芯片洗脱缓冲液中（50 mM Tris-HCl pH 8，10 mm EDTA，1％SDS）中洗脱结合的染色质，然后在65°C下摇动30分钟，然后将超级泌尿力转移到新的管中。通过摇动在65°C下在65°C下孵育16小时，可以逆转交联。将样品用200μLTE（10 mM Tris-HCl pH 8，1 mM EDTA）稀释，然后在37°C下与0.2 mg Ml-1 RNase A（Sigma-Aldrich，R4642）孵育2 h。然后通过与0.2 mg ml -1蛋白酶K（Ambion，AM2546）在55°C下孵育2小时，从而消化蛋白质。然后通过苯酚 - 氯仿提取，然后进行乙醇沉淀纯化DNA。将沉淀的DNA在20μl的10 mM Tris-HCl pH 8.5中洗脱，以进行库的生成或QPCR分析。使用HS DSDNA定量试剂盒（Thermo Fisher Scientific，Q32854）通过QUBIT 2.0定量芯片反应。

　　使用Nebnext Ultra II库制备试剂盒（NEB，E7645S）配备双索引引物（NEB，E7600S）制备芯片 - 隔离DNA库。对于每个TF，使用与至少三个芯片重复池相对应的5-20 ng芯片DNA制备库。使用20 ng的超声DNA生成输入文库。根据制造商的说明进行尺寸选择（200 bp）。图书馆制备过程中的PCR扩增受到限制，因此具有5-10 ng芯片DNA的样品经过了11个循环的PCR扩增，并且带有10-20 ng芯片DNA的样品进行了10个周期。使用20 ng的DNA起始材料生成输入文库。用10％PEG-8000溶液中的45 µL Seramag SpeeadBeads进行PCR清理。使用具有高敏感性DSDNA试剂盒（Thermo Fisher Scientific，Q32854）的Qubit 2.0设备对库进行定量，并使用具有D1000 HS试剂的Agilent 2200 Tapestation确定片段尺寸（Agilent，Aggilent，5067-5584，5067-5067-55585）。通过使用75 bp配对的设置或使用50 bp配对的终端设置在Illumina Hiseq 4000上，通过爱丁堡基因组学对样品进行了测序。

　　使用Qiagen rneasy套件分离总RNA。使用Sense mRNA-Seq库Prep Kit V2（词汇）制备所有mRNA库，并在Illumina NextSeq 500平台上对汇总的库进行了测序，以生成75 bp的单端读数。

　　如前所述5,7,50进行了ATAC – SEQ库的准备。简而言之，每次重复100,000个细胞（每系两个生物学重复）与0.1％NP-40孵育以分离核。然后，用适应器的Nextera TN5（Illumina，FC-121-1030）在37°C下将核转移30分钟。直接对pcr放大并在Illumina NextSeq 500平台上进行测序，以生成2×36 bp配对的读数。

　　对于H1 OE和H1 KD，将每个样品的400,000个细胞与0.1％NP-40、0.1％Tween-20和0.01％Digitonin（Calbiochem，300410）孵育，以分离核。然后，使用Illumina Tagment DNA酶和缓冲液小型套件（20034210），将核分别分成四个100,000个细胞的四个重复，分别在37°C下进行转置30分钟。直接在Novaseq 600系统上直接对pCR扩增并测序，以生成50 bp配对的读数。

　　每次消化条件大约1.5×107个细胞制备MNase样品。对于交联，将1.1 mL的交联缓冲液（Dulbecco的PBS和11％甲醛）添加到10 mL培养基中，并在室温下在10 cm细胞培养板上旋转10分钟（Corning，430167）。通过添加0.55 ml 2.5 m甘氨酸来阻止交联，并在室温下旋转5分钟。从交联细胞中吸出培养基，并用10 ml冰冷的SST洗涤两次细胞（150 mM NaCl，0.5 M柠檬酸三丁烷，10 mM Tris-HCl pH 7.5）。Cells were scraped into 5 ml ice cold RSB (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3 mM MgCl2 and 10 mM sodium butyrate with cOmplete Ultra EDTA-Free Protease Inhibitor (Roche, 5892953001) supplemented with 0.5% NP40 substitute (Roche, 11332473001). Cells were then pelleted at在380R离心机（Hettich）中，有1000 rpm在4°C下使用摇摆的旋转（Hettich）（Hettich，1754）。将带有NP40的RSB添加到样品中，并在1,400 rpm下在4°C下离心7分钟。重悬于600 µL冷的RSB中。

　　对于MNase的反应，将5 ml的OD260 = 1核转移到15 mL管。一次处理一个管。然后，将150 µl 100 mM CaCl2添加到最终浓度3 mm中，并将样品在37°C的水浴中孵育90 s。将微球菌核酸酶（Worthington Biochemicals，LS004797）添加到最终浓度为0、1、4、16或64 U mL -1，并在37°C的水浴中消化染色质2分钟。为了使MNase失活，添加了5.2 mL 2×室温TNESK（20 mM Tris-HCl pH 7.5，200 mM NaCl，2 mM EDTA，2％SDS和0.2 mg ML-1蛋白酶K），然后加入，并将样品剧烈混合。将样品在37°C下至少放置2小时，然后在65°C放置过夜以反向交联。通过苯酚 - 氯仿提取，然后进行乙醇沉淀纯化样品。然后将RNase A添加到0.2 mg mL -1的浓度中，并将样品在37°C下孵育2小时。随后通过苯酚 - 氯仿提取和乙醇沉淀纯化了DNA样品。然后，在1.3％琼脂糖凝胶上运行7.5 µg此样品，以检查消化模式。

　　为了纯化消化的样品，在20×20 cm的垂直电泳系统上，在6％聚丙烯酰胺TBE凝胶上进行消化的DNA 3 h和30分钟。凝胶用溴化乙锭染色后，切除条带对应于90至210 bp。该切除的凝胶通过在0.5 mL管子中离心，该管子被用针刺入1.5 ml管，该管子被刺穿的0.5 mL管。加入了两个凝胶量的扩散缓冲液（乙酸铵，10 mm乙酸镁，1 mM EDTA，0.1％SDS），并在37°C下摇动样品过夜。然后在室温下将样品在车轮上旋转2小时。然后将样品在20,000g下离心10分钟，然后将上清液转移到新管。将其在20,000克时再离心10分钟，以去除凝胶片段，并将上清液转移到新管中。然后通过乙醇沉淀纯化DNA，然后使用Monarch PCR和DNA清理套件进一步纯化（NEB，T1030）。使用HS DSDNA定量试剂盒（Thermo Fisher Scientific，Q32854）使用Qubit 2.0面粉计（Thermo Fisher Scientific）对DNA进行定量。

　　MNase–seq libraries were prepared using 30 ng, 300 ng, 500 ng or 1 µg DNA for 1 U ml−1, 4 U ml−1, 16 U ml−1 or 64 U ml−1, respectively, using the NEBNext Ultra II Library Preparation Kit (NEB, E7645S) with dual-index primers (NEB, E7600S).除了基于珠子的尺寸选择和PCR清理外，制造商的说明还遵循图书馆准备。在PCR循环之前进行了修改的尺寸选择协议，对于第一和第二尺寸选择珠的添加，将200 bp库的尺寸选择珠的体积更改为42 µL和37.5 µL。这里使用的尺寸选择珠是Seramag Speedbeads羧基涂层颗粒（GE Healthcare，GE65152105050250），在18％（w/v）PEG-8000溶液（10 mm Tris-HCl PH 8，10 mm Edta，1 mm Edta，1 mm eDta，1 mm nacl，0.05％，0.005％）的溶液中以50（w/v）PEG-8000溶液中的50稀释剂制备。与制造商协议的这种偏差是避免造成小碎片损失。1 U ML -1 MNase样品进行了十个PCR扩增周期，所有其他样品进行了七个PCR扩增的周期。PCR清理使用45 µL的Seramag Speedbeads在18％PEG-8000中制备。使用D1000 HS试剂（Agilent，5067-5584，5067-5585）的Agilent 2200 tapestation确定库片段尺寸。MNase – Seq库通过爱丁堡基因组学在Illumina novaseq平台上使用具有50 bp配对的设置的SII流动池进行了测序，每个库的深度约为1.6亿读。

　　为了制备微型C的交联样品，将细胞在15 cm细胞培养皿上生长至约80％的汇合。在开始之前，将15 cm或10 cm的板并行制备，用于胰蛋白酶，并用于通过计算血细胞计数来获得近似细胞数。然后，在将培养基吸入培养基之前，将微C样品升至室温，并用30 mL DPB洗涤样品两次。对于交联，将3.3 mL的交联缓冲液（甲醛11％的DPB）添加到板上，含有30毫升DPB，并在室温下孵育10分钟，旋转10分钟。通过添加1.65 ml 2.5 m甘氨酸并在室温下旋转5分钟来阻止交联。然后将样品转移到冰上并孵育15分钟。然后将细胞刮在冰上并转移到50 mL圆锥形管中。然后将细胞以1,000 rpm的速度在4°C下以旋转380R离心机（Hettich）的旋转式转子在4°C下5分钟（Hettich，Hettich，1754）。随后，将相同类型的颗粒组合在单个15毫升圆锥管中，并在10 ml冷的DPB中重悬于10 ml冷的DPB中，然后在4°C下以1,000 rpm的速度打入5分钟。然后将细胞颗粒重悬于每毫升中400万细胞的DPB中，在3 mM DSG的DPB中，并在室温下旋转40分钟（DSG库存最初是通过在DMSO中构成300 mM的库存来制备的，并稀释到DPB中）。通过将甘氨酸添加到最终浓度400毫米，并在室温下孵育5分钟，然后转移到冰上15分钟，将DSG淬灭。然后用DPBS 0.5％BSA洗涤细胞两次，并在含有500万个细胞的颗粒中使用液氮在-80°C下进行Snap-Frozen。每个micro-c文库使用一个细胞颗粒。

　　为了制备MNase消化，将两个细胞颗粒重悬于600 µL PBS中，并用0.1 mg ML -1 BSA（NEB，B9000）重悬于冰上20分钟。然后通过在4°C下以5,000克旋转5分钟来通过离心收集细胞。然后用MB1（10 mm Tris-HCl，pH 7.5，50 mm NaCl，5 mm MgCl2，1毫米Cacl2，0.2％NP-40，1×Roche完全EDTA完全EDTA无）（ROCHE诊断，04693132001），在5,000 respruge（5,000 c）中，将细胞沉淀物洗涤。每100万个细胞（每样品1,125 µL）分为225 µL MB1。然后将沉淀分成五个200 µL消化等分试样（以100 µL为无消化的对照）。对于一组五个200 µl等分试样，添加了15 U的MNase，并且，通过添加15或20 µl 1 U µl -1 MNase，添加了20 u的MNase，并在37°C的水浴中将样品孵育10分钟。H1-OE MEF样品使用了20分钟的消化。为了停止消化，然后加入2 µL 0.5 m EGTA，并将样品在65°C下孵育10分钟，以使MNase失活。

　　然后将相同MNase浓度的MNase消化样品重新组合到单个管中，并将100 µL作为无结构对照。然后将其余的重合样品划分在两个试管上，并通过离心收集细胞（在4°C下5,000g持续5分钟）。然后用500 µL 1×NEB缓冲液2.1（NEB，B7202）洗涤细胞，通过在4°C下以5,000g离心5分钟，然后将其重悬于45 µL 1×1×NEB缓冲液2.1中。然后，加入5μLRSHRIMP碱性磷酸酶（NEB，M0203），并将样品在37°C下孵育45分钟，以去磷酸化DNA末端。然后通过在65°C下孵育5分钟来停止反应。接下来，40μlKlenow前混合缓冲液（5μL10×NEB缓冲液2.1，2μl100 mm ATP（Thermo Fisher Scientific，r0441），3μl100mm dtt，30μl水），8μl大型klenow fragment（Neb，m0210l）和2μlt4pn neb，m0210l t4 pn，命令。然后通过在37°C下孵育15分钟来产生5'DNA悬垂。然后通过添加100 µL生物素前混合（10μl1 mm Biotin-14-DATP（Jena Biosciences，NU-835-BIO14-L），10μl1mmBiotin-14-DCTP（Jena-14-DCTP（Jena Bioscience），NU-956-BIOM956-1141414-14-14-14-14-14，DGTP和10 mM DTTP（NEB，N0446），10μL10×T4连接酶缓冲液（NEB，B0202S），0.5μL200×BSA（NEB，B9000S），67.5μl水），在25°C下孵化45分钟。然后，加入12 µl的0.5 m EDTA，并将样品在65°C下孵育20分钟以停止反应。

　　将样品在4°C下以10,000g的含量为10,000g，持续5分钟，除去上清液，并将沉淀重悬于500 µL 1×T4连接酶缓冲液中，并用50 mM NaCl。将样品在4°C下以10,000克离心5分钟，并取出上清液。然后将样品重悬于500 µL连接前混合物中（5 µl 200,000 U ML-1 T4连接酶（NEB，M0202M），1.25 µL 200×BSA，50 µL 10×T4 Ligase Buffer，443.75 µL水），并在2.5 h处孵化为2.5 h处于室内温度。接下来，添加5 µL 5 m NaCl，将样品在16,000g和4°C下离心，并丢弃上清液。

　　为了从无素的DNA末端去除生物素核苷酸，将沉淀重悬于外丝酶混合物中（20 µl 10×10×NEB缓冲液1（NEB，B7001S），170 µL水和10 µL 100,000 U ML -ML -ML -1外核酸酶III（NEB，M0206L）和argutitiatiatiatiation Agn 377°C C.随后，加入了1.25 µl 20 mg Ml -1 RNase A，10 µL 20 mg Ml -1蛋白酶K和25 µL 10％SDS。此时，还通过用100 µL水和RNase A稀释至200 µl，添加蛋白酶K和10％SDS，以上述方式处理无结构控制样品。将样品在65°C下孵育，过夜至裂解细胞。然后，通过苯酚 - 氯仿提取，然后进行乙醇沉淀纯化DNA，并在100 µL 10 mM Tris-HCl pH 8.5中洗脱。使用Zymo Research DNA Clean＆Comentator-5 Kit（Zymo Research，d4013）进行了另一轮DNA纯化，并在6.5 µL的10 mM Tris-HCl pH 8.5中洗脱。通过比较使用HS D1000试剂对Agilent 2200的无连接控制和无缘样品进行比较，测试了连接效率。在这一点上，合并了连接样品的单个重复（即，通过将MNase Digest跨两根管分解为250万个细胞的重复）。

　　为了净化二核体大小的连接片段，用Nusieve GTG低熔点琼脂糖（Lonza，50081）或TopVision低熔点琼脂糖（Thermo Fisher Scientific，R0801）制备了1.5％凝胶。tae运行缓冲液被预先填充到4°C，并在60 V下在冰上运行2.5 h。切除条带对应于大约250-400 bp。使用Zymoclean凝胶DNA恢复试剂盒（Zymo Research，d4001t）从中纯化DNA，使用31 µL 10 mM Tris-HCl pH 8.5作为洗脱缓冲液。使用Qubit 2.0系统和高敏DsDNA试剂确定DNA浓度。

　　为了准备微型-C文库，根据制造商指定的珠子的结合能力，制备了2.5–10 µL dynabeads Myone链霉亲和蛋白C1珠（Invitrogen，65001）。用300μL1×TBW（5 mM Tris-HCl，pH 7.5，0.5 mm EDTA，1 M NaCl，1 M NaCl，0.05％Tween-20）洗涤这些珠一次，并悬浮在150μL2×BB（10 mm Tris-HCl，pH 7.5，1.5，1.5，1 mm EDTA，1 mm EDTA，2 m nacl）中。通过添加120 µL无核酸酶的水，然后添加到珠悬浮液中，将微C样品稀释至150 µL的最终体积。将样品在室温下与搅拌一起孵育20分钟。通过在55°C下孵育5分钟，用300 µL 1×TBW洗涤珠两次。然后将珠子悬浮在35 µL 10 mM Tris-HCl pH 8.5、3.5 µL末端Prep反应缓冲液和1.5 µL末端Prep酶混合物（来自Nebnext Ulta II DNA库中预备套件）中。然后将样品在20°C的搅拌下孵育30分钟，然后在搅拌下在65°C下孵育30分钟。然后，添加0.5 µL的NEBNEXT适配器，15 µL NEBNEXT连接主混合物和0.5 µL NEBNEXT连接增强剂，并在搅拌下在20°C下孵育30分钟。接下来，加入1.5 µL NEBNEXT用户酶，并在搅拌下在37°C下在37°C孵育30分钟。然后，通过在55°C下孵育5分钟，用搅拌将珠洗一次，用100 µL 1×TBW洗涤一次。然后将珠子用100 µl 10 mm Tris pH 7.5洗涤一次，然后悬浮在20 µL 10 mm Tris pH 7.5中。然后，将2 µL的珠悬浮液作为测试定量PCR（QPCR）反应，以找到适当数量的PCR循环以生成库。然后将珠子分成9个PCR管（以减少PCR循环过程中单个PCR管中沉降的珠子的数量）和10 µL Nebnext Ulta II Q5 Q5 Master Mix，2 µL 10 µM Nebnext I5 Primer，2 µl 10 µl 10 µl 10 µm Nebnext I7 Primer（Nebnext I7 Primer（Neb，e766600s））加入4 µL水。然后根据Nebnext Ultra II库套件循环条件进行PCR，通常使用9或10个PCR循环。当每个单独的PCR反应中的上清液合并到每个文库中的单个管中，并使用0.9倍的NEB样品纯化珠（NEB，E7103S）纯化DNA。使用带有D1000 HS试剂的Agilent 2200 Tapestation确定库片段尺寸。通过使用SI或SP流动池在Illumina Novaseq平台上，通过爱丁堡基因组学对微C库进行测序，使用50 bp配对的设置，每个细胞类型的深度为大约10亿读对。

　　在室温下，将细胞固定在含4％多聚甲醛的PBS中，持续10分钟。在室温下用0.1％Triton X-100用0.1％Triton X-100透化10分钟，并在PBS中用4％驴或山羊血清（Sigma-Aldrich）在室温下60分钟，或在4°C下过夜。将阻塞的细胞在封闭缓冲液（PBS中的4％血清）中孵育过夜，其中含有适当的抗体浓度（补充表2）。将抗体染色的细胞用TBST（20 mM TRIS-HCL，pH 7.4、0.15 NaCl，0.05％Tween-20）洗涤3次，然后在室温下与适当的二级抗体在阻断溶液中孵育2 h。将细胞核用3毫克ML-1 4,6-二氨基-2-苯基吲哚（DAPI）（Invitrogen，Thermo Fisher Scientific）在室温下染色10分钟。使用虹膜数字单元成像系统（Logos Biosystems）拍摄荧光图像，并使用ImageJ51进行可视化。通过将TF阳性核计数为多个图像中DAPI阳性核的百分比来进行感染效率定量。

　　对于CDX2阳性其细胞菌落，将细胞固定在4％的多聚甲醛中，在PBS中固定20分钟，用PBS冲洗3次，用含有0.1％Triton X-100和5％FBS的PBS阻塞1小时，并在含有0.1％Triton X-100和1％fbs的PBS中孵育过夜，并孵育为一夜之间。1：500）。然后将细胞用PBS洗涤3次，在含有0.1％Triton X-100和1％FBS的PBS中与相关（Alexa）二抗（1：500稀释）一起孵育1 h。在孵育的最后10分钟内添加了DAPI（1：1,000）。将细胞用PBS洗涤3次，并在荧光显微镜（Nikon Eclipse Ti）下进行可视化。

　　用裂解缓冲液（100 mM Tris-HCl，300 mM NaCl，2％Triton X-100，0.2％脱氧胆酸钠，10 mM CaCl2）裂解48小时的重编程细胞，并补充了EDTA无eDTA的无蛋白酶抑制剂（Roche，Roche，118443580001），用于20分钟。然后将裂解物在14,000 rpm处离心20分钟，以消除细胞碎屑，然后通过添加dynabeads（A和G混合物）（Invitrogen，10004d/10002d）预先清除包含蛋白质的上清液，并在4°C上孵育1小时。The precleared supernatant was then incubated overnight with pre-bound Dynabeads (A and G mix) using anti-TFAP2C (Abcam, ab110635), anti-ESRRB (Perseus Proteomics, PP-H6705-00), anti-EOMES (Abcam, ab3345) or anti-IgG (Santa Cruz, sc-2025, sc-2027).然后将样品用冰冷的裂解缓冲液洗涤两次，将带有蛋白质复合物的Dynabead用样品缓冲液重悬于100°C下煮沸或10分钟，并进行蛋白质印迹分析。用以下主要抗体探测印迹：抗TFAP2C（ABCAM，AB110635）和抗MYC（ABCAM，AB32072）和适当的IgG-HRP二级抗体（1：10,000），并使用ECL检测套件可视化。

　　Whole-cell extracts were prepared from doxycycline-induced and uninduced MEFs using RIPA extraction buffer (25 mM Tris HCl pH 7.6, 150 mM NaCl, 1% Na-deoxycholate, 1% NP-40, 0.1% SDS) supplemented with cOmplete ultra protease inhibitor and Pierce phosphatase inhibitor cocktail (Thermo Fisher Scientific, A32957).

　　根据制造商的说明（Thermo Fisher Scientific），使用Pierce BCA蛋白测定试剂盒对裂解液的蛋白质浓度进行了定量。通过SDS-聚丙烯酰胺凝胶电泳解析的蛋白质被电印迹到PVDF膜上。用牛奶（0.1％Tween-20，10％非脂肪干牛奶过夜过夜）在4°C的摇动中，将膜在PBST中被阻塞过夜。将膜在室温下在摇杆上用PBST洗涤3次，持续5分钟。在室温稀释至PBST 5％BSA的室温下进行一级抗体孵育4小时（补充表2）。将膜在室温下在摇杆上用PBST洗涤3次，持续5分钟。在室温下进行二级抗体孵育10％的非脂肪干牛奶2小时，然后用PBST洗涤3次。使用使用MI5处理器（JET X-Ray）中开发的Amersham Hyperfilm ECL（GE Healthcare），使用SuperSignal West Pico化学发光底物（Thermo Fisher Scientific）可视化印迹。

　　从MEF129细胞和TNG-MKOS-MEF中分离组蛋白，在强力霉素诱导72小时后，未感染的细胞系对照，或144 h的H1.4 ShRNA感染和空的载体感染细胞线对照，通过用0.2 N硫酸提取，如前所述，如先前所描述的。简而言之，将细胞重悬于0.3 m的蔗糖缓冲液中，并使用Dounce Hogogogeniser获得了核。使用含有0.35 M KCl的高盐缓冲液裂解核，然后使用0.2 N硫酸溶解组蛋白，随后用乙醇沉淀，最后重悬于无核酸酶的水中。

　　根据制造商的说明（Thermo Fisher Scientific），使用Pierce BCA蛋白测定试剂盒对酸提取的组蛋白的蛋白质浓度进行了定量。通过SDS-聚丙烯酰胺凝胶电泳分辨出的蛋白质在200 mA的PVDF膜上电视2.5 h。用牛奶（0.1％Tween-20，10％非脂肪干牛奶）在室温下用牛奶（0.1％Tween-20，10％非脂肪干牛奶）在PBST中阻塞4小时。在室温下，将膜用PBST洗涤一次。将针对H1的主要抗体和H3载荷对照稀释为PBST 5％BSA（补充表2），并在4°C下孵育过夜。将膜洗涤六次，持续5分钟，在室温下用摇杆上的PBST洗一次10分钟。在PBST 5％BSA中进行二级抗体在室温下进行1小时，然后摇动6洗，持续5分钟，用PBST洗1分钟。通过使用化学发光设置在白色托盘上使用SuperSignal West Pico化学发光底物（Thermo Fisher Scientific）和Biorad Chemidoc Imager可视化印迹。补充表2提供了本研究中使用的抗体列表。

　　为了制备细胞裂解液，将MEF（WT 129）感染了多西环诱导的慢病毒，编码了感兴趣的TF，并通过强力霉素治疗在48小时内过表达（请参阅上面的慢病毒协议）。每次制剂总共收集了1000万个细胞。然后将细胞在缓冲液A（10 mM HEPES pH 7.5、1.5 mM MGCL2、10 mM KCl，0.5 mm DTT）上裂解10分钟，并散发40倍（紧dounce）。然后将细胞固定并重悬于100μl的缓冲液B（20 mM HEPES pH 7.5、30％甘油，420 mM NaCl，1.5 mM MGCL2、0.2 mM EDTA，0.5 mM DTT）中，每个1000万细胞在4°C下孵育30分钟。旋转后，将上清液在透析缓冲液（20 mM HEPES pH 7.5，30％甘油，100 mM KCl，0.83 mm EDTA pH 8，1.66 mm DTT，0.2 mm pmsf）中透析2小时。将细胞裂解物等分并储存在液氮中闪烁后的-80°C中，直到使用EMSA。

　　对于EMSA，如先前所述，制备了Cy5端标记的寡核苷酸双链体（50 nm）12。将Cy5端标记的寡核苷酸双链体与增加的细胞裂解物（0.5μl至4μl）和非特异性竞争竞争（G-C）寡核苷酸（1μg）在结合缓冲液中（50 mm Tris HCl pH 7.5，5 mm MgCl2，5mmmgCl2，50％kcl2，50毫米kcl2，50mm kcl2，50％，5 mM）（v/v）甘油，2.5 mg ml -1 bsa）的最终体积为10μl，并在21°C的黑暗中孵育1小时。对于EMSA-SuperShifts，将5μg抗体或20x非标记的寡核苷酸竞争者与TF-DNA混合物混合在结合缓冲液中，并在室温下孵育20分钟。在90 V的5％聚丙烯酰胺凝胶和100 mA的0.5×TBE（45 mM Tris-Borate，1 mM EDTA）上以5％聚丙烯酰胺凝胶的速度运行，并使用Beio-Rad Chemidoc MP（Biio-Rad）成像。

　　为了进行流式细胞仪分析，首先将细胞进行胰蛋白酶素，然后用含有10％胎牛血清（FBS）的培养基中和。接下来将细胞离心并用PBS洗涤两次，以确保去除任何残留的胰蛋白酶和培养基。然后将洗涤的细胞重悬于PBS中以进行后续分析。

　　荧光标记EGFP和TDTOMATO用于识别和量化特定的细胞群体。使用Beckman Coulter（Gallios）流式细胞仪进行流式细胞仪分析。使用Kaluza软件（V.1.0.14029.14028）进行了数据采集和分析。

　　为了去除死细胞，所有样品最初使用FSC-A/SSC-A门控以识别活细胞种群（低于200 FS面积）。为了删除单元双线，通过门控向前散射高度与区域选择单细胞。基于阳性对照细胞的荧光强度，将阳性荧光细胞门控。EGFP和TDTOMATO的门控策略的示例如图2所示。

　　本研究中构建的质粒如下：

　　The pFUW-TetO-hEsrrb plasmid was generated by PCR amplifying human ESRRB from pPB-PGK-hEsrrb (Addgene, 60434)54 and inserting the amplified fragment into an EcoRI digested FUW-tet-O-hOct4 plasmid (Addgene, 20726) backbone using an IN-Fusion HD Cloning Plus kit according to the制造商的说明（Takara Clontech）和以下引物：5'-GCCTCCGCGCCCCCCCCCCCCCATCCACCACCATGCCTCCTCGGACGACA-3';和5'-ataagcttgatatcgaAttctttattActggtggccagagatgctt-3'。

　　H1.4 cDNA是通过扭曲生物科学合成生成的，并插入PET-28A（+）细菌质粒中。H1.4 cDNA was then amplified by PCR using the pET28-H1.4 construct as a template and primers containing EcoRI site and Kozak fragment (forward) and XbaI restriction site (reverse) (5′-CCCCGAATTCGCCACCATGTCCGAGACTGCGCCT-3′ and 5′-TATCTCTAGACCTACTTTTTCTTGGCTGCCGCC-3′).将PCR产物用ECOR1和XBAI消化，并连接到与相同酶消化的线性Fuw-Teto质粒中。

　　双重PIGGYBAC记者（PB-TAP-INSX3-NANOG_ENH-EGFP-NANOG_FLIP-TDOMATO）质粒是根据以下步骤构建的：

　　其余的质粒将从以下来源获得：慢病毒质粒fuw-teto-hoct4（addgene，20726），fuw-tet-o-hsox2（Addgene，20724），fuw-teto-hklf4（fuw-teto-hklf4）R. Jaenisch实验室56,57产生了Fuw-Teto-MSOX9和Fuw-Teto-Mgata4（Addgene，41084）。如先前所述5。PWPT-RTTA2M2载体是在K. Zaret实验室中产生的58。Wernig Laboratory 22产生了PFUW-TETO-MBRN2（Addgene，27151）矢量。RNAi联盟（TRC）59设计了一组针对Plko.1慢病毒载体的Hist1H1E基因（H1.4）的四个发夹shrnas，并从Horizon/ Dharmacon获得。空plko.1质粒从addgene（8453）60获得。PCMV-HYPbase是从Kaji实验室获得的61。补充表3中提供了本研究中使用的所有DNA构建体及其来源的列表。

　　核反射前两天，以1:10的比例分裂了接近含量的MES细胞培养物（70-80％）。对于每个核反理，制备了2×106 MES细胞。对于每个核反理，制备了一个15 ml猎鹰管，准备了9.5 mL温暖的培养基。用PBS洗涤后，用0.25％胰蛋白酶EDTA处理MES细胞，并在37°C下孵育2-3分钟。通过在300 rcf下离心3分钟，通过添加血清 - 中间抑制胰蛋白酶，并通过离心3分钟收集MES细胞。用PBS洗涤细胞颗粒，并在300 RCF下离心3分钟。在1.5 mL管中，混合了1 µg Pbase和1 µg Pb载体（混合（高质量的质粒，浓度在0.5-2 µg µl-1之间，需要将体积保持在10 µL以下）。通过使用小鼠胚胎干细胞核对象试剂盒（Lonza，Vaph-1001）添加90 µL核对象溶液和20 µL补充1来制备核反射混合物。将MES细胞颗粒（2×106细胞）迅速重悬于核反射混合物中。然后将细胞悬浮液转移到混音中，而无需引入气泡（气泡会短路电流并负面影响细胞的活力）。将比色绿色放入核对器机器（Amaxa Biosystems）中，并用程序A-023进行脉冲。比色杯很快被带到组织培养引擎盖上，并增加了500 µL预热的培养基。使用Lonza巴斯德移液管从比色杯中取出细胞悬浮液，并以9.5 mL温暖的培养基转移到制备的15 mL猎鹰管中。将细胞悬浮液铺在明胶涂层的10厘米盘上。将细胞在37°C下孵育，每2天更换培养基。如下所述，在显微镜下检查了绿色（EGFP）和红色（TDTOMATO）荧光，并通过荧光激活细胞分类对细胞进行排序。

　　为了进行转基因表达分析，使用Macherey-Nagel试剂盒（ORNAT）提取指定样品的总RNA。使用iscript cDNA合成试剂盒（Bio-Rad）对500至2,000 ng的总RNA进行反转录。使用SYBR Green Fast QPCR混合物（Applied Biosystems）中的SteponePlus实时PCR系统（Applied Biosystems），使用1/100的逆转录反应（applied Biosystems）对三个生物学重复（n = 3）进行了QPCR分析。

　　特定引物用于专门检测转基因表达。对于GATA3，EOMES，TFAP2C，MYC，ESRRB和SOX2的基因，使用了FUW-TETO载体（反向引物）的最后一个外显子（正向引物）和WPRE元件的引物。对于OCT4和KLF4的基因，使用了针对第一个外显子（反向引物）和病毒载体（TETO，正向引物）的启动。将每个样品中的cDNA量标准化为管家控制基因GAPDH的水平。补充表4提供了本研究中使用的引物的列表。

　　为了感染，将HEK293T细胞以每15 cm板的密度为2.4×106细胞的密度，并在30 ml HEK培养基中生长24小时，然后用RTTA2或H1.4慢病毒质粒转染。每种病毒都是在单独的菜肴中制备的。为了转染，将2.4 µg PMD.g，5.1 µg PSPAX2和7.5 µg相应的质粒溶解在1,710 µL Opti-Mem培养基（Thermo Fisher Scientific，31985062）和90 µL Fugene 6 fugene 6 figene 6试剂（然后添加promega，e2692），彻底的vortex and and vortex and complate forex and complate and and forex and Exex and complate forex and Exex and complate and Exex and complated forex and copted forex and copted forex.15厘米板含有HEK293T细胞，将其孵育16小时。将转染培养基替换为新鲜的HEK培养基，并将转染的细胞再培养60小时。通过收集30 mL上清液收集慢病毒，该上清液通过0.45 µm聚乙烯磺酮过滤器拟合注射器。然后使用Beckman Coulter Optimaxpn-80 Ultracentrifuge和SW32-TI Swinging-Bucket Rotor（Beckman Coulter）在4°°C中，使用Beckman Coulter Optimaxpn-80 Ultracentrifuge和SW32-TI Ultracentrifuge和SW32-TI Ultracentrifuge和SW32-TI Ultracentrifuge（Beckman coulter），以25,000 rpm的超速离心管（77,000g）在25,000 rpm（77,000g）的超速离心管中颗粒。除去上清液，并通过旋转将病毒颗粒溶解在300 µL GMEM中，然后在同一天等分并存储在-80°C下。平均而言，每毫升的滴度确定为5×107感染单位。

　　在感染前24小时，将MEF129（通道2）以每CM2的25,000个细胞播种，用于Omni-ATAC和Western blotting的H1感染使用了两个10 cm的培养基（每盘140万个细胞），以确认与未感染的细胞系对照组相比确认过表达。使用7厘米（每盘360万个细胞）进行微C。第二天早晨，将培养基改为补充的MEF培养基，该培养基添加了8μgml-1聚甲烯和PFUW-TETO-H1.4和PWPT-RTTA2M2病毒，其MOI为2。然后，感染后24小时，所有平板上的培养基均更改为新鲜MEF培养基。接下来，感染后48小时，H1.4的表达是通过添加含有多西环素的MEF培养基诱导的，将多西环素添加到最终浓度为2μgml -1，并将细胞孵育72小时。接下来，在强力霉素诱导后的72小时内，通过胰蛋白酶消化收集感染和未感染的10 cm板，并使用血细胞计数仪进行计数。总共将来自感染和未感染样品的400,000个细胞立即接受OMNI ATAC协议50（请参见下面的“ ATAC – SEQ”部分（H1 OE和KD实验的库和测序），而其余细胞则是酸提取的HPLC定量和蛋白质印刷区（请参阅“ Western blotting”部分）。将15厘米的板进行微C进行双交联（请参阅“ Micro-C”部分（MNase消化和结扎））。

　　为了感染，将HEK293T细胞以每15 cm板的2×106细胞的密度接种，并在30 ml HEK培养基中生长24小时。每15 cm板的MEF的板数量两倍，这是由于使用病毒上清液（VSN）的两轮感染而导致的。HEK细胞的22张15 cm板用于H1.4靶向shRNA，并将两个15 cm的HEK细胞用于空载体对照。用于转染，2.4 µg PMD.G，5.1 µg PSPAX2加上四个H1.4-靶向shRNA质粒的混合物（Horizon Discovery TRCN0000096935TRCN0000096937：TCTTAGCCTTAGTTGCCTTG，TRCN0000096938：TAGCTGCCTTAGGCTGCTGGGG）或7.5 µg空的plko质粒溶解在1,710 µL Opti-Mem培养基中（热fisherscientific，31985062）E2692）。通过涡旋充分混合shRNA和空的转染混合物，并在室温下孵育15分钟，然后添加到含有HEK293T细胞的15厘米板。与转染混合物孵育16小时后，将培养基替换为新鲜的HEK培养基，并将转染的细胞培养60小时。通过收集30 mL上清液收集慢病毒，该上清液通过0.45 µM PVDF滤清器单元（STERICUP Millipore），并补充了8μgml -1的多甲氧烯。一半的VSN在液氮中闪烁，并存储在-80°C下。

　　在收集病毒之前24小时，将280万个MEF129细胞（通道2）播种成两个10 cm的菜肴（密度，每CM2 25,000），以用于OMNI-ATAC和Western blottting，用于空载体控制。将10厘米的菜肴（每盘360万个细胞）感染，用于Micro-C，Omni-ATAC和Western印迹。为了感染，每15 cm板中添加25 mL H1.4靶向的shRNA VSN，并添加15 mL MEF培养基，并补充8μgml-1的聚甲烯。每10 cm板中加入总共8.5 ml的空PLKO矢量VSN，并带有5 mL的MEF培养基，其中含有8μgml -1的聚甲烯。其余的VSN具有8μgml-1聚甲烯，在液氮中闪烁，并将其储存在-80°C下，直到72小时后感染了第二轮。感染后24小时，用1μgml-1紫霉素的MEF培养基更改了培养基，以选择含PLKO-VECTOR的MEF的培养基。然后，在初次感染后72小时，暂停了紫霉素的选择，并使用冰冻的VSN进行第二轮感染，并在37°C下预热。然后，24小时后，将培养基换成1μgml-1含紫霉素的MEF培养基。接下来，初次感染后的144小时，通过胰蛋白酶吸引力收集了感染H1.4靶向的SHRNA VSN的15厘米板中的两个，并且两个感染了空载体对照VSN的10 cm板，并使用血细胞计数计数。立即将来自感染和未感染样品的大约400,000个细胞进行OMNI ATAC方案（请参阅“ ATAC – SEQ”部分（H1 OE和KD实验的库和测序）），其余的细胞进行了酸提取的HPLC定量和蛋白质印刷（请参阅“ Western blotting”部分）。将其剩余的15厘米板（感染H1.4 shRNA感染）对微C进行双交联（请参阅“ Micro-C”部分（MNase消化和结扎））。

　　H1.4 OE和KD在TNG-MKOS-MEFS62中对ATAC – SEQ进行了与MEF129 WT细胞相似的ATAC – SEQ（请参阅“ H1过表达”和“ H1 KD”部分）。简而言之，H1.4 OE在病毒感染前，将TNG-MKOS-MEF（传递3）以每CM2 27,000的播种为27,000个细胞（H1过表达的七个10 cm平板，未感染的3个10 cm平板）。将细胞用PFUW-TETO-H1.4和PWPT-RTTA2M2病毒感染2为2的MOI。第二天更改了培养基，并通过施用强力霉素（2μgml-1）48 h来实现病毒基因表达，感染后48 h，以感染和单感染细胞。对感染和未感染的样品，在诱导的0小时和诱导后72小时对样品进行ATAC。与未感染的细胞系对照相比，在强力霉素后72小时对组蛋白提取物进行了蛋白质印迹，以确认成功的过表达。

　　为了达到H1.4 kD，将HEK293T细胞在MEF面前72小时播种，然后在24小时后转染以使H1.4 shRNA池的VSN（请参见上面的“ H1 KD”部分）。在感染前24小时，将TNG-MKOS-MEFS（通道3）以每CM2的21,500细胞为21,500个细胞。用10 ml的H1.4 shRNA VSN感染了3个10 cm的板，并用10 ml的空载体VSN感染了两个板。其余的VSN在液氮中闪烁，并存储在-80°C下。所有VSN均补充了8μgml -1的聚甲烯，并为所有板上添加了另外5 ml带有聚甲烯的新鲜MEF培养基。用1μgml -1紫霉素的新鲜MEF培养基更改了培养基。然后，48小时后，收集了一个被H1.4-ShRNA-VSN感染的板，并在ATAC实验的0 h强力霉素时点收集了一个被空载体VSN感染的板。像以前一样，第二次用VSN感染其余板，将强力霉素添加到2μgml -1的最终浓度中，以诱导MKOS的表达。第二天将培养基更改为2μgml-1强力霉素的MEF培养基。然后，在开始强力霉素诱导后的72小时，收集了其余的板以进行ATAC（请参阅“ ATAC – SEQ”部分（H1 OE和KD实验的库和测序）。

　　为了下调成纤维细胞中的H1.4表达，将靶向H1.4（PLKO.1载体）的四个不同的shRNA序列纳入了复制不足的溶性病毒中。使用慢病毒包装载体（PSPAX2和PGDM.2，比率：1：1）和四个SHRNA（比率为1：1：1：1）的混合物以1：1的比率包装。包装在HEK293T细胞中进行，并在转染后48小时收集VSN。将上清液通过0.45μm的过滤器过滤，并补充了8μgml-1聚甲烯（Sigma-Aldrich），并用于感染MEF。然后，感染后24小时，将培养基替换为含有10％FBS的新鲜DMEM。

　　接下来，在感染后4天，包含GETM因子的复制不足的慢病毒（比率为1：1：1：0.3）被类似地包装，并用于感染H1.4偏见的细胞。第二次感染后二十四小时，将培养基替换为含有10％FBS和2μgml-1强力霉素的新鲜DMEM。两周后，将培养基切换为TS细胞重编程培养基（添加了20％FBS，0.1 mMβ-羟基乙醇，2 mM-谷氨酰胺，25 ng ml-1人类重组FGF4（peprotech），peprotech），1μgml-1 heparin（1μgmMl-1 Heparin（1μgmml-1）和2 sigma-aldrich and dox和2μgmlich和2μgmlichM.1周后，将培养基替换为没有强力霉素的TX培养基。然后，一周后，将板固定并染色以鉴定阳性菌落。

　　同样，对于过表达H1.4，MEF被使用PFUW-TETO-H1.4质粒编码H1.4的慢病毒感染。如上所述，将慢病毒包装在HEK293T细胞中。为了启动其细胞重编程，如上所述，使用强力霉素（2μgml -1）诱导H1.4过表达。

　　如前所述52，通过用0.2 N硫酸提取分离组蛋白蛋白。简而言之，将细胞重悬于0.3 m的蔗糖缓冲液中，并使用Dounce Hogogogeniser获得了核。使用含有0.35 M KCl的高盐缓冲液裂解核，然后使用0.2 N硫酸溶解组蛋白，随后用乙醇沉淀，最后重悬于无核酸酶的水中。根据制造商的说明（Thermo Fisher Scientific），使用Pierce BCA蛋白测定套件对酸提取的组蛋白进行定量。使用配备有Vydac 218TP C18 HPLC柱的Waters 2695系统，通过反相高压LC分析了酸提取的组蛋白。监测废水，并使用214 nm处的水996光电二极管阵列检测器记录峰。使用Waters Empower Pro软件（V.2）进行H1峰积分，并归一化为H2B峰。

　　在10 mM DTT，0.02％NP-40和100 mM NH4HCO3中降低酸提取物在37°C 1 h。然后在室温下用30 mM IAA用30 mM IAA烷基在黑暗中将样品烷基化45分钟。然后，使用Zebaspin 7k柱（Thermo Fisher Scientific）将反应脱盐为50 mM NH4HCO3，并补充了用胰蛋白酶（0.1 mg mL -1）补充洗脱液，并在37°C下消化2小时。在2小时结束时，将样品补充额外的胰蛋白酶，并允许消化过夜。用1％甲酸淬灭消化，在SpeedVac中干燥，然后重悬于130 µL MS样品缓冲液（0.1％甲酸，水中1％乙腈）中。

　　在5600+质谱仪（AB SCIEX）上与M5 MicroLC系统（AB Sciex/Eksigent）和PAL3 AutoSampler进行LC -MS分析。LC分离是通过陷阱 - 精还构型进行的，该配置由陷阱柱组成（Luna C18（2），100Å，5μm，20×0.3 mm弹药筒，Phenomenex）和一个分析柱（kinetex2.6μmxb-c18，100Å，50×0×0×0.3 mmmmmm mirofoleenex columenex，流动相（A期）由水中的0.1％甲酸组成，B期由乙腈中的0.1％甲酸组成。

　　将MS样品缓冲液中的肽注入50μl样品环中，在陷阱柱上以3％流动相位的B在陷阱柱上以25μlmin -1的流速为B，持续4分钟，然后在分析柱上以梯度洗脱以5μlmin -1的流量分离。梯度设置为如下：0-24分钟，B期3％至35％；24–27分钟，B期35％至80％；27–32分钟，B期80％；32–33分钟，B期80％至3％；在3％的第3％。每个样品（30μl）的体积相等的33-38分钟，一次注射了四次，一次以进行信息依赖性采集（IDA），然后立即一式三份dia/swath。通过空白注射（80 µL MS样品缓冲液）分离不同样品的采集，以防止样品结转。质谱仪以正离子模式以5,200 V为正离子模式，在30 psi处的源气1，源气体1在20 psi时源气体2，25 psi的窗帘气体和200°C的源温度。

　　执行IDA以生成参考光谱库以进行宽度数据量化。IDA方法的设置为250毫秒TOF-MS扫描从300到1,250 DA，然后在高敏感模式下进行MS/MS扫描，从100到1,500 DA的前25个前体离子的100 da高于100 cps阈值（100 ms的积累时间，100 ppm的质量耐受性，滚动碰撞能量和动态累积状态），以获取+2的+2。使用ProteinPilot（V.5.0.2，abciex）搜索IDA文件，其默认设置用于胰蛋白酶摘要和IAA烷基化针对蛋白质序列数据库。

　　使用序列的Mus Musculus proteome fasta文件（54,910个蛋白质条目，UNIPROT：UP000000589）用作常见污染物序列的序列，用作搜索的参考。最多允许两个错过的乳沟地点。前体和片段离子的质量公差设置为100 ppm。5％的假差异率（FDR）用作肽鉴定的截止值。

　　为了进行所有理论质谱（SWATH-MS）的顺序获取63，进行了50 ms TOF-MS扫描从300到1,250 DA，然后在100到1,500 DA（15 ms积累时间，100 ppm Mass tolage，+2 to +5 z， +5 z，rollions collions collions-collions-collistion-collistion-collistion wide）中进行高敏感模式的MS/MS扫描。使用PeakView（V.2.2.2.0.11391，Absciex）对DIA数据进行定量，该数据与蛋白质pilot产生的选定光谱文库具有Swath Acquisition MicroApp（V.2.0.1.2133，absciex）。使用23个核心组蛋白H4C1（Uniprot：P62806）的23肽对单个植物采集的保留时间进行了校准，该肽在IDA ION库中具有很高的代表性和所有swath的获取。使用以下软件设置：每蛋白质最多25个肽，每肽的6个过渡，95％的肽置信阈值，5％的肽FDR，XIC提取窗口10分钟和XIC宽度100 ppm。在所有片段文件中，手动策划了核心和接头组蛋白蛋白的定量数据，以排除在至少一项借助一项习惯中表现出异常保留时间的考虑（与IDA/离子文库或其他Swath获取中的差异> 20％的差异）。将蛋白质峰面积作为Excel文件导出，并如下所述处理。

　　如前所述65所述，使用相对LC -MS测定和绝对HPLC定量的组合对MEF中的单个H1亚型进行定量。

　　使用FASTQC工具包（https://github.com/s-andrews/fastqc）进行初始质量控制分析。ES Cell H1芯片– Seq读取及其相关输入被修剪以删除适配器和基础的PHRED分数 <30 using Cutadapt66 (cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -q 30). ChIP–seq, ATAC–seq and MNase–seq samples were aligned to mouse reference genome MGSCv37 (mm9) using Bowtie267 v.2.3.4.1, using a --very-sensitive call and paired-end settings (or single-end settings where appropriate). Aligned reads were sorted and subsequently converted to BAM format using the samtools suite68. RNA-seq samples were aligned using STAR (v.2.7)69 with --outFilterMultimapNmax 1. Duplicated reads were eliminated using the Picard (https://github.com/broadinstitute/picard) function MarkDuplicates, except for MNase–seq and RNA-seq, for which duplicates were retained. Sequencing replicates were merged using samtools merge. The sequencing coverage and the insert size distribution were measured from the resulting BAM files using Qualimap (v.2.2.1)70.

　　Micro-C libraries were aligned to the mm9 reference genome and processed using the Nextflow (https://www.nextflow.io/) pipeline distiller-nf (https://github.com/open2c/distiller-nf) using the following configurations; make_pairsam = False, drop_readid = False, parsing_options: ‘--add-columns mapq --walks-policy mask’, max_mismatch_bp = 1. Balanced multi-resolution cool (mcool) files were outputted with the following bin sizes: 10,000,000, 5,000,000, 2,500,000, 1,000,000, 500,000, 250,000, 100,000, 50,000, 25,000, 10,000, 5,000, 2,000, 1,000, 500, 100. 15U and 20U Micro-C libraries for each cell type were merged using pairtools merge (https://github.com/open2c/pairtools).

　　ChIP–seq narrowPeaks and summits showing significant enrichment over input DNA were called using MACS2 (v.2.1.1.20160309)71, and were controlled to a q-value (minimum FDR) cut-off of 0.01. To identify broadPeaks of TF binding, peaks were called using MACS2 with the following flags: -B --broad-cutoff 0.1 --broad --nomodel --extsize 200. Regions that overlapped with the ENCODE blacklist72 were removed using the bedtools73 intersect function (flag --v).

　　To obtain a consensus list of nucleosome positions, the alignments for each MNase concentration were merged into a single BAM file using samtools merge. Nucleosome and nucleosome dyad positions were called using the DANPOS274 function dpos with a 1% FDR, paired-end settings and bin size of 1 bp to ensure dyad position accuracy. A file of nucleosome dyad positions was then generated by taking the summit position and adding 1 to create a bed file of 1 bp chromosome coordinates. The smoothened.wig file of MNase signal from DANPOS2 was converted to bigwig using wigToBigWig (http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/) and used for heat maps and profiles of MNase signal.

　　The aligned reads (BAM files) were normalized for sequencing coverage to 1× genome depth (RPGC) using the bamCoverage tool from DeepTools275 with a bin size of 10 bp and extendReads parameter, chromosome X was ignored for normalization. The resulting bigwig files were converted to wig format using the UCSC bigWigToWig tool76, and subsequently converted into a bed file using the wig2bed77. To sort peaks of individual TFs based on either ChIP–seq or ATAC–seq enrichment, 1 bp summits produced by MACS2 were extended by 150 bp on each side to produce a 301 bp peak using the bedtools slop function73. The tag density under these peaks was then quantified using the bedmap function of BEDOPS77, against the RPGC-normalized bed file of either the ChIP or ATAC samples. Peaks were then sorted from highest-to-lowest enrichment using the UNIX command line sort function.

　　For ATAC–seq-sorted peaks, peaks were split based on RPGC to open (>20 RPGC）或闭合染色质（<20 RPGC), representing the value whereby no ATAC enrichment is observed within the central 301 bp peak over the flanking 350 bp either side (total region of 1 kb). As there is no input DNA for ATAC–seq, we compared the enrichment of ATAC–seq within the peak to a 1 kb local region. By plotting ATAC–seq enrichment of TF sites as function of number of reads (sequence coverage normalised in RPGC or reads per genome coverage), we identified the baseline of 20 RPGC.

　　To generate the read density heat maps and line profiles, we first computed a density matrix using the DeepTools2 tool computeMatrix reference-point and the following parameters: --referencePoint center, --binsize 10, -b 1000 -a 1000, --sortregions keep, --missingDataAsZero and --averageTypeBins sum using the peak bed files as reference files (-R) and the normalized ChIP–seq and ATAC–seq bigwig files as score files (-S)75. The ENCODE blacklist was excluded. The resulting matrix was subsequently used to generate heat maps and profiles using Deeptools2 functions plotHeatmap and plotProfile, respectively75.

　　Histone H1 ChIP–seq data were processed as described above except that the Deeptools2 function bigwigCompare75 was used to subtract the RPGC-normalized input signal from the RPGC-normalized H1 ChIP–seq signal. ChIP–seq data for H1c and H1d78 were merged for analysing H1 in ES cells to obtain maximum coverage of H1-bound regions in ES cells.

　　Profiles of Micro-C contact junctions around TF sites were produced by generating a bed file containing 1 bp coordinates for each junction in a ‘.pairs’ contact file generated by the distiller-nf pipeline. This was then used to generate a genome coverage bedgraph using the bedtools73 function genomecov before being subsequently converted to a bigwig file using bedGraphToBigWig (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/). This bigwig file was then used as a sample file with Deeptools275.

　　To assess peak overlaps between conditions (but not co-bound sites), all peaks were considered as 301 bp centred round the summit. This is because the average peak size was identified by MACS to be ~300 bp, and one nucleotide was added to place the summit in the middle. Overlapping peaks between conditions were identified using the Intervene venn function with the flag --save-overlaps79, such that regions would be called as overlapping based on a 1 bp or greater overlap. Bar plots were generated by counting the number of peaks in each list. For comparison of MYC peaks within closed and open chromatin across all reprogramming systems, intersection over union or the Jaccard index was measured using the bedtools jaccard function and ggplot2 was used to generate the resulting heat map73. Peaks were assigned to transcription start sites using the GREAT tool available online with mm9 association settings for ‘Single Nearest Gene’ with a maximum distance of 1,000 kb80.

　　To quantify ATAC–seq on co-bound TF sites, 301 bp peaks for each TF were labelled with a single-letter identifier for each TF and combined into a single file using bedops --everything77. Bedtools merge was used to collapse each overlapping peak with a --distinct settings used for the single letter label column to label each peak with the letter code for each TF present (that is, ‘OS’ for OCT4 and SOX2), awk was then used to count the number of TFs present by counting the number of letters. RPGC-normalized ATAC–seq data on these merged peaks were then quantified using bedmap ---echo and the value was scaled by dividing by the peak width, to account for variability in peak size. These values were then used to generate violin plots using the ggplot2 functions geom_violin() and geom_boxplot().

　　To generate lists of TF sites distal or proximal to MYC, closed chromatin peaks of all TFs within a combination were combined using bedops --everything without merging overlapping peaks. The master peak list was used as an input in the bedtools window with MYC peaks from that condition as the –b file, and a –w flag of 350 to ensure detection of nearby MYC peaks. Proximal or distal sites were then obtained with the --u or --v flags, respectively.

　　De novo motif analysis was performed using the MEME suite installed on a local Linux server81. First, the DNA sequences (FASTA) were generated from the central 200 bp of the ChIP–seq peak regions using bedtools getfasta73. To use as the background, DNA sequences (200 bp) were extracted from genomic regions located 1 kb upstream from the summit of each peak using bedtools shift73. All regions were filtered through the ENCODE blacklist. Finally, meme-chip was run using the Fasta sequence files and the corresponding Markov model and the following parameters: -nmeme 600, -meme-mod zoops, -meme-minw 6, -meme-maxw 18, -meme-maxsize 50000000, -dreme-e 0.00001, -dreme-m 20 using the JASPAR core motif database82. The most enriched de novo motifs discovered by MEME83 and DREME84 were analysed using CentriMO to confirm their central enrichment over the background sequences and compared to the canonical motifs using Tomtom.

　　Gene expression quantification was performed using the featureCounts function of the R subRead package85, using a gtf file containing the UCSC genes for mm9 with paired or single-end settings depending on the samples. Tables generated for the paired and single-end data were combined using cbind(). Differential gene expression analysis was performed using the package DESeq2 (v.1.22.2)86 with DESeqDataSetFromMatrix() followed by DESeq2(). Genes with 0 counts in all of the conditions were excluded and the samples were normalized according to library size using sizeFactors(). Values then underwent regularized log-transformation with rlog() and counts were obtained using assay(). Pearson correlation analysis was performed using the top 500 most variable genes with cor() with method=c(“pearson”) followed by package pheatmap(). A PCA plot was generated using plotPCA() on the regularised log transformed matrix. Differentially expressed genes at 72 h in each of the early reprogramming systems were identified using the results() function in DESeq2 using a contrast versus MEFs, lfcThreshold=1, altHypothesis=“greaterAbs” and alpha = 0.05.

　　To perform upset analysis of DEGs, unique DEG gene IDs were combined into a dataframe in R. This was then used as the input for the function upset() from the package upsetR. DEGs targeted by MYC were identified by taking the gene IDs from the output of the GREAT analysis of ChIP peaks and finding matching gene IDs in the DEG lists with join(). These were combined into a data.frame and plotted with upset().

　　To analyse TF enrichment at differentially expressed genes, the gene IDs of differentially expressed genes were combined with a list of coordinates of transcription start sites for the mm9 genome using UCSC refGene TSS mm9 coordinates of seqMINER87 with join(), and a bed file was generated. Approximately 4–5% of genes per set did not have matching gene IDs due to release differences in the annotation and were excluded. The Deeptools function plotProfile was used to plot TF enrichment as described for the ChIP/ATAC–seq analysis.

　　To define whether differentially expressed genes were targets of a specific TF, the nearest gene from each TF summit was obtained using GREAT. This gene list was compared with the list of differentially expressed genes using join() such that each gene appeared once in a final list of genes that are both TF targets and differentially expressed. Overlaps were identified using the package UpsetR88.

　　Fragment-size enrichment heat maps were drawn using plot2DO89 with ChIP–seq peak summits or TSS as a reference and the aligned raw BAM files as the sample. only fragment sizes between 50 and 250 bp were considered. Heat-map scales were scaled to the same value between open and closed chromatin to allow for direct comparison of fragment enrichment.

　　Bound nucleosomes were identified by selecting the closest 1 bp nucleosome dyads to ChIP–seq summits using the closest features function from bedops77 (closest-features --delim ‘ ’ --dist --closest). Nucleosomes where the ChIP–seq summit was greater than 80 bp from the dyad were filtered out using awk and the remaining nucleosomes were labelled with a column containing a single letter identifier for that TF (that is, ‘O’ for OCT4). Co-bound mononucleosomes were identified by combining the lists of bound nucleosome dyads for each individual TF and merging using bedtools merge73, such that the single-letter TF label column would contain multiple identifiers if the same dyad was present in each list of bound nucleosomes. The number of TFs present on each nucleosome was counted by using awk to count the number of characters in this column.

　　Bound nucleosome dyad positions were used to generate a 1 bp GRanges object90. IRanges90 was used to extend this object to 160 bp, representing our average nucleosome fragment size. Sequences were obtained for the positive strand using the BSgenomes function getSeq(). The position weight matrix (pwm) was obtained from the MEME-ChIP91 and used to scan each strand separately for nucleosome sequence using the seqPattern92 function motifScanHits() with 100% match score. To count motifs in the reverse orientation, the pwm was passed through Biostrings function reverseComplement() before scanning. The motif count for each strand was then assigned to the corresponding nucleosome dyad by counting the number of times that each sequence identifier appeared in the motifScanHits() output. Total motif counts were obtained by summing the values for the positive and negative strands.

　　To generate heat maps of motif density, nucleosome dyads were extended symmetrically by 500 bp in each direction using IRanges90. An image matrix for each strand was generated using the function PatternHeatmap() of the R package heatmaps93 with the pwm and minimum score between 80 and 95% depending on the motif lengths (shorter motifs used a higher match score)92. To generate a matrix for the reverse orientation of a the motif, the sequences on the positive DNA strand was queried using the pwm reverse complemented using the Biostrings function reverseComplement(). Kernel smoothing was applied to the matrix using smoothHeatmap(). To plot both strands together, the matrix produced for motif reverse complement was multiplied by −1. Positive and negative matrices were converted to data frames and then combined using rbindlist() from the package data.table (https://github.com/Rdatatable/data.table) by alternating lines according to the row number in the data frame such that, for every line of positive-strand scores on a sequence, the next line is corresponding scores for the motif reverse complement on that same sequence. The combined data frame was then converted back to a matrix using data.matrix(). Heat maps were then plotted using the R package heatmaps functions Heatmap() and plotHeatmapList().

　　To generate density plots of motif position around the dyad, the positive and negative strands were considered independently. Dyad positions were extended symmetrically by 200 bp using IRanges90, and sequences were obtained using getSeq(). The seqPattern function plotMotifOccurrenceAverage() was used with the MEME pwm and its reverse complement. A smoothing window of 3 bp was used for plotting. To increase the resolution of motif identification around the nucleosome dyad, only perfect motif matches were considered.

　　To identify TF-bound nucleosome arrays, a column containing a single letter label was added to the broadpeak file for each TF. These broadpeaks were then combined into a single file using bedops --everything. Bedtools merge was used to collapse each overlapping broadpeak with a --distinct settings used to for the single-letter label column to label each peak with the letter code for each TF present (that is, ‘O’ for OCT4), with awk being used to count the number of TFs present by counting the number of letters. The RPGC-normalized ATAC–seq signal was then quantified on these broadpeaks using bedmap --echo --sum --delim ‘ ’, this value was then scaled by the broadpeak length in kb, and open and closed sites were separated using a read counts per kb cut-off value of 40. The positions of flanking nucleosome dyads were identified using bedops closest-features (with flags --delim ‘ ’ --dist --no-overlaps) and the array width was obtained by between by subtracting the first coordinate position of the upstream dyad from the downstream dyad using awk. This value was used for sorting on the basis of distance using the UNIX command line sort function. TF combinations were identified by selecting for different letter combinations using an awk equality. Oligonucleosomes were centred on their array midpoint by taking the coordinates of the upstream nucleosome dyad, and shifting them by half the array width using awk. Arrays were centred on the left or right edge by shifting to either the upstream or downstream dyad coordinate. Array width histograms were generated by passing the array width values to the geom_hist() function of ggplot2.

　　Motif analysis for TF-bound nucleosome arrays was performed similarly to mononucleosomes. The seqPattern function plotMotifOccurrenceAverage() with the MEME pwm and its reverse complement were used to generate density plots. A smoothing window of 10 bp was used for plotting and percentage match cut-offs were set between 80 and 95% depending on the motif length. Motif heat maps were generated using the heat-map library as on mononucleosomes.

　　To identify motif occurrences within arrays, a GRanges object was object was created using the 1 bp array midpoint coordinates and adding metadata columns for the array width, half the array width, a left boundary of (5,000 − half array width) and a right boundary of (5,000 + half the array width). This object was then extended ±5 kb using promoters() and the sequences were obtained using getSeq(). This gives arrays a maximum array size for motif identification of 10 kb but prevents most sequences from extending off the chromosome boundaries. Arrays that were extended off the chromosome boundary were filtered using GenomicRanges:::get_out_of_bound_index() (which occurred for approximately 1 in every 15,000 arrays). This filtering was applied to the 10 kb extended sequences, the 1 bp midpoints and the left boundaries as applicable. An ID column was then generated for each array using seqalong() and added as metadata. To identify motifs occurring within an array (such as for SOX2) motifScanHits() was used to identify motif occurrences on each 10 kb sequence using a MEME pwm (with a 95% match score used for SOX2 motif). This produces a table of the motif positions in which each line contains two columns, sequence ID and the start position of a single motif on that sequence. A left_join() was used to match the motif table with the GRanges object of array midpoints by sequence ID. subset() was used to filter sequences for which the motif start position was outside the array edges (motif_position >= left_boundary和motif_position <= right_boundary). A frequency table was then made to count the occurrence of each sequence ID in the filtered motif table and these counts were appended to the Granges object of array midpoints to produce a column containing motif counts within each array. This process was repeated to add another column motif counts on the bottom strand by passing the motif pwm through reverseComplement(). Motif counts per kb were obtained by dividing the motif count within the array by the array width in kb (after setting the width values for any array >10 kb至10 kb）。该值用于根据每个kb的基序计数过滤阵列。通过将这些基序计数传递到GGPLOT2的GEOM_HIST（）函数来生成计数直方图。如上所述，生成了基序热图和轮廓，如单核小体小体，基序PWM的百分比匹配通常设置在80％至95％之间，具体取决于基序的长度和退化。

　　所有具有核小组位置和图案的脚本都已存放在GitHub（https://git.ecdf.ed.ac.ac.uk/soufi_lab/motif_nucleosome_arrays）上。

　　使用coolpup.py package94进行了Micro-C堆积分析。为了生成堆积的热图，使用了一个包含TF-bound站点的床文件和Micro-c Mcool文件，使用-Local设置和-ignore_diags设置为0。用于20 kb填充窗户的矩阵Micro-c触点，用于TF网站周围的20 kb填充窗口，使用了100 bp mcool bin，用于400 kb Padding Windows，2 KB Padding Windows，2 KB McMc。

　　使用Fithichip Pipeline95（https://ay-lab.github.io/fithichip/html/index.html）来调用连接TF结合位点的统计意义回路。在配置文件中使用了以下设置：cool = path to..mcool文件（请参见上文），peakfile =路径=路径。使用），Mergeint = 1。

　　使用Cloops2软件包96（https://github.com/yaqiangcao/cloops2）生成了单个基因座的Micro-C触点图。首先，使用cloops2 pre-format对预处理micro-c.pairs文件。使用CLOOP2 ESTRES函数估算合理的接触矩阵分辨率。cloops2图用于与-m obs的指定基因组坐标上绘制接触密度，并指定了绘制拱形图的 - 架构，并指定了-triu以绘制BINNED的三角形接触矩阵。大于1 MB的区域使用了20 kb的垃圾箱尺寸，否则使用了500 bp的垃圾箱。

　　使用荷马软件包97与CircoS98结合使用Micro-C触点，MNase，ChIP-Seq和ATAC-Seq的圆形轨道。首先，使用flag-keepall的maketagDirectory使用maketagDirectory转换为荷马标记目录，并合并后的mnase bam文件被重复过滤的芯片– seq和atac-seq bam文件。要准备Micro-C样品，请通过重新排列列，将文件转换为.hicsummary格式，如下所示：egrep -v“（^＃。*|^$）” filename.pairs |awk‘begin {ofs =“”} {print（$ 1，$ 2，$ 3，$ 6，$ 4，$ 4，$ 5，$ 7）}’ - > file.hicsummary。然后，将其用MaketagDirectory和标志 - 格式Hicsummary转换为本垒打标签目录。使用AnalyzeHic命令和以下参数生成曲目：-res 5000 -Superres 10000 -circos ciroutput -nomatrix -nomatist -Mindist 20000 -pvalue 0.000000000000001。曲目尺度，线厚度和颜色在ciroutput.config文件中编辑，并使用Circos -Conf重新播放。

　　为了从微-C结扎中定义核小体方向，根据使用尴尬的读对方向将.pairs文件分为三个文件。首先，隔离鲜膜体结扎事件，然后过滤读数以获得200 bp和2 kb之间的连接。内向（in – in）对通过匹配读对与读取方向+/-的匹配定义：egrep -v -v“（^＃。awk‘begin {ofs =“”}’ - 。外部（外部）对通过匹配+/-读取方向来定义：egrep -v -v“（^＃。。‘开始{ofs =“”}’ - 。通过 +/ +或 - / - 方向识别串联（IN- out或out -in）对，如下所示：eGrep -v“（^＃。awk‘begin {ofs =“”}’ - 。串联方向在理论上被认为是可以互换的，并且不分开30。然后确定每对连接连接之间的距离，并使用ggplot2函数geom_dense（）绘制。

　　使用Integrative Genomics Viewer99，使用基因组覆盖范围（RPGC）数据生成基因组轨道筛选。

　　单核体结构是使用蛋白质数据库（PDB）5NL0（参考文献100）建造的，并使用Pymol Molecular图形系统v.3.0Schrödinger进行了可视化。

　　使用PDB 6IPU和6HKT36,101和EMD-2601（参考37）在开源3D计算机图形软件Blender102中对核小组阵列进行建模。

　　使用Alphafold-Multimer在COSMIC2 Portal中使用仅TF-DBDS的氨基酸序列在COSMIC2门户中运行MYC/MAX-TFAP2C复合物的结构预测。然后，使用pymol Align函数将所得的复合物与人类TFAP2A的晶体结构与DNA（PDB：8J0K）104进行对齐。使用PyMol105中的APB插件计算静电表面电荷。

　　有关研究设计的更多信息可在与本文有关的自然投资组合报告摘要中获得。

左文资讯声明：未经许可，不得转载。