Figure 7 depicts the number of shared clonotypes in the CF twin samples. Frequency distribution of amino acid clonotypes present in two or more CF siblings. In this presentation we did not distinguish between related and unrelated twin pairs.

utilizing the OLGA algorithm recently published by Sethna and colleagues (39), we calculated the probability of generating CDR3 amino acid sequences for various subsets of shared clonotypes.

Of the CDR3 amino acid sequences exclusively identified in both siblings of a twin pair, the median probability of generating one of these clonotypes was computed to be 3. A 2-fold higher median generation probability of 6.

We were curious to know whether and to which extent the shared CDR3 sequences had already been detected in healthy individuals or patients with other diseases. Presence of frequent amino acid clonotypes of CF siblings in the samples deposited in the immuneACCESS database. These public clones are very frequent within the immuneACCESS database.

The majority of these clonotypes is only present with low frequency in the immuneACCESS database. In contrast, clonotypes only present in twins or in unrelated samples are comparably rare. Just 73 shared clonotypes had not been detected before in the other projects, 31 of which in a twin pair and 42 in unrelated CF patients.

Their median generation probability was computed to be 7. These values are substantially lower than calculated for the three sets of shared clonotypes described in the previous paragraph.

A subset of these sequences may have emerged in the monozygotic CF twins during shared exposure to opportunistic pathogens as a consequence of their inherited susceptibility to infection. Consistent with this interpretation none of these 73 clonotypes was listed in the VDJdb database (38) that contains clonotypes with known specificities. Conversely, 47 of the 84 clonotypes that were identified in more than half of the CF twin samples could be mapped to the epitopes listed for Mycobacterium tuberculosis (one entry) and one or more viruses (67 entries) in the VDJdb database (38).

Forty clonotypes coin usd assigned to CMV epitopes and 11 clonotypes to EBV epitopes, eight of which were mapped to both CMV and EBV epitopes.

These 84 most abundant clonotypes in the CF twin cohort have a median generation probability of 1. The three data subsets of shared clonotypes of our CF cohort (twin pair only, unrelated patients only, twin pairs and unrelated patients) were analyzed for their frequency distribution among the non-CF samples of the immuneACCESS database (Figure 8, Table 4).

If a clonotype had only been identified in the two sibs of a twin pair, it generally was infrequently present in the non-CF samples. Conversely, if a clonotype had been identified in the two siblings of a twin pair and in addition in unrelated CF patients, the frequency among non-CF samples showed a skewed Gaussian distribution (Figure 8B).

This sub-set of CF clonotypes mainly consists of widely distributed public clones (Table 4), i. Since these frequent CDR3 sequences are shared by both monozygotic twin pairs and unrelated individuals, we hypothesized that they should represent a pool of sequences that are non-randomly selected by both global and individual genetic and environmental factors. The OLGA algorithm predicted a high median generation probability of 2.

These conserved public sequences are central within TCR sequence-similarity networks (47, 48). Hence we would like to conclude that the clonotypes that are shared within twin pairs and between unrelated CF sibs represent public clones that are common in healthy humans.

Frequency distribution of shared CF amino acid clonotypes in samples of healthy humans (A), patients with infectious disease (B) and patients with cancer (C) deposited in the immuneACCESS database. Phe508del homozygous CF twins. In contrast to expectation we identified only a few yet unknown CDR3 sequences at amino acid sequence level. We predominantly examined CF children and adolescents who were carrying individual CDR3 sequences at low copy numbers.

Individual clonal expansions were primarily seen in the clinically most severely affected patients suggesting that clonality may be higher in CF adults with more advanced lung disease.

Alternatively, the compartmentalization in CF airways, with neutrophils accumulating in the lumen, whereas T cells stay in the submucosa and lymph nodes and are excluded from the lumen (9), could prevent a more active role of T cells against the luminal pathogens and the neutrophil-driven chronic inflammation.

Previous studies in healthy adults (18, 21, 51) have shown a genetic influence on the usage of V and to a lesser non-significant extent on the usage of J gene segments. In contrast we observed a significant genetic influence on the usage of both V and J segments in our monozygotic CF twin cohort which may be the consequence of the fact that the previous analyses examined only three (21), five or six pairs (18) whereas this study is based on 16 twin pairs.

This genetic bias showed up in the Jensen-Shannon distance of V and J segment usage that was typically shortest to the twin individual (Figure 7). Recently the T-cell receptor repertoire has been reported for 28 monozygotic twin pairs of whom one or both twins were affected by the polyfactorial inflammatory bowel disease (IBD) (52). Deep sequencing of the TCR repertoire in cord blood and healthy twins and unrelated individuals of different age (23) have provided convincing evidence that a large portion of these clones are derived from the same progenitor T cells generated during fetal development.

However, besides the persistence of fetal clones further mechanisms should exist that generated an increased repertoire of shared clonotypes among monozygotic CF twin pairs.

The subset of shared clonotypes in the CF cohort showed divergent frequency distributions in the non-CF cohort. Clonotypes were typically re-identified in only few non-CF individuals if the sequence was shared by A) either the sibs of a twin pair or B) two or more unrelated CF patients. In both scenarios (A) or (B) the frequency distribution of clonotypes in the non-CF cohort showed an exponential decline indicating that the probability is comparatively low that the same clonotype will re-emerge in an unrelated individual.

In contrast, if both scenarios (A) and (B) applied, the respective clonotypes were widely distributed in the non-CF cohort (cf. These CDR3 sequences are public clones in the narrow sense which survive thymic selection in healthy humans with high probability (Figure 9). Public clones have been frequently detected in mouse models of autoimmune and infectious diseases, but the immuneACCESS database tells us that these clones are characteristic for healthy humans.

The genetic constraints of TCR repertoire formation showed up in highly similar distributions of the usage of J and V genes of in-frame clonotypes in a CF twin pair. The original contributions presented in the study are publicly available.

The studies involving human participants were reviewed and approved by Ethics Committee of Hannover Medical School (no. FS and BT planned the study. SF and BT analyzed the data. SF and BT wrote the manuscript.



