The number of reads retained at each processing step. Shown as raw counts (bar labels) and percentages (bar height) of the total number of reads in the previous step.
AIRR-Seq - Preprocessing
An adaptive immune repertoire sequencing (AIRR-Seq) pipeline, part of the Online Pipelines Platform (OP²).
Pipeline overview
Our OP² immune repertoire sequencing pipeline is a bioinformatics workflow used to characterize the complementary determining region (CDR) of T-cell receptors (TCRs). You get insights on the quality of your data, overview of the clonotype contents, somatic hypermutation rates, amino acid properties and gene usage.
The workflow processes raw data from FastQ input files, aligns the sequences to the germline database and then proceeds to the immunoprofiling of the samples. The results are made available via two reports, and the data is provided in the standardized AIRR format to perform downstream analyses. The pre-processing workflow processes the raw sequence data until the sequences are aligned against the IMGT germline reference. The post-processing workflow provides a set of analyses and matrics to provide basic characteristics and insights on the immune repertoire.
See the pipeline page for a more detailed overview.
Do you have any question about these results? Just email us at helpdesk@excelra.com
Report info
- Generated on
- 2025-02-20, 21:01 UTC
- Experiment
- Experiment_Bulk_TCR
- pipeline
- AIRR-Seq
- Report
- Pre-processing Report
- Species
- mouse
- Species Build
- tr
General Statistics
Showing 36/36 rows and 4/6 columns.| Sample Name | % Dups | % GC | Median Read Length | M Seqs |
|---|---|---|---|---|
| 10_TCRbeta_NOD_Rep2_R1 | 68.2% | 49% | 300 bp | 4.1 |
| 10_TCRbeta_NOD_Rep2_R2 | 91.6% | 48% | 300 bp | 4.1 |
| 11_TCRbeta_B6_Rep3_R1 | 54.1% | 48% | 300 bp | 5.7 |
| 11_TCRbeta_B6_Rep3_R2 | 95.6% | 48% | 300 bp | 5.7 |
| 12_TCRbeta_NOD_Rep3_R1 | 58.1% | 48% | 300 bp | 5.5 |
| 12_TCRbeta_NOD_Rep3_R2 | 95.3% | 48% | 300 bp | 5.5 |
| 1_TCRalpha_B6_Rep1_R1 | 94.1% | 50% | 300 bp | 3.1 |
| 1_TCRalpha_B6_Rep1_R2 | 95.3% | 48% | 300 bp | 3.1 |
| 2_TCRalpha_NOD_Rep1_R1 | 93.5% | 49% | 300 bp | 3.8 |
| 2_TCRalpha_NOD_Rep1_R2 | 93.5% | 48% | 300 bp | 3.8 |
| 3_TCRalpha_B6_Rep2_R1 | 75.7% | 49% | 300 bp | 3.8 |
| 3_TCRalpha_B6_Rep2_R2 | 91.5% | 48% | 300 bp | 3.8 |
| 4_TCRalpha_NOD_Rep2_R1 | 82.7% | 48% | 300 bp | 4.4 |
| 4_TCRalpha_NOD_Rep2_R2 | 91.8% | 48% | 300 bp | 4.4 |
| 5_TCRalpha_B6_Rep3_R1 | 79.6% | 47% | 300 bp | 6.9 |
| 5_TCRalpha_B6_Rep3_R2 | 96.4% | 48% | 300 bp | 6.9 |
| 6_TCRalpha_NOD_Rep3_R1 | 76.0% | 47% | 300 bp | 6.6 |
| 6_TCRalpha_NOD_Rep3_R2 | 96.4% | 48% | 300 bp | 6.6 |
| 7_TCRbeta_B6_Rep1_R1 | 85.2% | 46% | 300 bp | 6.4 |
| 7_TCRbeta_B6_Rep1_R2 | 96.4% | 47% | 300 bp | 6.4 |
| 8_TCRbeta_NOD_Rep1_R1 | 80.5% | 47% | 300 bp | 4.8 |
| 8_TCRbeta_NOD_Rep1_R2 | 96.6% | 48% | 300 bp | 4.8 |
| 9_TCRbeta_B6_Rep2_R1 | 70.9% | 49% | 300 bp | 3.3 |
| 9_TCRbeta_B6_Rep2_R2 | 91.7% | 48% | 300 bp | 3.3 |
| SRR12772040_L001_atleast-2 | 90.8% | 49% | 454 bp | 0.0 |
| SRR12772041_L001_atleast-2 | 93.1% | 50% | 464 bp | 0.0 |
| SRR12772042_L001_atleast-2 | 87.5% | 49% | 404 bp | 0.0 |
| SRR12772043_L001_atleast-2 | 89.6% | 49% | 404 bp | 0.0 |
| SRR12772044_L001_atleast-2 | 81.1% | 49% | 454 bp | 0.0 |
| SRR12772045_L001_atleast-2 | 82.3% | 49% | 444 bp | 0.0 |
| SRR12772046_L001_atleast-2 | 70.9% | 49% | 384 bp | 0.0 |
| SRR12772047_L001_atleast-2 | 88.8% | 49% | 374 bp | 0.0 |
| SRR12772048_L001_atleast-2 | 89.4% | 48% | 374 bp | 0.0 |
| SRR12772049_L001_atleast-2 | 88.0% | 48% | 374 bp | 0.0 |
| SRR12772050_L001_atleast-2 | 88.0% | 48% | 344 bp | 0.0 |
| SRR12772051_L001_atleast-2 | 93.9% | 49% | 464 bp | 0.0 |
Summary of Processing Steps
Quality Scores
Quality filtering is an essential step in most sequencing workflows. Phred quality scores are assigned to each nucleotide base call in automated sequencer traces. The quality score of a base call is logarithmically related to the probability that a base call is incorrect. The most commonly used approach is to remove reads with average quality score below 20, i.e. when a base call is incorrectly assigned 1 in 100 times. pRESTO’s FilterSeq tool removes reads with mean Phred quality scores below 20.