
Data Submission FAQs
What type of data is BLOODPAC interested in to support research?
We are agnostic to the molecular profiling approach (eg, Next Generation Sequencing [NGS], polymerase chain reaction [PCR], enzyme-linked immunosorbent assay [ELISA] or other protein-based techniques).
Regarding the types of clinical or research samples we take following their molecular profiling, we are agnostic to the specific type of “bio-fluid” (eg, plasma, urine, pleural, CSF, saliva, etc.). Importantly, since solid tumor tissue or cells (eg, lymphoma, leukemia) are the gold standards for clinical diagnostics, we are especially interested in paired solid-liquid tumor samples. For instance from the same patient, NGS data from a lung cancer solid tumor along with NGS data from a blood plasma sample. We are very interested in comparing and contrasting NGS results (or other molecular profiling modalities) from matched solid tumor tissue with corresponding bio-fluid sample data.
The clinical context of the tumor samples and corresponding molecular profiling data in the BLOODPAC Data Commons are critically important. For instance, patient demographics, tumor pathology, salient clinical data such as treatment response, prior treatment regimens, findings from medical imaging, comorbidities, etc
What type of scientific datasets are accepted?
Initial datasets contributed to BLOODPAC should support one of the current BLOODPAC Consortium projects including:
Data supporting Project Exhale - patient data focusing on liquid tissue concordance in lung cancer
Data supporting one or more of the BLOODPAC Pre-Analytical MTDE’s
If you do not have a dataset to support the BLOODPAC data commons in one of the areas noted above, please suggest an alternative initial dataset to contribute and how it supports one or more of the BloodPAC Consortium projects.
Does BLOODPAC require any patient-level demographics (i.e gender)?
No.
What are the requirements for members to submit datasets to the BLOODPAC Data Commons?
A BLOODPAC data submission by a BLOODPAC Consortium Member should be accompanied by:
A published technical paper or a technical report.
The dataset itself.
A metadata file that provides metadata about the dataset.
A data dictionary that describes the variables in the dataset.
Additional information, if necessary, so that there is enough information to replicate some of the results in the paper (state which results), or reproduce one of the figures with scientific content in the paper (state which figure).
What are the minimum technical data elements for data submitted to the BLOODPAC Data Commons?
The BloodPAC Minimum Technical Data Element (MTDE) Working Group developed recommendations for 11 required preanalytic attributes that are essential for studies that it sponsors and for data contributed to the BloodPAC Data Commons. These 11 recommended preanalytic data elements (formerly MTDEs) along with the process used to identify them are described here.
What formats can I use to submit genomic data?
We accept the following four formats: BAM, CRAM, unaligned BAM, and FASTQ. For BAM/CRAM/uBAM, we expect proper read group information in the file header.
What is the best format for FASTQ submission?
One FASTQ tarball for each sample is expected, which can be either .tar or tar.gz. If multiple read groups exist per sample, include them all in the same tarball. Files inside of the tarball are named as “readgroupname_[12s].(fq|fastq)(.gz)?”. The postfix could be fq, fastq, fq.gz, fastq.gz. The prefixes are readgroupname_1 and readgroupname_2 for paired-ended reads; or readgroupname_s for single-ended reads. At this time, we cannot accept FASTQ chunks.
Are there filename expectations?
For BAM/CRAM/unaligned BAM, we expect proper read group information in the file header but have no restrictions on the file names.
What formats can I use to send processed / downstream genomic data?
We can accept VCFs. There is no standard naming/header convention for submitted VCFs. We prefer to also receive the raw data for those submitting VCFs.
May I submit data in increments?
Yes. Please fill out a new data inventory form for every dataset you submit. If you are going to submit data in three increments, please fill out the data inventory form for each of the three datasets.