BDOSE v1.1 file format
Header
SNP identifiers from Z file (sequence of NSNPs blocks)
Sample IDs (sequence of Nsamples blocks)
Dosage data offsets
Dosage data (sequence of NSNPs blocks)
| Bytes | Description |
| 8 | Magic number (bdose1.1) |
| 4 | Unsigned integer indicating the length LBGEN_filename in bytes of the BGEN filename used to generate the BDOSE file |
| LBGEN_filename | Name of the BGEN file |
| 8 | Unsigned integer indicating the size SBGEN_file of the BGEN file in bytes |
| Min(1000, SBGEN_file) | First bytes of the BGEN file |
| 8 | Unsigned integer indicating the size SBDOSE_file of the BDOSE file in bytes |
| 4 | Unsigned integer indicating the number of samples NSamples in the BDOSE file |
| 4 | Unsigned integer indicating the number of SNPs NSNPs in the BDOSE file |
| 1 | Unsigned integer representing the compression level Clevel used in the BDOSE file |
| Clevel = 0 indicates 2 bytes | |
| Clevel = 1 indicates 4 bytes | |
| Clevel = 2 indicates 8 bytes | |
| Clevel = 3 indicates 1 byte | |
| 8 | Unsigned integer indicating the start position of the sample IDs in the BDOSE file |
| 8 | Unsigned integer indicating the start position of the dosage data offsets in the BDOSE file |
| 8 | Unsigned integer indicating the start position of the dosage data in the BDOSE file |
| Bytes | Description |
| 4 | Unsigned integer indicating the length LBlock of the SNP identifier block in bytes |
| 4 | Unsigned integer indicating the line in which the SNP appears in the Z file |
| 2 | Unsigned integer indicating the length Lrsid of the entry in column rsid of the Z file in bytes |
| Lrsid | Entry in column rsid of the Z file |
| 4 | Unsigned integer indicating the entry in column position of the Z file |
| 2 | Unsigned integer indicating the length Lchromosome of the entry in column chromosome of the Z file in bytes |
| Lchromosome | Entry in column chromosome of the Z file |
| 4 | Unsigned integer indicating the length Lallele1 of the entry in column allele1 of the Z file in bytes |
| Lallele1 | Entry in column allele1 of the Z file |
| 4 | Unsigned integer indicating the length Lallele2 of the entry in column allele2 of the Z file in bytes |
| Lallele2 | Entry in column allele2 of the Z file |
| LBlock = 20 + Lrsid + Lchromosome + Lallele1 + Lallele2 number of bytes for the SNP identifier block | |
| Bytes | Description |
| 4 | Unsigned integer indicating the length Lsample_ID of the sample ID in bytes |
| Lsample_ID | Sample ID |
| Bytes | Description |
| 8 × NSNPs | Unsigned integers indicating the start position of compressed dosages data for each SNP |
| Bytes | Description | |
| 4 | Unsigned integer indicating the size Scompressed of the compressed dosage data in bytes | |
| 4 | Unsigned integer indicating the size Suncompressed of the uncompressed dosage data in bytes | |
| Scompressed - 4 | Zstandard compressed dosage data in integer format | |
| Missing values are coded as follows | ||
| y = 53248 | if Clevel = 0 | |
| y = 3489660928 | if Clevel = 1 | |
| y = 14987979559889010688 | if Clevel = 2 | |
| y = 208 | if Clevel = 3 | |
| Convert a dosage value from integer format to floating-point format with the following transformation | ||
| x = 2(2 - 8 × Nbytes ) × y | ||
| See Clevel for the number of bytes Nbytes | ||