BioD

A library for computational biology & bioinformatics.

Installing a D compiler

The easiest way is to download the relevant DMD compiler for your platform and follow your platform specific instructions to install the precompiled binaries. Users on MacOS can use homebrew to install dmd or the ldc compilers.

You can use your favourite text editor to create a file hello.d, type the following code,

print ready to use
1
2
3
4
5
import std.stdio;

void main() {
    writeln("I am ready to use BioD!");
}
Compile and run the program
compiling and running a D program with DMD
1
2
3
4
5
$ dmd hello.d  //compile it

$ ./hello     // run the compiled program

$ I am ready to use BioD
Congratulations, your D compiler is waiting for your instructions.

Using BioD in your project

To take advantage of BioD in your project, you can include the library as a dependency using DUB, D's package management utility. However if you do not like DUB, you can clone the library into your existing project directory and link to it during compiling step as shown below:

1
$ dmd -I../path_to_biod -i my_biod_source.d
dmd

Examples of using BioD

BioD is very efficient in manipulating BAM files(The compressed version of SAM files) and used to represet aligned sequences. A BAM file contains a header section and an alignment section.Header contain information about the file such as Sample name, length, and alignment method. The alignment section contains information bout the read name, read sequence, read quality, the alignment information and some custom tags.

  1. Reading a BAM file

    We can use the BioD library to read a BAM file. This example reads a BAM file and performs a query for specific columns. We create a BAM object and retrieve information about specific reads spanning a region. We then create a pileup and iterate over this pileup.

    Reading a BAM file
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    
    import std.stdio;
    
    import bio.std.hts.bam.reader;
    import bio.std.hts.bam.pileup;
    
    void main() {
    
        auto bam = new BamReader("my_file.bam");
        auto reads = bam["chr1"][150 .. 160];
        auto pileup = makePileup(reads,false,155, 158);
    
        foreach (column; pileup) {
            writeln("Reference position: ", column.position);
            writeln("    Coverage: ", column.coverage);
            writeln("    Reads:");
    
            foreach (read; column.reads) {
                writefln("%30s\t%s\t%.2d\t%s\t%2s/%2s\t%2s/%2s\t%10s\t%s %s",
                read.name,
                read.current_base,
                read.current_base_quality,
                read.cigar_operation,
                read.cigar_operation_offset + 1, read.cigar_operation.length,
                read.query_offset + 1, read.sequence.length,
                read.cigarString(),
                read.cigar_before, read.cigar_after);
            }
        }
    }
    
    We get this information from one of the positions.
    example output for a single position
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
    Reference position: 155
        Coverage: 11
        Reads:
            EAS221_3:4:30:1452:1563   A       27      35M     35/35   35/35          35M      [] []
            EAS114_45:1:77:1000:1780  A       22      35M     34/35   34/35          35M      [] []
            EAS114_45:4:48:310:473    A       24      35M     34/35   34/35          35M      [] []
            B7_591:2:279:124:41       A       20      36M     33/36   33/36          36M      [] []
            EAS112_32:8:89:254:332    A       26      35M     33/35   33/35          35M      [] []
            B7_597:7:103:731:697      A       25      35M     32/35   32/35          35M      [] []
            EAS139_11:2:71:83:58      A       27      9M       9/ 9    9/35      9M2I24M      [] [2I, 24M]
            EAS192_3:4:63:5:870       A       27      9M       9/ 9    9/35      9M2I24M      [] [2I, 24M]
            EAS139_19:2:29:1822:1881  A       27      7M       7/ 7    7/40      7M2I31M      [] [2I, 31M]
            EAS221_3:2:100:1147:124   G       27      35M      7/35    7/35          35M      [] []
            EAS192_3:8:6:104:118      G       27      35M      3/35    3/35          35M      [] []
    
  2. Reading multiple BAM files

    We can also read information from multiple bam files!

    reading multiple bam files
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    import bio.std.hts.bam.multireader;
    import bio.std.hts.bam.read : compareCoordinates;
    import bion.std.hts.bam.pileup;
    
    import std.algorithm;
    import std.conv;
    import std.stdio;
    
    void main(){
     // if the bam files can be merged, they can be transversed simultinously
     auto bam = new MultiBamReader(["../test/data/illu_20_chunk.bam", "../test/data/ion_20_chuck.bam"]);
     auto pileup = makePileup(bam.reads,true,32_000_083,32_000_089);
    
     foreach(column;pileup)
        writeln("Column position: ", column.position);
        writeln("   Ref.base: ", column.reference_base);
        writeln("   Coverage: ", column.coverage);
     }
    
    }
    
    Expected output
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    
    Column position: 32000083
        Ref.base: G
        Coverage: 23
     Column position: 32000084
         Ref.base: C
         Coverage: 23
     Column position: 32000085
         Ref.base: C
         Coverage: 23
      Column position: 32000086
         Ref.base: C
         Coverage: 24
      Column position: 32000087
         Ref.base: C
         Coverage: 24
      Column position: 32000088
         Ref.base: T
         Coverage: 24