BAM are usually used to store short read alignments. The size of BAM files is usually in the range of the gigabytes. As such, a backend server is required for any practical applications. For the development of a realistic use case of the BioJS BAMViewer. My first open source software is a binder for samtools in ruby (https://github.com/helios/bioruby-samtools). As I know pretty well the library I thought I could use it for my backend. Originally, I intended to use Rails, but at the end I settled up to Sintra(http://www.sinatrarb.com) as the web server.
As I intend to use this service to make a BAM viewer, I implemented the following functions:, I query the list of references (so, you can list all the chromosomes without having them in a database),
- alignment: Gets the alignments in a region
- wig: Converts the BAM file to wig, I will use it to display the coverage across the region
- reference: Returns the string of the region. This way, you don’t need to load a full chromosome.
- list: gets all the reference chromosome/contigs.
Each service has a function to dispatch its content, but since creating the object containing the sam file is the same across them, I have a function to wrap this behaviour
def get_bam(bam,reference) return @bam_files[bam] if @bam_files[bam] bam_path = "#{self.settings.folder.to_s}/#{bam}.bam" reference_path = "#{self.settings.reference.to_s}/#{reference}" return nil unless File.file?(bam_path) @bam_files[bam] = Bio::DB::Sam.new( :fasta => reference_path, :bam => bam_path ) return @bam_files[bam] end
Then, each query has a method to dispatch it, the alignment function is:
get '/*/*/alignment' do |ref, bam| region = params[:region] reg = Bio::DB::Fasta::Region.parse_region(region) stream do |out| get_bam(bam, ref).fetch(reg.entry, reg.start, reg.end) do |sam| out << "#{sam.sam_string}\n" end end end
For the wig output, I didn’t have in the library a function to
convert from the SAM file, so I monkey patched the Region class, and
with that I can generate wig files with arbitrary bin sizes.
class Bio::DB::Fasta::Region def to_wig(opts={}) step_size = opts[:step_size] ? opts[:step_size] : 1 out = StringIO.new out << "fixedStep chrom=#{self.entry} span=#{step_size}\n" return out.string if self.pileup.size == 0 current_start = self.pileup[0].pos current_acc = 0.0 current_end = current_start + step_size self.pileup.each do |pile| if pile.pos >= current_end out << current_start << " " << (current_acc/step_size).to_s << "\n" current_start = pile.pos current_acc = 0.0 current_end = current_start + step_size end current_acc += pile.coverage end out.string end end
The source code can be found in https://github.com/homonecloco/bioruby-samtools-server
No hay comentarios:
Publicar un comentario