BAM are usually used to store short read alignments. The size of BAM files is usually in the range of the gigabytes. As such, a backend server is required for any practical applications. For the development of a realistic use case of the BioJS BAMViewer. My first open source software is a binder for samtools in ruby (https://github.com/helios/bioruby-samtools). As I know pretty well the library I thought I could use it for my backend. Originally, I intended to use Rails, but at the end I settled up to Sintra(http://www.sinatrarb.com) as the web server.
As I intend to use this service to make a BAM viewer, I implemented the following functions:, I query the list of references (so, you can list all the chromosomes without having them in a database),
- alignment: Gets the alignments in a region
- wig: Converts the BAM file to wig, I will use it to display the coverage across the region
- reference: Returns the string of the region. This way, you don’t need to load a full chromosome.
- list: gets all the reference chromosome/contigs.
Each service has a function to dispatch its content, but since creating the object containing the sam file is the same across them, I have a function to wrap this behaviour
def get_bam(bam,reference)
return @bam_files[bam] if @bam_files[bam]
bam_path = "#{self.settings.folder.to_s}/#{bam}.bam"
reference_path = "#{self.settings.reference.to_s}/#{reference}"
return nil unless File.file?(bam_path)
@bam_files[bam] = Bio::DB::Sam.new(
:fasta => reference_path,
:bam => bam_path
)
return @bam_files[bam]
end
Then, each query has a method to dispatch it, the alignment function is:
get '/*/*/alignment' do |ref, bam|
region = params[:region]
reg = Bio::DB::Fasta::Region.parse_region(region)
stream do |out|
get_bam(bam, ref).fetch(reg.entry, reg.start, reg.end) do |sam|
out << "#{sam.sam_string}\n"
end
end
end
For the wig output, I didn’t have in the library a function to
convert from the SAM file, so I monkey patched the Region class, and
with that I can generate wig files with arbitrary bin sizes.
class Bio::DB::Fasta::Region
def to_wig(opts={})
step_size = opts[:step_size] ? opts[:step_size] : 1
out = StringIO.new
out << "fixedStep chrom=#{self.entry} span=#{step_size}\n"
return out.string if self.pileup.size == 0
current_start = self.pileup[0].pos
current_acc = 0.0
current_end = current_start + step_size
self.pileup.each do |pile|
if pile.pos >= current_end
out << current_start << " " << (current_acc/step_size).to_s << "\n"
current_start = pile.pos
current_acc = 0.0
current_end = current_start + step_size
end
current_acc += pile.coverage
end
out.string
end
end
The source code can be found in https://github.com/homonecloco/bioruby-samtools-server

No hay comentarios:
Publicar un comentario