Skip to content

Commit

Permalink
Fix bed_index_core() handling of overlapping entries
Browse files Browse the repository at this point in the history
It needs to skip if the current entry ends before the previous
one, which can happen if the input BED file has entries that
are completely contained in an earlier region.  Failing to
do this could cause last_end to jump backwards, resulting
in some incorrect entries being put in the index.  This could
lead to hits between the end of the innermost and outermost
nested BED entries being missed.

Fixes samtools#2104 ("samtools view -L " incomplete output for BED with
nested targets)
  • Loading branch information
daviesrob committed Aug 28, 2024
1 parent a621383 commit 8433eee
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 0 deletions.
2 changes: 2 additions & 0 deletions bedidx.c
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,8 @@ static int bed_index_core(bed_reglist_t *regions)
hts_pos_t beg = a[i].beg >= 0 ? a[i].beg >> LIDX_SHIFT : 0;
hts_pos_t end = a[i].end >= 0 ? a[i].end >> LIDX_SHIFT : 0;
hts_pos_t j;
if (end < last_end)
continue; // Can happen for a containment
if (end + 1 >= SIZE_MAX / sizeof(*idx)) { // Ensure no overflow
errno = ENOMEM;
free(idx);
Expand Down
3 changes: 3 additions & 0 deletions test/dat/nested.bed
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
ref2 1 436871168 reg1a
ref2 8192 8193 reg1b
ref2 541556270 541556290 reg3
34 changes: 34 additions & 0 deletions test/dat/nested.expected.sam
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
@HD VN:1.4 SO:coordinate
@RG ID:grp1 DS:Group 1 LB:Library 1 SM:Sample
@RG ID:grp2 DS:Group 2 LB:Library 2 SM:Sample
@RG ID:grp3 DS:Group 3 LB:Library 3 SM:Sample
@PG ID:prog1 PN:emacs CL:emacs VN:23.1.1
@CO
@CO Copyright (C) 2014 Genome Research Ltd.
@CO
@CO Permission is hereby granted, free of charge, to any person obtaining
@CO a copy of this software and associated documentation files (the
@CO "Software"), to deal in the Software without restriction, including
@CO without limitation the rights to use, copy, modify, merge, publish,
@CO distribute, sublicense, and/or sell copies of the Software, and to
@CO permit persons to whom the Software is furnished to do so, subject
@CO to the following conditions:
@CO
@CO The above copyright notice and this permission notice shall be
@CO included in all copies or substantial portions of the Software.
@CO
@CO THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
@CO EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
@CO MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
@CO IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
@CO CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
@CO TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
@CO SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@CO
@SQ SN:ref1 LN:56 M5:08c04d512d4797d9ba2a156c1daba468
@SQ SN:ref2 LN:541556283 M5:7c35feac7036c1cdef3bee0cc4b21437
ref2_grp3_p001 83 ref2 1 99 15M = 31 45 ATTCTATAGTGTCAC ~~~~~~~~~~~~~~~ RG:Z:grp3 NM:i:0 MD:Z:15
ref12_grp1_p001 145 ref2 2 50 10M ref1 36 0 TTCTATAGTG BBBBBBBBBB RG:Z:grp1 NM:i:0 MD:Z:10
ref12_grp2_p001 145 ref2 12 50 10M ref1 46 0 TCACCTAAAT BBBBBBBBBB RG:Z:grp2 NM:i:0 MD:Z:10
ref2_grp3_p002 147 ref2 436870911 99 15M = 46 45 CTAAATAGCTTGGCG }}}}}}}}}}}}}}} RG:Z:grp3 NM:i:0 MD:Z:15
ref2_grp3_p002 99 ref2 541556280 99 15M = 16 -45 CTGTTTCCTGTGTGA {{{{{{{{{{{{{{{ RG:Z:grp3 NM:i:0 MD:Z:15
9 changes: 9 additions & 0 deletions test/test.pl
Original file line number Diff line number Diff line change
Expand Up @@ -2368,6 +2368,15 @@ sub test_view
}
}

# -L option with nested regions
run_view_test($opts,
msg => "$test: -L with nested regions",
args => ['-h', '-L', "$$opts{path}/dat/nested.bed", '--no-PG',
"$$opts{path}/dat/large_chrom.sam"],
out => sprintf("%s.test%03d.sam", $out, $test),
compare => "$$opts{path}/dat/nested.expected.sam");
$test++;

# -T / -t options
my $sam_no_sq = "$$opts{tmp}/view.001.no_sq.sam";
filter_sam($sam_no_ur, $sam_no_sq, {no_sq => 1});
Expand Down

0 comments on commit 8433eee

Please sign in to comment.