Add early culling for lines fully left of the viewport.#1368
Add early culling for lines fully left of the viewport.#1368
Conversation
There was a problem hiding this comment.
Very cool, and I think this is roughly what it should look like. I do have a few observations.
Early culling is conditionally enabled based on whether a culling opportunity was detected in the previous strip generation pass.
I'm not sure culling_opportunity are the best semantics. A path having had a culling opportunity may not be a good predictor of culling being possible the next time generate_with_clip is called, and same if the path didn't have a culling opportunity, the next path may in actuality well have one. These semantics will of course mean that we're always going to miss the performance benefit of culling at least during the first path in a sequence of cullable paths.
Perhaps, instead we should be conditional only on whether we want culling at all -- for example, if we're caching strips, perhaps all types of culling should be disabled (including culling where we don't account for winding, i.e., above, to the right, and below the viewport). In that case it may still make sense sense to have the dispatch be const-generic over the culling, as you have now, but perhaps the difference in performance can be so minimal that we don't have to bother.
E.g., for tiling, with the current implementation it doesn't seem to matter if we always cull or not:
tile/Ghostscript_Tiger time: [184.92 µs 185.14 µs 185.41 µs]
tile/Ghostscript_Tiger-cull time: [181.96 µs 182.34 µs 182.84 µs]
That of course doesn't preclude dispatching to a strip rendering implementation that does or does not use culling, depending on whether we've culled something. That may well be important for performance.
Should also benchmark strip rendering, but that requires some bigger changes to the benchmark code...
It does seem both tiling and strip rendering are about 6% slower with these changes, but there is a little bit of noise so that has to be measured a bit more carefully (and this can perhaps be optimized more).
Currently this does not use the bitvec, and instead traverses the histogram.
We have to carefully check what the performance impact is of this where rendering a bunch of small geometry (like glyphs) which will usually just occupy a few rows.
These are good points! For more context, we made the decision to make the culling conditional on the previous So here's a thought: WDYT about culling always enabled in Bitvec to come! |
Yes! In fact, I edited my message (probably as you were typing) to say this:
|
|
@LaurenzV do you know if it's safe to early cull when the storage generation mode is not For safety, I make the assumption no, but after thinking about it, I think it's fine? Once the strips and alpha are generated (and cached), there's no need to hold onto the culled winding contributions (similarly to how tiles are also cleared regardless of the storage mode)? |
|
Otherwise, ready for review. |
|
I don't believe it should make a difference for the generation mode, as long as the strips are there it should be fine! |
|
Early culling is now always enabled. I've retained the |
|
I have not forgotten about this, but haven't had a chance for a proper look yet! |
0333485 to
eb1a069
Compare
|
PTAL! Note, the behavior is different than what I had before. Now, left edge crossing lines are culled (if possible) inside Re moving culling earlier into Re using clip instead of viewport rect: I will add that as a follow on CL. |
|
Added more testing, some benchmarking, and fixed the issue identified with hayro. |
1f4c083 to
17e6d54
Compare
|
PDF tests seem to pass now! |
LaurenzV
left a comment
There was a problem hiding this comment.
Still need to review strip.rs, but here some first comments!
| alpha_buf.len() as u32, | ||
| false, | ||
| )); | ||
| alpha_buf.extend([255_u8; Tile::HEIGHT as usize * Tile::WIDTH as usize]); |
There was a problem hiding this comment.
I'm wondering if those are necessary (same for above), but not for you to figure out. For now, that's definitely the better choice, unsure if there is code that assumes strips have at least a width of 4.
There was a problem hiding this comment.
Not sure about other downstream affects, but they are necessary for at least clipping. Without injecting in the alphas, this:
#[inline(always)]
fn cur_strip_width(&self) -> u16 {
let cur = self.cur_strip();
let next = self.next_strip();
((next.alpha_idx() - cur.alpha_idx()) / Tile::HEIGHT as u32) as u16
}
returns 0, resulting in a 0 width fill region, so downstream in:
fn overlap_relationship(&self, other: &Region<'_>) -> OverlapRelationship {
if self.end() <= other.start() {
OverlapRelationship::Advance(Advance::Left)
} else if self.start() >= other.end() {
OverlapRelationship::Advance(Advance::Right)
} else {
. . .
}
}
if you have a clip that also starts at the left edge (also possibly culled?), you get:
if self.end() <= other.start() { // 0 <= 0 (Evaluates to TRUE)
OverlapRelationship::Advance(Advance::Left)
}
Resulting in no overlap
|
PTAL! |
Apologies this took an embarrassingly long time due to a funny bug. Adds changes as discussed.
UPDATED:
Lines left of the viewport are not visible but still contribute winding, creating a left to right dependency for rendering. This patch simplifies rendering in this scenario by culling the left-of-viewport portion of lines, and recording their equivalent winding contribution inside a histogram. If a line is entirely left of the viewport, it is completely culled and produces no tiles.
Correspondingly
strip.rshas been updated to consume this histogram. Notably,strip.rsnow handles cases where no tiles are present, but culling occured, and cases where winding is "captive" either between left edge of the viewport and the first tile in the row, or between two non-consecutive rows.Removes the previous handling of left of edge lines in
strip.rs.No culling, Ghostscript_Tiger shifted 50% off screen:
With culling: