Skip to content

Feature/query by row#745

Open
hongzhi-gao wants to merge 8 commits intoapache:developfrom
hongzhi-gao:feature/query-by-row
Open

Feature/query by row#745
hongzhi-gao wants to merge 8 commits intoapache:developfrom
hongzhi-gao:feature/query-by-row

Conversation

@hongzhi-gao
Copy link
Contributor

@hongzhi-gao hongzhi-gao commented Mar 17, 2026

Summary

Add queryByRow(paths/table, offset, limit) for both tree and table model. Results are equivalent to “full query then skip first offset rows and take at most limit rows,” but offset/limit are pushed down so that Chunk/Page-level skipping avoids decoding unnecessary data where possible.

Changes

Tree model

  • API: TsFileReader::queryByRow(path_list, offset, limit) / TsFileTreeReader::queryByRow(devices, measurements, offset, limit).
  • Pushdown: Single-path: set_row_range(offset, limit) on SSI → Chunk/Page skipped by count. Multi-path: offset/limit applied in merge loop; min_time_hint used to skip stale Chunks.
  • Tests: Correctness (no offset/limit, offset only, limit only, offset+limit, boundaries, multi-path merge) + QueryByRowFasterThanManualNext (timing: queryByRow faster than full query + manual next, 5% tolerance).

Table model

  • API: TsFileReader::queryByRow(table_name, column_names, offset, limit).
  • Pushdown:
    • Device: skip whole device when remaining_offset >= device_row_count (Dense (in this codebase) means: within one device, every queried column has the same number of rows and the same timestamps.).
    • SSI: when dense, set_row_range(offset, limit) on each column’s SSI → Chunk/Page skip by count.
    • TsBlock: when sparse or not fully consumed at SSI, offset/limit applied in merge loop.
  • Tests: Correctness (single/multi device, offset/limit, boundaries, equivalence with manual skip, SSI-level pushdown) + QueryByRowFasterThanManualNext (same timing check as tree).

Review focus

  • Semantics: queryByRow(offset, limit) matches “full query + skip offset + take limit” (existing equivalence tests).
  • Performance: New timing tests require queryByRow to be no slower than manual next within 5% (min of 5 runs); confirms pushdown is used in practice.

@hongzhi-gao hongzhi-gao deleted the feature/query-by-row branch March 17, 2026 09:23
@hongzhi-gao hongzhi-gao restored the feature/query-by-row branch March 17, 2026 09:24
@hongzhi-gao hongzhi-gao reopened this Mar 17, 2026
Comment on lines +507 to +510
bool ChunkReader::should_skip_page_by_time(int64_t min_time_hint) {
if (min_time_hint < 0) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though rare, the timestamp can be negative.
May use the minimum of int64 for this case.

Comment on lines +181 to +185
// Apply offset: skip this row.
if (remaining_offset_ > 0) {
remaining_offset_--;
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May skip setting row_record's field if offset > 0.

Comment on lines +328 to +342
} else {
if (time_iters_[index]) {
delete time_iters_[index];
time_iters_[index] = nullptr;
}
if (value_iters_[index]) {
delete value_iters_[index];
value_iters_[index] = nullptr;
}
if (tsblocks_[index]) {
ssi_vec_[index]->destroy();
tsblocks_[index] = nullptr;
}
ret = E_OK;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure status cannot be sensed in this branch. Is this safe?

for (size_t i = 0; i < lower_case_column_names.size(); ++i) {
auto ind = table_schema->find_column_index(lower_case_column_names[i]);
if (ind < 0) {
delete time_filter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is time_filter deleted here?


int TableQueryExecutor::query(const std::string& table_name,
const std::vector<std::string>& columns,
Filter* time_filter, Filter* id_filter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id_filter -> tag_filter

int ret = E_OK;
TsFileMeta* tsfile_meta = tsfile_executor_->get_tsfile_meta();
if (tsfile_meta == nullptr) {
return E_TSFILE_WRITER_META_ERR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TSFILE_WRITER ?


bool TsFileSeriesScanIterator::should_skip_chunk_by_time(
ChunkMeta* cm, int64_t min_time_hint) {
if (min_time_hint < 0 || cm->statistic_ == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware of negative timestamp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants