Skip to content
This repository was archived by the owner on Sep 12, 2018. It is now read-only.

Conversation

@ncalexan
Copy link
Member

Datomic used to expose :db.fn/retractEntity and
:db.fn/retractAttribute, but there latest Cloud offering has only
:db/retractEntity. Since that's what I am interested in using, that's
all I've implemented.

This transformation doesn't follow the existing pattern of inserting
into the temp.*_searches table. It instead populates
temp.search_results directly from datoms, and in so doing it makes
some assumptions about how the searches tables will be accessed.

It might be even more efficient to have an entirely new temporary
table just for these retractions. One advantage with the current
scheme is that indexing restrictions placed on the search results
table will apply to the datoms retracted by :db/retractEntity as well.

There are a few remaining items TODO.

TODO: ensure that we mark the schema as potentially modified when we
:db/retractEntity. It's not clear to me how to do this efficiently.

TODO: ensure that transaction watchers get the correct transacted
datom stream. I didn't try to address this yet because it appears to
me that the existing watcher implementation isn't quite correct: it
tells watchers about datoms that are potentially to be transacted but
not about datoms that actually are transacted. This, of course,
matters when watching :db/retractEntity entities being transacted.

Datomic used to expose `:db.fn/retractEntity` and
`:db.fn/retractAttribute`, but there latest Cloud offering has only
:db/retractEntity.  Since that's what I am interested in using, that's
all I've implemented.

This transformation doesn't follow the existing pattern of inserting
into the temp.*_searches table.  It instead populates
temp.search_results directly from datoms, and in so doing it makes
some assumptions about how the searches tables will be accessed.

It might be even more efficient to have an entirely new temporary
table just for these retractions.  One advantage with the current
scheme is that indexing restrictions placed on the search results
table will apply to the datoms retracted by :db/retractEntity as well.

There are a few remaining items TODO.

TODO: ensure that we mark the schema as potentially modified when we
:db/retractEntity.  It's not clear to me how to do this efficiently.

TODO: ensure that transaction watchers get the correct transacted
datom stream.  I didn't try to address this yet because it appears to
me that the existing watcher implementation isn't quite correct: it
tells watchers about datoms that are potentially to be transacted but
not about datoms that actually are transacted.  This, of course,
matters when watching :db/retractEntity entities being transacted.
@ncalexan
Copy link
Member Author

@rnewman @fluffyemily this is ready for initial review.

I have an open question about transaction watching, and don't see any documentation. Right now, it appears that a transaction watcher will be told about any datom flowing through the system, whether it is or is not transacted. That is, a duplicate datom that is dropped because it's already asserted will be provided to a transaction watcher. That seems wrong to me, but I'm not sure what the intention actually is.

This matters for :db/retractEntity because it's non-trivial work to surface the set of datoms that are actually retracted.

@ncalexan ncalexan requested review from fluffyemily and rnewman April 20, 2018 22:25
@rnewman
Copy link
Collaborator

rnewman commented Apr 23, 2018

Re tx watchers: yes, to avoid reading the transacted datoms back out of the store, we just use the input to the transactor. Naturally we might not get around reading the transacted datoms (or some aspects of them) for synced transactions, and for retractEntity we probably just want a way to tell the listener that an entity was retracted. The idea is that a watcher can always go to the store with the tx-id and interrogate it; the notification itself is as much detail as the transactor can cheaply provide.

WHERE d.e IN ({}) OR (d.value_type_tag = 0 AND d.v IN ({}))
"#, values, values);

let mut stmt = self.prepare_cached(s.as_str())?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're preparing the statement here but only using it once.

Your entity IDs are always safe to inline in the SQL, so I think it's better to do that (and thus avoid the max_vars check) unless you're reusing the prepared statement.

}).collect();

// Two copies, first for testing e, then for testing v.
// TODO: perhaps we can reference the same named parameter multiple times, making this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can: ?1 is the first supplied parameter, ?2 the second, and so on. (Yes, 1-indexed.)

// search type is also ignored.
let s: String = format!(r#"
INSERT INTO temp.search_results
SELECT d.e, d.a, d.v, d.value_type_tag, 0, -1, ':db.cardinality/many', d.rowid, d.v
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is search_type really an ident string?! Can we please have an issue to use a TINYINT instead? That string will be duplicated for every row…

}

/// Insert datoms corresponding to :db/retractEntity entities into the search results.
fn insert_retract_entities(&self, entids: &[Entid]) -> Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to allow retracting:

  • Core vocabulary.
  • Attributes.
  • Transactions.

Could you add validation here?

assert_matches!(conn.datoms(),
"[[333 :db/ident :test/dangling]
[333 :db/valueType :db.type/ref]]");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test to make sure I can't retract a tx-id or an attribute.

},

Term::RetractEntity(KnownEntid(e)) => {
// TODO: Might update metadata?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't allow retracting vocabulary, so this might update idents… but that's a can of worms.


Term::RetractEntity(KnownEntid(e)) => {
// TODO: Might update metadata?
// TODO: Watching/counting datoms?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the observer service needs to be extended to be told about retracted entities.

self.store.insert_fts_searches(&fts_many[..], db::SearchType::Exact)?;
}

self.store.insert_retract_entities(&retract_entities[..])?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this first, if only for conceptual clarity?

@ncalexan
Copy link
Member Author

ncalexan commented May 2, 2018

I realize that this also is not recursing to retract :db/isComponent attributes. In fact, we don't retract :db/isComponent attributes at all right now, and that's not clearly messaged.

I ended up building some substantial work on top of this (#673) and then, in the course of that work, started to doubt this implementation strategy. Fundamentally, this represents RetractEntity as an Entity (high-level) and as a Term (low-level). That's awkward for the term building apparatus, and I think that awkwardness speaks to an interface mismatch: maybe we shouldn't be exposing high-level concepts to the low-level interface? If it's better to keep Term low-level, then the implementation of RetractEntity might instead query the store to determine relevant terms as part of the transactor pipeline from Entity to Term, rather than pushing the work to a lower level. That's probably slightly less efficient, but it might simplify the expression of the transactor. I'm particularly interested in the same question for excision (#21).

In light of this uncertainty, I'm not going to try to push this across the line just now.

@rnewman
Copy link
Collaborator

rnewman commented May 2, 2018

I don't see much wrong with querying for the work to do. Also useful for notifying observers.

You might consider introducing a wildcard value for retractions to avoid pulling larger values out of the DB — I think you can do everything you need using only e, a.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants