Skip to content

Conversation

@akshaymankar
Copy link
Member

@akshaymankar akshaymankar commented Aug 11, 2021

The API will only support getting metadata of a given list of ids.
https://wearezeta.atlassian.net/wiki/spaces/CORE/pages/477660475/Federated+Conversations+Pagination

Checklist

  • The PR Title explains the impact of the change.
  • The PR description provides context as to why the change should occur and what the code contributes to that effect. This could also be a link to a JIRA ticket or a Github issue, if there is one.
  • If end-points have been added or changed: the endpoint / config-flag checklist (see Wire-employee only backend wiki page) has been followed.
  • If a schema migration has been added, I ran make git-add-cassandra-schema to update the cassandra schema documentation.
  • Section Unreleased of CHANGELOG.md contains the following bits of information:
    • A line with the title and number of the PR in one or more suitable sub-sections.
    • If /a: measures to be taken by instance operators.
    • If /a: list of cassandra migrations.
    • If public end-points have been changed or added: does nginz need upgrade?
    • If internal end-points have been added or changed: which services have to be deployed in a specific order?

@akshaymankar akshaymankar changed the title Allow only requests with conversation ids in POST /list-conversations Introduce new endpoint to deal allow fetching conversation metadata by qualified ids Aug 11, 2021
@akshaymankar akshaymankar force-pushed the akshaymankar/list-convs-only-by-id branch from 241062f to d81418b Compare August 17, 2021 14:22
@akshaymankar akshaymankar force-pushed the akshaymankar/list-convs-only-by-id branch from 5963d5e to cc7d45a Compare August 19, 2021 09:55
@akshaymankar akshaymankar changed the title Introduce new endpoint to deal allow fetching conversation metadata by qualified ids Introduce new endpoint to allow fetching conversation metadata by qualified ids Aug 19, 2021
@akshaymankar akshaymankar marked this pull request as ready for review August 19, 2021 10:37
@akshaymankar akshaymankar requested a review from jschaul August 19, 2021 14:53
findRemoteConv :: Remote ConvId -> m (Maybe (Identity UserId))
findRemoteConv remoteConvId = do
let (Qualified conv domain) = unTagged remoteConvId
query1 Cql.selectRemoteConvMembership (params Quorum (usr, domain, conv))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine as-is in terms of functionality, though in this particular case it is not be necessary to jump through parallelization hoops, and performance wise it's probably better (we should really measure these things though...) to use an IN query here.. The existing selectUserConvsIn = "select conv from user where user = ? and conv in ? order by conv" query should be fine to be used with a list of conversation IDs, since the IN part of the query concerns only the clustering key, not the partition key of the user table. I.e. all the conversation IDs for a given user are stored on the same 3 cassandra nodes so you can do one query instead of 1000 queries and performance wise it's fine to do one query here, as there is not the one cassandra node contacts them all effect here.

The resulting list would then just need to be diffed with the original list to get the lists of found and not found.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better, thanks!

Copy link
Member Author

@akshaymankar akshaymankar Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I misunderstood this and applied the change to both tables. I will revert it for the remote table.
Actually, user id is still the partition key in the remote user table too, so maybe what I did is ok. Can you please confirm?

-- | Tests getting many converations given their ids.
--
-- In this test, Alice is a local user, who will be asking for metadata of these
-- conversations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice list of cases 🏆

@akshaymankar akshaymankar force-pushed the akshaymankar/list-convs-only-by-id branch from 90e9e9b to 4ebd9e2 Compare August 30, 2021 10:18
@akshaymankar akshaymankar requested a review from jschaul August 30, 2021 11:30
Copy link
Member

@jschaul jschaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now (with respect to cassandra queries), I think, at least in theory 😄

Another thought: Would it make sense to run one manual test (probably would take too long to run each time) which creates 1000 conversations in the remote table and then gets its metadata in one request? Just to ensure there are no CQL limits to an IN [1..1000] query?

@akshaymankar
Copy link
Member Author

I wrote this test and it took more than a minute for galley to respond. Interestingly, registering a 1000 conversation, which took 1000 requests to galley, was very quick (~2s). I will check why the list request is slow.

testBulkGet1000QualifiedConvs :: TestM ()
testBulkGet1000QualifiedConvs = do
  aliceQ <- randomQualifiedUser
  bobId <- randomId
  let remoteDomainA = Domain "a.far-away.example.com"
  let alice = qUnqualified aliceQ
      bobQ = Qualified bobId remoteDomainA

  remoteConvIds <- replicateM 1000 $ randomQualifiedId remoteDomainA


  let aliceAsOtherMember = OtherMember aliceQ Nothing roleNameWireAdmin
  preRegisterTime <- liftIO getCurrentTime
  print (preRegisterTime, "start registering" :: String)
  for_ remoteConvIds $ \conv ->
    registerRemoteConv conv bobQ Nothing (Set.fromList [aliceAsOtherMember])
  postRegisterTime <- liftIO getCurrentTime
  print (postRegisterTime, "finished registering" :: String)

  let aliceAsSelfMember = Member (qUnqualified aliceQ) Nothing False Nothing Nothing False Nothing False Nothing roleNameWireAdmin
      bobAsOtherMember = OtherMember bobQ Nothing roleNameWireAdmin
      mockConversations = map (\conv -> mkConv conv bobId aliceAsSelfMember [bobAsOtherMember]) remoteConvIds
      req = ListConversationsV2 . unsafeRange $ remoteConvIds
  opts <- view tsGConf
  (respAll, receivedRequests) <-
    withTempMockFederator'
      opts
      remoteDomainA
      ( \fedReq -> do
          let success = pure . F.OutwardResponseBody . LBS.toStrict . encode
          case F.domain fedReq of
            d | d == domainText remoteDomainA -> success $ GetConversationsResponse mockConversations
            _ -> assertFailure $ "Unrecognized domain: " <> show fedReq
      ) $ do
       preListTime <- liftIO getCurrentTime
       print (preListTime, "listing convs" :: String)
       listRes <- listConvsV2 alice req
       postListTime <- liftIO getCurrentTime
       print (postListTime, "retrieved the list" :: String)
       pure listRes

  convs <- responseJsonUnsafe <$> (pure respAll <!! const 200 === statusCode)

  liftIO $ do
    let expectedFound = sortOn cnvQualifiedId mockConversations
        actualFound = sortOn cnvQualifiedId $ crFound convs
    assertEqual "found conversations" expectedFound actualFound

    -- Assumes only one request is made
    let [requestedConvIds] = sort $ fmap (FederatedGalley.gcrConvIds . fromJust . decode @FederatedGalley.GetConversationsRequest . (LBS.fromStrict . F.body) . fromJust . F.request)  receivedRequests
    assertEqual "only locally found conversations should be queried" (sort $ qUnqualified <$> remoteConvIds) requestedConvIds

    let expectedNotFound = []
        actualNotFound = sort $ crNotFound convs
    assertEqual "not founds" expectedNotFound actualNotFound
    assertEqual "failures" [] (crFailed convs)

@akshaymankar
Copy link
Member Author

With the power of lots of print statements, I could pin-point this to the part which grpcCall, so either our mock server is encoding the response too slowly or our grpc client is decoding it too slowly. Either way, this taking more than a minute is not acceptable, so I will keep digging. Another thing, I noted was this doesn't happen when we return one or two conversations.

@akshaymankar
Copy link
Member Author

I was able to figure out that problem existed in http2-client and it was already fixed by haskell-grpc-native/http2-client#75
Pulling http2-client from HEAD and http2-grpc-client from haskell-grpc-native/http2-grpc-haskell#46 makes the above test run in 2.06 seconds, most of which is spent in making 1000 calls to setup those conversations. I am convinced that this PR is fine.

I will make a separate PR with make us depend on the new http2-client and a version of http2-grpc-client which is a merge of haskell-grpc-native/http2-grpc-haskell#46 and haskell-grpc-native/http2-grpc-haskell#48

@akshaymankar akshaymankar merged commit 89b8d80 into develop Sep 1, 2021
@akshaymankar akshaymankar deleted the akshaymankar/list-convs-only-by-id branch September 1, 2021 14:22
@akshaymankar akshaymankar mentioned this pull request Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants