Skip to content

Conversation

@yonicswe
Copy link

hey matan.
please check these patches

Leon Romanovsky and others added 23 commits January 31, 2017 14:26
The current code creates an idr per type. Since types are currently
common for all drivers and known in advance, this was good enough.
However, the proposed ioctl based infrastructure allows each driver
to declare only some of the common types and declare its own specific
types.

Thus, we decided to implement idr to be per uverbs_file.

issue: 776663
Change-Id: I5c267a875b52af5fb1a8a95ac2b7f829cd26f6dd
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
The new ioctl infrastructure supports driver specific objects.
Each such object type has a hot unplug function, allocation size and
an order of destruction.

When a ucontext is created, a new list is created in this ib_ucontext.
This list contains all objects created under this ib_ucontext.
When a ib_ucontext is destroyed, we traverse this list several time
destroying the various objects by the order mentioned in the object
type description. If few object types have the same destruction order,
they are destroyed in an order opposite to their creation.

Adding an object is done in two parts.
First, an object is allocated and added to idr tree. Then, the
command's handlers (in downstream patches) could work on this object
and fill in its required details.
After a successful command, the commit part is called and the user
objects become ucontext visible. If the handler failed, alloc_abort
should be called.

Removing an uboject is done by calling lookup_get with the write flag
and finalizing it with destroy_commit. A major change from the previous
code is that we actually destroy the kernel object itself in
destroy_commit (rather than just the uobject).

We should make sure idr (per-uverbs-file) and list (per-ucontext) could
be accessed concurrently without corrupting them.

issue: 776663
Change-Id: I264f705951d646a47629ff22f7ec58e60e59e2c4
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
This patch adds the standard idr based types. These types are
used in downstream patches in order to initialize, destroy and
lookup IB standard objects which are based on idr objects.

An idr object requires filling out several parameters. Its op pointer
should point to uverbs_idr_ops and its size should be at least the
size of ib_uobject. We add a macro to make the type declaration easier.

Change-Id: I63ef0534833c1391d87d333120d791ff7493ffc7
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
This changes only the handlers which deals with idr based objects to
use the new idr allocation, fetching and destruction schema.
This patch consists of the following changes:
(1) Allocation, fetching and destruction is done via idr ops.
(2) Context initializing and release is done through
    uverbs_initialize_ucontext and uverbs_cleanup_ucontext.
(3) Ditching the live flag. Mostly, this is pretty straight
    forward. The only place that is a bit trickier is in
    ib_uverbs_open_qp. Commit [1] added code to check whether
    the uobject is already live and initialized. This mostly
    happens because of a race between open_qp and events.
    We delayed assigning the uobject's pointer in order to
    eliminate this race without using the live variable.

[1] commit a040f95
	("IB/core: Fix XRC race condition in ib_uverbs_open_qp")

Change-Id: I847e3e0a48c05c257ab54fc448208cc14a21c295
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
When two handlers used the same object in the old schema, we blocked
the process in the kernel. The new schema just returns -EBUSY. This
could lead to different behaviour in applications between the old
schema and the new schema. In most cases, using such handlers
concurrently could lead to crashing the process. For example, if
thread A destroys a QP and thread B modifies it, we could have the
destruction happens before the modification. In this case, we are
accessing freed memory which could lead to crashing the process.
This is true for most cases. However, attaching and detaching
a multicast address from QP concurrently is safe. Therefore, we
preserve the original behaviour by adding a lock there.

Change-Id: Id5cf94882197f51e9184999490524afdb00c8c0e
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
The completion channel we use in verbs infrastructure is FD based.
Previously, we had a separate way to manage this object. Since we
strive for a single way to manage any kind of object in this
infrastructure, we conceptually treat all objects as subclasses
of ib_uobject.

This commit adds the necessary mechanism to support FD based objects
like their IDR counterparts. FD objects release need to be synchronized
with context release. We use the cleanup_mutex on the uverbs_file for
that.

issue: 776663
Change-Id: I15754b3906e0028953ff900e4699a3e2f670c081
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
This patch adds the standard fd based type - completion_channel.
The completion_channel is now prefixed with ib_uobject, similarly
to the rest of the uobjects.
This requires a few changes:
(1) We define a new completion channel fd based object type.
(2) completion_event and async_event are now two different types.
    This means they use different fops.
(3) We release the completion_channel exactly as we release other
    idr based objects.
(4) Since ib_uobjects are already kref-ed, we only add the kref to the
    async event.

A fd object requires filling out several parameters. Its op pointer
should point to uverbs_fd_ops and its size should be at least the
size if ib_uobject. We use a macro to make the type declaration
easier.

Change-Id: I415f4aa3309ea9543c0f16a1b9425470f19f7a72
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
The ioctl infrastructure treats all user-objects in the same manner.
It gets an id from the user-space and by using the object type and
operation mentioned in the action specification, it executes this
operation. An operation is split to two stages. The first one is
carried out before executing the handler and the second one is
executed afterwards.

In order to abstract these details from the ioctl infrastructure
layer, we add uverbs_get_uobject_from_context and
uverbs_finalize_object functions which corresponds to the first
and second stages respectively.

Change-Id: I021307020edca93a40e845ffb4be42fc39ccfc21
Signed-off-by: Matan Barak <matanb@mellanox.com>
The new IOCTL based infrastructure either commits or rollbacks
all objects of the command as one transaction. In order to do
that, we introduce a notion of dealing with a collection of
objects that are related to a specific action.

This also requires adding a notion of an action and attribute.
An action contains a groups of attributes, where each group
contains several attributes.

When declaring these actions and attributes, we actually declare
their specifications. When a command is executed, we actually
allocates some space to hold auxiliary information.

issue: 776663
Change-Id: I5a96c0dc1cd97df994e7fa340ed6879055e1fd82
Signed-off-by: Matan Barak <matanb@mellanox.com>
In this ioctl interface, processing the command starts from
properties of the command and fetching the appropriate user objects
before calling the handler.

Parsing and validation is done according to a specifier declared by
the driver's code. In the driver, all supported types are declared.
These types are separated to different type groups, each could be
declared in a different place (for example, common types and driver
specific types).

For each type we list all supported actions. Similarly to types,
actions are separated to actions groups too. Each group is declared
separately. This could be used in order to add actions to an existing
type.

Each action has a specifies a handler, which could be either a
standard command or a driver specific command.
Along with the handler, a group of attributes is specified as well.
This group lists all supported attributes and is used for automatic
fetching and validation of the command, response and its related
objects.

When a group of elements is used, the high bits of the elements ids
are used in order to calculate the group index. Then, these high bits
are masked out in order to have a zero based namespace for every
group. This is mandatory for compact representation and O(1) array
access.

A group of attributes is actually an array of attributes. Each
attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length.
Attributes could be validated through some attributes, like:
(*) Minimum size / Exact size
(*) Fops for FD
(*) Object type for IDR

If an IDR/fd attribute is specified, the kernel also states the object
type and the required access (NEW, WRITE, READ or DESTROY).
All uobject/fd management is done automatically by the infrastructure,
meaning - the infrastructure will fail concurrent commands that at
least one of them requires concurrent access (WRITE/DESTROY),
synchronize actions with device removals (dissociate context events)
and take care of reference counting (increase/decrease) for concurrent
actions invocation. The reference counts on the actual kernel objects
shall be handled by the handlers.

 types
+--------+
|        |
|        |   actions                                                                +--------+
|        |   group      action      action_spec                           +-----+   |len     |
+--------+  +------+[d]+-------+   +----------------+[d]+------------+    |attr1+-> |type    |
| type   +> |action+-> | spec  +-> +  attr_groups   +-> |common_chain+--> +-----+   |idr_type|
+--------+  +------+   |handler|   |                |   +------------+    |attr2|   |access  |
|        |  |      |   +-------+   +----------------+   |vendor chain|    +-----+   +--------+
|        |  |      |                                    +------------+
|        |  +------+
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
+--------+

[d] = distribute ids to groups using the high order bits

The right types table is also chosen by using the high bits from
uverbs_types_groups.

Once validation and object fetching (or creation) completed, we call
the handler:
int (*handler)(struct ib_device *ib_dev, struct ib_ucontext *ucontext,
               struct uverbs_attr_array *ctx, size_t num);

Where ctx is an array of uverbs_attr_array. Each element in this array
is an array of attributes which corresponds to one group of attributes.
For example, in the usually used case:

 ctx                               core
+----------------------------+     +------------+
| core: uverbs_attr_array    +---> | valid      |
+----------------------------+     | cmd_attr   |
| driver: uverbs_attr_array  |     +------------+
|----------------------------+--+  | valid      |
                                |  | cmd_attr   |
                                |  +------------+
                                |  | valid      |
                                |  | obj_attr   |
                                |  +------------+
                                |
                                |  vendor
                                |  +------------+
                                +> | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | obj_attr   |
                                   +------------+

Ctx array's indices corresponds to the attributes groups order. The indices
of core and driver corresponds to the attributes name spaces of each
group. Thus, we could think of the following as one object:
1. Set of attribute specification (with their attribute IDs)
2. Attribute group which owns (1) specifications
3. A function which could handle this attributes which the handler
   could call
4. The allocation descriptor of this type uverbs_obj_type.

Upon success of a handler invocation, reference count of uobjects and
use count will be a updated automatically according to the
specification.

issue: 776663
Change-Id: I0cb47d94530169e29605b46aacdd2ea4a2d5edea
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Switch all uverbs_type_attrs_xxxx with DECLARE_UVERBS_TYPE
macros. This will be later used in order to embed the types
specific actions in the types as well.

issue: 776663
Change-Id: Id4709b05b6063de82861af5c1c8a6ef220674ca5
Signed-off-by: Matan Barak <matanb@mellanox.com>
This adds the DEVICE type. This type support creating the context all
objects are created from. Moreover, it supports executing actions
which are related to the device itself, such as QUERY_DEVICE.
The only related object to this type is a singleton (per file
instance) context.

All standard types are put in the root structure.

issue: 776663
Change-Id: Idf6a7717957374a914b3fa90761a7e9743e5019b
Signed-off-by: Matan Barak <matanb@mellanox.com>
In order to accelerate the validation and parsing process, we
calculate the number of attributes of all groups in an action
and the mandatory attributes bitmask in advance.

Change-Id: I5f8665b6d366fe1ee126350a81e4cd49d8b7304c
Signed-off-by: Matan Barak <matanb@mellanox.com>
This patch adds macros for declaring action groups, actions,
attribute groups and attributes. These definitions are later
used by downstream patches to declare some of the common types.

In addition, we add some helper inline functions to copy_{from,to}
user-space buffers and check if an attribute is valid.

issue: 776663
Change-Id: I16a094e0da1b62e6a257d87a430040100da4fd7f
Signed-off-by: Matan Barak <matanb@mellanox.com>
We add the common (core) code for init context, query device,
reg_mr, create_cq, create_qp, modify_qp create_comp_channel and
init_pd.
This includes the following parts:
* Macros for defining commands and validators
* For each command
    * type declarations
          - destruction order
          - free function
          - uverbs action group
    * actions
    * handlers
    * attributes

Drivers could use the these attributes, actions or types when they
want to alter or add a new type. They could use the uverbs handler
directly in the action (or just wrap it in the driver's custom code).

Currently we use ib_udata to pass vendor specific information to the
driver. This should probably be refactored in the future.

issue: 776663
Change-Id: Ic9df96bc92e766814d366ab25a2212be31d099c1
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Add CONFIG_INFINIBAND_EXP_USER_ACCESS that enables the ioctl
interface. This interface is experimental and is subject to change.

Change-Id: I35ecfcae1ffc3dbc82ca29726c1d9204136514e4
Signed-off-by: Matan Barak <matanb@mellanox.com>
In order to allow compatibility header, allow passing
ib_uverbs_ioctl_hdr and ib_uverbs_attr from kernel space.

issue: 776663
Change-Id: I1c1774db58027c47450024b0f3212afc2f9f2118
Signed-off-by: Matan Barak <matanb@mellanox.com>
Implement write->ioctl compatibility layer for ib_uverbs_get_context
by translating the headers to ioctl headers and invoke the ioctl
parser.

issue: 776663
Change-Id: I3473c897153965ee6a79f0eb814cb8137f2c8a00
Signed-off-by: Matan Barak <matanb@mellanox.com>
Change-Id: I543efb86c0e224e1896d33911fa5ee5a7f2b2f95
Signed-off-by: Matan Barak <matanb@mellanox.com>
Change-Id: I166c5d8e6620f903b6126695bd7dc42b97528f72
Signed-off-by: Matan Barak <matanb@mellanox.com>
rdma_user_ioctl_cmd.h will be included from libibverbs to share
common definitions of ioctl commands.
UVERBS_TYPE_QP,
UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
UVERBS_ATTR_PTR_IN(QUERY_QP_ATTR_MASK, u32),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be better to split the response to the actual fields we care of (instead of getting an attr_mask). In that way, you could just ask if a field exists, query and output this specific data.

UVERBS_ATTR_IDR(REREG_MR_HANDLE, UVERBS_TYPE_MR, UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
UVERBS_ATTR_IDR(REREG_MR_PD_HANDLE, UVERBS_TYPE_PD, UVERBS_ACCESS_READ,
UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to break the struct to attr fields.
Meaning, having an attribute for "change access", another one for "translation", etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants