Skip to content

Add (native) VECTOR data type support #158

@erics465

Description

@erics465

More recent CockroachDB versions (I think 25+) added the data type VECTOR for storing fixed-length arrays of floating-point numbers, which represent data points in multi-dimensional space. It is often used for vector search in applications with LLMs, e.g. RAG (retrieval augmented generation). See the docs for more info.

As such, having support for this data type in sequelize-cockroachdb would be quite nice to enable such applications using CockroachDB, Sequelize and NodeJS.

The type VECTOR does not exist currently in sequelize-cockroachdb, also using ARRAY is not possible due to syntax mismatches between handling ARRAY and VECTOR in SQL. The workaround I'm using now (and documenting here for others stumbling upon this problem) is defining the column data type as STRING in the schema (but as VECTOR in the actual database) and then manually converting an array of floats to the format required by CockroachDB using JSON.stringify().

//Schema file
...
  embedding: {
    type: Sequelize.DataTypes.STRING, //Should be VECTOR when supported
    allowNull: true
  },
...

Inserting/updating data in this column can be done like this:

// Update data in VECTOR column
let embedding = [-0.03704405203461647,  0.024347146973013878,  -0.007968421094119549, ...];
await model.update({ embedding: JSON.stringify(embedding) });

Querying works with literals:

let queryEmbedding = [-0.03704405203461647,  0.024347146973013878,  -0.007968421094119549, ...];
let results = await model.findAll({
  where: Sequelize.literal(`embedding <=> '${JSON.stringify(queryEmbedding)}'::vector < 0.5`)
});

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions