-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fixes: #12711 Added sparse vector support in Oracle #12712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi, can you fix the link error? after that I'll add it to gerrit to thanks. You can install pre-commit to run the linters |
f4ae7d1
to
d052522
Compare
I have fixed them. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of CaselIT to try to get revision d052522 of this pull request into gerrit so we can run tests and reviews and stuff
New Gerrit review created for change d052522: https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Federico Caselli (CaselIT) wrote:
left a suggestion regarding an improvement on the typing of the sparse array.
Also a question considering typing etc? Wouldn't this make more sense as another type? So VECTOR could be typed as VECTOR(TypeEngine[list[Any]])
and an ipotetical different class SparseVector could be typed as SparseVector(TypeEngine[SparseValue])
?
again just a suggestion
View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023
|
||
from sqlalchemy import insert, select | ||
|
||
sparse_val = (10, [1, 2], array.array("d", [23.45, 221.22])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Federico Caselli (CaselIT) wrote:
we are adding python side objects to ease usage of more advantage types, like Range or BitString in postgresql. links https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#sqlalchemy.dialects.postgresql.Range and https://docs.sqlalchemy.org/en/21/dialects/postgresql.html#sqlalchemy.dialects.postgresql.BitString
Maybe we could provide a class for this? Or a named tuple? Something like
class SparseValue(NamedTuple):
num_dimensions: int
indexes: Sequence[int]
values: array.array[Any] | np.ndarray # assuming this is supported by the driver
This could be used also as the return type of sparse array columns?
Threat this as a suggestion, it's likely ok as is
View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Federico Caselli (CaselIT) wrote:
we could even just reuse the value defined in the driver? https://github.com/oracle/python-oracledb/blob/2325059859ca9b4a517b8bfae7fa3c4f5b536ab2/src/oracledb/sparse_vector.py#L41
View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CaselIT Including the SparseVector object in sqlalchemy.dialects.oracle seems reasonable to me.
Also importing the value directly from oracledb also feels like a sensible approach—I’d go ahead with that change. The only consideration is that, with this setup, users will need to explicitly import oracledb
and then create the object using oracledb.SparseVector()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably alias that in the oracle dialect module.
@zzzeek would you be ok doing a try/expect at module lever to try and import that oracledb class and otherwise use a function that would error when invoked?
I know that's not how we usually do this, but it may make sense in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MMMmmmm I'm a pretty strong -1 on having SQLAlchemy users need to import symbols from the oracledb DBAPI module directly. as you noted, this is exactly the thing we've been trying to move away from in the few cases where it was happening.
I would much prefer if we made our own SparseVector object in sqlalchemy.dialects.oracle , which would likely look very similar to oracledb's. But I think having our own object presented to the user is best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a value object returned per-row? even then I'd be +1 marshall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's a value object returned per row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CaselIT Based on the points raised, I see a couple of possible approaches we could take. Before proceeding, I wanted to check with you on which approach you’d prefer I follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the easiest thing it likely to just make a simple class in sqlalchemy and marshal to-from that in the type, like mike suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've uploaded the latest changes based on your suggestion. Could you please review and let me know if anything needs to be updated?
Federico Caselli (CaselIT) wrote: code review left on gerrit View this in Gerrit at https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023 |
I feel it might be more consistent to keep Vector and SparseVector together, since SparseVector is essentially a format or variant of Vector, rather than a separate type. Treating it separately could potentially lead to redundancy or fragmentation. Would love to hear your thoughts on this approach. |
It's ok for me, it was just an idea if useful from a typing prospective. We can always add such "useful for typing" subclasses later on if we get requests for it |
4392553
to
9b542b8
Compare
Fixes: sqlalchemy#12711 Added sparse vector support in Oracle Fixes: sqlalchemy#12711 Added sparse vector support in Oracle Fixes: sqlalchemy#12711 Added sparse vector support in Oracle Fixes: sqlalchemy#12711 Added sparse vector support in Oracle Fixes: sqlalchemy#12711 Added sparse vector support in Oracle Fixes: sqlalchemy#12711 Added sparse vector support in Oracle
9b542b8
to
0be516b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of CaselIT to try to get revision 0be516b of this pull request into gerrit so we can run tests and reviews and stuff
Patchset 0be516b added to existing Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/6023 |
Description
A Sparse vector is a vector which has zero value for most of its dimensions.
Sparse vectors are supported in Oracle Database 23.7 and later.
This PR extends vector support by adding functionality for sparse vectors in Oracle
Fixes #12711
Checklist
This pull request is:
must include a complete example of the issue. one line code fixes without an
issue and demonstration will not be accepted.
Fixes: #<issue number>
in the commit messageinclude a complete example of how the feature would look.
Fixes: #<issue number>
in the commit messageHave a nice day!