Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.

...

6 commits
30 files changed
3 contributors

Commits on Nov 21, 2025

Change default use_hybrid_disposition to False (#714 )
```
This changes the default value of use_hybrid_disposition from True to False
in the SEA backend, disabling hybrid disposition by default.
```
samikshya-db authored Nov 21, 2025
Configuration menu
View commit details

Copy full SHA for b8494ff

Browse repository at this point
Copy the full SHA

b8494ff View commit details

Browse the repository at this point in the history

Commits on Nov 26, 2025

Circuit breaker changes using pybreaker (#705 )

* Added driver connection params

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* Added model fields for chunk/result latency

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed linting issues

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* lint issue fixing

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* circuit breaker changes using pybreaker

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* Added interface layer top of http client to use circuit rbeaker

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* Added test cases to validate ciruit breaker

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixing broken tests

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed linting issues

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed failing test cases

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed urllib3 issue

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* added more test cases for telemetry

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* simplified CB config

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* poetry lock

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fix minor issues & improvement

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* improved circuit breaker for handling only 429/503

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* linting issue fixed

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* raise CB only for 429/503

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fix broken test cases

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed untyped references

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* added more test to verify the changes

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* description changed

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* remove cb congig class to constants

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* removed mocked reponse and use a new exlucded exception in CB

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed broken test

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* added e2e test to verify circuit breaker

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* lower log level for telemetry

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed broken test, removed tests on log assertions

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* modified unit to reduce the noise and follow dry principle

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

---------

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

nikhilsuri-db authored Nov 26, 2025

Commits on Nov 27, 2025

perf: Optimize telemetry latency logging to reduce overhead (#715 )

 perf: Optimize telemetry latency logging to reduce overhead

Optimizations implemented:
1. Eliminated extractor pattern - replaced wrapper classes with direct
   attribute access functions, removing object creation overhead
2. Added feature flag early exit - checks cached telemetry_enabled flag
   to skip heavy work when telemetry is disabled
3. Simplified code structure with early returns for better readability


Signed-off-by: Samikshya Chand <samikshya.chand@databricks.com>

samikshya-db authored Nov 27, 2025

Commits on Nov 28, 2025

basic e2e test for force telemetry verification (#708 )

* basic e2e test for force telemetry verification

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* Added more integration test scenarios

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* default on telemetry + logs to investigate failing test

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fixed linting issue

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* added more logs to identify server side flag evaluation

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* remove unused logs

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* fix broken test case for default enable telemetry

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* redcude test length and made more reusable code

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

* removed telemetry e2e to daily single run

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

---------

Signed-off-by: Nikhil Suri <nikhil.suri@databricks.com>

nikhilsuri-db authored Nov 28, 2025

Commits on Dec 3, 2025

feat: Implement host-level telemetry batching to reduce rate limiting (…

…#718)

* feat: Implement host-level telemetry batching to reduce rate limiting

Changes telemetry client architecture from per-session to per-host batching,
matching the JDBC driver implementation. This reduces the number of HTTP
requests to the telemetry endpoint and prevents rate limiting in test
environments.

Key changes:
- Add _TelemetryClientHolder with reference counting for shared clients
- Change TelemetryClientFactory to key clients by host_url instead of session_id
- Add getHostUrlSafely() helper for defensive null handling
- Update all callers (client.py, exc.py, latency_logger.py) to pass host_url

Before: 100 connections to same host = 100 separate TelemetryClients
After:  100 connections to same host = 1 shared TelemetryClient (refcount=100)

This fixes rate limiting issues seen in e2e tests where 300+ parallel
connections were overwhelming the telemetry endpoint with 429 errors.

* chore: Change all telemetry logging to DEBUG level

Reduces log noise by changing all telemetry-related log statements
(info, warning, error) to debug level. Telemetry operations are
background tasks and should not clutter logs with operational messages.

Changes:
- Circuit breaker state changes: info/warning -> debug
- Telemetry send failures: error -> debug
- All telemetry operations now consistently use debug level

* chore: Fix remaining telemetry warning log to debug

Changes remaining logger.warning in telemetry_push_client.py to debug level
for consistency with other telemetry logging.

* fix: Update tests to use host_url instead of session_id_hex

- Update circuit breaker test to check logger.debug instead of logger.info
- Replace all session_id_hex test parameters with host_url
- Apply Black formatting to exc.py and telemetry_client.py

This fixes test failures caused by the signature change from session_id_hex
to host_url in the Error class and TelemetryClientFactory.

* fix: Revert session_id_hex in tests for functions that still use it

Only Error classes changed from session_id_hex to host_url.
Other classes (TelemetryClient, ResultSetDownloadHandler, etc.) still use session_id_hex.

Reverted:
- test_telemetry.py: TelemetryClient and initialize_telemetry_client
- test_downloader.py: ResultSetDownloadHandler
- test_download_manager.py: ResultFileDownloadManager

Kept as host_url:
- test_client.py: Error class instantiation

* fix: Update all Error raises and test calls to use host_url

Changes:
1. client.py: Changed all error raises from session_id_hex to host_url
   - Connection class: session_id_hex=self.get_session_id_hex() -> host_url=self.session.host
   - Cursor class: session_id_hex=self.connection.get_session_id_hex() -> host_url=self.connection.session.host

2. test_telemetry.py: Updated get_telemetry_client() and close() calls
   - get_telemetry_client(session_id) -> get_telemetry_client(host_url)
   - close(session_id) -> close(host_url=host_url)

3. test_telemetry_push_client.py: Changed logger.warning to logger.debug
   - Updated test assertion to match debug logging level

These changes complete the migration from session-level to host-level
telemetry client management.

* fix: Update thrift_backend.py to use host_url instead of session_id_hex

Changes:
1. Added self._host attribute to store server_hostname
2. Updated all error raises to use host_url=self._host
3. Changed method signatures from session_id_hex to host_url:
   - _check_response_for_error
   - _hive_schema_to_arrow_schema
   - _col_to_description
   - _hive_schema_to_description
   - _check_direct_results_for_error
4. Updated all method calls to pass self._host instead of self._session_id_hex

This completes the migration from session-level to host-level error reporting.

* Fix Black formatting by adjusting fmt directive placement

Moved the `# fmt: on` directive to the except block level instead
of inside the if statement to resolve Black parsing confusion.

* Fix telemetry feature flag tests to set mock session host

The tests were failing because they called get_telemetry_client("test")
but the mock session didn't have .host set, so the telemetry client was
registered under a different key (likely None or MagicMock). This caused
the factory to return NoopTelemetryClient instead of the expected client.

Fixed by setting mock_session_instance.host = "test" in all three tests.

* Add teardown_method to clear telemetry factory state between tests

Without this cleanup, tests were sharing telemetry clients because they
all used the same host key ("test"), causing test pollution. The first
test would create an enabled client, and subsequent tests would reuse it
even when they expected a disabled client.

* Clear feature flag context cache in teardown to fix test pollution

The FeatureFlagsContextFactory caches feature flag contexts per session,
causing tests to share the same feature flag state. This resulted in the
first test creating a context with telemetry enabled, and subsequent tests
incorrectly reusing that enabled state even when they expected disabled.

* fix: Access actual client from holder in flush worker

The flush worker was calling _flush() on _TelemetryClientHolder objects
instead of the actual TelemetryClient. Fixed by accessing holder.client
before calling _flush().

Fixes AttributeError in e2e tests: '_TelemetryClientHolder' object has
no attribute '_flush'

* Clear telemetry client cache in e2e test teardown

Added _clients.clear() to the teardown fixture to prevent telemetry
clients from persisting across e2e tests, which was causing session ID
pollution in test_concurrent_queries_sends_telemetry.

* Pass session_id parameter to telemetry export methods

With host-level telemetry batching, multiple connections share one
TelemetryClient. Each client stores session_id_hex from the first connection
that created it. This caused all subsequent connections' telemetry events
to use the wrong session ID.

Changes:
- Modified telemetry export method signatures to accept optional session_id
- Updated Connection.export_initial_telemetry_log() to pass session_id
- Updated latency_logger.py export_latency_log() to pass session_id
- Updated Error.__init__() to accept optional session_id_hex and pass it
- Updated all error raises in Connection and Cursor to pass session_id_hex

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix Black formatting in telemetry_client.py

* Use 'test-host' instead of 'test' for mock host in telemetry tests

* Replace test-session-id with test-host in test_client.py

* Fix telemetry client lookup to use test-host in tests

* Make session_id_hex keyword-only parameter in Error.__init__

---------

Co-authored-by: Claude <noreply@anthropic.com>

samikshya-db and claude authored Dec 3, 2025

Commits on Dec 4, 2025

Prepare for a release with telemetry on by default (#717 )

* Prepare for a release with telemetry on by default

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

* Make edits

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

* Update version

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

* Fix CHANGELOG formatting to match previous style

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

* Fix telemetry e2e tests for default-enabled behavior

- Update test expectations to reflect telemetry being enabled by default
- Add feature flags cache cleanup in teardown to prevent state leakage between tests
- This ensures each test runs with fresh feature flag state

* Add wait after connection close for async telemetry submission

* Remove debug logging from telemetry tests

* Mark telemetry e2e tests as serial - must not run in parallel

Root cause: Telemetry tests share host-level client across pytest-xdist workers,
causing test isolation issues with patches. Tests pass serially but fail with -n auto.

Solution: Add @pytest.mark.serial marker. CI needs to run these separately without -n auto.

* Split test execution to run serial tests separately

Telemetry e2e tests must run serially due to shared host-level
telemetry client across pytest-xdist workers. Running with -n auto
causes test isolation issues where futures aren't properly captured.

Changes:
- Run parallel tests with -m 'not serial' -n auto
- Run serial tests with -m 'serial' without parallelization
- Use --cov-append for serial tests to combine coverage
- Mark telemetry e2e tests with @pytest.mark.serial
- Update test expectations for default telemetry behavior
- Add feature flags cache cleanup in test teardown

* Mark telemetry e2e tests as serial - must not run in parallel

The concurrent telemetry e2e test globally patches telemetry methods
to capture events. When run in parallel with other tests via pytest-xdist,
it captures telemetry events from other concurrent tests, causing
assertion failures (expected 60 events, got 88).

All telemetry e2e tests must run serially to avoid cross-test
interference with the shared host-level telemetry client.

---------

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

samikshya-db authored Dec 4, 2025