Skip to main content

2 posts tagged with "adbc"

Arrow Database Connectivity (ADBC) related topics and discussions

View All Tags

Spice v2.0-rc.2 (Apr 10, 2026)

ยท 28 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.2! ๐Ÿ”ฅ

v2.0.0-rc.2 is the second release candidate for advanced testing of v2.0, building on v2.0.0-rc.1.

Highlights in this release candidate include:

  • Distributed Spice Cayenne Query and Write Improvements with data-local query routing and partition-aware write-through
  • DataFusion v52.4.0 Upgrade with aligned arrow-rs, datafusion-federation, and datafusion-table-providers
  • MERGE INTO for Spice Cayenne catalog tables with distributed support across executors
  • PARTITION BY Support for Cayenne enabling SQL-defined partitioning in CREATE TABLE statements
  • ADBC Data Connector & Catalog with full query federation, BigQuery support, and schema/table discovery
  • Databricks Lakehouse Federation Improvements with improved reliability, resilience, DESCRIBE TABLE fallback, and source-native type parsing
  • Delta Lake Column Mapping supporting Name and Id mapping modes
  • HTTP Pagination support for paginated API endpoints in the HTTP data connector
  • New Catalog Connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
  • JSON Ingestion Improvements with single-object support, soda (Socrata Open Data) format support, json_pointer extraction, and auto-detection
  • Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations
  • Dependency upgrades including Turso v0.5.3, iceberg-rust v0.9, and Vortex improvements

What's New in v2.0.0-rc.2โ€‹

Distributed Cayenne Query and Write Improvementsโ€‹

Distributed query for Cayenne-backed tables now has better partition awareness for both reads and writes.

Key improvements:

  • Data-Local Query Routing: Cayenne catalog queries can now be routed to executors that hold the relevant partitions, improving distributed query efficiency.
  • Partition-Aware Write Through: Scheduler-side Flight DoPut ingestion now splits partitioned Cayenne writes and forwards them to the responsible executors instead of routing through a single raw-forward path.
  • Dynamic Partition Assignment: Newly observed partitions can be added and assigned atomically as data arrives, with persisted partition metadata for future routing.
  • Better Cluster Coordination: Partition management is now separated for accelerated and federated tables, improving routing behavior for distributed Cayenne catalog workloads.
  • Distributed UPDATE/DELETE DML: UPDATE and DELETE statements for Cayenne catalog tables are now forwarded to all executors in distributed mode, with all executors required to succeed.
  • Distributed runtime.task_history: Task history is now replicated across the distributed cluster for observability.
  • RefreshDataset Control Stream: Dataset refresh operations are now distributed via the control stream to executors.
  • Executor DDL Sync: When an executor connects, it receives DDL for all existing tables, ensuring late-joining executors have full table state.

MERGE INTO for Spice Cayenneโ€‹

Spice now supports MERGE INTO statements for Cayenne catalog tables, enabling upsert-style data operations with full distributed support.

Key improvements:

  • MERGE INTO Support: Execute MERGE INTO statements against Cayenne catalog tables for combined insert/update/delete operations.
  • Distributed MERGE: MERGE operations are automatically distributed across executors in cluster mode.
  • Data Safety: Duplicate source keys are detected and prevented to avoid data loss during MERGE operations.
  • Chunked Delete Filters: Large MERGE delete filter lists are chunked to prevent stack overflow with Vortex IN-list expressions.

PARTITION BY Support for Cayenneโ€‹

SQL Partition Management: Spice now supports PARTITION BY for Cayenne-backed CREATE TABLE statements, enabling partition definitions to be expressed directly in SQL and persisted in the Cayenne catalog.

Key improvements:

  • SQL Partition Definition: Define Cayenne table partitioning directly in SQL using CREATE TABLE ... PARTITION BY (...).
  • Partition Validation: Partition expressions are parsed and validated during DDL analysis before table creation.
  • Persisted Partition Metadata: Partition metadata is stored in the Cayenne catalog and can be reloaded by the runtime after restart.
  • Distributed DDL Support: Partition metadata is forwarded when CREATE TABLE is distributed to executors in cluster mode.
  • Improved Type Support: Partition utilities now support newer string scalar variants such as Utf8View.

Example:

CREATE TABLE events (id INT, region TEXT, ts TIMESTAMP) PARTITION BY (region)

Catalog Connector Enhancementsโ€‹

Spice now includes additional catalog connectors for major database systems, improving schema discovery and federation workflows across external data systems.

Key improvements:

  • New Catalog Connectors: Added catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake.
  • Schema and Table Discovery: Connectors use native metadata catalogs such as information_schema / INFORMATION_SCHEMA to discover schemas and tables.
  • Improved Federation Workflows: These connectors make it easier to expose external database metadata through Spice for cross-system federation scenarios.
  • PostgreSQL Partitioned Tables: Fixed schema discovery for PostgreSQL partitioned tables.

Example PostgreSQL catalog configuration:

catalogs:
- from: pg
name: pg
include:
- 'public.*'
params:
pg_host: localhost
pg_port: 5432
pg_user: postgres
pg_pass: ${secrets:POSTGRES_PASSWORD}
pg_db: my_database
pg_sslmode: disable

JSON Ingestion Improvementsโ€‹

JSON ingestion is now more flexible and robust.

Key improvements:

  • More JSON Formats: Added support for single-object JSON documents, auto-detected JSON formats, and Socrata SODA responses.
  • json_pointer Extraction: Extract nested payloads before schema inference and reading using RFC 6901 JSON Pointer syntax.
  • Better Auto-Detection: JSON format detection now handles arrays, objects, JSONL, and BOM-prefixed input more reliably, including single multi-line objects.
  • SODA Support: Added schema extraction and data conversion for Socrata Open Data API responses.
  • Broader Compatibility: Improved handling for BOM-prefixed files, CRLF-delimited JSONL, nested payloads, mixed structures, and wrapped documents.

Example using json_pointer to extract nested data from an API response:

datasets:
- from: https://api.example.com/v1/data
name: users
params:
json_pointer: /data/users

DataFusion v52.4.0 Upgradeโ€‹

Apache DataFusion has been upgraded from v52.2.0 to v52.4.0, with aligned updates across arrow-rs, datafusion-federation, and datafusion-table-providers.

Key improvements:

  • DataFusion v52.4.0: Brings the latest fixes and compatibility improvements across query planning and execution.
  • Strict Overflow Handling: try_cast_to now uses strict cast to return errors on overflow instead of silently producing NULL values.
  • Federation Fix: Fixed SQL unparsing for Inexact filter pushdown with aliases.
  • Partial Aggregation Optimization: Improved partial aggregation performance for FlightSQLExec.

Dependency Upgradesโ€‹

DependencyVersion / Update
Turso (libsql)v0.5.3 (from v0.4.4)
iceberg-rustv0.9
VortexMap type support, stack-safe IN-lists
arrow-rsArrow v57.2.0
datafusion-federationUpdated for DataFusion v52.4.0 alignment
datafusion-table-providersUpdated for DataFusion v52.4.0 alignment
datafusion-ballistaBumped to fix BatchCoalescer schema mismatch panic

Other Improvementsโ€‹

  • Cayenne released as RC: Cayenne data accelerator is now promoted to release candidate status.

  • File Update Acceleration Mode: Added mode: file_update acceleration mode for file-based data refresh.

  • spice completions Command: New CLI command for generating shell completion scripts, with auto-detection of shell directory.

  • --endpoint Flag: Added --endpoint flag to spice run with scheme-based routing for custom endpoints.

  • mTLS Client Auth: Added mTLS client authentication support to the spice sql REPL.

  • DynamoDB DML: Implemented DML (INSERT, UPDATE, DELETE) support for the DynamoDB table provider.

  • Caching Retention: Added retention policies for cached query results.

  • GraphQL Custom Auth Headers: Added custom authorization header support for the GraphQL connector.

  • ClickHouse Date32 Support: Added Date32 type support for the ClickHouse connector.

  • AWS IAM Role Source: Added iam_role_source parameter for fine-grained AWS credential configuration.

  • S3 Metadata Columns: Metadata columns renamed to _location, _last_modified, _size for consistency, with more robust handling in projected queries.

  • S3 URL Style: Added s3_url_style parameter for S3 connector URL addressing (path-style vs virtual-hosted). Useful for S3-compatible stores like MinIO:

    params:
    s3_endpoint: https://minio.local:9000
    s3_url_style: path
  • S3 Parquet Performance: Improved S3 parquet read performance.

  • HTTP Caching: Transient HTTP error responses such as 429 and 5xx are no longer cached, preventing stale error payloads from being served from cache.

  • HTTP Connector Metadata: Added response_headers as structured map data for HTTP datasets.

  • Views on_zero_results: Accelerated views now support on_zero_results: use_source to fall back to the source when no results are found:

    views:
    - name: sales_summary
    sql: |
    SELECT region, SUM(amount) as total
    FROM sales
    GROUP BY region
    acceleration:
    enabled: true
    on_zero_results: use_source
  • Flight DoPut Ingestion Metrics: Added rows_written and bytes_written metrics for Flight DoPut / ADBC ETL ingestion.

  • EXPLAIN ANALYZE Metrics: Added metrics for EXPLAIN ANALYZE in FlightSQLExec.

  • Scheduler Executor Metrics: Added scheduler_active_executors_count metric for monitoring active executors.

  • Query Memory Limit: Updated default query memory limit from 70% to 90%, with GreedyMemoryPool for improved memory management.

  • MetastoreTransaction Support: Added transaction support to prevent concurrent metastore transaction conflicts.

  • Iceberg REST Catalog: Coerce unsupported Arrow types to Iceberg v2 equivalents in the REST catalog API.

  • CDC Cache Invalidation: Improved cache invalidation for CDC-backed datasets.

  • Spice.ai Connector Alignment: Parameter names aligned across catalog and data connectors for Spice.ai Cloud.

  • Cayenne File Size: Cayenne now correctly respects the configured target file size (defaults to 128MB).

  • Cayenne Primary Keys: Properly set primary_keys/on_conflict for Cayenne tables.

  • Turso Metastore Performance: Cached metastore connections and prepared statements for improved Turso and SQLite metastore performance.

  • Turso SQL Robustness: More robust SQL unparsing and date comparison handling for Turso.

  • Dictionary Type Normalization: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration.

  • GitHub Connector Resilience: Improved GraphQL client resilience, performance, and ref filter handling.

  • ODBC Fix: Fixed ODBC queries silently returning 0 rows on query failure.

  • Anthropic Fixes: Fixed compatibility issues with Anthropic model provider.

  • v1/responses API Fix: The /v1/responses API now correctly preserves client instructions when system_prompt is set.

  • Shared Acceleration Snapshots: Show an error when snapshots are enabled on a shared acceleration file.

  • Distributed Mode Error Handling: Improved error handling for distributed mode and state_location configuration.

  • Helm Chart: Added support for ServiceAccount annotations and AWS IRSA example.

  • Perplexity Removed: Removed Perplexity model provider support.

  • Rust v1.93.1: Upgraded Rust toolchain to v1.93.1.

Contributorsโ€‹

Breaking Changesโ€‹

  • S3 metadata columns renamed: S3 metadata columns renamed from location, last_modified, size to _location, _last_modified, _size.
  • v1/evals API removed: The /v1/evals endpoint has been removed.
  • Perplexity removed: Perplexity model provider support has been removed.
  • Default query memory limit changed: Default query memory limit increased from 70% to 90%.

Upgradingโ€‹

To upgrade to v2.0.0-rc.2, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.2

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.2 image:

docker pull spiceai/spiceai:2.0.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.2

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
  • fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
  • feat: Implement catalog connectors for various databases by @lukekim in #9509
  • Refactor and clean up code across multiple crates by @lukekim in #9620
  • fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
  • Properly install postgres in install-postgres action by @krinart in #9629
  • fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
  • Update spicepod.schema.json by @app/github-actions in #9640
  • Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
  • fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
  • Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
  • testoperator for partitioned arrow accelerator by @Jeadie in #9635
  • Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
  • Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
  • Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
  • docs: update distribution details and add NAS support in release notes by @lukekim in #9650
  • Enable postgres-accel in CI builds for benchmarks by @sgrebnov in #9649
  • perf: Cache Turso metastore connection across operations by @penberg in #9646
  • Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
  • Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
  • chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
  • perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
  • Improve CDC cache invalidation by @krinart in #9651
  • Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
  • fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
  • fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
  • s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
  • docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
  • Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
  • Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
  • Upgrade to Rust v1.93.1 by @lukekim in #9669
  • Security fixes and improvements by @lukekim in #9666
  • feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
  • Skip caching http error response + add response_headers by @krinart in #9670
  • refactor: Remove v1/evals functionality by @Jeadie in #9420
  • Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
  • Enable on_zero_results: use_source for views by @krinart in #9699
  • fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
  • Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
  • Fixes for anthropic by @Jeadie in #9707
  • Use max_partitions_per_executor in allocate_initial_partitions by @Jeadie in #9659
  • [SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
  • Upgrade to Turso v0.5 by @lukekim in #9628
  • feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
  • fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
  • fix: Ensure Cayenne respects target file size by @peasee in #9730
  • refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
  • [SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
  • Properly set primary_keys/on_conflict for Cayenne tables by @krinart in #9739
  • Add executor resource and replica support to cloud app config by @ewgenius in #9734
  • feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
  • Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
  • Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
  • JSON file format improvements by @lukekim in #9743
  • [SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
  • Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
  • Make S3 metadata column handling more robust by @sgrebnov in #9762
  • Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
  • Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
  • Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
  • Improve JSON SODA support by @lukekim in #9795
  • Add ADBC Data Connector by @lukekim in #9723
  • docs: Release Cayenne as RC by @peasee in #9766
  • cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
  • Include updated JSON formats in HTTPS connector by @lukekim in #9800
  • Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
  • Pass through authentication to ADBC connector by @lukekim in #9801
  • Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
  • Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
  • Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
  • Fix http connector by @krinart in #9818
  • Fix ADBC Connector build and test by @lukekim in #9813
  • Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
  • Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
  • fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
  • fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
  • fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
  • fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
  • Filter pushdown improvements for SqlTable by @lukekim in #9852
  • feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
  • Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
  • feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
  • Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
  • Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
  • Partial Aggregation optimisation for FlightSQLExec by @lukekim in #9882
  • fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
  • feat: emit scheduler_active_executors_count and use it in spidapter by @Jeadie in #9885
  • feat: Add custom auth header support for GraphQL connector by @krinart in #9899
  • Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
  • When executor connects, send DDL for existing tables by @Jeadie in #9904
  • fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
  • fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
  • fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
  • Remove Perplexity support by @Jeadie in #9910
  • Fix refresh_sql support for debezium constraints by @krinart in #9912
  • Implement DML for DynamoDBTableProvider by @lukekim in #9915
  • chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
  • Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
  • Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
  • Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
  • fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
  • Add full query federation support for ADBC data connector by @lukekim in #9953
  • Make time_format deserialization case-insensitive by @vyershov in #9955
  • Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
  • fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
  • ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
  • Make BigQueryDialect more robust and add BigQuery TPC-H benchmark support by @lukekim in #9969
  • fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
  • fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
  • Add caching retention by @krinart in #9984
  • fix: improve Databricks schema error detection and messages by @lukekim in #9987
  • fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
  • fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
  • fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
  • fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
  • fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
  • fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
  • Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
  • fix: Address data correctness bugs found in audit by @sgrebnov in #10015
  • fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
  • Improve GitHub connector ref handling and resilience by @lukekim in #10023
  • feat: Add spice completions command for shell completion generation by @lukekim in #10024
  • fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
  • Implement RefreshDataset for distributed control stream by @Jeadie in #10055
  • perf: Improve S3 parquet read performance by @sgrebnov in #10064
  • fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
  • feat: spice completions auto-detects shell directory and writes file by @lukekim in #10068
  • fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
  • fix partial aggregation deduplication on string checking by @lukekim in #10078
  • fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
  • fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
  • feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
  • Use strict cast in try_cast_to to error on overflow instead of silent NULL by @sgrebnov in #10104
  • feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
  • feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
  • Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
  • Add mode: file_update acceleration mode by @krinart in #10108
  • Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
  • fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
  • feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
  • fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
  • fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
  • feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
  • Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
  • fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
  • fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
  • fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
  • fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
  • Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
  • fix: executor startup failures by @Jeadie in #10155
  • feat: Distributed runtime.task_history support by @Jeadie in #10156
  • fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
  • feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
  • fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
  • Fix linux/macos spice upgrade by @phillipleblanc in #10194
  • Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
  • fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
  • Propagate runtime.params.parquet_page_index to Delta Lake connector by @sgrebnov in #10209
  • Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
  • fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
  • fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
  • fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
  • feat: Add pagination support to HTTP data connector by @lukekim in #10228
  • feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
  • fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
  • feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
  • fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
  • fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.1...v2.0.0-rc.2

Spice v1.2.0 (Apr 28, 2025)

ยท 16 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.2.0! ๐Ÿš€

Spice v1.2.0 is a significant update. It upgrades DataFusion to v45 and Arrow to v54. This release brings faster query performance, support for parameterized queries in SQL and HTTP APIs, and the ability to accelerate views. Several bugs have been fixed and dependencies updated for better stability and speed.

DataFusion v45 Highlightsโ€‹

Spice.ai is built on the DataFusion query engine. The v45 release brings:

  • Faster Performance ๐Ÿš€: DataFusion is now the fastest single-node engine for Apache Parquet files in the clickbench benchmark. Performance improved by over 33% from v33 to v45. Arrow StringView is now on by default, making string and binary data queries much faster, especially with Parquet files.

  • Better Quality ๐Ÿ“‹: DataFusion now runs over 5 million SQL tests per push using the SQLite sqllogictest suite. There are new checks for logical plan correctness and more thorough pre-release testing.

  • New SQL Functions โœจ: Added show functions, to_local_time, regexp_count, map_extract, array_distance, array_any_value, greatest, least, and arrays_overlap.

See the DataFusion 45.0.0 release notes for details.

Spice.ai upgrades to the latest minus one DataFusion release to ensure adequate testing and stability. The next upgrade to DataFusion v46 is planned for Spice v1.3.0 in May.

What's New in v1.2.0โ€‹

  • Parameterized Queries: Parameterized queries are now supported with the Flight SQL API and HTTP API. Positional and named arguments via $1 and :param syntax are supported, respectively. Logical plans for SQL statements are cached for faster repeated queries.

    Example Cookbook recipes:

    See the API Documentation for additional details.

  • Accelerated Views: Views, not just datasets, can now be accelerated. This provides much better performance for views that perform heavy computation.

    Example spicepod.yaml:

    views:
    - name: accelerated_view
    acceleration:
    enabled: true
    engine: duckdb
    primary_key: id
    refresh_check_interval: 1h
    sql: |
    select * from dataset_a
    union all
    select * from dataset_b

    See the Data Acceleration documentation.

  • Memory Usage Metrics & Configuration: Runtime now tracks memory usage as a metric, and a new runtime memory_limit parameter is available. The memory limit parameter applies specifically to the runtime and should be used in addition to existing memory usage configuration, such as duckdb_memory_limit. Memory usage for queries beyond the memory limit will spill to disk.

    See the Memory Reference for details.

  • New Worker Component: Workers are new configurable compute units in the Spice runtime. They help manage compute across models and tools, handle errors, and balance load. Workers are configured in the workers section of spicepod.yaml.

    Example spicepod.yaml:

    workers:
    - name: round-robin
    description: |
    Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
    models:
    - from: foo
    - from: bar
    - name: fallback
    description: |
    Tries 'bar' first, then 'foo', then 'baz' if earlier models fail.
    models:
    - from: foo
    order: 2
    - from: bar
    order: 1
    - from: baz
    order: 3

    See the Workers Documentation for details.

  • Databricks Model Provider: Databricks models can now be used with from: databricks:model_name.

    Example spicepod.yaml:

    models:
    - from: databricks:llama-3_2_1_1b_instruct
    name: llama-instruct
    params:
    databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
    databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }

See the Databricks model documentation.

  • spice chat CLI Improvements: The spice chat command now supports an optional --temperature parameter. A one-shot chat can also be sent with spice chat <message>.

  • More Type Support: Added support for Postgres JSON type and DuckDB Dictionary type.

  • Other Improvements:

    • New image tags let you pick memory allocators for different use-cases: jemalloc, sysalloc, and mimalloc.
    • Better error handling and logging for chat and model operations.

Contributorsโ€‹

Cookbook Updatesโ€‹

New recipes for:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.2.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.0 image:

docker pull spiceai/spiceai:1.2.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Spice is now built with Rust 1.85.0 and Rust 2024.

Changelogโ€‹

- Update end_game.md (#5312) by @peasee in https://github.com/spiceai/spiceai/pull/5312
- feat: Add initial testoperator query validation (#5311) by @peasee in https://github.com/spiceai/spiceai/pull/5311
- Update Helm + Prepare for next release (#5317) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5317
- Update spicepod.schema.json (#5319) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5319
- add integration test for reading encrypted PDFs from S3 (#5308) by @kczimm in https://github.com/spiceai/spiceai/pull/5308
- Stop `load_components` during runtime shutdown (#5306) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5306
- Update openapi.json (#5321) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5321
- feat: Implement record batch data validation (#5331) by @peasee in https://github.com/spiceai/spiceai/pull/5331
- Update QA analytics for v1.1.1 (#5320) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5320
- fix: Update benchmark snapshots (#5337) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5337
- Enforce pulls with Spice v1.0.4 (#5339) by @lukekim in https://github.com/spiceai/spiceai/pull/5339
- Upgrade to DataFusion 45, Arrow 54, Rust 1.85 & Edition 2024 (#5334) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5334
- feat: Allow validating testoperator in benchmark workflow (#5342) by @peasee in https://github.com/spiceai/spiceai/pull/5342
- Upgrade `delta_kernel` to 0.9 (#5343) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5343
- deps: Update odbc-api (#5344) by @peasee in https://github.com/spiceai/spiceai/pull/5344
- Fix schema inference for Snowflake tables with large number of columns (#5348) by @ewgenius in https://github.com/spiceai/spiceai/pull/5348
- feat: Update testoperator dispatch for validation, version metric (#5349) by @peasee in https://github.com/spiceai/spiceai/pull/5349
- fix: validate_results not validate (#5352) by @peasee in https://github.com/spiceai/spiceai/pull/5352
- revert to previous pdf-extract; remove test for encrypted pdf support (#5355) by @kczimm in https://github.com/spiceai/spiceai/pull/5355
- Stablize the test `verify_similarity_search_chat_completion` (#5284) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5284
- Turn off `delta_kernel::log_segment` logging and refactor log filtering (#5367) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5367
- Upgrade to DuckDB 1.2.2 (#5375) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5375
- Update Readme - fix broken and outdated links (#5376) by @ewgenius in https://github.com/spiceai/spiceai/pull/5376
- Upgrade dependabot dependencies (#5385) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5385
- fix: Remove IMAP oauth (#5386) by @peasee in https://github.com/spiceai/spiceai/pull/5386
- Bump Helm chart to 1.1.2 (#5389) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5389
- Refactor accelerator registry as part of runtime. (#5318) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5318
- Include `vnd.spiceai.sql/nsql.v1+json` response examples (openapi docs) (#5388) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5388
- docs: Update endgame template with SpiceQA, update qa analytics (#5391) by @peasee in https://github.com/spiceai/spiceai/pull/5391
- Make graceful shutdown timeout configurable (#5358) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5358
- docs: Update release criteria with note on max columns (#5401) by @peasee in https://github.com/spiceai/spiceai/pull/5401
- Update openapi.json (#5392) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5392
- FinanceBench: update scorer instructions and switch scoring model to `gpt-4.1` (#5395) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5395
- feat: Write OTel metrics for testoperator (#5397) by @peasee in https://github.com/spiceai/spiceai/pull/5397
- Update nsql openapi title (#5403) by @ewgenius in https://github.com/spiceai/spiceai/pull/5403
- Track `ai_inferences_count` with used tools flag. Extensible runtime request context. (#5393) by @ewgenius in https://github.com/spiceai/spiceai/pull/5393
- Include newly detected view as changed view (#5408) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5408
- Track used_tools in ai_inferences_with_spice_count as number (#5409) by @ewgenius in https://github.com/spiceai/spiceai/pull/5409
- Update openapi.json (#5406) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5406
- Tweak enforce pulls with Spice (#5411) by @lukekim in https://github.com/spiceai/spiceai/pull/5411
- Allow `flightsql` and `spiceai` connectors to override flight max message size (#5407) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5407
- Retry model graded scorer once on successful, empty response (#5405) by @Jeadie in https://github.com/spiceai/spiceai/pull/5405
- use span task name in 'spice trace' tree, not span_id (#5412) by @Jeadie in https://github.com/spiceai/spiceai/pull/5412
- Rename to `track_ai_inferences_with_spice_count` in all places (#5410) by @ewgenius in https://github.com/spiceai/spiceai/pull/5410
- Update qa_analytics.csv (#5421) by @peasee in https://github.com/spiceai/spiceai/pull/5421
- Remove the filter for the `list_datasets` tool in the AI inferences metric count. (#5417) by @ewgenius in https://github.com/spiceai/spiceai/pull/5417
- fix: Testoperator uses an exact API key for benchmark metric submission (#5413) by @peasee in https://github.com/spiceai/spiceai/pull/5413
- feat: Enable testoperator metrics in workflow (#5422) by @peasee in https://github.com/spiceai/spiceai/pull/5422
- Upgrade mistral.rs (#5404) by @Jeadie in https://github.com/spiceai/spiceai/pull/5404
- Include all FinanceBench documents in benchmark tests (#5426) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5426
- Handle second Ctrl-C to force runtime termination (#5427) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5427
- Add optional `--temperature` parameter for `spice chat` CLI command (#5429) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5429
- Remove `with_runtime_status` from the `RuntimeBuilder` (#5430) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5430
- Fix spice chat error handling (#5433) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5433
- Add more test models to FinanceBench benchmark (#5431) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5431
- support 'from: databricks:model_name' (#5434) by @Jeadie in https://github.com/spiceai/spiceai/pull/5434
- Upgrade Pulls with Spice to v1.0.6 and add concurrency control (#5442) by @lukekim in https://github.com/spiceai/spiceai/pull/5442
- Upgrade DataFusion table providers (#5443) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5443
- Test spice chat in e2e_test_spice_cli (#5447) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5447
- Allow for one-shot chat request using `spice chat <message>` (#5444) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5444
- Enable parallel data sampling for NSQL (#5449) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5449
- Upgrade Go from v1.23.4 to v1.24.2 (#5462) by @lukekim in https://github.com/spiceai/spiceai/pull/5462
- Update PULL_REQUEST_TEMPLATE.md (#5465) by @lukekim in https://github.com/spiceai/spiceai/pull/5465
- Enable captured outputs by default when spiced is started by the CLI (spice run) (#5464) by @lukekim in https://github.com/spiceai/spiceai/pull/5464
- Parameterized queries via Flight SQL API (#5420) by @kczimm in https://github.com/spiceai/spiceai/pull/5420
- fix: Update benchmarks readme badge (#5466) by @peasee in https://github.com/spiceai/spiceai/pull/5466
- delay auth check for binding parameterized queries (#5475) by @kczimm in https://github.com/spiceai/spiceai/pull/5475
- Add support for `?` placeholder syntax in parameterized queries (#5463) by @kczimm in https://github.com/spiceai/spiceai/pull/5463
- enable task name override for non static span names (#5423) by @Jeadie in https://github.com/spiceai/spiceai/pull/5423
- Allow parameter queries with no parameters (#5481) by @kczimm in https://github.com/spiceai/spiceai/pull/5481
- Support unparsing UNION for distinct results (#5483) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5483
- add rust-toolchain.toml (#5485) by @kczimm in https://github.com/spiceai/spiceai/pull/5485
- Add parameterized query support to the HTTP API (#5484) by @kczimm in https://github.com/spiceai/spiceai/pull/5484
- E2E test for spice chat <message> behavior (#5451) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5451
- Renable and fix huggingface models integration tests (#5478) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5478
- Update openapi.json (#5488) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5488
- feat: Record memory usage as a metric (#5489) by @peasee in https://github.com/spiceai/spiceai/pull/5489
- fix: update dispatcher to run all benchmarks, rename metric, update spicepods, add scale factor (#5500) by @peasee in https://github.com/spiceai/spiceai/pull/5500
- Fix ILIKE filters support (#5502) by @ewgenius in https://github.com/spiceai/spiceai/pull/5502
- fix: Update test spicepod locations and names (#5505) by @peasee in https://github.com/spiceai/spiceai/pull/5505
- fix: Update benchmark snapshots (#5508) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5508
- fix: Update benchmark snapshots (#5512) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5512
- Fix Delta Lake bug for: Found unmasked nulls for non-nullable StructArray field "predicate" (#5515) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5515
- fix: working directory for duckdb e2e test spicepods (#5510) by @peasee in https://github.com/spiceai/spiceai/pull/5510
- Tweaks to README.md (#5516) by @lukekim in https://github.com/spiceai/spiceai/pull/5516
- Cache logical plans of SQL statements (#5487) by @kczimm in https://github.com/spiceai/spiceai/pull/5487
- Fix `content-type: application/json` (#5517) by @Jeadie in https://github.com/spiceai/spiceai/pull/5517
- Validate postgres results in testoperator dispatch (#5504) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5504
- fix: Update benchmark snapshots (#5511) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5511
- Fix results cache by SQL with prepared statements (#5518) by @kczimm in https://github.com/spiceai/spiceai/pull/5518
- Add initial support for views acceleration (#5509) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5509
- fix: Update benchmark snapshots (#5527) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5527
- Support switching the memory allocator Spice uses via `alloc-*` features. (#5528) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5528
- fix: Update benchmark snapshots (#5525) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5525
- Add test spicepod for tpch mysql-duckdb[file acceleration] (#5521) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5521
- Fix nightly arm build - change tag `-default` to `-models` (#5529) by @ewgenius in https://github.com/spiceai/spiceai/pull/5529
- LLM router via `worker` spicepod component (#5513) by @Jeadie in https://github.com/spiceai/spiceai/pull/5513
- Apply Spice advanced acceleration logic and params support to accelerated views (#5526) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5526
- Enable DatasetCheckpoint logic for accelerated views (#5533) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5533
- Fix public '.model' name for router workers (#5535) by @Jeadie in https://github.com/spiceai/spiceai/pull/5535
- feat: Add Runtime memory limit parameter (#5536) by @peasee in https://github.com/spiceai/spiceai/pull/5536
- For fallback worker, check first item in `chat/completion` stream. (#5537) by @Jeadie in https://github.com/spiceai/spiceai/pull/5537
- Move rate limit check to after parameterized query binding (#5540) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5540
- Update spicepod.schema.json (#5545) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5545
- Accelerate views: refresh_on_startup, ready_state, jitter params support (#5547) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5547
- Add integration test for accelerated views (#5550) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5550
- Don't install make or expect on spiceai-macos runners (#5554) by @lukekim in https://github.com/spiceai/spiceai/pull/5554
- `event_stream` crate for emitting events from tracing::Span; used in v1/chat/completions streaming. (#5474) by @Jeadie in https://github.com/spiceai/spiceai/pull/5474
- Fix typo in method (#5559) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5559
- Run test operator every day and current and previous commits (#5557) by @lukekim in https://github.com/spiceai/spiceai/pull/5557
- Add aws_allow_http parameter for delta lake connector (#5541) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5541
- feat: Add branch name to metric dimensions in testoperator (#5563) by @peasee in https://github.com/spiceai/spiceai/pull/5563
- fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/odbc[databricks].yaml (#5565) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5565
- fix: Split scheduled dispatch into a separate job (#5567) by @peasee in https://github.com/spiceai/spiceai/pull/5567
- fix: Use outputs.SPICED_COMMIT (#5568) by @peasee in https://github.com/spiceai/spiceai/pull/5568
- fix: Use refs in testoperator dispatch instead of commits (#5569) by @peasee in https://github.com/spiceai/spiceai/pull/5569
- fix: actions/checkout ref does not take a full ref (#5571) by @peasee in https://github.com/spiceai/spiceai/pull/5571
- fix: Testoperator dispatch (#5572) by @peasee in https://github.com/spiceai/spiceai/pull/5572
- Respect `update-snapshots` when running all benchmarks manually (#5577) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5577
- Use FETCH_HEAD instead of ${{ inputs.ref }} to list commits in setup_spiced (#5579) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5579
- Add additional test scenarios for benchmarks (#5582) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5582
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-duckdb[file].yaml (#5590) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5590
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml (#5591) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5591
- Fix Snowflake data connector rows ordering (#5599) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5599
- fix: Update benchmark snapshots (#5595) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5595
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-arrow.yaml (#5594) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5594
- fix: Update benchmark snapshots (#5589) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5589
- fix: Update benchmark snapshots (#5583) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5583
- Downgrade DuckDB to 1.1.3 (#5607) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5607
- Add prepared statement integration tests (#5544) by @kczimm in https://github.com/spiceai/spiceai/pull/5544

Full Changelog: v1.1.2...v1.2.0