Skip to main content
Version: v1.10

FTP/SFTP Data Connector

FTP (File Transfer Protocol) and SFTP (SSH File Transfer Protocol) are network protocols used for transferring files between a client and server, with FTP being less secure and SFTP providing encrypted file transfer over SSH.

The FTP/SFTP Data Connector enables federated SQL query across supported file formats stored on FTP/SFTP servers.

Quickstart​

Connect to an SFTP server and query CSV files:

datasets:
- from: sftp://remote-sftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
sftp_port: 22
sftp_user: my-sftp-user
sftp_pass: ${secrets:my_sftp_password}

Configuration​

from​

The from field takes one of two forms: ftp://<host>/<path> or sftp://<host>/<path> where <host> is the host to connect to and <path> is the path to the file or directory to read from.

If a folder is provided, all child files will be loaded.

name​

The dataset name used as the table name in SQL queries. Cannot be a reserved keyword.

params​

FTP​

Parameter NameDescription
file_formatRequired when connecting to a directory. See File Formats.
ftp_userUsername for FTP authentication.
ftp_passPassword for FTP authentication. Use secrets syntax: ${secrets:my_ftp_pass}.
ftp_portFTP server port. Default: 21.
client_timeoutConnection timeout duration. E.g. 30s, 1m. No timeout when unset.
hive_partitioning_enabledEnable Hive-style partitioning from folder structure. Default: false.

SFTP​

Parameter NameDescription
file_formatRequired when connecting to a directory. See File Formats.
sftp_userUsername for SFTP authentication.
sftp_passPassword for SFTP authentication. Use secrets syntax: ${secrets:my_sftp_pass}.
sftp_portSFTP server port. Default: 22.
client_timeoutConnection timeout duration. E.g. 30s, 1m. No timeout when unset.
hive_partitioning_enabledEnable Hive-style partitioning from folder structure. Default: false.

Examples​

Connecting to FTP​

- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}
hive_partitioning_enabled: false

Connecting to SFTP​

- from: sftp://remote-sftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
sftp_port: 22
sftp_user: my-sftp-user
sftp_pass: ${secrets:my_sftp_password}
hive_partitioning_enabled: false

Secrets​

Spice integrates with multiple secret stores for secure credential management. Store FTP/SFTP credentials in a secret store and reference them using the ${secrets:key} syntax.

datasets:
- from: sftp://files.example.com/data/
name: secure_data
params:
file_format: parquet
sftp_user: ${secrets:sftp_username}
sftp_pass: ${secrets:sftp_password}

For detailed information, refer to the secret stores documentation.

Troubleshooting​

Connection Timeouts​

If connections frequently timeout, increase the client_timeout value:

params:
client_timeout: 120s

Authentication Failures​

Verify credentials are correctly stored in your secret store and that the user has read access to the specified path on the server.

File Format Errors​

When connecting to a directory, ensure file_format is specified and matches the actual file types in the directory. Spice expects all files in a directory to have the same format.

Cookbook​

Refer to the FTP cookbook recipe to see an example of the FTP connector in use.