Overview of Supported Data Sources¶
Privacera Discovery supports a wide range of data sources, including databases, cloud storage, and file systems. This document provides an overview of the supported data sources and the steps to connect them to Privacera Discovery.
Connecting Data sources to Discovery¶
Below are the steps to create a System and add the relevant Data Sources in it.
-
For Self-Managed, log in to the Privacera Portal; for Data Plane, log in to the Privacera Discovery Admin Console.
-
Go to Settings > Data Source Registration.
-
Click ADD SYSTEM.
-
Enter the System name and description (optional) in the Name and Description field respectively.
-
Click SAVE.
Data sources Connectors supported by Privacera Discovery¶
Category | Data Source Connectors |
---|---|
Databases | Vertica, Databricks Unity Catalog, Databricks Spark SQL, Amazon Redshift, Apache Hive, Apache HBase, Apache Kudu, Apache Phoenix, Azure SQL Database, AWS DynamoDB, Google BigQuery, Microsoft SQL Server, Synapse, MySQL, Oracle, PostgreSQL, Snowflake, Trino, Teradata,Dremio |
Cloud Storage | Amazon S3, Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS) |
File Systems | HDFS, Local Filesystem, NFS, SFTP |
Supported File Formats for Discovery Scans¶
Type | Formats |
---|---|
Structured Data | .avro, .avro (nested), .csv, .html, .json, .json (nested), .orc, .parquet, .parquet (nested), .sas, .tsv, .xls, .xlsx, .xml, .dat |
Compressed/Archive Data | .gzip (single or multiple files), .gz (single or multiple files), .lzo/.lzop, .jar (single or multiple files), .tar.gz (single or multiple files), .snappy.parquet, .snappy.orc, .snappy.avro, .zip (single or multiple files), .zlib.orc, .zlib.parquet, .zlib.avro |
Unstructured Data | .doc, .docx, .pdf, .txt |
Media Data | For metadata extraction only: .jpeg, .mp4, .mpeg |
Database Data | For Database binary type columns is skipped and only non-binary columns data is scanned: |
- Prev Topic: Quick Start
- Next Topic: Scanning Data