Skip to content

Overview of Supported Data Sources

Privacera Discovery supports a wide range of data sources, including databases, cloud storage, and file systems. This document provides an overview of the supported data sources and the steps to connect them to Privacera Discovery.

Connecting Data sources to Discovery

Below are the steps to create a System and add the relevant Data Sources in it.

  1. For Self-Managed, log in to the Privacera Portal; for Data Plane, log in to the Privacera Discovery Admin Console.

  2. Go to Settings > Data Source Registration.

  3. Click ADD SYSTEM.

  4. Enter the System name and description (optional) in the Name and Description field respectively.

  5. Click SAVE.

Data sources Connectors supported by Privacera Discovery

Category Data Source Connectors
Databases Vertica, Databricks Unity Catalog, Databricks Spark SQL, Amazon Redshift, Apache Hive, Apache HBase, Apache Kudu, Apache Phoenix, Azure SQL Database, AWS DynamoDB, Google BigQuery, Microsoft SQL Server, Synapse, MySQL, Oracle, PostgreSQL, Snowflake, Trino, Teradata,Dremio
Cloud Storage Amazon S3, Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS)
File Systems HDFS, Local Filesystem, NFS, SFTP

Supported File Formats for Discovery Scans

Type Formats
Structured Data .avro, .avro (nested), .csv, .html, .json, .json (nested), .orc, .parquet, .parquet (nested), .sas, .tsv, .xls, .xlsx, .xml, .dat
Compressed/Archive Data .gzip (single or multiple files), .gz (single or multiple files), .lzo/.lzop, .jar (single or multiple files), .tar.gz (single or multiple files), .snappy.parquet, .snappy.orc, .snappy.avro, .zip (single or multiple files), .zlib.orc, .zlib.parquet, .zlib.avro
Unstructured Data .doc, .docx, .pdf, .txt
Media Data For metadata extraction only: .jpeg, .mp4, .mpeg
Database Data For Database binary type columns is skipped and only non-binary columns data is scanned:

Comments