Overview of Supported Data Sources¶

Privacera Discovery supports a wide range of data sources, including databases, cloud storage, and file systems. This document provides an overview of the supported data sources and the steps to connect them to Privacera Discovery.

Connecting Data sources to Discovery¶

Below are the steps to create a System and add the relevant Data Sources in it.

For Self-Managed, log in to the Privacera Portal; for Data Plane, log in to the Privacera Discovery Admin Console.
Go to Settings > Data Source Registration.
Click ADD SYSTEM.
Enter the System name and description (optional) in the Name and Description field respectively.
Click SAVE.

Data sources Connectors supported by Privacera Discovery¶

Category	Data Source Connectors
Databases	Vertica, Databricks Unity Catalog, Databricks Spark SQL, Amazon Redshift, Apache Hive, Apache HBase, Apache Kudu, Apache Phoenix, Azure SQL Database, AWS DynamoDB, Google BigQuery, Microsoft SQL Server, Synapse, MySQL, Oracle, PostgreSQL, Snowflake, Trino, Teradata,Dremio
Cloud Storage	Amazon S3, Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS)
File Systems	HDFS, Local Filesystem, NFS, SFTP

Supported File Formats for Discovery Scans¶

Type	Formats
Structured Data	.avro, .avro (nested), .csv, .html, .json, .json (nested), .orc, .parquet, .parquet (nested), .sas, .tsv, .xls, .xlsx, .xml, .dat
Compressed/Archive Data	.gzip (single or multiple files), .gz (single or multiple files), .lzo/.lzop, .jar (single or multiple files), .tar.gz (single or multiple files), .snappy.parquet, .snappy.orc, .snappy.avro, .zip (single or multiple files), .zlib.orc, .zlib.parquet, .zlib.avro
Unstructured Data	.doc, .docx, .pdf, .txt
Media Data	For metadata extraction only: .jpeg, .mp4, .mpeg
Database Data	For Database binary type columns is skipped and only non-binary columns data is scanned:

Prev Topic: Role Mapping
Next Topic: Scanning Data