Loading a Data Source
In Acadia, a Data Source represents the origin of data that feeds into a dataset. Data sources can vary widely, ranging from simple CSV files to complex APIs, and are crucial for the initial steps of data ingestion in the system.
Overview
A data source in Acadia abstracts the complexity of data ingestion, allowing users to focus on specifying what data they need without worrying about the underlying mechanics of data extraction. Acadia supports multiple types of data sources, each tailored to different data formats and storage mechanisms.
Supported Data Source Types
Currently, Acadia supports the following data source types:
- CSVDataSource: For loading data from CSV files, which is commonly used due to its simplicity and wide adoption in data storage and exchange.
Usage
To utilize a data source, you first need to instantiate a data source object by specifying the necessary parameters that define how data should be read and processed. Here is an example of how to load a CSV file as a data source:
# Import the necessary module
import acadia
# Create a data source from a CSV file
data_source = acadia.data_sources.CSVDataSource("human_eval.csv", sample_size=1000)