pdreadsqlSELECT
pdreadsqlSELECT is a programmatic interface provided by the DataFrame Analytics Library (DAL), an extension of the popular pandas toolkit. It allows users to execute arbitrary SQL SELECT statements directly against a variety of data sources—including relational databases, CSV files, and in‑memory data stores—and import the results into a pandas‑compatible DataFrame. The function is designed to integrate SQL’s expressive querying capabilities with Python’s data‑science ecosystem, helping analysts avoid repetitive data‑cleaning boilerplate while maintaining familiar DataFrame manipulation patterns.
The key concept behind pdreadsqlSELECT is that it accepts a SQL SELECT string, a database connection or
pdreadsqlSELECT(sql, con, params=None, index_col=None, chunksize=None, kwargs)
```
The `sql` argument is a string containing a valid SELECT statement. The `con` argument can be a
Usage examples include loading a subset of a remote PostgreSQL table:
"SELECT id, name, value FROM metrics WHERE value > :threshold",
)
```
or querying a local Parquet file:
"SELECT FROM parquet_table WHERE year = 2023",
)
```
While pdreadsqlSELECT supports standard SELECT syntax, it does not directly handle DML statements such as INSERT,
pdreadsqlSELECT was introduced in DAL version 2.1 as part of the library’s effort to bridge the gap