Skip to content

Flexible CSV processing for Postgres

License

Notifications You must be signed in to change notification settings

PostgREST/pg_csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pg_csv

PostgreSQL version Coverage Status Tests

Postgres has CSV support on the COPY command, but COPY has problems:

  • It uses a special protocol, so it doesn't work with other standard features like prepared statements, pipeline mode or pgbench.
  • Is not composable. You can't use COPY inside CTEs, subqueries, view definitions or as function arguments.

pg_csv offers flexible CSV processing as a solution.

  • Includes a CSV aggregate that composes with SQL expressions.
  • Native C extension, x2 times faster than SQL queries that try to output CSV (see our CI results).
  • No dependencies except Postgres.

Installation

PostgreSQL >= 12 is supported. Clone this repo and run:

make && make install

To install the extension:

create extension pg_csv;

csv_agg

Aggregate that builds a CSV respecting RFC 4180, quoting as required.

create table projects as
select *
from (
  values
    (1, 'Death Star OS', 1),
    (2, 'Windows 95 Rebooted', 1),
    (3, 'Project "Comma,Please"', 2),
    (4, 'Escape ""Plan""', 2),
    (NULL, 'NULL & Void', NULL)
) as _(id, name, client_id);
select csv_agg(x) from projects x;
            csv_agg
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 ,NULL & Void,
(1 row)

Custom Delimiter

Custom delimiters can be used to produce different formats like pipe-separated values, tab-separated values or semicolon-separated values.

select csv_agg(x, csv_options(delimiter := '|')) from projects x;
           csv_agg
-----------------------------
 id|name|client_id          +
 1|Death Star OS|1          +
 2|Windows 95 Rebooted|1    +
 3|Open Source Lightsabers|2+
 4|Galactic Payroll System|2+
 7|Bugzilla Revival|3
(1 row)

select csv_agg(x, csv_options(delimiter := E'\t')) from projects x;
              csv_agg
-----------------------------------
 id      name    client_id        +
 1       Death Star OS   1        +
 2       Windows 95 Rebooted     1+
 3       Open Source Lightsabers 2+
 4       Galactic Payroll System 2+
 7       Bugzilla Revival        3
(1 row)

Note

  • Newline, carriage return and double quotes are not supported as delimiters to maintain the integrity of the separated values format.
  • The delimiter can only be a single char, if a longer string is specified only the first char will be used.
  • Why use a csv_options constructor function instead of extra arguments? Aggregates don't support named arguments in postgres, see a discussion on #2 (comment).

BOM

You can include a byte-order mark (BOM) to make the CSV compatible with Excel.

select csv_agg(x, csv_options(bom := true)) from projects x;

      csv_agg
-------------------
id,name,client_id+
 1,Death Star OS,1
 2,Windows 95 Rebooted,1
 3,Open Source Lightsabers,2
 4,Galactic Payroll System,2
 5,Bugzilla Revival,3
(1 row)

Header

You can omit or include the CSV header.

select csv_agg(x, csv_options(header := false)) from projects x;

           csv_agg
-----------------------------
 1,Death Star OS,1          +
 2,Windows 95 Rebooted,1    +
 3,Open Source Lightsabers,2+
 4,Galactic Payroll System,2+
 7,Bugzilla Revival,3
(1 row)

Null string

NULL values are represented by an empty string by default. This can be changed with the nullstr option.

SELECT csv_agg(x, csv_options(nullstr:='<NULL>')) AS body
FROM   projects x;

              body
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 <NULL>,NULL & Void,<NULL>
(1 row)

Limitations

  • For large bulk exports and imports, COPY ... CSV should still be preferred as its faster due to streaming support.

About

Flexible CSV processing for Postgres

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors