unitpackage/ChangeLog at main · echemdb/unitpackage · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
==================
echemdb Change Log
==================

.. current developments

v0.13.4
====================

**Added:**

* Added ``Echemdb.from_remote(version=...)`` to download a specific version of the echemdb database. The version is defined as ``ECHEMDB_DATABASE_VERSION`` in ``remote.py`` and serves as the single source of truth.

**Changed:**

* Changed ``Collection.from_remote`` to log the URL it downloads data from.
* Changed argument order in ``Collection.from_remote``, ``Echemdb.from_remote``, and ``collect_datapackages``: ``outdir`` now comes before ``data``.

**Removed:**

* Removed ``ECHEMDB_DATABASE_URL`` module-level constant from ``remote.py``; use ``get_echemdb_database_url()`` instead.


v0.13.3
====================

**Performance:**

* Improved speed to return the echemdb description by caching the bibliography data.


v0.13.2
====================

**Added:**

* Added shifting and loading of reference electrodes from aliases in `ReferenceElectrode.shift` and `ReferenceElectrode.load` methods.

**Changed:**

* Changed citation output format for `EchemdbEntry.citation()`, removing url, doi from the output .


v0.13.1
====================

**Changed:**

* Changed the CSV loader to emit a warning instead of raising a ``ValueError`` when the number of fields in data rows is inconsistent with column headers. Extra columns are auto-labeled as ``unknown 1``, ``unknown 2``, etc., and missing values are represented as ``NaN``.


v0.13.0
====================

**Added:**

* Added shared `to_builtin()` methods on descriptor wrappers.

**Changed:**

* Changed upper bound for `pandas` dependency from 3 to 4.

**Fixed:**

* Fixed collection positional access to use deterministic alphabetical ordering, so iteration, integer indexing, and slicing follow the same identifier order.
* Fixed CSV-to-dataframe reconstruction from tabular resources to honor frictionless descriptor dialect and encoding metadata, avoiding silent misparsing for non-default delimiters.
* Fixed the CSV loader API by replacing the ambiguous `delimiters` argument with explicit `delimiter` and `candidate_delimiters` parameters.
* Fixed calculation of reference electrode shift.
* Fixed CLI metadata loading for the `unitpackage csv` command by applying parsed YAML metadata via `entry.metadata.from_dict(...)`.
* Fixed `create_unitpackage(...)` to always store metadata as a dictionary when no metadata is provided.
* Fixed metadata aliasing when creating derived resources by deep-copying metadata in `Entry._create_new_df_resource(...)`.
* Fixed plotting with option `original` if the fields contain fields without units.
* Fixed YAML export for descriptors containing lists, including nested list structures.


v0.12.0
====================

**Added:**

* Added ``Entry.apply_scaling_factor`` which multiplies a column by a given value and tracks the cumulative scaling factor in the field metadata.
* Added ``EchemdbEntry.scan_rate`` property returning the scan rate as an astropy quantity from ``figureDescription.scanRate``.
* Added ``EchemdbEntry.rescale_scan_rate`` which rescales the current (density) axis by the ratio of a new scan rate to the original one, which is essentially like applying a scaling factor.
* Added `entry.fields` as a property to access the list of fields in the entry's schema (`entry.resource.schema.fields`).
* Added `entry.rename_field(field_name, new_name, keep_original_name_as=None)` for renaming a single field.
* Added `entry.remove_column(field_name)` for removing a single column.
* Added `_modify_field_name()` helper method for modifying a single field's name.
* Added `device` parameter to `Entry.from_csv()` to select instrument-specific loaders (e.g., ``device='eclab'`` for BioLogic MPT files, ``device='gamry'`` for Gamry DTA files).
* Added `BaseLoader.metadata` property, which returns file structure information (loader name, delimiter, decimal, header, column headers) stored as ``dsvDescription`` in the entry's metadata.
* Added `EchemdbEntry.from_mpt()` classmethod to load BioLogic EC-Lab MPT files with automatic field updates, renaming, and filtering.
* Added `eclab_fields.py` module (renamed from ``column_names.py``) containing ``biologic_fields`` and ``biologic_fields_alt_names`` for standardized electrochemistry field definitions.
* Added `Entry.remove_columns()`, which removes both a column from the dataframe and the fields.
* Added `Entry.load_metadata()` for loading metadata from YAML or JSON files with method chaining support.
* Added `MetadataDescriptor` class for enhanced metadata handling with dict and attribute-style access.
* Added `Entry.default_metadata_key` class attribute to control metadata access patterns in subclasses.
* Added `Entry._default_metadata` property to access the appropriate metadata subset.
* Added `encoding`, `header_lines`, `column_header_lines`, `decimal`, `delimiters`, and `device` parameters to `Entry.from_csv()` for handling complex CSV formats and instrument-specific file types.
* Added `create_tabular_resource_from_csv()` to create resources from CSV files with auto-detection of standard vs. complex formats.
* Added `create_df_resource_from_csv()` for creating pandas dataframe resources from CSV files with custom formats.
* Added `create_df_resource_from_df()` for creating resources directly from pandas DataFrames.
* Added `create_df_resource_from_tabular_resource()` for converting tabular resources to pandas dataframe resources.
* Added `update_fields()` function to update schema fields with additional information.
* Added ``Entry.create_example(name=None)`` which returns a single example entry. Defaults to ``'alves_2011_electrochemistry_6010_f1a_solid'`` when no name is provided.
* Added `_create_new_df_resource()`, which returns a new pandas resource when the schema of the resource changed.
* Added `_df_resource` as a cached property, which transforms a tabular_resource into a pandas resource when first accessed and caches the result for improved performance.
* Added `collection.rescale(units)` method to rescale the units of all entries in a collection at once. Accepts a dict of `{field_name: new_unit}` and silently ignores fields not present in an entry.

**Changed:**

* Changed `entry.remove_columns()`, `entry.add_columns()`, and `entry.update_fields()` to use frictionless Schema's built-in methods (`schema.remove_field()`, `schema.add_field()`, `schema.update_field()`).
* Changed `entry.rename_fields()` to perform batch rename operations on all fields at once instead of creating intermediate entries.
* Changed `entry.remove_columns()` to perform batch column removal at once instead of creating intermediate entries.
* Changed `_modify_fields()` to `_modify_fields_names()` for clearer naming, and updated its parameter names (`original` → `fields`, `alternative` → `name_mappings`).
* Changed `update_fields()` in `unitpackage.local` to use frictionless `schema.update_field()` instead of manual field dictionary manipulation.
* Changed `Entry.update_fields()` and `create_unitpackage()` in `unitpackage.local` to delegate to `local.update_fields()` for consistent field validation and logging.
* Changed all docstrings referencing `entry.resource.schema.fields` to use the simpler `entry.fields` property.
* Changed `Entry.field_unit()` to return an empty string instead of raising a KeyError when a field does not have a unit.
* Changed `Entry.from_df()` to no longer require the `outdir` parameter.
* Changed `Entry.from_df()` to directly create entries from pandas DataFrames without temporary CSV files.
* Changed `Entry.from_df()` to require `basename` as a keyword-only argument.
* Changed `Entry.save()` to automatically convert pandas resources to CSV format.
* Changed `Entry.metadata` to return a cached `MetadataDescriptor` object supporting enhanced metadata operations while still reflecting metadata changes.
* Changed workflows to use pixi v0.63.2.
* Changed ``Collection.create_example`` to use ``Collection.from_local`` internally.

**Removed:**

* Removed deprecated module `cv_collection`.
* Removed deprecated module `cv_entry`.
* Removed `metadata` argument from `Entry.from_df()` and `Entry.from_csv()`.
* Removed ``Entry.create_examples``. Use ``Entry.create_example`` instead.
* Removed `entry.mutable_resource`.

**Fixed:**

* Fixed creating plots from entries without units in the fields (#128).
* Fixed resource naming when importing complex CSV files with multiple headers.

**Performance:**

* Improved field handling performance by performing batch operations for `rename_fields()` and `remove_columns()`, avoiding O(N) intermediate entry creation.
* Improved schema copying efficiency in `_create_new_df_resource()` by using direct schema descriptor copying when no updates are needed.


v0.11.2
====================

**Added:**

* Added `unitpackage.electrochemistry.reference_electrodes` which contains reference electrode data and a dataclass to interact with the data (`_reference_electrodes`), and a `ReferenceElectrode` object, which determining the shift of the potential between different reference scales.
* Added `Entry.add_offset` which allows shifting the values of a specified column of the entry by a certain offset and tracking the information in the fields description.
* Added `EchmdbEntry.rescale_reference` which allows shifting the potential scale onto another potential scale known to `unitpackage.electrochemistry._reference_electrodes`.


v0.11.1
====================

**Removed:**

* Removed dependency clevercsv and used Python's csv instead.


v0.11.0
====================

**Added:**

* Added modules from `echemdb-converters` (https://github.com/echemdb/echemdb-converters v4.0.1) as modules in unitpackage within `unitpackage.loaders`.
* Added support for Python 3.14.
* Added dependency `click>=8,<9`
* Added dependency `clevercsv>=0.7.0,<0.9.0`

**Removed:**

* Removed support for Python 3.9.
* Removed dependency on iteration_utilities.


v0.10.1
====================

**Added:**

* Added property `identifiers` to `Collection`, returning a list of identifiers of the collection.
* Added additional methods to `collection.__getitem__()`, allowing for creating new collections from existing collections by providing a list of identifiers (`db["id1","id2"]`), integers (`db[0,2]`) or simply a slice (`db[2:3]`). Additionally, entries can now be selected by their position in the collection (`entry = db[3]`).

**Changed:**

* Changed `collection.bibliography` to a cached_property.

**Fixed:**

* Fixed showing plotly plots in the documentation, by using plotly 5 in the workflow to build the documentation.


v0.10.0
====================

**Added:**

* Added `unitpackage.database.echemdb_entry.CVEntry` and `unitpackage.database.echemdb.EchemdbEntry`, with specific functionalities for the echemdb data repository.
* Added tests for Python 3.13.

**Changed:**

* Changed metadata example keys to use camelCase for consistency with JSON naming conventions.
* Use electrochemistry-data release 0.5.0 for remote data tests.
* Changed upper version bound for plotly from "<6" to "<7".
* Changed version bound for pybtex from ">=0.24,<0.25" to ">=0.25,<0.26".

**Deprecated:**

* Deprecated `unitpackage.cv.cv_entry.CVEntry` and `unitpackage.cv.cv_collection.CVCollection`.

**Removed:**

* Removed unused dependency `filelock`.


v0.9.2
====================

**Changed:**

* Changed upper version bound for astropy from <=7 to <8.


v0.9.1
====================

**Removed:**

* Removed `unitpackage.local.collect_datapackage` since it is identical to frictionless `package = Package()`.

**Fixed:**

* Fixed creating and saving entries containing upper case characters, which are converted to lowercase, to match the frictionless specifications.


v0.9.0
====================

**Added:**

* Added `entry.add_column` which allows adding a column to an existing pandas dataframe and extends the Data Package fields with given units.
* Added the property `entry.mutable_resource`, which is a virtual modifiable copy of the original resource excluding its metadata.
* Added `unitpackage.local.collect_resources`, which collects all resources from a list of frictionless Data Packages.
* Added `collection.from_local_file` to create a collection from the resources included in a Data Package (JSON).
* Added validation to check for duplicate resource names upon creating a collection.
* Added wheel upload on new release.
* Added dependency `iteration_utilities`.

**Changed:**

* Changed `unitpackage.entry.Entry` from being a frictionless Data Package into a frictionless Resource.
* Changed `unitpackage.collection.Collection` from being a collection of frictionless Data Packages  into a collection of frictionless Resources forming a Data Package.
* Changed the virtual `echemdb` Resource into an `entry.mutable_resource`.
* Changed `unitpackage.local.create_df_resource` to create a resource from an actual frictionless Resource instead of a frictionless Data Package.
* Changed packages for development to be provided by pixi instead of conda directly.

**Removed:**

* Removed argument `resource_name` in `unitpackage.local.create_df_resource` and all other instances where resources were named "echemdb".

**Fixed:**

* Fixed parsing of arguments `data` and `outdir` for `collection.from_remote` downloading data from the default remote url.
* Fixed breaking tests on GitHub (tests should be more stable now since we switched to pixi for locked versions of dependencies.)

**Performance:**

* Improved loading collections via `collection.from_local` or `collection.from_remote` and entries via `entry.from_local`. In contrast to the previous version, dataframes are now only loaded when a method or property is called that requires access to the resource's data. This also increases the speed for filtering the data based on metadata predicates.


v0.8.5
====================

**Added:**

* Added `Entry.rename_fields`, returning an entry new with field names and dataframe column names.
* Added the classmethod `Entry._modify_fields` that updates a list of fields, and allows keeping the original name.


v0.8.4
====================

**Fixed:**

* Fixed `entry.save`, where the saved datapackages contained the `echemdb` resource.


v0.8.3
====================

**Fixed:**

* Fixed saving entries with `date` and `datetime` objects.


v0.8.2
====================

**Fixed:**

* Fixed creation of fields, when not all fields were provided to `local.create_unitpackage`.
* Fixed creating an entry from the parent directory with `Entry.from_csv`.


v0.8.1
====================

**Fixed:**

* Fixed binder configuration files and links.
* Upgraded to pandas version 2.


v0.8.0
====================

**Added:**

* Added `entry.save` which creates a unitpackage, i.e., a CSV file and a JSON file, in the directory `outdir`.
* Added `collection.save_entries` saves the entries of this collection using `entry.save`.
* Added `Collection.from_local`, creating a collection from local datapackages.
* Added `Collection.from_remote`, creating a collection from remote datapackages collected from a url containing a ZIP.
* Added `Entry.from_local`, loading an entry from a local datapackage.
* Added `Entry.from_csv`, creating an entry from a CSV and optionally allows adding metadata to the entry and modifying the field properties.
* Added `Entry.from_df`, creating an entry from a pandas dataframe and optionally allows adding metadata to the entry and modifying the field properties.

**Changed:**

* Changed default url to collect datapackages to assets in https://github.com/echemdb/electrochemistry-data/releases.
* Changed loading echemdb data, using `Collection.from_remote()` instead of `Collection()`.
* Changed `Entry.create_examples` to load pre-defined datapackages instead of generating datapackges from SVGs with the `svgdigitizer`.

**Removed:**

* Removed `svgdigitizer` as dependency used in automated tests.
* Removed `Entry._digitize_example`, used to create datapackages for automated test.
*

**Fixed:**

* Fixed the description of the project in the `setup.py`.


v0.7.1
====================

**Fixed:**

* Fixed content of the Changelog to indicate that `unitpackage` originates from https://github.com/echemdb/echemdb.


v0.7.0
====================

**Fixed:**

* Fixed `entry.rescale` which returned an erroneous entry.


Older Versions
==============

For versions older than 0.7.0 please refer to [echemdb](https://github.com/echemdb/echemdb).