NEWS
nc 2024.9.20 (2024-09-20)
- list(given_first=list(given, family)) examples + v5 + docs update.
nc 2024.9.19
- bugfix: error for literal groups (un-escaped parens in pattern strings) now happens in apply_type_funs instead of capture_first_vec.
- bugfix: capture_all_str(engine="PCRE") will now produce that same error when there are no matches (match matrix with 0 rows now takes number of columns from gregexpr trivial no-match info, rather than from number of arguments defining pattern in R code).
nc 2024.8.15
- bugfix: using alternatives_with_shared_groups with nc::capture_first_vec now works on a subject of length=1.
nc 2024.7.16
- arrow::arrow_with_dataset() to avoid CRAN NOTE.
nc 2024.2.21 (2024-02-22)
- various minor updates, thanks CRAN.
nc 2024.2.12
- if(requireNamespace(suggested)) in tests, thanks CRAN.
nc 2024.2.1
- Fix vignette 7 by adding if(requireNamespace("arrow")).
nc 2024.1.30
- Imports data.table >= 1.15.0.
- Remove code which uses old pre-measure data.table.
- Fix eurostat URL in vignette 3.
nc 2024.1.4
- new vignette v7-capture-glob.
nc 2023.8.24 (2023-08-31)
- provide argument descriptions for un-exported functions (required to avoid NOTE on CRAN using new R-devel).
- data.table::setDTthreads(1) in examples and tests to avoid CRAN NOTE.
- examples fread plain text instead of gz files, to save time on CRAN.
nc 2023.5.1 (2023-05-01)
- melt uses variable_table if data.table has measure.
- VignetteEngine knitr -> rmarkdown.
nc 2023.3.30
- new arrow hive partition example for capture_first_glob.
nc 2023.1.10
- new function capture_first_glob.
nc 2021.11.30
- capture_first_vec: when there are more than 10 subjects which do not match, error message only shows first 5 and last 5 indices (instead of all indices as previously). Uses new collapse_some function (internal use only; also used in check_names).
nc 2021.11.17
- alternatives_with_shared_groups.
nc 2021.7.16
- Suggest re2 instead of re2r.
nc 2021.6.28
- capture_first_df has new existing.error argument, default TRUE same as existing behavior (error when capture group name same as column name in subject), FALSE means to overwrite existing column (useful when experimenting with regex that modifies DT).
nc 2021.6.25
- capture_first_df uses data.table::set when input is data table, for memory efficiency. Entire table is still copied if input is data frame.
- capture_melt_multiple example for fill=TRUE.
nc 2021.6.23
- apply_type_funs stops for NA to non-NA conversion.
nc 2021.5.20
- new capture_longer_spec function.
nc 2021.4.2
- New v0-overview vignette.
nc 2021.3.31
- Duplicate capture group names now allowed in alternatives.
- New function altlist which helps when defining alternatives with common sub-patterns.
nc 2020.10.6
- New error for capture_melt_multiple with only one input variable per value captured in column group.
- New internal functions measure_single, measure_multiple, melt_list, check_names, check_df_names.
- capture_melt_single/multiple now use data.table::melt with new variable_table attribute of measure.vars (more efficient than previous join approach). Therefore Imports changed to require at least version
nc 2020.9.27
- capture_melt_multiple(fill=TRUE)
- Move helper functions from README to vignette 5.
- New more informative/helpful error message for capture_melt_* when some columns match but they are converted to NA by the type conversion functions.
- capture_melt_multiple(na.rm=TRUE) works even when there are entire groups which are missing (actually nc code did NOT change, bug fix was in data.table).
- PROVEDIt data with 100 peaks x 3 features per peak columns to melt.
nc 2020.8.6 (2020-08-10)
- data.table::fread version 1.13.0 now gives IDate instead of character, thanks to @MichaelChirico for updating test-CRAN-multiple.R
- New test for default/specified engine via features that only work with ICU (verbose character classes) and PCRE (recursion). Old test used non-representable characters which did not work on windows (but did work on Ubuntu).
- Suggest markdown for vignettes.
nc 2020.7.1
- nc::capture_* functions were using a named argument starting with "s" as the subject argument, which was confusing since the first argument should be the subject, and it is never named. Fixed by removing subject argument and instead taking subject from the first element of ... (in capture_df_names and new subject_var_args function).
- Various documentation clarifications.
nc 2020.5.16
nc 2020.5.14
- Remove big print from vignette 4 comparisons, use str instead.
nc 2020.5.13 (2020-05-14)
- tidyr::pivot_longer examples use names_transform (not present in tidyr
nc 2020.3.25
- fix typo in capture_melt_single example.
- stop_for_engine tells user to get re2r from github.
nc 2020.3.23 (2020-03-25)
- reduce repetition in tests, test_engines defined in a separate inst/test_engines.R file.
nc 2020.2.27 (2020-03-04)
- PROVEDit file names example in capture first vignette.
- alternatives(pattern1, pattern2, ...) generates pattern1|pattern2, with support for named arguments.
nc 2020.2.24
- capture_melt_multiple bugfix when "count" is one of the columns to output.
nc 2020.1.16 (2020-01-31)
- test for pattern with literal groups.
nc 2019.12.12
- time command line example for capture_all_str.
- more informative error message for capture_first_df.
nc 2019.11.22
- quantifier.
- Do not run emoji examples on Solaris.
nc 2019.10.19 (2019-10-28)
- dot prefix now ok for capture_melt_multiple.
- informative error messages to enforce unique column names in input and output of capture_melt_*.
nc 2019.10.14
- capture_first_melt -> capture_melt_single.
- capture_first_melt_multiple -> capture_melt_multiple.
nc 2019.10.11
- new vignettes 3=melt 4=comparison.
- move unicode data from example to extdata in order to avoid the following on CRAN solaris:
- Warning in deparse(e[[2L]]) : it is not known that wchar_t is Unicode on this platform
nc 2019.9.28
- capture_first_melt_multiple inspired by data.table::patterns.
- capture_first_melt inspired by tidyr::pivot_longer.
- Suggest tidyr for who data example.
nc 2019.9.18
- additional citation/medline web page data/examples.
nc 2019.9.16 (2019-09-23)
- do not do engine tests (with rename library dirs) on CRAN.
- remove print/str from error messages for CRAN.
- group, field functions.
nc 2019.9.12
- always output data.table.
nc 2019.9.10
- test 0 column result via NULL names and rownames, not identical/expected data.frame.
nc 2019.9.4
- engine error message includes install.packages.
nc 2019.9.3
- dput used for did you mean error message.
- vignettes separated, 1=first 2=all.
nc 2019.8.22
- DESCRIPTION updated.
- Informative error for named character string patterns.
nc 2019.8.21
- main functions renamed to capture_* for auto-complete convenience.
- on https://github.com/gagolews/stringi/issues/279 it says ICU >= 59 is required for emoji support. but travis uses Ubuntu xenial system ICU (55.1) by default. Ubuntu bionic has ICU 60.2 https://launchpad.net/ubuntu/bionic/+source/icu so travis build will most likely be fixed by telling travis to use bionic. Actually dist: bionic builds use xenial still for language: r, https://travis-ci.community/t/for-dist-bionic-xenial-vm-is-used-instead-with-language-rust/4487/2 So instead we fix the build by making if interactive() for this example, and adding another engine="ICU" example based on http://userguide.icu-project.org/strings/regexp for str_capture_all.
- docs.
- vignette.
- FIXED: needed to escape backslashes in examples.
nc 2019.8.14
- first version forked from namedCapture.
- vec_capture_first, str_capture_all, df_capture_first pass tests for three engines: ICU, PCRE, RE2.