refcodes-tabular: Process tabular data and CSV files using POJOs
Siegfried Steiner
Siegfried Steiner
4 min read

Categories

Tags

README

The REFCODES.ORG codes represent a group of artifacts consolidating parts of my work in the past years. Several topics are covered which I consider useful for you, programmers, developers and software engineers.

What is this repository for?

This archetype helps processing table like data structures including the processing of CSV files with records, headers as well as comments whilst supporting Plain old java objects (POJO) and simple new Java record types.

Quick start archetype

For a jump start into developing Java driven command line tools, I created some fully pre-configured Maven Archetypes available on Maven Central. Those Maven Archetypes already provide means to directly create native executables, bundles as well as launchers and support out of the box command line argument parsing as well as out of the box property file handling.

Use the refcodes-archetype-alt-csv archetype to create a bare metal CSV (CLI) driven Java application:

Please adjust my.corp with your actual Group-ID and myapp with your actual Artifact-ID:

mvn archetype:generate \
  -DarchetypeGroupId=org.refcodes \
  -DarchetypeArtifactId=refcodes-archetype-alt-csv \
  -DarchetypeVersion=4.0.0 \
  -DgroupId=my.corp \
  -DartifactId=myapp \
  -Dversion=0.0.1

Using the defaults, this will generate a CSV processing application harnessing the refcodes-tabular toolkit whilst providing a command line interface by harnessing the refcodes-cli toolkit.

How do I get set up?

To get up and running, include the following dependency (without the three dots “…”) in your pom.xml:

1
2
3
4
5
6
7
8
9
<dependencies>
	...
	<dependency>
		<artifactId>refcodes-tabular</artifactId>
		<groupId>org.refcodes</groupId>
		<version>4.0.0</version>
	</dependency>
	...
</dependencies>

The artifact is hosted directly at Maven Central. Jump straight to the source codes at Bitbucket. Read the artifact’s javadoc at javadoc.io.

Snippets of interest

Below find some code snippets which demonstrate the various aspects of using the refcodes-tabular artifact (and , if applicable, its offsprings). See also the example source codes of this artifact for further information on the usage of this artifact.

Reading CSV files

The below example presumes a type Tupel being defined with the attributes date (Date), n1 (int), n2 (int), n3 (int), n4 (int), n5 (int), n6, (int), jackpot (float) and currency (String). An input CSV file is being read line by line where each line represents a Record and which’s field are mapped to the Tupel type. This Tupel instance then is written (to exemplify writing to an output CSV file) back to the output CSV file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
...

import static org.refcodes.tabular.TabularSugar.*;
import org.refcodes.tabular.CsvRecordReader;
import org.refcodes.tabular.CsvRecordWriter;
import org.refcodes.tabular.Header;
		
public class SomeClass {		
	...
	public void someMethod(boolean hasHeader) {
		...
		File theSourceFile = ...
		File theDestFile = ...
		...
		Tupel eTupel;
		Header<?> theHeader = headerOf( dateColumn( "date", DateFormat.MIN_DATE_FORMAT.getFormatter() ), intColumn( "n1" ), intColumn( "n2" ), intColumn( "n3" ), intColumn( "n4" ), intColumn( "n5" ), intColumn( "n6" ), floatColumn( "jackpot" ), stringColumn( "currency" ) );
		try (CsvRecordReader<?> theCsvReader = theSourceFile != null ? new CsvRecordReader<>( theHeader, theSourceFile, theSourceDelimiter ) : new CsvRecordReader<>( theHeader, new BufferedInputStream( System.in ), theSourceDelimiter )) {
			try (CsvRecordWriter<?> theCsvWriter = theDestFile != null ? new CsvRecordWriter<>( theHeader, theDestFile, theDestDelimiter ) : new CsvRecordWriter<>( theHeader, new BufferedOutputStream( System.out ), theDestDelimiter )) {
				if ( hasHeader ) {
					theCsvReader.readHeader();
					theCsvWriter.writeHeader();
				}
				while ( theCsvReader.hasNext() ) {
					eTupel = theCsvReader.nextType( Tupel.class );
					theCsvWriter.writeType( eTupel );
				}
			}
		}
		...
	}
	...
}

Note: As said above, the type Tupel corresponds to the header theHeader in terms of attributes and their types!

An sample of the CSV file being processed might look as follows:

1
2
3
4
5
6
7
date;n1;n2;n3;n4;n5;n6;jackpot;currency
2021-02-24;1;5;13;36;44;46;26529922;EUR
2021-02-17;4;10;25;29;40;48;26362228;EUR
2021-02-10;5;19;21;27;32;46;28438716;EUR
2021-02-03;3;6;10;12;34;47;27384502;EUR
2021-01-27;1;4;15;16;44;45;25298324;EUR
2021-01-20;5;6;11;31;35;42;26535136;EUR

Examples

See the sources of the funcodes-calcsv and funcodes-waves command line tools for a demo of this toolkit as well as the source code examples of this artifact.

Contribution guidelines

  • Report issues
  • Finding bugs
  • Helping fixing bugs
  • Making code and documentation better
  • Enhance the code

Who do I talk to?

Licensing Philosophy

This project follows a dual-licensing model designed to balance openness, pragmatism and fair attribution.

You may choose between the LGPL v3.0 or later and the Apache License v2.0 when using this software.

The intention behind this model is simple:

  • Enable use in both open-source and proprietary projects
  • Keep the codebase approachable and reusable
  • Ensure that improvements to the library itself remain available to the community
  • Preserve clear attribution to the original author and the ecosystem

Under the LGPL v3.0+, you are free to use this library in any application. If you modify the library itself, those modifications must be made available under the same license and must retain proper attribution.

Alternatively, the Apache License v2.0 allows broad use, modification and distribution, including commercial usage, provided that copyright notices and the accompanying NOTICE file are preserved.

This dual-licensing approach intentionally avoids artificial barriers while discouraging closed, uncredited forks of the core library. Contributions, improvements and refinements are encouraged to flow back into the project, benefiting both the community and downstream users.

For licensing questions, alternative licensing arrangements or commercial inquiries, please contact the copyright holder.