refcodes-tabular: Process tabular data and CSV files using POJOs

README

The REFCODES.ORG codes represent a group of artifacts consolidating parts of my work in the past years. Several topics are covered which I consider useful for you, programmers, developers and software engineers.

What is this repository for?

This archetype helps processing table like data structures including the processing of CSV files with records, headers as well as comments whilst supporting Plain old java objects (POJO) and simple new Java record types.

Quick start archetype

For a jump start into developing Java driven command line tools, I created some fully pre-configured Maven Archetypes available on Maven Central. Those Maven Archetypes already provide means to directly create native executables, bundles as well as launchers and support out of the box command line argument parsing as well as out of the box property file handling.

Use the refcodes-archetype-alt-csv archetype to create a bare metal CSV (CLI) driven Java application:

Please adjust my.corp with your actual Group-ID and myapp with your actual Artifact-ID:

mvn archetype:generate \
  -DarchetypeGroupId=org.refcodes \
  -DarchetypeArtifactId=refcodes-archetype-alt-csv \
  -DarchetypeVersion=3.4.1 \
  -DgroupId=my.corp \
  -DartifactId=myapp \
  -Dversion=0.0.1

Using the defaults, this will generate a CSV processing application harnessing the refcodes-tabular toolkit whilst providing a command line interface by harnessing the refcodes-cli toolkit.

How do I get set up?

To get up and running, include the following dependency (without the three dots “…”) in your pom.xml:

1
2
3
4
5
6
7
8
9
<dependencies>
	...
	<dependency>
		<artifactId>refcodes-tabular</artifactId>
		<groupId>org.refcodes</groupId>
		<version>3.4.1</version>
	</dependency>
	...
</dependencies>

The artifact is hosted directly at Maven Central. Jump straight to the source codes at Bitbucket. Read the artifact’s javadoc at javadoc.io.

Snippets of interest

Below find some code snippets which demonstrate the various aspects of using the refcodes-tabular artifact (and , if applicable, its offsprings). See also the example source codes of this artifact for further information on the usage of this artifact.

Reading CSV files

The below example presumes a type Tupel being defined with the attributes date (Date), n1 (int), n2 (int), n3 (int), n4 (int), n5 (int), n6, (int), jackpot (float) and currency (String). An input CSV file is being read line by line where each line represents a Record and which’s field are mapped to the Tupel type. This Tupel instance then is written (to exemplify writing to an output CSV file) back to the output CSV file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
...

import static org.refcodes.tabular.TabularSugar.*;
import org.refcodes.tabular.CsvRecordReader;
import org.refcodes.tabular.CsvRecordWriter;
import org.refcodes.tabular.Header;
		
public class SomeClass {		
	...
	public void someMethod(boolean hasHeader) {
		...
		File theSourceFile = ...
		File theDestFile = ...
		...
		Tupel eTupel;
		Header<?> theHeader = headerOf( dateColumn( "date", DateFormat.MIN_DATE_FORMAT.getFormatter() ), intColumn( "n1" ), intColumn( "n2" ), intColumn( "n3" ), intColumn( "n4" ), intColumn( "n5" ), intColumn( "n6" ), floatColumn( "jackpot" ), stringColumn( "currency" ) );
		try (CsvRecordReader<?> theCsvReader = theSourceFile != null ? new CsvRecordReader<>( theHeader, theSourceFile, theSourceDelimiter ) : new CsvRecordReader<>( theHeader, new BufferedInputStream( System.in ), theSourceDelimiter )) {
			try (CsvRecordWriter<?> theCsvWriter = theDestFile != null ? new CsvRecordWriter<>( theHeader, theDestFile, theDestDelimiter ) : new CsvRecordWriter<>( theHeader, new BufferedOutputStream( System.out ), theDestDelimiter )) {
				if ( hasHeader ) {
					theCsvReader.readHeader();
					theCsvWriter.writeHeader();
				}
				while ( theCsvReader.hasNext() ) {
					eTupel = theCsvReader.nextType( Tupel.class );
					theCsvWriter.writeType( eTupel );
				}
			}
		}
		...
	}
	...
}

Note: As said above, the type Tupel corresponds to the header theHeader in terms of attributes and their types!

An sample of the CSV file being processed might look as follows:

1
2
3
4
5
6
7
date;n1;n2;n3;n4;n5;n6;jackpot;currency
2021-02-24;1;5;13;36;44;46;26529922;EUR
2021-02-17;4;10;25;29;40;48;26362228;EUR
2021-02-10;5;19;21;27;32;46;28438716;EUR
2021-02-03;3;6;10;12;34;47;27384502;EUR
2021-01-27;1;4;15;16;44;45;25298324;EUR
2021-01-20;5;6;11;31;35;42;26535136;EUR

Examples

See the sources of the funcodes-calcsv and funcodes-waves command line tools for a demo of this toolkit as well as the source code examples of this artifact.

Contribution guidelines

Report issues
Finding bugs
Helping fixing bugs
Making code and documentation better
Enhance the code

Who do I talk to?

Siegfried Steiner (steiner@refcodes.org)

Terms and conditions

The REFCODES.ORG group of artifacts is published under some open source licenses; covered by the refcodes-licensing (org.refcodes group) artifact - evident in each artifact in question as of the pom.xml dependency included in such artifact.

refcodes-tabular: Process tabular data and CSV files using POJOs

Categories

Tags