Sorting text files by DateTime in content

I have made a GitHub project

This project was developed to propose a solution that was raised by Dave Evers in the Katalon Community forum at

Problem to solve

One day in the “Katalon Community” forum I was asked a question. The original poster asked as follows:

  • He has several hundreds of text files in a directory. Every file was named with a postfix “.eml”, which stands for “Email message”, like 79edddc6-ce98-4eff-b8ea-414e392bce1f.eml.
  • The files have similar content like this:
X-Sender: "Do Not Reply" donotreply@anywhere.com
X-Receiver: xyz.com
MIME-Version: 1.0
From: "Do Not Reply" donotreply@anywhere.com
To: xyz.com
Date: 26 Oct 2022 14:43:15 -0700
Subject: Hello world, what is my URL?
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64

aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbQ==
  • He wants to get a list of file names, which are sorted by descending order of the “Date” header value in the file content, for example: Date: 26 Oct 2022 14:43:15 -0700.

How to try the solution in Katalon Studio

You can try the solution in your local Katalon Studio. I assume you have a Katalon project created already. The project should have some .eml files in the following folder:

  • <projectDir>/Include/data

Plus, You need to download an external jar that I made from

Download “SortingFilesByDateTimeInContent-0.1.0.jar” file, store it into the Drivers folder of your local Katalon project.

You want to make a test case code, like this:

import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import java.util.stream.Collectors

import com.kms.katalon.core.configuration.RunConfiguration

import com.kazurayam.study20221030.PathComparableByDateTime
import com.kazurayam.study20221030.PathComparableByContentEmailHeaderValue

Path dataDir = Paths.get(RunConfiguration.getProjectDir()).resolve("Include/data")

List<PathComparableByDateTime> emlFiles =
	Files.list(dataDir)
		.filter({ p -> p.toString().endsWith(".eml") })
		// wrap the Path object by a adapter class
		// to sort the Path objects by the Email Date in the file content
		.map({ p -> new PathComparableByContentEmailHeaderValue(p, "Date") })
		// in the descending order of the Date value
		.sorted(Comparator.reverseOrder())
		.collect(Collectors.toList())

// print path of all eml files
emlFiles.eachWithIndex  { p, index ->
	println((index + 1) + "\t" + p.getValue() + "\t" + dataDir.relativize(p.get()))
}

This Test Case will emit messages in console, for example as this:

2022-10-31 07:42:18.223 DEBUG testcase.TC                              - 1: dataDir = get(getProjectDir()).resolve("Include/data")
2022-10-31 07:42:18.249 DEBUG testcase.TC                              - 2: emlFiles = reverseOrder()).collect(Collectors.toList())
2022-10-31 07:42:18.451 DEBUG testcase.TC                              - 3: emlFiles.eachWithIndex({ java.lang.Object p, java.lang.Object index -> ... })
1	20221026_144315	79edddc6-ce98-4eff-b8ea-414e392bce1f.eml
2	20221026_120000	f503182a-d0ef-444d-a52b-658b5d10de81.eml
3	20221026_110000	test.eml
4	20221026_110000	ccc.eml
5	20221026_110000	bb.eml
6	20221026_110000	a.eml
2022-10-31 07:42:18.484 INFO  c.k.katalon.core.main.TestCaseExecutor   - END Test Cases/TC

Please note that the timestamp values are picked out of the file content.

The com.kazurayam.study20221030.PathComparableByContentEmailDate class implements everything needed to solve the Dave’s problem. If you want to study more, please visit the following GitHub project and read the source.

Dave’s problem is a pure Java/Groovy programming problem. It has nothing specific to Katalon Studio. So the SortingFilesByDateTimeContent project is not a Katalon project. It is a plain Java8 + Gradle + JUnit5 project. I used my favorite IDE (IntelliJ IDEA) to develop this project.

1 Like

released 0.3.2. It supports filtering text files by the vaue of Email header value (such as “To: xyz.com”) AND sorting the files by the lastModifiled property of java.io.File.

You can perform it like this:

...
List<IPathComparable> files =
    Files.list(this.dir)
        filter(p -> p.getFileName().toString().endsWith(".eml"))
        // filter files with "To: xyz.com" header
        .map(p -> new PathComparableByContentEmailHeaderValue(p, "To"))
        .filter(p -> p.getValue().matches("xyz.com"))
        // to sort by the lastModified timestamp
        .map(p -> new PathComparableByFileLastModified(p.get()))
        .sorted(reverseOrder())
        .collect(Collectors.toList());
...

The javadoc is here


1 Like

Please note that I wrote my sample code in Java using Lambda expression like

filter(p -> p.getFileName().toString().endsWith(".eml"))

If you copy&paste the above code into Test Case in Katalon Studio, you will have compilation error. You need to rewrite this to Closure syntax in Groovy, like

filter({p -> p.getFileName().toString().endsWith(".eml")})
1 Like