Verify if sorting of alphanumeric with symbol characters is correct

Sorry, this is off topic. Not relevant to the original issue at all.

I played on java.text.RuleBasedCollator further. I wanted to sort digits (Days of month in the range of 1 - 31) in Japanese KANJI characters in “natural order”. This is my code:

import java.text.Collator;
import java.text.RuleBasedCollator;

List<String> stringList = Arrays.asList("十","五","一","七","八","四","六","二","九",
	"三","十一","二十","廿","廿三","三十","二十一","十三");
// [10,5,1,7,8,4,6,2,9,3,11,20,20,23,30,21,13]

RuleBasedCollator localRules = (RuleBasedCollator) Collator.getInstance(Locale.JAPAN);
StringBuilder sb= new StringBuilder();
sb.append("一 < 二 < 三 < 四 < 五 < 六 < 七 < 八 < 九 < 十")
sb.append(" < 十一 < 十二 < 十三 < 十四 < 十五 < 十六 < 十七 < 十八 < 十九 < 二十,廿");
sb.append(" < 二十一,廿一 < 二十二,廿二 < 二十三,廿三 < 二十四,廿四 < 二十五,廿五");
sb.append(" < 二十六,廿六 < 二十七,廿七 < 二十八,廿八 < 二十九,廿九 < 三十 < 三十一");

RuleBasedCollator c = new RuleBasedCollator(localRules.getRules() + " & " + sb.toString());
Collections.sort(stringList, c);
System.out.println(stringList);
// [一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 十三, 二十, 廿, 二十一, 廿三, 三十]
// [1,2,3,4,5,6,7,8,9,10,11,13,20,20,21,23,30]

Wow! Perfectly natural.


A memo for myself.

The concept of “Collation” was developed by ICU. The ICU User Guide wrote:

Combinations of letters can be treated as if they were one letter. For example, in traditional Spanish “ch” is treated as a single letter, and sorted between “c” and “d”.

This is the point.

For example a set of strings: 1, 10 and 2. Each of them can be regarded by the RuleBasedCollator as one letter. And I can define the order as 1 < 2 < 10.

2 Likes

hi,

once more with road, with python

TESTCASE

        List<String> list = new ArrayList<String>();
        list.add("AA-10");
        list.add("AA-1");
        list.add("AA-2");
        list.add("AA-2 (1)");

        //String result = list.join(",")

        StringBuilder sb = new StringBuilder()
        for (String item: list) {
        	if (sb.length() > 0 ) {
        	  sb.append(", ");
        	}
           sb.append(item)
        }
        String result = sb.toString()

        CustomKeywords.'demo.PythonKeywords.sortList'(result)

        Katalon keyword:
        	@Keyword
        	def sortList(String names){
        		runPython("keywords.sort_string", names)
        	}

        python keyword
        from natsort import natsorted
        def sort_string(AllArgs, names):
          res = names.strip('][').split(', ')
          res = natsorted(res, key=lambda y: y.lower())
          print(res)
          return res
result:
2020-04-18 15:02:19.704 INFO  com.kms.katalon.core.util.KeywordUtil    - ['AA-1', 'AA-2', 'AA-2 (1)', 'AA-10']