-
Notifications
You must be signed in to change notification settings - Fork 11
Description
This is a follow-on from #41. I initially posted the comments below to that closed issue, which I couldn't re-open. They really warrant a separate issue, as #41 is related only to the Polish language. I'll delete the original comments on #41 after posting in this issue.
I've attached 8 files named <language>_new.pm.txt. They are the merge of changes required to parse dates on MacOS Monterey 12.5.1 and Oracle Linux Server 7.9. I appended .txt to permit attachment. There are some parse failures I note below.
I've also attached two crude scripts I used to test all the language configuration files, date-manip-test-macos.sh.txt and date-manip-test-linux.sh.txt. They aren't ideal tests, as they will fail to parse a date if, for example, either a day abbreviation or a month abbreviation isn't recognised. When all date time elements are recognised, they parse, which is how I used them. I ran both scripts with the final merged language files.
I'm not a speaker of any of these languages, so I simply updated the files to achieve successful parsing, using the strings output by date(1).
Do you know yet what the version number will be that incorporates changes? I'll look for it downstream.
Explaining the Romanian Tuesday day_name update
I added an entry to the Romanian Tuesday array, the last element below, which both Linux and macOS output as the full day name:
['marți', 'marti', 'marþi', 'marţi'],
The ț in the first element is U+021B, Latin small letter t with comma below.
The ţ in the last element is U+0163, Latin small letter t with cedilla.
Both are accepted as Tuesday in Romanian, e.g. see reverso translation of both.
Parse failures
I'm including this section just FYI. None of this is causing me a problem. Just stuff I noticed.
Russian on macOS
The standard output of date on macOS fails to parse. It's due to the presence of "r. " and the parentheses around the TZ name in the output below. The blank line is the failed parse. Using a format to eliminate the "r. " and parentheses results in a successful parse.
% LANG=ru_RU date
четверг, 6 октября 2022 г. 18:34:30 (AEDT)
% LANG=ru_RU date | perl -I/Users/USERNAME/Downloads/Date-Manip-SBECK-github/lib/ -MDate::Manip -lpe 'Date_Init("Language=Russian", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'
% LANG=ru_RU date +"%A, %e %B %Y %T %Z"
четверг, 6 октября 2022 18:35:53 AEDT
% LANG=ru_RU date +"%A, %e %B %Y %T %Z" | perl -I/Users/USERNAME/Downloads/Date-Manip-SBECK-github/lib/ -MDate::Manip -lpe 'Date_Init("Language=Russian", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'
20221006 18:36:10
%
German, Italian and Norwegian on Linux
German
The default format doesn’t parse, with or without .UTF-8:
$ LANG=de_DE.UTF-8 date --date="2022-01-03 11:00:00"
Mo 3. Jan 11:00:00 AEST 2022
It appears to be due to the period after the day of the month. This does parse, with or without .UTF-8:
$ LANG=de_DE.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
Mo 3 Jan 2022 11:00:00 AEST
Italian
The default format doesn’t parse, with or without .UTF-8:
$ LANG=it_IT.UTF-8 date --date="2022-01-03 11:00:00"
lun 3 gen 2022, 11.00.00, AEST
It appears to the due to the commas, and I didn't check whether the periods in the time contribute. This does parse, with or without .UTF-8:
$ LANG=it_IT.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
lun 3 gen 2022 11:00:00 AEST
Norwegian
The default format doesn’t parse:
$ LANG=no_NO date --date="2022-01-03 11:00:00"
ma. 03. jan. 11:00:00 +1000 2022
It seems to be due to the period after the day of the month.
UTF-8 seems to output the default LANG (en_AU):
$ LANG=no_NO.UTF-8 date --date="2022-01-03 11:00:00"
Mon Jan 3 11:00:00 AEST 2022
This does parse without .UTF-8:
$ LANG=no_NO date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
ma. 3 jan. 2022 11:00:00 AEST
This doesn’t parse with .UTF-8, unless you use English, i.e. Date_Init("Language=English", "DateFormat=non-US"):
$ LANG=no_NO.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
Mon 3 Jan 2022 11:00:00 AEST
Files
finnish_new.pm.txt
french_new.pm.txt
norwegian_new.pm.txt
polish_new.pm.txt
portugue_new.pm.txt
romanian_new.pm.txt
russian_new.pm.txt
turkish_new.pm.txt
date-manip-test-macos.sh.txt
date-manip-test-linux.sh.txt