Skip to content

Updates for language files #42

@davidh2075

Description

@davidh2075

This is a follow-on from #41. I initially posted the comments below to that closed issue, which I couldn't re-open. They really warrant a separate issue, as #41 is related only to the Polish language. I'll delete the original comments on #41 after posting in this issue.

I've attached 8 files named <language>_new.pm.txt. They are the merge of changes required to parse dates on MacOS Monterey 12.5.1 and Oracle Linux Server 7.9. I appended .txt to permit attachment. There are some parse failures I note below.

I've also attached two crude scripts I used to test all the language configuration files, date-manip-test-macos.sh.txt and date-manip-test-linux.sh.txt. They aren't ideal tests, as they will fail to parse a date if, for example, either a day abbreviation or a month abbreviation isn't recognised. When all date time elements are recognised, they parse, which is how I used them. I ran both scripts with the final merged language files.

I'm not a speaker of any of these languages, so I simply updated the files to achieve successful parsing, using the strings output by date(1).

Do you know yet what the version number will be that incorporates changes? I'll look for it downstream.

Explaining the Romanian Tuesday day_name update

I added an entry to the Romanian Tuesday array, the last element below, which both Linux and macOS output as the full day name:
['marți', 'marti', 'marþi', 'marţi'],
The ț in the first element is U+021B, Latin small letter t with comma below.
The ţ in the last element is U+0163, Latin small letter t with cedilla.
Both are accepted as Tuesday in Romanian, e.g. see reverso translation of both.

Parse failures

I'm including this section just FYI. None of this is causing me a problem. Just stuff I noticed.

Russian on macOS

The standard output of date on macOS fails to parse. It's due to the presence of "r. " and the parentheses around the TZ name in the output below. The blank line is the failed parse. Using a format to eliminate the "r. " and parentheses results in a successful parse.

% LANG=ru_RU date
четверг,  6 октября 2022 г. 18:34:30 (AEDT)
% LANG=ru_RU date | perl -I/Users/USERNAME/Downloads/Date-Manip-SBECK-github/lib/ -MDate::Manip -lpe 'Date_Init("Language=Russian", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'

% LANG=ru_RU date +"%A, %e %B %Y %T %Z"
четверг,  6 октября 2022 18:35:53 AEDT
% LANG=ru_RU date +"%A, %e %B %Y %T %Z" | perl -I/Users/USERNAME/Downloads/Date-Manip-SBECK-github/lib/ -MDate::Manip -lpe 'Date_Init("Language=Russian", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'
20221006 18:36:10
%

German, Italian and Norwegian on Linux

German

The default format doesn’t parse, with or without .UTF-8:

$ LANG=de_DE.UTF-8 date --date="2022-01-03 11:00:00"
Mo 3. Jan 11:00:00 AEST 2022

It appears to be due to the period after the day of the month. This does parse, with or without .UTF-8:

$ LANG=de_DE.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
Mo  3 Jan 2022 11:00:00 AEST
Italian

The default format doesn’t parse, with or without .UTF-8:

$ LANG=it_IT.UTF-8 date --date="2022-01-03 11:00:00"
lun  3 gen 2022, 11.00.00, AEST

It appears to the due to the commas, and I didn't check whether the periods in the time contribute. This does parse, with or without .UTF-8:

$ LANG=it_IT.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
lun  3 gen 2022 11:00:00 AEST
Norwegian

The default format doesn’t parse:

$ LANG=no_NO date --date="2022-01-03 11:00:00"
ma. 03. jan. 11:00:00 +1000 2022

It seems to be due to the period after the day of the month.

UTF-8 seems to output the default LANG (en_AU):

$ LANG=no_NO.UTF-8 date --date="2022-01-03 11:00:00"
Mon Jan  3 11:00:00 AEST 2022

This does parse without .UTF-8:

$ LANG=no_NO date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
ma.  3 jan. 2022 11:00:00 AEST

This doesn’t parse with .UTF-8, unless you use English, i.e. Date_Init("Language=English", "DateFormat=non-US"):

$ LANG=no_NO.UTF-8 date --date="2022-01-03 11:00:00" +"%a %e %b %Y %T %Z"
Mon  3 Jan 2022 11:00:00 AEST

Files

finnish_new.pm.txt
french_new.pm.txt
norwegian_new.pm.txt
polish_new.pm.txt
portugue_new.pm.txt
romanian_new.pm.txt
russian_new.pm.txt
turkish_new.pm.txt
date-manip-test-macos.sh.txt
date-manip-test-linux.sh.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions