Skip to content

Commit f2e59f3

Browse files
committed
update translate script, add new py script to resolve bugs
1 parent 3ae1584 commit f2e59f3

File tree

5 files changed

+74
-162
lines changed

5 files changed

+74
-162
lines changed

.gitignore

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,7 @@ Thumbs.db
4040
*.sqlite3
4141

4242
# translation files
43-
untranslated_fr.tmp
44-
untranslated_ja.tmp
45-
untranslated_zh.tmp
43+
*.tmp
4644

4745
locales/fr/LC_MESSAGES/messages.po.bak
4846
locales/ja/LC_MESSAGES/messages.po.bak

README.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,18 +17,19 @@ A practical command-line interface (CLI) for selected [astroquery](https://astro
1717
- **Gaia**: Cone search
1818
- **IRSA**: Infrared Science Archive queries
1919
- **Heasarc**: HEASARC Archive queries
20-
- **JPLSBDB**: JPL Small-Body Database queries
20+
- **JPL**: JPL Small-Body Database queries
2121
- **MAST**: Mikulski Archive for Space Telescopes queries
22-
- **NASA-ADS**: NASA Astrophysics Data System literature search and BibTeX retrieval, allows simple commands to search for "latest papers" or "highly cited reviews".
22+
- **ADS**: NASA Astrophysics Data System literature search and BibTeX retrieval, allows simple commands to search for "latest papers" or "highly cited reviews".
2323
- **NED**: NASA/IPAC Extragalactic Database name resolution
2424
- **NIST**: National Institute of Standards and Technology Atomic Spectra Database queries
25+
- **Exoplanet**: NASA Exoplanet Archive queries
2526
- **SDSS**: Sloan Digital Sky Survey queries
2627
- **ESO**: European Southern Observatory queries
2728
- **SIMBAD**: SIMBAD Astronomical Database basic query
2829
- **Splatalogue**: Molecular line queries
2930
- **VizieR**: VizieR Catalogue Database catalog search, basic query
3031

31-
_Some modules and commands are not fully implemented. Please refer to `aqc --help` for the latest status._
32+
_Some modules and commands are not fully implemented. Aliases are available for some modules (e.g., `sim` for `simbad`, `viz` for `vizier`, `spl` for `splatalogue`, `hea` for `heasarc`, `exo` for `exoplanet`). Please refer to `aqc --help` for the latest status._
3233

3334
---
3435

@@ -126,7 +127,16 @@ aqc --field simbad
126127

127128
### Updating Translations
128129

129-
Helper scripts in the `locales/` directory assist with extracting, updating, and compiling translation files. See script comments for details.
130+
Helper scripts in the `locales/` directory assist with extracting, updating, and compiling translation files. The general workflow is as follows:
131+
132+
1. **Extract untranslated entries**: Run `locales/extract-untranslated.sh`. This script generates `untranslated_pot.tmp` (for new entries in `messages.pot`) and `untranslated_<lang>.tmp` files (for untranslated entries in language-specific `.po` files).
133+
2. **Translate `untranslated_pot.tmp`**: Manually translate the entries in `locales/untranslated_pot.tmp`. These are new `msgid` entries that need to be added to all language files.
134+
3. **Merge translations**: After translating `untranslated_pot.tmp`, merge these translations into the respective `untranslated_<lang>.tmp` files. This step typically involves copying the translated `msgstr` from `untranslated_pot.tmp` to the corresponding entries in `untranslated_<lang>.tmp`.
135+
4. **Update `.po` files**: Run `locales/update-po.sh` to incorporate the translated entries from the `untranslated_<lang>.tmp` files into the `messages.po` files for each language.
136+
5. **Check for updates**: Run `locales/check-update.sh` to ensure all translation files are consistent and up-to-date.
137+
6. **Compile `.mo` files**: After updating `.po` files, compile them into `.mo` files using `locales/compile-mo.sh` (or similar command if not explicitly provided as a script).
138+
139+
Refer to the comments within each script in the `locales/` directory for more detailed instructions.
130140

131141
---
132142

locales/extract-untranslated.sh

Lines changed: 17 additions & 155 deletions
Original file line numberDiff line numberDiff line change
@@ -9,166 +9,28 @@ PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
99
# Full path to locales directory
1010
LOCALES_DIR="$SCRIPT_DIR"
1111
DOMAIN="messages"
12-
POT_FILE="$LOCALES_DIR/$DOMAIN.pot" # This needs to be an absolute path
1312

14-
# AWK script to extract and clean msgid strings, handling multi-line and unescaping
15-
# This script is written to a temporary file to avoid issues with 'read -r -d'
16-
AWK_EXTRACT_MSGID_SCRIPT_PATH="$LOCALES_DIR/awk_extract_msgid.awk"
17-
cat << 'EOF_AWK_EXTRACT_MSGID' > "$AWK_EXTRACT_MSGID_SCRIPT_PATH"
18-
BEGIN {
19-
current_msgid_raw = "";
20-
in_msgid_block = 0;
21-
}
22-
23-
/^msgid / {
24-
if (in_msgid_block) {
25-
cleaned_msgid = current_msgid_raw;
26-
sub(/^msgid /, "", cleaned_msgid);
27-
if (length(cleaned_msgid) > 0 && substr(cleaned_msgid, 1, 1) == "\"" && substr(cleaned_msgid, length(cleaned_msgid), 1) == "\"") {
28-
cleaned_msgid = substr(cleaned_msgid, 2, length(cleaned_msgid) - 2);
29-
}
30-
gsub(/\n"/, "\n", cleaned_msgid);
31-
gsub(/\\"/, "\"", cleaned_msgid);
32-
gsub(/\\n/, "\n", cleaned_msgid);
33-
print cleaned_msgid;
34-
}
35-
current_msgid_raw = $0;
36-
in_msgid_block = 1;
37-
next;
38-
}
39-
40-
/^msgstr / {
41-
cleaned_msgid = current_msgid_raw;
42-
sub(/^msgid /, "", cleaned_msgid);
43-
if (length(cleaned_msgid) > 0 && substr(cleaned_msgid, 1, 1) == "\"" && substr(cleaned_msgid, length(cleaned_msgid), 1) == "\"") {
44-
cleaned_msgid = substr(cleaned_msgid, 2, length(cleaned_msgid) - 2);
45-
}
46-
gsub(/\n"/, "\n", cleaned_msgid);
47-
gsub(/\\"/, "\"", cleaned_msgid);
48-
gsub(/\\n/, "\n", cleaned_msgid);
49-
print cleaned_msgid;
13+
echo "Extracting untranslated and missing entries using Python..."
14+
for file in "$LOCALES_DIR"/$DOMAIN.pot "$LOCALES_DIR"/*/LC_MESSAGES/$DOMAIN.po; do
15+
[ -f "$file" ] || continue
5016

51-
current_msgid_raw = "";
52-
in_msgid_block = 0;
53-
next;
54-
}
55-
56-
/^"/ {
57-
if (in_msgid_block) {
58-
current_msgid_raw = current_msgid_raw "\n" $0;
59-
}
60-
next;
61-
}
62-
63-
/^#/ {
64-
next;
65-
}
66-
67-
/^$/ {
68-
if (in_msgid_block) {
69-
cleaned_msgid = current_msgid_raw;
70-
sub(/^msgid /, "", cleaned_msgid);
71-
if (length(cleaned_msgid) > 0 && substr(cleaned_msgid, 1, 1) == "\"" && substr(cleaned_msgid, length(cleaned_msgid), 1) == "\"") {
72-
cleaned_msgid = substr(cleaned_msgid, 2, length(cleaned_msgid) - 2);
73-
}
74-
gsub(/\n"/, "\n", cleaned_msgid);
75-
gsub(/\\"/, "\"", cleaned_msgid);
76-
gsub(/\\n/, "\n", cleaned_msgid);
77-
print cleaned_msgid;
78-
}
79-
current_msgid_raw = "";
80-
in_msgid_block = 0;
81-
next;
82-
}
83-
84-
END {
85-
if (in_msgid_block && current_msgid_raw != "") {
86-
cleaned_msgid = current_msgid_raw;
87-
sub(/^msgid /, "", cleaned_msgid);
88-
if (length(cleaned_msgid) > 0 && substr(cleaned_msgid, 1, 1) == "\"" && substr(cleaned_msgid, length(cleaned_msgid), 1) == "\"") {
89-
cleaned_msgid = substr(cleaned_msgid, 2, length(cleaned_msgid) - 2);
90-
}
91-
gsub(/\n"/, "\n", cleaned_msgid);
92-
gsub(/\\"/, "\"", cleaned_msgid);
93-
gsub(/\\n/, "\n", cleaned_msgid);
94-
print cleaned_msgid;
95-
}
96-
}
97-
EOF_AWK_EXTRACT_MSGID
98-
99-
echo "Extracting untranslated and missing entries..."
100-
for po in "$LOCALES_DIR"/*/LC_MESSAGES/$DOMAIN.po; do
101-
[ -f "$po" ] || continue
102-
lang=$(basename "$(dirname "$(dirname "$po")")")
103-
tmpfile="$LOCALES_DIR/untranslated_${lang}.tmp"
104-
tmpfile_pot_msgids="$LOCALES_DIR/all_pot_msgids.tmp"
105-
tmpfile_po_translated_msgids="$LOCALES_DIR/po_translated_msgids_${lang}.tmp"
17+
filename=$(basename "$file")
18+
if [ "$filename" = "$DOMAIN.pot" ]; then
19+
lang="pot"
20+
tmpfile="$LOCALES_DIR/untranslated_${lang}.tmp"
21+
else
22+
lang=$(basename "$(dirname "$(dirname "$file")")")
23+
tmpfile="$LOCALES_DIR/untranslated_${lang}.tmp"
24+
fi
10625

107-
# Clear tmp files
26+
# Clear tmp file
10827
: > "$tmpfile"
109-
: > "$tmpfile_pot_msgids"
110-
: > "$tmpfile_po_translated_msgids"
111-
112-
echo "--- Processing language: $lang ---"
11328

114-
# 1. Extract all msgids from the .pot file
115-
echo "Step 1: Extracting all msgids from $POT_FILE..."
116-
awk -f "$AWK_EXTRACT_MSGID_SCRIPT_PATH" "$POT_FILE" | sort -u > "$tmpfile_pot_msgids"
117-
echo "Step 1 Complete: All msgids from $POT_FILE extracted to $tmpfile_pot_msgids"
29+
echo "--- Processing file: $file (Language: $lang) ---"
11830

119-
# 2. Extract all *translated* msgids from the .po file
120-
echo "Step 2: Extracting translated msgids from $po..."
121-
grep -P -A 1 '^msgid ' "$po" | awk '
122-
BEGIN {
123-
current_msgid_block = "";
124-
in_msgid_section = 0;
125-
}
126-
/^msgid / {
127-
current_msgid_block = $0;
128-
in_msgid_section = 1;
129-
next;
130-
}
131-
/^msgstr "[^"]+"$/ { # msgstr is not empty
132-
if (in_msgid_section) {
133-
print current_msgid_block; # Print the msgid block
134-
}
135-
in_msgid_section = 0;
136-
current_msgid_block = "";
137-
next;
138-
}
139-
/^"/ { # Continuation lines for msgid
140-
if (in_msgid_section) {
141-
current_msgid_block = current_msgid_block "\n" $0;
142-
}
143-
next;
144-
}
145-
/^#/ { next; } # Ignore comments
146-
/^$/ { # Empty line, end of entry
147-
in_msgid_section = 0;
148-
current_msgid_block = "";
149-
next;
150-
}
151-
END {
152-
# Handle case where last entry is a translated msgid
153-
if (in_msgid_section && current_msgid_block != "") {
154-
# This case is tricky, as we only print if msgstr is non-empty.
155-
# The grep -A 1 should handle this by providing the msgstr line.
156-
# So, no need to print here, as it would have been printed by the /^msgstr/ block.
157-
}
158-
}
159-
' | awk -f "$AWK_EXTRACT_MSGID_SCRIPT_PATH" - | sort -u > "$tmpfile_po_translated_msgids"
160-
echo "Step 2 Complete: Translated msgids from $po extracted to $tmpfile_po_translated_msgids"
31+
# Use the Python script to extract untranslated msgid,msgstr pairs
32+
echo "Extracting untranslated entries from $file to $tmpfile..."
33+
poetry run python "$LOCALES_DIR/extract_untranslated.py" "$file" "$tmpfile"
34+
echo "Complete: Untranslated entries written to: $tmpfile"
16135

162-
# 3. Compare the two lists to find untranslated/missing entries
163-
echo "Step 3: Comparing msgids to find untranslated and missing entries..."
164-
comm -23 "$tmpfile_pot_msgids" "$tmpfile_po_translated_msgids" > "$tmpfile"
165-
echo "Step 3 Complete: Untranslated and missing entries written to: $tmpfile"
166-
167-
# Clean up temporary files for the current language
168-
echo "Cleaning up temporary files for language $lang..."
169-
rm "$tmpfile_pot_msgids" "$tmpfile_po_translated_msgids"
17036
done
171-
172-
# Clean up the AWK script file after all languages are processed
173-
echo "Cleaning up AWK script file..."
174-
rm "$AWK_EXTRACT_MSGID_SCRIPT_PATH"

locales/extract_untranslated.py

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
import polib
2+
import os
3+
import sys
4+
5+
def extract_untranslated(po_file_path, output_file_path):
6+
"""
7+
Extracts untranslated or fuzzy entries from a .po file and writes them to an output file.
8+
"""
9+
try:
10+
po = polib.pofile(po_file_path)
11+
except Exception as e:
12+
print(f"Error reading PO file {po_file_path}: {e}", file=sys.stderr)
13+
return
14+
15+
untranslated_entries = []
16+
for entry in po:
17+
# An entry is considered untranslated if msgstr is empty or identical to msgid
18+
# and it's not a fuzzy translation.
19+
if not entry.obsolete and (not entry.msgstr or entry.msgstr == entry.msgid or entry.fuzzy):
20+
untranslated_entries.append(entry.msgid)
21+
22+
# Sort and write unique entries to the output file
23+
untranslated_entries = sorted(list(set(untranslated_entries)))
24+
25+
try:
26+
with open(output_file_path, 'w', encoding='utf-8') as f:
27+
for entry in untranslated_entries:
28+
# polib handles unescaping, so we just write the msgid directly
29+
f.write(f"{entry}|||\n")
30+
print(f"Complete: Untranslated entries written to: {output_file_path}")
31+
except Exception as e:
32+
print(f"Error writing to output file {output_file_path}: {e}", file=sys.stderr)
33+
34+
if __name__ == "__main__":
35+
if len(sys.argv) != 3:
36+
print("Usage: python extract_untranslated_py.py <po_file_path> <output_file_path>", file=sys.stderr)
37+
sys.exit(1)
38+
39+
po_file = sys.argv[1]
40+
output_file = sys.argv[2]
41+
extract_untranslated(po_file, output_file)

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ pyvo = "^1.4.1"
2727
[tool.poetry.group.dev.dependencies]
2828
pytest = "^7.4.2"
2929
snakeviz = "^2.2.0"
30+
polib = "^1.2.0"
3031

3132
[build-system]
3233
requires = ["poetry-core"]

0 commit comments

Comments
 (0)